Fix `qwen2_moe` tests #40494

ydshieh · 2025-08-27T13:44:53Z

What does this PR do?

The CI job of the tests in Qwen2MoeIntegrationTest is killed from time to time due to the CPU memory limit (60 G), which is unclear to me why the usage may differ in different runs.

This PR simply reuses the same model set once in the first test, except the test_model_a2_7b_long_prompt_flash_attn, because it loads the model using flash_attn.

ydshieh · 2025-08-27T13:45:48Z

tests/models/qwen2_moe/test_modeling_qwen2_moe.py

            out = model(input_ids).logits.float().cpu()
        # Expected mean on dim = -1
-        EXPECTED_MEAN = torch.tensor([[-4.2125, -3.6416, -4.9136, -4.3005, -4.9938, -3.4393, -3.5195, -4.1621]])
+        EXPECTED_MEAN = torch.tensor([[-4.2106, -3.6411, -4.9111, -4.2840, -4.9950, -3.4438, -3.5262, -4.1624]])


changed because now we use fp16 (previously fp32)

github-actions · 2025-08-27T13:46:08Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_moe

ydshieh · 2025-08-27T13:46:25Z

tests/models/qwen2_moe/test_modeling_qwen2_moe.py

    def test_speculative_generation(self):
        EXPECTED_TEXT_COMPLETION = (
-            "To be or not to be, that is the question.\nThe answer is to be, of course. But what does it"
+            "To be or not to be, that is the question. Whether 'tis nobler in the mind to suffer the sl"


the previous value never pass

HuggingFaceDocBuilderDev · 2025-08-27T13:54:43Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Thanks

zucchini-nlp · 2025-08-27T14:11:01Z

tests/models/qwen2_moe/test_modeling_qwen2_moe.py

        assistant_model = model
        assistant_model.generation_config.num_assistant_tokens = 2
        assistant_model.generation_config.num_assistant_tokens_schedule = "constant"
        generated_ids = model.generate(input_ids, max_new_tokens=4, temperature=0)


another test where assistant_model is not actually used 😄

update

45a7389

ydshieh requested a review from zucchini-nlp August 27, 2025 13:45

ydshieh commented Aug 27, 2025

View reviewed changes

zucchini-nlp approved these changes Aug 27, 2025

View reviewed changes

ydshieh merged commit 6350636 into main Aug 27, 2025
19 checks passed

ydshieh deleted the fix_qwen2_moe branch August 27, 2025 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `qwen2_moe` tests #40494

Fix `qwen2_moe` tests #40494

Uh oh!

ydshieh commented Aug 27, 2025

Uh oh!

ydshieh Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

ydshieh Aug 27, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 27, 2025

Uh oh!

zucchini-nlp left a comment

Uh oh!

zucchini-nlp Aug 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix qwen2_moe tests #40494

Fix qwen2_moe tests #40494

Uh oh!

Conversation

ydshieh commented Aug 27, 2025

What does this PR do?

Uh oh!

ydshieh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

ydshieh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 27, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix `qwen2_moe` tests #40494

Fix `qwen2_moe` tests #40494