Skip to content

Conversation

@mmathew23
Copy link
Collaborator

Latest TRL redefines DataCollatorForLanguageModeling and uses that internally. We actually import it, but that override the name since we later import the same name from transformers.

https://github.com/huggingface/trl/blob/v0.18.0/trl/trainer/sft_trainer.py#L75
vs
https://github.com/huggingface/transformers/blob/v4.52.3/src/transformers/data/data_collator.py#L764

Since the signature changes unsloth sfttrainer fails becuase we initialize the transformers version with arguments expected for the sft_trainer version. My fix imports the transformers version as a different name so they don't clobber each other.

I've successfully verified that the fix works for SFT and DPO.

Failed original runs:
llama: https://colab.research.google.com/drive/1UGMaG2NQMho1z9i7ZDC9J_HYMW0S3L9k?usp=sharing
qwen: https://colab.research.google.com/drive/1w83mQOzSL6T8sbBA2GjJO8n1ycA6kzoj?usp=sharing

fixed:
qwen: https://colab.research.google.com/drive/1MmZJFz6VSHKnyPknUxDa-TDjKI1hd-1U?usp=sharing
dpo: https://colab.research.google.com/drive/1psXyPg9RFJIPUPjoeHQrUDdJA5aTT1XX?usp=sharing

Please note: With new trl 0.18.0 non completion tokens are masked by default if prompt is a valid key in the dataset

@danielhanchen danielhanchen merged commit 8a040b7 into unslothai:main May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants