Fix AttributeError in Wuerstchen prior training model card#13529
Open
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Open
Fix AttributeError in Wuerstchen prior training model card#13529Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Conversation
Both Wuerstchen prior training scripts build their HF Hub model card
with an f-string that references `{args.weight_dtype}`:
pipe_prior = DiffusionPipeline.from_pretrained("{repo_id}", torch_dtype={args.weight_dtype})
`weight_dtype` is a local variable in `main()` (set from
`args.mixed_precision`), not an attribute of the `args` namespace.
`--weight_dtype` is never defined on the parser, so when
`save_model_card(args, ...)` runs (triggered by `--push_to_hub`), the
f-string formatter raises:
AttributeError: 'Namespace' object has no attribute 'weight_dtype'
at the tail end of training, right after the model has been saved
locally — losing the hub upload for anyone who trained with
`--push_to_hub`.
Affected files:
- `examples/research_projects/wuerstchen/text_to_image/train_text_to_image_prior.py` (lines 104-105)
- `examples/research_projects/wuerstchen/text_to_image/train_text_to_image_lora_prior.py` (lines 105, 108)
Both only use the value to illustrate a `torch_dtype=` argument in the
generated README code block. Mirror the canonical
`examples/text_to_image/train_text_to_image.py:95`, which hardcodes
`torch.float16` (the standard inference dtype) into the model card
template. Drop the broken f-string interpolation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Both Wuerstchen prior training scripts build their HF Hub model card with an f-string that references
{args.weight_dtype}:train_text_to_image_prior.py(lines 104-105):train_text_to_image_lora_prior.py(lines 105, 108):Why this is a real bug
weight_dtypeis a local variable inmain()(set fromargs.mixed_precision), not an attribute of theargsnamespace.--weight_dtypeis never defined on the parser. So whensave_model_card(args, ...)runs (triggered by--push_to_hub— line 925 of the prior script, line 938 of the LoRA variant), the f-string formatter raises:This fires at the tail end of training, right after the model has been saved locally — losing the hub upload for anyone who trained with
--push_to_hub.Fix
The
torch_dtype=argument in the model card's usage snippet is just an illustrative example. Mirror the canonicalexamples/text_to_image/train_text_to_image.py:95, which hardcodestorch.float16(the standard inference dtype) into the card template:Same edit in the LoRA variant (2 occurrences).
Before submitting
--push_to_hubfunctionality.Who can review?
@sayakpaul