Skip to content

Fix AttributeError in Wuerstchen prior training model card#13529

Open
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix-wuerstchen-prior-model-card-weight-dtype
Open

Fix AttributeError in Wuerstchen prior training model card#13529
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix-wuerstchen-prior-model-card-weight-dtype

Conversation

@Ricardo-M-L
Copy link
Copy Markdown
Contributor

What this PR does

Both Wuerstchen prior training scripts build their HF Hub model card with an f-string that references {args.weight_dtype}:

train_text_to_image_prior.py (lines 104-105):

pipe_prior = DiffusionPipeline.from_pretrained("{repo_id}", torch_dtype={args.weight_dtype})
pipe_t2i = DiffusionPipeline.from_pretrained("{args.pretrained_decoder_model_name_or_path}", torch_dtype={args.weight_dtype})

train_text_to_image_lora_prior.py (lines 105, 108):

pipeline = AutoPipelineForText2Image.from_pretrained(
                "{args.pretrained_decoder_model_name_or_path}", torch_dtype={args.weight_dtype}
            )
pipeline.prior_pipe.load_lora_weights("{repo_id}", torch_dtype={args.weight_dtype})

Why this is a real bug

weight_dtype is a local variable in main() (set from args.mixed_precision), not an attribute of the args namespace. --weight_dtype is never defined on the parser. So when save_model_card(args, ...) runs (triggered by --push_to_hub — line 925 of the prior script, line 938 of the LoRA variant), the f-string formatter raises:

AttributeError: 'Namespace' object has no attribute 'weight_dtype'

This fires at the tail end of training, right after the model has been saved locally — losing the hub upload for anyone who trained with --push_to_hub.

Fix

The torch_dtype= argument in the model card's usage snippet is just an illustrative example. Mirror the canonical examples/text_to_image/train_text_to_image.py:95, which hardcodes torch.float16 (the standard inference dtype) into the card template:

-pipe_prior = DiffusionPipeline.from_pretrained("{repo_id}", torch_dtype={args.weight_dtype})
-pipe_t2i = DiffusionPipeline.from_pretrained("{args.pretrained_decoder_model_name_or_path}", torch_dtype={args.weight_dtype})
+pipe_prior = DiffusionPipeline.from_pretrained("{repo_id}", torch_dtype=torch.float16)
+pipe_t2i = DiffusionPipeline.from_pretrained("{args.pretrained_decoder_model_name_or_path}", torch_dtype=torch.float16)

Same edit in the LoRA variant (2 occurrences).

Before submitting

  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue or the forum? N/A — straightforward crash fix.
  • Did you make sure to update the documentation with your changes? N/A — only touches the generated model-card template, matches the canonical script's pattern.
  • Did you write any new necessary tests? N/A — restores --push_to_hub functionality.

Who can review?

@sayakpaul

Both Wuerstchen prior training scripts build their HF Hub model card
with an f-string that references `{args.weight_dtype}`:

    pipe_prior = DiffusionPipeline.from_pretrained("{repo_id}", torch_dtype={args.weight_dtype})

`weight_dtype` is a local variable in `main()` (set from
`args.mixed_precision`), not an attribute of the `args` namespace.
`--weight_dtype` is never defined on the parser, so when
`save_model_card(args, ...)` runs (triggered by `--push_to_hub`), the
f-string formatter raises:

    AttributeError: 'Namespace' object has no attribute 'weight_dtype'

at the tail end of training, right after the model has been saved
locally — losing the hub upload for anyone who trained with
`--push_to_hub`.

Affected files:
- `examples/research_projects/wuerstchen/text_to_image/train_text_to_image_prior.py` (lines 104-105)
- `examples/research_projects/wuerstchen/text_to_image/train_text_to_image_lora_prior.py` (lines 105, 108)

Both only use the value to illustrate a `torch_dtype=` argument in the
generated README code block. Mirror the canonical
`examples/text_to_image/train_text_to_image.py:95`, which hardcodes
`torch.float16` (the standard inference dtype) into the model card
template. Drop the broken f-string interpolation.
@github-actions github-actions Bot added examples size/S PR with diff < 50 LOC labels Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant