Thank you for your tremendous contributions to the community. I have a question regarding the de-distillation LoRA strategy described in the repository. To enable fine-tuning of the few-step distilled model (zimage-turbo), the approach first generates a synthetic dataset using the distilled model itself, and then trains a de-distillation LoRA on this data. My specific question is: when training this de-distillation LoRA, is the optimization objective simply the standard flow matching loss (i.e., corrupting clean samples with noise to obtain noisy states and predicting the velocity), or does it employ distillation-specific losses?
Thank you for your tremendous contributions to the community. I have a question regarding the de-distillation LoRA strategy described in the repository. To enable fine-tuning of the few-step distilled model (zimage-turbo), the approach first generates a synthetic dataset using the distilled model itself, and then trains a de-distillation LoRA on this data. My specific question is: when training this de-distillation LoRA, is the optimization objective simply the standard flow matching loss (i.e., corrupting clean samples with noise to obtain noisy states and predicting the velocity), or does it employ distillation-specific losses?