stablediffusion吧 关注:47,364贴子:199,074
  • 8回复贴,共1

大佬们,为什么我prodigy没有效果,是哪里配错了吗

只看楼主收藏回复

不管是用cosin还是constant都没效果,constant学习率,就没变过是一条线,预览图都是一样的,载下来用也没效果,用adamw8bit第二轮保存的时候就能看出效果



IP属地:福建来自Android客户端1楼2025-06-01 14:05回复
    配置如下model_train_type = "sdxl-lora"
    pretrained_model_name_or_path = "/root/autodl-tmp/ill/illustriousXL_v01.safetensors etensors,illustriousXL_v01.safetensors) etensors,illustriousXL_v01.safetensors) "
    resume = ""
    train_data_dir = "/root/autodl-tmp/lora-scripts/train/theresa/"
    prior_loss_weight = 1
    resolution = "1024,1024"
    enable_bucket = true
    min_bucket_reso = 1
    max_bucket_reso = 2560
    bucket_reso_steps = 32
    bucket_no_upscale = true
    output_name = "theresav2"
    output_dir = "/root/autodl-tmp/lora-scripts/output/theresav2"
    save_model_as = "safetensors"
    save_precision = "fp16"
    save_every_n_epochs = 2
    save_state = true
    max_train_epochs = 50
    train_batch_size = 4
    gradient_checkpointing = true
    gradient_accumulation_steps = 1
    network_train_unet_only = true
    network_train_text_encoder_only = false
    learning_rate = 1
    unet_lr = 1
    text_encoder_lr = 1
    lr_scheduler = "cosine"
    lr_warmup_steps = 0
    optimizer_type = "Prodigy"
    network_module = "networks.lora"
    network_dim = 32
    network_alpha = 32
    network_dropout = 0.01
    sample_prompts = "masterpiece, best quality, theresa,1girl ,full body,turtleneck dress,high heel boots --w 1024 --h 1024 --l 7 --s 24 --d 1337"
    sample_sampler = "euler_a"
    sample_every_n_epochs = 2
    log_with = "tensorboard"
    log_prefix = "theresav2"
    log_tracker_name = "LORA"
    logging_dir = "./logs"
    caption_extension = ".txt"
    shuffle_caption = true
    keep_tokens = 1
    max_token_length = 255
    seed = 1019
    mixed_precision = "fp16"
    full_fp16 = true
    no_half_vae = true
    xformers = true
    sdpa = true
    lowram = false
    cache_latents = true
    cache_latents_to_disk = true
    persistent_data_loader_workers = true
    optimizer_args = [
    "decouple=True",
    "weight_decay=0.01",
    "use_bias_correction=True",
    "d_coef=2.0"
    ]


    IP属地:福建来自Android客户端2楼2025-06-01 14:06
    回复
      2026-03-16 23:47:40
      广告
      不感兴趣
      开通SVIP免广告
      prodigy的d*lr才是真正的学习率,它是冷启动的,最开始从0开始一点点自动上升,因此最开始训练的几轮看起来和原图变化不大。aw8bit如果不设置预热步数的话一开始学习率就是你设置的值,随后逐渐减少到0,所以一开始就有很大的效果,越到后面变化越小


      IP属地:北京来自Android客户端3楼2025-06-02 09:30
      收起回复


        IP属地:福建来自Android客户端4楼2025-06-02 11:39
        回复
          我记得选用prodigy以后下面会有一个参数prodigy_d_coef,这个参数你填上2.0试试。还有看图上的黑色线显示只训练了两个epoch,正在训练epoch3,一共只有300多step,这时候lr还是接近0也是正常的,可能大概第6,7个epoch才能开始看到上涨。你可以把鼠标放在d*lr那个图上看看,曲线上的value值是0还是一个接近0的很小的数比如1e-8之类的


          IP属地:北京来自Android客户端5楼2025-06-02 15:22
          收起回复
            跑了十几轮一点变化都没有model_train_type = "sdxl-lora"
            pretrained_model_name_or_path = "/root/autodl-tmp/ill/网页链接 "
            resume = ""
            train_data_dir = "/root/autodl-tmp/lora-scripts/train/theresa/"
            prior_loss_weight = 1
            resolution = "512,512"
            enable_bucket = true
            min_bucket_reso = 1
            max_bucket_reso = 1024
            bucket_reso_steps = 32
            bucket_no_upscale = true
            output_name = "theresav3"
            output_dir = "/root/autodl-tmp/lora-scripts/output/theresav2"
            save_model_as = "safetensors"
            save_precision = "fp16"
            save_every_n_epochs = 2
            save_state = true
            max_train_epochs = 40
            train_batch_size = 4
            gradient_checkpointing = true
            gradient_accumulation_steps = 1
            network_train_unet_only = false
            network_train_text_encoder_only = false
            learning_rate = 1
            unet_lr = 1
            text_encoder_lr = 1
            lr_scheduler = "constant"
            lr_warmup_steps = 0
            optimizer_type = "Prodigy"
            network_module = "网页链接 "
            network_dim = 32
            network_alpha = 32
            network_dropout = 0.01
            sample_prompts = "masterpiece, best quality, 1girl, solo,theresa,turtleneck dress, chest-shoulder device,single pauldron, high heel boots,sky,grass --w 1024 --h 1024 --l 7 --s 24 --d 1337"
            sample_sampler = "euler_a"
            sample_every_n_epochs = 1
            log_with = "tensorboard"
            log_prefix = "theresav1"
            log_tracker_name = "LORA"
            logging_dir = "./logs"
            caption_extension = ".txt"
            shuffle_caption = true
            keep_tokens = 1
            max_token_length = 255
            seed = 1019
            mixed_precision = "fp16"
            full_fp16 = true
            no_half_vae = true
            xformers = true
            sdpa = true
            lowram = false
            cache_latents = true
            cache_latents_to_disk = true
            persistent_data_loader_workers = true
            optimizer_args = [
            "decouple=True",
            "weight_decay=0.01",
            "use_bias_correction=True",
            "d_coef=2.0"
            ]




            IP属地:福建来自Android客户端6楼2025-06-02 21:40
            回复
              我学的是flux lora trainer,只要跑过程loss曲线慢慢下降的模型 最终效果都还不错。constant的意思是固定学习率 所以你跑的lr是直线。我用的adafactor优化器。 给你做个参考。


              IP属地:浙江7楼2025-06-02 22:10
              回复