长期我都是用打标器打标,只要描述基本正确我也没太在意,目前练的十几个lora都还能用。昨天有个真实类人物lora收敛不好我把打标内容让豆包分析了下,豆包说打标没打好,因为豆包只能信一半我实践了下效果不错。分享下过程。
数据集为22张真实人物多角度多表情1024x1024大头照,训练底模illustrious真实系模型,repeat40,epoch80(这个我故意拉高了,收敛好随时停一般跑3000到5000步),学习率1e-4,batch size 8,第一组训练跑到1600步预览图完全不像我就停了,让豆包分析了下参数和打标内容,它说我打标不对和repeat太高,并推荐了我跑repeat20和打标简化,抱着试试的心态我跑了550步就出了一个收敛和泛化非常不错的lora。
---------------------------------------------------------
对比之前的训练方法有以下提高:1.收敛快。之前的lora基本在3000步到5000步才好用,训练时间从3小时缩短到1小时内(500步收敛);2.泛化好。之前的lora其实有些过拟了,比如数据集衣服会被训练进去。
---------------------------------------------------------
问题核心原因:之前打标器给的打标内容自然语言过长,每一步无效噪声 token 太多(触发词重复太多),只能靠堆几千步慢慢磨人脸特征。
----------------------------------------------------------
之前的打标内容(DAMEINV是触发词):DAMEINV, DAMEINV is depicted in a medium close-up shot.The camera angle is approximately 90 degrees relative to DAMEINV’s forward facing position.DAMEINV’s expression appears thoughtful, with a subtle, slight upturn of the lips suggesting a gentle smile.The lighting is soft and even, highlighting the contours of DAMEINV’s face and creating a warm tone.The background is a plain, light-colored surface, devoid of any distinct details, which isolates DAMEINV as the primary focus.
优化后的打标内容:DAMEINV, medium close-up, 90 degrees side face, thoughtful expression, faint gentle smile, slightly upturned lips, soft even lighting, distinct facial contours, warm tone, plain light solid background, no extra details, face centered
打标优化后550步就非常相像了,而且换什么衣服都稳稳的换掉了,如果中景或远景不像就提点权重 或 脸部重绘就可以非常还原,这样说来我之前的lora都有点问题,都是暴力跑到收敛的,有一些也存在明显的过拟问题
(太油要降权重、衣服难换要刷图)。抛砖引玉,大家交流交流
数据集为22张真实人物多角度多表情1024x1024大头照,训练底模illustrious真实系模型,repeat40,epoch80(这个我故意拉高了,收敛好随时停一般跑3000到5000步),学习率1e-4,batch size 8,第一组训练跑到1600步预览图完全不像我就停了,让豆包分析了下参数和打标内容,它说我打标不对和repeat太高,并推荐了我跑repeat20和打标简化,抱着试试的心态我跑了550步就出了一个收敛和泛化非常不错的lora。
---------------------------------------------------------
对比之前的训练方法有以下提高:1.收敛快。之前的lora基本在3000步到5000步才好用,训练时间从3小时缩短到1小时内(500步收敛);2.泛化好。之前的lora其实有些过拟了,比如数据集衣服会被训练进去。
---------------------------------------------------------
问题核心原因:之前打标器给的打标内容自然语言过长,每一步无效噪声 token 太多(触发词重复太多),只能靠堆几千步慢慢磨人脸特征。
----------------------------------------------------------
之前的打标内容(DAMEINV是触发词):DAMEINV, DAMEINV is depicted in a medium close-up shot.The camera angle is approximately 90 degrees relative to DAMEINV’s forward facing position.DAMEINV’s expression appears thoughtful, with a subtle, slight upturn of the lips suggesting a gentle smile.The lighting is soft and even, highlighting the contours of DAMEINV’s face and creating a warm tone.The background is a plain, light-colored surface, devoid of any distinct details, which isolates DAMEINV as the primary focus.
优化后的打标内容:DAMEINV, medium close-up, 90 degrees side face, thoughtful expression, faint gentle smile, slightly upturned lips, soft even lighting, distinct facial contours, warm tone, plain light solid background, no extra details, face centered
打标优化后550步就非常相像了,而且换什么衣服都稳稳的换掉了,如果中景或远景不像就提点权重 或 脸部重绘就可以非常还原,这样说来我之前的lora都有点问题,都是暴力跑到收敛的,有一些也存在明显的过拟问题










