关于多阶段训练
#21
by
xxxcliu
- opened
看paper有点困惑,是先进行了RetroMAE的预训练,然后用无监督数据进行了dense retrieval的训练,然后又用self蒸馏进行了三种方式的训练吗?
Yes. RetroMAE->dense retrieval->unified fine-tuning
Yes. RetroMAE->dense retrieval->unified fine-tuning
Thanks for your reply! Do both bge and bge-m3 adopt a bi-encoder architecture? If so, is there actually a single roberta encoder or two seperate models?
bge and bge-m3 both are bi-encoder model, and query and passage share the same encoder.