Is there any library that can be similar to TRL and DPO for this type of model?
perhaps @nouamanetazi or @lvwerra know?
· Sign up or log in to comment