Discussion

#2
by Nicolas-BZRD - opened

Hey @Omartificial-Intelligence-Space , thank you for using our model. Out of curiosity, I wondered if you had any comparison models? I don't realise if the score obtained is good?

Hey @Nicolas-BZRD , If you’re referring to a comparison with other Arabic embedding models, performance on STS benchmarks typically ranges between 70-86, depending on the architecture used—such as AraBERT, MARBERT, ARBERT, ModernBERT, and EuroBERT. However, this model has not yet outperformed these architectures.

But, we believe that achieving nearly 80 on STS17 and around 60 on STS22.v2 benchmarks demonstrates the model’s high potential to stand on its own with its unique architecture and features.

That said, the comparison I provided is intended to showcase how Arabic-specific semantic fine-tuning applied to a multilingual embedding model can significantly enhance performance, enabling the model to better understand Arabic semantics.

Best,
Omer

Thanks for your answer. I think EuroBERT could benefit from longer training because, unlike other Arabic models, it was not trained exclusively on Arabic and has been exposed to multiple languages. Additionally, if you have the compute, you might want to try the 610M model. While its base performance is lower, I am confident it will surpass the 210M model after fine-tuning. It achieved the best results on our evaluation set for Arabic and compared to the 210M, we did not observe significant multilinguality issues with this model (as smaller models often struggle to cover a wide range of languages). If you need help, don’t hesitate to ask and thanks a lot for this work it’s very nice 🙌

I totally agree with you. In fact, the 610M model is currently in training as we speak. Excited for the results

For information we just add the hyper parameter we found during our finetuning on our models card :
https://huggingface.co/EuroBERT/EuroBERT-610m

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment