Validating fine-tuning of AnglE model

#27

by hugguc - opened 15 days ago

15 days ago

Hey,

I could use some guidance in using AnglE. I'd appreciate a response.

I'd like to i) download pre-trained AnglE model (e.g. UAE-Large-V1) and then ii) fine-tune that model to my subject area. I seem to not be able to do that successfully, hence a couple of questions:

Does this approach (of using pre-trained AnglE model and then fine-tuning it to my space) even make sense, or I absolutely have to fine-tune AngleE from Bert or alike, in a single step?
Can I run a fine-tuning experiment, using only a handful (say 20, total) of positive and negative examples?
In machine learning space, test-on-train inference usually works with a very high degree of accuracy: if one pulls example/label pairs from the train set and forms the test set out of those pairs, then the inference would work close-to-perfectly on that test set. I'm expecting the same effect when using AnglE: once I've trained the model on my own positive/negative examples, those same examples should be properly characterized as positive/negative with a very high degree of certainty. Is that a correct assumption? Somehow I'm not observing this to be the case and I presume that I'm doing something wrong.

Thanks again!

SeanLee97

WhereIsAI org 15 days ago

•

edited 15 days ago

thanks for using angle. Here are the answers to your questions:

it makes sense and is recommended to fine-tune WhereIsAI/UAE-Large-V1 with your domain data; it is suggested to set a smaller learning rate, e.g., 1e-6/1e-7. To help you fine-tune your model, here is a tutorial to fine-tune a medical domain embedding with angle and WhereIsAI/UAE-Large-V1: https://angle.readthedocs.io/en/latest/notes/tutorial.html
yes, you can try it, but I think 20 samples are not that enough. I encourage you to collect more samples, at least a few hundred.
it depends on your data and training hyperparameters. To verify it, you can compare the average similarity of pos/negative pairs between fine-tuned and non-fine-tuned models. Take pos pairs as an example, if the average similarity of the fine-tuned model is smaller than non-fine-tuned's, the fine-tuning should work well.

hugguc

14 days ago

•

edited 14 days ago

Thanks for providing this guidance, this recipe seems to have worked for me!

I guess that would be a simple question. I've fine-tuned model WhereIsAI/UAE-Large-V1 using your instructions and saved it into --save_dir ckpts/uae-medical-large-v1. I didn't push the model to the hub, however (removed arguments from "push_to_hub 1" like).

When I load the model, I see what seems to be a warning message, which I include below.

I wonder, am I using the correct syntax to load the model? Is this warning message expected?

Thank you!

angle_t = AnglE.from_pretrained('ckpts/uae-medical-large-v1', pooling_strategy='cls').cuda()
Some weights of BertModel were not initialized from the model checkpoint at ckpts/uae-medical-large-v1 and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

SeanLee97

WhereIsAI org 14 days ago

hi @hugguc , you got this message since you set --load_mlm_model 1 during training. it is a normal message, actually; you can test it.

SeanLee97

WhereIsAI org 14 days ago

i already removed the --load_mlm_model from the tutorial note

hugguc

12 days ago

Hey @SeanLee97 ,

Thanks for explaining that issue and fixing the doc!

As I reported earlier, I'm noticing that my fine-tuning run, for which I use only very few examples, terminates very quickly, while not achieving good prediction performance even on the samples that I used for training. I'm wondering, is there a way to have the trainer run through more iterations, in hope that it will at least produce descent predictions on the training set? I tried to change --epoch argument from your recommedation of 1 to 10, but this didn't affect the result by much, and the run also was very quick.

Thank you for your advice!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment