Update instruction with infinity

#23
by michaelfeil - opened

Ready for review.

Command tested:

docker run --gpus all -v $PWD/data:/app/.cache -e HF_TOKEN=$HF_TOKEN -p "7997":"7997" michaelf34/infinity:0.0.68 v2 --model-id intfloat/multilingual-e5-large-instruct --revision "main" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO     2024-11-13 00:46:42,260 infinity_emb INFO:        infinity_server.py:89
         Creating 1engines:                                                     
         engines=['intfloat/multilingual-e5-large-instruct                      
         ']                                                                     
INFO     2024-11-13 00:46:42,264 infinity_emb INFO: Anonymized   telemetry.py:30
         telemetry can be disabled via environment variable                     
         `DO_NOT_TRACK=1`.                                                      
INFO     2024-11-13 00:46:42,272 infinity_emb INFO:           select_model.py:64
         model=`intfloat/multilingual-e5-large-instruct`                        
         selected, using engine=`torch` and device=`cuda`                       
INFO     2024-11-13 00:46:42,367                      SentenceTransformer.py:216
         sentence_transformers.SentenceTransformer                              
         INFO: Load pretrained SentenceTransformer:                             
         intfloat/multilingual-e5-large-instruct                                
INFO     2024-11-13 00:46:46,145 infinity_emb INFO: Adding    acceleration.py:56
         optimizations via Huggingface optimum.       

@intfloat Can you review this?

Thanks for the contribution! Sorry for the late reply.

intfloat changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment