dhamaniasad commited on
Commit
622983d
·
verified ·
1 Parent(s): 5617a9f

Fix milvus example link

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -26,7 +26,7 @@ A classic example: using both embedding retrieval and the BM25 algorithm.
26
  Now, you can try to use BGE-M3, which supports both embedding and sparse retrieval.
27
  This allows you to obtain token weights (similar to the BM25) without any additional cost when generate dense embeddings.
28
  To use hybrid retrieval, you can refer to [Vespa](https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb
29
- ) and [Milvus](https://github.com/milvus-io/pymilvus/blob/master/examples/hello_hybrid_sparse_dense.py).
30
 
31
  - As cross-encoder models, re-ranker demonstrates higher accuracy than bi-encoder embedding model.
32
  Utilizing the re-ranking model (e.g., [bge-reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker), [bge-reranker-v2](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_reranker)) after retrieval can further filter the selected text.
@@ -40,8 +40,8 @@ Utilizing the re-ranking model (e.g., [bge-reranker](https://github.com/FlagOpen
40
  The previous test results were lower because we mistakenly removed the passages that have the same id as the query from the search results. After correcting this mistake, the overall performance of BGE-M3 on MIRACL is higher than the previous results, but the experimental conclusion remains unchanged. The other results are not affected by this mistake. To reproduce the previous lower results, you need to add the `--remove-query` parameter when using `pyserini.search.faiss` or `pyserini.search.lucene` to search the passages.
41
 
42
  </details>
43
- - 2024/3/20: **Thanks Milvus team!** Now you can use hybrid retrieval of bge-m3 in Milvus: [pymilvus/examples
44
- /hello_hybrid_sparse_dense.py](https://github.com/milvus-io/pymilvus/blob/master/examples/hello_hybrid_sparse_dense.py).
45
  - 2024/3/8: **Thanks for the [experimental results](https://towardsdatascience.com/openai-vs-open-source-multilingual-embedding-models-e5ccb7c90f05) from @[Yannael](https://huggingface.co/Yannael). In this benchmark, BGE-M3 achieves top performance in both English and other languages, surpassing models such as OpenAI.**
46
  - 2024/3/2: Release unified fine-tuning [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/unified_finetune) and [data](https://huggingface.co/datasets/Shitao/bge-m3-data)
47
  - 2024/2/6: We release the [MLDR](https://huggingface.co/datasets/Shitao/MLDR) (a long document retrieval dataset covering 13 languages) and [evaluation pipeline](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB/MLDR).
@@ -85,7 +85,7 @@ For embedding retrieval, you can employ the BGE-M3 model using the same approach
85
  The only difference is that the BGE-M3 model no longer requires adding instructions to the queries.
86
 
87
  For hybrid retrieval, you can use [Vespa](https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb
88
- ) and [Milvus](https://github.com/milvus-io/pymilvus/blob/master/examples/hello_hybrid_sparse_dense.py).
89
 
90
 
91
  **3. How to fine-tune bge-M3 model?**
 
26
  Now, you can try to use BGE-M3, which supports both embedding and sparse retrieval.
27
  This allows you to obtain token weights (similar to the BM25) without any additional cost when generate dense embeddings.
28
  To use hybrid retrieval, you can refer to [Vespa](https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb
29
+ ) and [Milvus](https://github.com/milvus-io/pymilvus/blob/master/examples/hybrid_search/hello_hybrid_sparse_dense.py).
30
 
31
  - As cross-encoder models, re-ranker demonstrates higher accuracy than bi-encoder embedding model.
32
  Utilizing the re-ranking model (e.g., [bge-reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker), [bge-reranker-v2](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_reranker)) after retrieval can further filter the selected text.
 
40
  The previous test results were lower because we mistakenly removed the passages that have the same id as the query from the search results. After correcting this mistake, the overall performance of BGE-M3 on MIRACL is higher than the previous results, but the experimental conclusion remains unchanged. The other results are not affected by this mistake. To reproduce the previous lower results, you need to add the `--remove-query` parameter when using `pyserini.search.faiss` or `pyserini.search.lucene` to search the passages.
41
 
42
  </details>
43
+ - 2024/3/20: **Thanks Milvus team!** Now you can use hybrid retrieval of bge-m3 in Milvus: [pymilvus/examples/hybrid_search/
44
+ /hello_hybrid_sparse_dense.py](https://github.com/milvus-io/pymilvus/blob/master/examples/hybrid_search/hello_hybrid_sparse_dense.py).
45
  - 2024/3/8: **Thanks for the [experimental results](https://towardsdatascience.com/openai-vs-open-source-multilingual-embedding-models-e5ccb7c90f05) from @[Yannael](https://huggingface.co/Yannael). In this benchmark, BGE-M3 achieves top performance in both English and other languages, surpassing models such as OpenAI.**
46
  - 2024/3/2: Release unified fine-tuning [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/unified_finetune) and [data](https://huggingface.co/datasets/Shitao/bge-m3-data)
47
  - 2024/2/6: We release the [MLDR](https://huggingface.co/datasets/Shitao/MLDR) (a long document retrieval dataset covering 13 languages) and [evaluation pipeline](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB/MLDR).
 
85
  The only difference is that the BGE-M3 model no longer requires adding instructions to the queries.
86
 
87
  For hybrid retrieval, you can use [Vespa](https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb
88
+ ) and [Milvus](https://github.com/milvus-io/pymilvus/blob/master/examples/hybrid_search/hello_hybrid_sparse_dense.py).
89
 
90
 
91
  **3. How to fine-tune bge-M3 model?**