nadiinchi commited on
Commit
2eba72f
·
verified ·
1 Parent(s): 421f913

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -87,7 +87,7 @@ Interface of the `process` function:
87
  * Training data: MS Marco (document) + NQ training sets, with synthetic silver labelling of which sentences to keep, produced using [LLama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B).
88
  * Languages covered: English
89
  * Context length: 512 tokens (similar to the pretrained DeBERTa model)
90
- * Evaluation: we evaluate Provence on 7 datasets from various domains: Wikipedia, biomedical data, course syllabi, and news. We find that Provence is able to prune irrelevant sentences with little-to-no drop in performance, in all domains, and outperforms existing baselines on the Pareto front (top right corners of the plots).
91
 
92
  Check out more analysis in the [paper](https://arxiv.org/abs/2501.16214)!
93
 
 
87
  * Training data: MS Marco (document) + NQ training sets, with synthetic silver labelling of which sentences to keep, produced using [LLama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B).
88
  * Languages covered: English
89
  * Context length: 512 tokens (similar to the pretrained DeBERTa model)
90
+ * Evaluation: we evaluate Provence on 7 datasets from various domains: Wikipedia, biomedical data, course syllabi, and news. Evaluation is conducted on the model trained only on MS Marco data. We find that Provence is able to prune irrelevant sentences with little-to-no drop in performance, in all domains, and outperforms existing baselines on the Pareto front (top right corners of the plots).
91
 
92
  Check out more analysis in the [paper](https://arxiv.org/abs/2501.16214)!
93