Add pipeline tag and transformers library
#2
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -1,20 +1,20 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
-
language:
|
4 |
-
- en
|
5 |
base_model:
|
6 |
- sfairXC/FsfairX-LLaMA3-RM-v0.1
|
|
|
|
|
|
|
7 |
tags:
|
8 |
- reward model
|
9 |
- fine-grained
|
|
|
|
|
10 |
---
|
11 |
|
12 |
# MDCureRM
|
13 |
|
14 |
-
|
15 |
[π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
|
16 |
|
17 |
-
|
18 |
## Introduction
|
19 |
|
20 |
**MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
|
@@ -113,10 +113,16 @@ reward_weights = torch.tensor([1/9, 1/9, 1/9, 2/9, 2/9, 2/9], device="cuda")
|
|
113 |
source_text_1 = ...
|
114 |
source_text_2 = ...
|
115 |
source_text_3 = ...
|
116 |
-
context = f"{source_text_1}
|
|
|
|
|
|
|
|
|
117 |
instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
|
118 |
|
119 |
-
input_text = f"Instruction: {instruction}
|
|
|
|
|
120 |
tokenized_input = tokenizer(
|
121 |
input_text,
|
122 |
return_tensors='pt',
|
@@ -141,7 +147,7 @@ Beyond MDCureRM, we open-source our best MDCure'd models at the following links:
|
|
141 |
| **MDCure-Qwen2-1.5B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k |
|
142 |
| **MDCure-Qwen2-7B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k |
|
143 |
| **MDCure-LLAMA3.1-8B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k |
|
144 |
-
| **MDCure-LLAMA3.1-70B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure-
|
145 |
|
146 |
## Citation
|
147 |
|
|
|
1 |
---
|
|
|
|
|
|
|
2 |
base_model:
|
3 |
- sfairXC/FsfairX-LLaMA3-RM-v0.1
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
license: apache-2.0
|
7 |
tags:
|
8 |
- reward model
|
9 |
- fine-grained
|
10 |
+
pipeline_tag: text-ranking
|
11 |
+
library_name: transformers
|
12 |
---
|
13 |
|
14 |
# MDCureRM
|
15 |
|
|
|
16 |
[π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
|
17 |
|
|
|
18 |
## Introduction
|
19 |
|
20 |
**MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
|
|
|
113 |
source_text_1 = ...
|
114 |
source_text_2 = ...
|
115 |
source_text_3 = ...
|
116 |
+
context = f"{source_text_1}
|
117 |
+
|
118 |
+
{source_text_2}
|
119 |
+
|
120 |
+
{source_text_3}"
|
121 |
instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
|
122 |
|
123 |
+
input_text = f"Instruction: {instruction}
|
124 |
+
|
125 |
+
{context}"
|
126 |
tokenized_input = tokenizer(
|
127 |
input_text,
|
128 |
return_tensors='pt',
|
|
|
147 |
| **MDCure-Qwen2-1.5B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k |
|
148 |
| **MDCure-Qwen2-7B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k |
|
149 |
| **MDCure-LLAMA3.1-8B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k |
|
150 |
+
| **MDCure-LLAMA3.1-70B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure-72k |
|
151 |
|
152 |
## Citation
|
153 |
|