Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,59 @@ datasets:
|
|
4 |
- GAIR/LIMO
|
5 |
base_model:
|
6 |
- Josephgflowers/Tinyllama-STEM-Cinder-Agent-v1
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- GAIR/LIMO
|
5 |
base_model:
|
6 |
- Josephgflowers/Tinyllama-STEM-Cinder-Agent-v1
|
7 |
+
---
|
8 |
+
|
9 |
+
## Model Overview
|
10 |
+
**TinyLlama-R1-LIMO** is a small, efficient transformer-based model designed to improve mathematical reasoning with minimal but high-quality training data. It was fine-tuned on the **LIMO** dataset, which emphasizes the principle that "Less Is More" for reasoning tasks. The model is part of ongoing research to enhance instruction-following and reasoning capabilities using a dataset of only 817 curated samples.
|
11 |
+
|
12 |
+
|
13 |
+

|
14 |
+
|
15 |
+
Model Name: `Josephgflowers/Tinyllama-R1-LIMO-Agent`
|
16 |
+
|
17 |
+
This model was made possible by the generous support of www.cherryrepublic.com
|
18 |
+
|
19 |
+
---
|
20 |
+
|
21 |
+
## Key Features
|
22 |
+
- **Data Efficiency**: Achieves competitive reasoning performance using the **LIMO** dataset with only 817 training samples.
|
23 |
+
- **Mathematical Reasoning Focus**: Tailored for tasks requiring logical and numerical problem-solving.
|
24 |
+
- **Instruction Adaptation**: The model shows improved chain-of-thought (CoT) reasoning but may require further refinement for handling complex, multi-step prompts.
|
25 |
+
- **Training Pipeline**: Built using the LLaMA-Factory framework with dataset-specific optimizations.
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
+
## Model Details
|
30 |
+
- **Model Type**: Transformer-based (TinyLlama architecture)
|
31 |
+
- **Parameter Count**: 1.1B
|
32 |
+
- **Training Framework**: Unsloth 8k context / Hugging Face Transformers
|
33 |
+
- **Primary Use Cases**:
|
34 |
+
- Mathematical and logical reasoning
|
35 |
+
- STEM education and problem-solving
|
36 |
+
- Instruction-following conversations
|
37 |
+
|
38 |
+
---
|
39 |
+
|
40 |
+
## Training Data
|
41 |
+
This model was fine-tuned using the **LIMO** dataset, which emphasizes the power of high-quality data over quantity.
|
42 |
+
|
43 |
+
### Dataset Highlights
|
44 |
+
- **Name**: LIMO (Less Is More for Reasoning)
|
45 |
+
- **Size**: 817 samples
|
46 |
+
|
47 |
+
Acknowledgments
|
48 |
+
|
49 |
+
Thanks to the creators of the LIMO dataset and contributors to the LLaMA-Factory training framework. Special thanks to Joseph Flowers for model fine-tuning and experimentation.
|
50 |
+
Citation
|
51 |
+
|
52 |
+
If you use this model or dataset, please cite the following paper:
|
53 |
+
|
54 |
+
@misc{ye2025limoreasoning,
|
55 |
+
title={LIMO: Less is More for Reasoning},
|
56 |
+
author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
|
57 |
+
year={2025},
|
58 |
+
eprint={2502.03387},
|
59 |
+
archivePrefix={arXiv},
|
60 |
+
primaryClass={cs.CL},
|
61 |
+
url={https://arxiv.org/abs/2502.03387},
|
62 |
+
}
|