DeepSeek-R1-Distill-Qwen-7B-News-Classifier

Model Description

DeepSeek-R1-Distill-Qwen-7B-News-Classifier is a fine-tuned version of DeepSeek-R1-Distill-Qwen-7B, specially optimized for news classification tasks. The base model is a distilled version from DeepSeek-R1 using Qwen2.5-Math-7B as its foundation.

Demo

Training Details

Training Data

The model was fine-tuned on a custom dataset of 300 news classification examples in ShareGPT format. Each example contains:

  • A news headline with a classification request prefix (e.g., "新闻分类:" or similar)
  • The expected category output with reasoning chain

Training Procedure

  • Framework: LLaMA Factory
  • Fine-tuning Method: LoRA with LoRA+ optimizer
  • LoRA Parameters:
    • LoRA+ learning rate ratio: 16
    • Target modules: all linear layers
    • Base learning rate: 5e-6
    • Gradient accumulation steps: 2
    • Training epochs: 3

Evaluation Results

The model was evaluated on a test set and achieved the following metrics:

  • BLEU-4: 29.67
  • ROUGE-1: 56.56
  • ROUGE-2: 31.31
  • ROUGE-L: 39.86

These scores indicate strong performance for the news classification task, with good alignment between model outputs and reference classifications.

Citation

If you use this model in your research, please cite:

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
      title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, 
      author={DeepSeek-AI},
      year={2025},
      eprint={2501.12948},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.12948}, 
}

Acknowledgements

This model was fine-tuned using the LLaMA Factory framework. We appreciate the contributions of the DeepSeek AI team for the original distilled model.

Downloads last month
11
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.