Improve Metadata and add Paper/Github links

This PR improves the metadata by adding the `datasets` tag, correcting the `pipeline_tag` to `image-text-to-text`, and including the `library_name`. It also adds links to the paper and the Github repository.

Files changed (1) hide show

README.md +12 -5

README.md CHANGED Viewed

@@ -1,14 +1,19 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - OpenGVLab/InternVL2_5-8B
-pipeline_tag: visual-question-answering
 ---
 **DriveLMM-o1: A Large Multimodal Model for Autonomous Driving Reasoning**
 DriveLMM-o1 is a fine-tuned large multimodal model designed for autonomous driving. Built on InternVL2.5-8B with LoRA-based adaptation, it leverages stitched multiview images to produce step-by-step reasoning. This structured approach enhances both final decision accuracy and interpretability in complex driving tasks like perception, prediction, and planning.
 **Key Features:**
@@ -57,6 +62,8 @@ tokenizer = AutoTokenizer.from_pretrained(
 For detailed usage instructions and additional configurations, please refer to the [OpenGVLab/InternVL2_5-8B](https://huggingface.co/OpenGVLab/InternVL2_5-8B) repository.
 **Limitations:**
-While DriveLMM-o1 demonstrates strong performance in autonomous driving tasks, it is fine-tuned for domain-specific reasoning. Users may need to further fine-tune or adapt the model for different driving environments.

 ---
 base_model:
 - OpenGVLab/InternVL2_5-8B
+language:
+- en
+license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
+datasets:
+- ayeshaishaq/DriveLMMo1
 ---
 **DriveLMM-o1: A Large Multimodal Model for Autonomous Driving Reasoning**
+[Paper](https://arxiv.org/abs/2503.10621)
 DriveLMM-o1 is a fine-tuned large multimodal model designed for autonomous driving. Built on InternVL2.5-8B with LoRA-based adaptation, it leverages stitched multiview images to produce step-by-step reasoning. This structured approach enhances both final decision accuracy and interpretability in complex driving tasks like perception, prediction, and planning.
 **Key Features:**
 For detailed usage instructions and additional configurations, please refer to the [OpenGVLab/InternVL2_5-8B](https://huggingface.co/OpenGVLab/InternVL2_5-8B) repository.
+Code: [https://github.com/Vision-CAIR/DriveLMM](https://github.com/Vision-CAIR/DriveLMM)
 **Limitations:**
+While DriveLMM-o1 demonstrates strong performance in autonomous driving tasks, it is fine-tuned for domain-specific reasoning. Users may need to further fine-tune or adapt the model for different driving environments.