--- base_model: - OpenPipe/mistral-ft-optimized-1218 - mlabonne/NeuralHermes-2.5-Mistral-7B library_name: transformers tags: - mergekit - merge - mistral - optimized --- # Optimized Mistral-Hermes Merge (3B Parameters) This is an optimized merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), successfully reducing the original 7B models to approximately 3B parameters while maintaining core capabilities. ## Model Size Optimization The reduction from 7B to 3B parameters was achieved through: - Layer reduction from 32 to 12 layers - Conversion to bfloat16 format (half precision) - Selective layer range implementation - SLERP merge method optimization ## About Me I'm David Soeiro-Vuong, a third-year Computer Science student working as an apprentice at TW3 Partners, a company specialized in Generative AI. Passionate about artificial intelligence and language models optimization, I focus on creating efficient model merges that balance performance and resource usage. 🔗 [Connect with me on LinkedIn](https://www.linkedin.com/in/david-soeiro-vuong-a28b582ba/) ## Merge Details ### Merge Method & Optimization This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method with specific optimizations: - Reduced to 12 layers for better memory efficiency - Using bfloat16 format - Optimized attention and MLP parameters ### Models Merged The following models were included in the merge: * [OpenPipe/mistral-ft-optimized-1218](https://huggingface.co/OpenPipe/mistral-ft-optimized-1218) * [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: OpenPipe/mistral-ft-optimized-1218 dtype: bfloat16 merge_method: slerp parameters: t: - filter: self_attn value: [0.0, 0.5] - filter: mlp value: [1.0, 0.5] - value: 0.5 slices: - sources: - layer_range: [0, 12] model: OpenPipe/mistral-ft-optimized-1218 - layer_range: [0, 12] model: mlabonne/NeuralHermes-2.5-Mistral-7B