---
base_model:
- OpenPipe/mistral-ft-optimized-1218
- mlabonne/NeuralHermes-2.5-Mistral-7B
library_name: transformers
tags:
- mergekit
- merge
- mistral
- optimized
---
# Optimized Mistral-Hermes Merge (3B Parameters)

This is an optimized merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), successfully reducing the original 7B models to approximately 3B parameters while maintaining core capabilities.

## Model Size Optimization
The reduction from 7B to 3B parameters was achieved through:
- Layer reduction from 32 to 12 layers
- Conversion to bfloat16 format (half precision)
- Selective layer range implementation
- SLERP merge method optimization
  
## About Me
I'm David Soeiro-Vuong, a third-year Computer Science student working as an apprentice at TW3 Partners, a company specialized in Generative AI. Passionate about artificial intelligence and language models optimization, I focus on creating efficient model merges that balance performance and resource usage.

🔗 [Connect with me on LinkedIn](https://www.linkedin.com/in/david-soeiro-vuong-a28b582ba/)

## Merge Details
### Merge Method & Optimization
This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method with specific optimizations:
- Reduced to 12 layers for better memory efficiency
- Using bfloat16 format
- Optimized attention and MLP parameters

### Models Merged
The following models were included in the merge:
* [OpenPipe/mistral-ft-optimized-1218](https://huggingface.co/OpenPipe/mistral-ft-optimized-1218)
* [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)

### Configuration
The following YAML configuration was used to produce this model:

```yaml
base_model: OpenPipe/mistral-ft-optimized-1218
dtype: bfloat16
merge_method: slerp
parameters:
  t:
  - filter: self_attn
    value: [0.0, 0.5]
  - filter: mlp
    value: [1.0, 0.5]
  - value: 0.5
slices:
- sources:
  - layer_range: [0, 12]
    model: OpenPipe/mistral-ft-optimized-1218
  - layer_range: [0, 12]
    model: mlabonne/NeuralHermes-2.5-Mistral-7B