Llama-3.2-MedIT-SUN-2.5B

Marsh Harrier

The Marsh Harrier (MSH) is a language model developed by MedIT Solutions using an advanced checkpoint merging technique. It represents a novel fusion of the Speakleash Bielik 11B v2.3 Instruct and Speakleash Bielik 11B v2 models, employing our proprietary weight-merging methodology.

Key Features:

  • Built on a pioneering approach to neural network weight fusion
  • Supports merging models of identical parameter counts while maintaining architecture flexibility
  • Demonstrates superior performance compared to its base models
  • Optimized for Polish language understanding and generation

Performance:

The model shows significant improvements over its predecessors across multiple metrics in the Open PL LLM Leaderboard evaluation framework (0-shot), which is part of the SpeakLeash.org open-science initiative.

Technical Details:

  • Base Models: Speakleash Bielik 11B v2.3 Instruct and Bielik 11B v2
  • Architecture: Compatible with original Bielik architecture
  • Parameter Count: 11 billion parameters
  • Special Feature: Utilizes MedIT Solutions' proprietary checkpoint merging technology

This model represents a step forward in developing the Polish language, demonstrating how merging techniques can enhance model performance while maintaining architectural efficiency.

Polish LLM Open Leaderboard

Core Leaderboards:

  • MT-Bench-PL: slight decrease of 0.3 points (8.27 vs 8.56)
  • Open PL LLM Leaderboard: improved performance by 0.09 points (65.80 vs 65.71)

Sentiment Analysis (PolEmo2):

  • In-domain accuracy: Matches Bielik at 77.70%
  • Out-of-domain accuracy: Improved performance at 79.76% (vs 79.35%)

Text Classification Tasks:

  • 8tags classification: Significant improvement of ~3pp (76.14% vs 73.17%)
  • Belebele benchmark: Matching performance at 88.56%
  • CBD task: Substantial F1 score improvement by 10pp (23.91% vs 13.73%)

Language Understanding:

  • DYK ("Did you know..."): Improved F1 score (69.77% vs 69.14%)
  • Named Entity Recognition (KLEJ NER): Notable improvement of ~8pp (45.53% vs 37.61%)
  • PolQA reranking: Slight decrease (81.99% vs 83.21%)
  • PPC: Enhanced accuracy (78.00% vs 77.20%)
  • PSC: Minor F1 score decrease (90.46% vs 93.63%)

Overall Performance: MSH-v1 achieves a higher average score of 71.18% compared to Bielik v2.3's 69.33%, demonstrating the effectiveness of our checkpoint merging technique in improving model performance across diverse NLP tasks.

All evaluations were conducted using the Open PL LLM Leaderboard framework (0-shot) as part of the SpeakLeash.org open-science initiative.

Kudos to the SpeakLeash project and ACK Cyfronet AGH for their extraordinary work.

Downloads last month
12
GGUF
Model size
11.2B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for meditsolutions/MSH-v1-Bielik-v2.3-Instruct-MedIT-merge-GGUF

Quantized
(12)
this model

Collection including meditsolutions/MSH-v1-Bielik-v2.3-Instruct-MedIT-merge-GGUF