Update README.md
Browse files
README.md
CHANGED
@@ -1,13 +1,49 @@
|
|
1 |
-
# This is an embedding model trained on triplet turkish corpus data. It performs well compared to other preexisting turkish embedding models. Thats all
|
2 |
---
|
3 |
license: apache-2.0
|
4 |
datasets:
|
5 |
-
- emrecan/
|
6 |
language:
|
7 |
- tr
|
|
|
|
|
|
|
|
|
|
|
8 |
base_model:
|
9 |
- nomic-ai/nomic-embed-text-v2-moe
|
|
|
10 |
library_name: sentence-transformers
|
11 |
-
|
12 |
-
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
+
- emrecan/all-nli-tr
|
5 |
language:
|
6 |
- tr
|
7 |
+
- en
|
8 |
+
metrics:
|
9 |
+
- spearmanr
|
10 |
+
- accuracy
|
11 |
+
- bertscore
|
12 |
base_model:
|
13 |
- nomic-ai/nomic-embed-text-v2-moe
|
14 |
+
pipeline_tag: zero-shot-classification
|
15 |
library_name: sentence-transformers
|
16 |
+
---
|
17 |
+
# Model Card: Turkish Triplet Embedding Model (Nomic MoE)
|
18 |
+
|
19 |
+
## Model Description
|
20 |
+
|
21 |
+
This is an embedding model trained on a Turkish triplet corpus, utilizing the dataset [`emrecan/all-nli-tr`](https://huggingface.co/datasets/emrecan/all-nli-tr). The model is based on **Nomic Mixture of Experts (MoE)** and achieves strong performance compared to other existing Turkish embedding models.
|
22 |
+
|
23 |
+
### **Intended Use**
|
24 |
+
- Semantic similarity tasks
|
25 |
+
- Text clustering
|
26 |
+
- Information retrieval
|
27 |
+
- Sentence and document-level embedding generation
|
28 |
+
|
29 |
+
### **Training Details**
|
30 |
+
- **Architecture:** Nomic Mixture of Experts (MoE)
|
31 |
+
- **Dataset:** `emrecan/all-nli-tr`
|
32 |
+
- **Training Objective:** Triplet loss for contrastive learning
|
33 |
+
|
34 |
+
### **Evaluation & Performance**
|
35 |
+
Compared to other Turkish embedding models, this model demonstrates strong performance in capturing semantic relationships within the language. Further evaluations and benchmarks will be shared as they become available.
|
36 |
+
|
37 |
+
### **How to Use**
|
38 |
+
You can use this model with Hugging Face's `transformers` or `sentence-transformers` library:
|
39 |
+
|
40 |
+
```python
|
41 |
+
from sentence_transformers import SentenceTransformer
|
42 |
+
|
43 |
+
model = SentenceTransformer("your-huggingface-model-name")
|
44 |
+
embeddings = model.encode(["Merhaba dünya!", "Bugün hava çok güzel."])
|
45 |
+
print(embeddings)
|
46 |
+
```
|
47 |
+
|
48 |
+
### **License & Citation**
|
49 |
+
Please refer to the repository for licensing details and citation instructions.
|