teoyidu commited on
Commit
f976017
·
verified ·
1 Parent(s): 936a387

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -5
README.md CHANGED
@@ -1,13 +1,49 @@
1
- # This is an embedding model trained on triplet turkish corpus data. It performs well compared to other preexisting turkish embedding models. Thats all
2
  ---
3
  license: apache-2.0
4
  datasets:
5
- - emrecan/stsb-mt-turkish
6
  language:
7
  - tr
 
 
 
 
 
8
  base_model:
9
  - nomic-ai/nomic-embed-text-v2-moe
 
10
  library_name: sentence-transformers
11
- tags:
12
- - embedding
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  datasets:
4
+ - emrecan/all-nli-tr
5
  language:
6
  - tr
7
+ - en
8
+ metrics:
9
+ - spearmanr
10
+ - accuracy
11
+ - bertscore
12
  base_model:
13
  - nomic-ai/nomic-embed-text-v2-moe
14
+ pipeline_tag: zero-shot-classification
15
  library_name: sentence-transformers
16
+ ---
17
+ # Model Card: Turkish Triplet Embedding Model (Nomic MoE)
18
+
19
+ ## Model Description
20
+
21
+ This is an embedding model trained on a Turkish triplet corpus, utilizing the dataset [`emrecan/all-nli-tr`](https://huggingface.co/datasets/emrecan/all-nli-tr). The model is based on **Nomic Mixture of Experts (MoE)** and achieves strong performance compared to other existing Turkish embedding models.
22
+
23
+ ### **Intended Use**
24
+ - Semantic similarity tasks
25
+ - Text clustering
26
+ - Information retrieval
27
+ - Sentence and document-level embedding generation
28
+
29
+ ### **Training Details**
30
+ - **Architecture:** Nomic Mixture of Experts (MoE)
31
+ - **Dataset:** `emrecan/all-nli-tr`
32
+ - **Training Objective:** Triplet loss for contrastive learning
33
+
34
+ ### **Evaluation & Performance**
35
+ Compared to other Turkish embedding models, this model demonstrates strong performance in capturing semantic relationships within the language. Further evaluations and benchmarks will be shared as they become available.
36
+
37
+ ### **How to Use**
38
+ You can use this model with Hugging Face's `transformers` or `sentence-transformers` library:
39
+
40
+ ```python
41
+ from sentence_transformers import SentenceTransformer
42
+
43
+ model = SentenceTransformer("your-huggingface-model-name")
44
+ embeddings = model.encode(["Merhaba dünya!", "Bugün hava çok güzel."])
45
+ print(embeddings)
46
+ ```
47
+
48
+ ### **License & Citation**
49
+ Please refer to the repository for licensing details and citation instructions.