convaiinnovations commited on
Commit
1e6bee6
·
verified ·
1 Parent(s): b60a961

Upload Hindi CausalLM model with RoPE

Browse files
Files changed (4) hide show
  1. README.md +2 -132
  2. config.json +1 -0
  3. model.safetensors +2 -2
  4. pytorch_model.bin +2 -2
README.md CHANGED
@@ -6,6 +6,7 @@ tags:
6
  - text-generation
7
  - causal-lm
8
  - lm
 
9
  license: mit
10
  datasets:
11
  - custom_hindi_corpus
@@ -14,145 +15,14 @@ datasets:
14
  # Hindi-CausalLM
15
 
16
  A Hindi language generation model with the following specifications:
17
- ## Usage
18
 
19
- You can use this model with the following code:
20
-
21
- ```python
22
- import torch
23
- from hindi_embeddings import SentencePieceTokenizerWrapper
24
- from convaicausallm_model import ConvaiCausalLM, ConvaiCausalLMConfig
25
- from safetensors.torch import load_file
26
- import os
27
- class HindiLLMGenerator:
28
- def __init__(self, model_path, device=None):
29
- # Set device
30
- if device is None:
31
- self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
32
- else:
33
- self.device = torch.device(device)
34
-
35
- print(f"Using device: {self.device}")
36
-
37
- # Load tokenizer
38
- tokenizer_path = os.path.join(model_path, "tokenizer.model")
39
- self.tokenizer = SentencePieceTokenizerWrapper(tokenizer_path)
40
-
41
- # Load model config
42
- config_path = os.path.join(model_path, "config.json")
43
- import json
44
- with open(config_path, 'r') as f:
45
- config_dict = json.load(f)
46
-
47
- self.config = ConvaiCausalLMConfig(**config_dict)
48
-
49
- # Load model - try safetensors first, fall back to PyTorch bin if needed
50
- safetensors_path = os.path.join(model_path, "model.safetensors")
51
- pytorch_path = os.path.join(model_path, "pytorch_model.bin")
52
-
53
- self.model = ConvaiCausalLM(self.config)
54
-
55
- # Check which format is available and load accordingly
56
- if os.path.exists(safetensors_path):
57
- print(f"Loading model from SafeTensors")
58
- state_dict = load_file(safetensors_path, device="cpu")
59
- self.model.load_state_dict(state_dict)
60
- elif os.path.exists(pytorch_path):
61
- print(f"Loading model from PyTorch bin")
62
- self.model.load_state_dict(torch.load(pytorch_path, map_location="cpu"))
63
-
64
- # Move model to device and set to evaluation mode
65
- self.model.to(self.device)
66
- self.model.eval()
67
-
68
- def generate(self, prompt, max_length=100, temperature=0.8, top_k=50, top_p=0.9,
69
- repetition_penalty=1.1, do_sample=True):
70
- # Tokenize the prompt
71
- input_ids = self.tokenizer.sp_model.EncodeAsIds(prompt)
72
- input_tensor = torch.tensor([input_ids], dtype=torch.long).to(self.device)
73
-
74
- # Start with the input tensor
75
- output_sequence = input_tensor.clone()
76
-
77
- # Generate tokens one by one
78
- for _ in range(max_length - len(input_ids)):
79
- with torch.no_grad():
80
- # Get the model's output for the current sequence
81
- outputs = self.model(output_sequence)
82
- next_token_logits = outputs[0, -1, :]
83
-
84
- # Apply temperature
85
- if temperature > 0:
86
- next_token_logits = next_token_logits / temperature
87
-
88
- # Apply repetition penalty
89
- if repetition_penalty > 1.0:
90
- for token_id in output_sequence[0].tolist():
91
- next_token_logits[token_id] /= repetition_penalty
92
-
93
- # Filter with top-k sampling
94
- if top_k > 0:
95
- top_k_values, top_k_indices = torch.topk(next_token_logits, top_k)
96
- next_token_logits = torch.full_like(next_token_logits, float('-inf'))
97
- next_token_logits.scatter_(0, top_k_indices, top_k_values)
98
-
99
- # Filter with top-p/nucleus sampling
100
- if top_p < 1.0 and do_sample:
101
- sorted_logits, sorted_indices = torch.sort(next_token_logits, descending=True)
102
- cumulative_probs = torch.cumsum(torch.softmax(sorted_logits, dim=-1), dim=-1)
103
-
104
- # Remove tokens with cumulative probability above the threshold
105
- sorted_indices_to_remove = cumulative_probs > top_p
106
- # Shift the indices to the right to keep the first token above the threshold
107
- sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()
108
- sorted_indices_to_remove[..., 0] = 0
109
-
110
- indices_to_remove = sorted_indices[sorted_indices_to_remove]
111
- next_token_logits[indices_to_remove] = float('-inf')
112
-
113
- # Sample or choose the next token
114
- if do_sample:
115
- probs = torch.softmax(next_token_logits, dim=-1)
116
- next_token = torch.multinomial(probs, num_samples=1)
117
- else:
118
- next_token = torch.argmax(next_token_logits, dim=-1).unsqueeze(0)
119
-
120
- # Add the next token to the sequence
121
- output_sequence = torch.cat([output_sequence, next_token.unsqueeze(0)], dim=1)
122
-
123
- # Check if we've generated an end token
124
- if next_token.item() == self.tokenizer.eos_token_id:
125
- break
126
-
127
- # Decode the generated sequence
128
- generated_ids = output_sequence[0].tolist()
129
- generated_text = self.tokenizer.sp_model.DecodeIds(generated_ids)
130
-
131
- return generated_text
132
-
133
- # Example usage
134
- if __name__ == "__main__":
135
- generator = HindiLLMGenerator("path/to/model")
136
- result = generator.generate("भारत एक विशाल देश है")
137
- print(result)
138
- ```
139
-
140
- ## Example Prompts
141
-
142
- Try the model with these example prompts:
143
-
144
- ```
145
- भारत एक विशाल देश है
146
- मुझे हिंदी में एक कहानी सुनाओ
147
- आज का मौसम बहुत अच्छा है
148
- हिंदी साहित्य की प्रमुख विशेषताएं
149
- ```
150
  ## Model Architecture
151
  - **Type**: Causal Language Model with Transformer architecture
152
  - **Hidden size**: 768
153
  - **Layers**: 12
154
  - **Attention heads**: 16
155
  - **Key-value heads**: 4 (using grouped-query attention)
 
156
  - **Vocabulary size**: 16000
157
  - **Parameters**: ~74.1M
158
  - **Context window**: 512 tokens
 
6
  - text-generation
7
  - causal-lm
8
  - lm
9
+ - rope
10
  license: mit
11
  datasets:
12
  - custom_hindi_corpus
 
15
  # Hindi-CausalLM
16
 
17
  A Hindi language generation model with the following specifications:
 
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Model Architecture
20
  - **Type**: Causal Language Model with Transformer architecture
21
  - **Hidden size**: 768
22
  - **Layers**: 12
23
  - **Attention heads**: 16
24
  - **Key-value heads**: 4 (using grouped-query attention)
25
+ - **Position encoding**: Rotary Position Embeddings (RoPE)
26
  - **Vocabulary size**: 16000
27
  - **Parameters**: ~74.1M
28
  - **Context window**: 512 tokens
config.json CHANGED
@@ -78,6 +78,7 @@
78
  "intermediate_size": 3072,
79
  "hidden_act": "silu",
80
  "max_position_embeddings": 512,
 
81
  "model_type": "convaicausallm",
82
  "auto_map": {
83
  "AutoConfig": "configuration_convaicausallm.ConvaiCausalLMConfig",
 
78
  "intermediate_size": 3072,
79
  "hidden_act": "silu",
80
  "max_position_embeddings": 512,
81
+ "rope_theta": 10000.0,
82
  "model_type": "convaicausallm",
83
  "auto_map": {
84
  "AutoConfig": "configuration_convaicausallm.ConvaiCausalLMConfig",
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0b013bef88c7f7cbf72bd25d7868854da142b00a899adc94175294b50a04d4dd
3
- size 408609208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ba8afe67cfd8a9622ba63f0607352cab2fda4a584a712a941cce9e82946c4a4
3
+ size 409791136
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f57c2ac93848af94c4dcbb2f93a6406135c030a2e0c9b717588f4f5929b13551
3
- size 408661966
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1e3091e78fdc85ef2ed3a1e1bb0e07408327b187fb7bb733f63014a1f6d25a0
3
+ size 409849254