cicdatopea commited on
Commit
1e1897b
·
verified ·
1 Parent(s): 9b8cc6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -444,6 +444,27 @@ print(make_table(res))
444
 
445
  ### Generate the model
446
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
447
  5*80g and 1.4T-1.6T memory is required
448
 
449
  ~~~python
 
444
 
445
  ### Generate the model
446
 
447
+ **1 add meta data to bf16 model** https://huggingface.co/opensourcerelease/DeepSeek-R1-bf16
448
+
449
+ ~~~python
450
+ import safetensors
451
+ from safetensors.torch import save_file
452
+
453
+ for i in range(1, 164):
454
+ idx_str = "0" * (5-len(str(i))) + str(i)
455
+ safetensors_path = f"model-{idx_str}-of-000163.safetensors"
456
+ print(safetensors_path)
457
+ tensors = dict()
458
+ with safetensors.safe_open(safetensors_path, framework="pt") as f:
459
+ for key in f.keys():
460
+ tensors[key] = f.get_tensor(key)
461
+ save_file(tensors, safetensors_path, metadata={'format': 'pt'})
462
+ ~~~
463
+
464
+
465
+
466
+ **2 remove torch.no_grad** in modeling_deepseek.py as we need some tuning in AutoRound.
467
+
468
  5*80g and 1.4T-1.6T memory is required
469
 
470
  ~~~python