Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ reduced from 64 experts to 32 experts. The pruned model is mainly used for [code
|
|
17 |
|
18 |
This is a test validation to see if we can prune the model according to professional requirements and still maintain acceptable performance. The model size has been reduced by about half, and no distortion has occurred.
|
19 |
|
|
|
|
|
20 |
The total parameter is equivalent to 8B.
|
21 |
|
22 |
This model has the same architecture as [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model, and we will try the pruned version of the [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model.
|
|
|
17 |
|
18 |
This is a test validation to see if we can prune the model according to professional requirements and still maintain acceptable performance. The model size has been reduced by about half, and no distortion has occurred.
|
19 |
|
20 |
+
This allows the model to be pruned according to one's needs.
|
21 |
+
|
22 |
The total parameter is equivalent to 8B.
|
23 |
|
24 |
This model has the same architecture as [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model, and we will try the pruned version of the [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model.
|