satreysa commited on
Commit
ef5f951
·
verified ·
1 Parent(s): 7a1e195

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -37
README.md CHANGED
@@ -3,49 +3,36 @@ language:
3
  - en
4
  pipeline_tag: text-generation
5
  base_model:
6
- - google/gemma-2-2b
7
  license: gemma
8
  ---
9
 
10
  # gemma-2-2b-awq-uint4-asym-g128-lmhead-g32-fp16-onnx
11
  - ## Introduction
12
- This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 
13
  - ## Quantization Strategy
14
- - ***Quantized Layers***: All linear layers
15
- - ***Weight***: uint4 asymmetric per-group. group_size=32 for lm_head, and group_size=128 for the rest.
 
16
  - ## Quick Start
17
- 1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
18
- 2. Run the quantization script in the example folder using the following command line:
19
- ```sh
20
- export MODEL_DIR = [local model checkpoint folder] or google/gemma-2-2b
21
- # single GPU
22
- python quantize_quark.py --model_dir $MODEL_DIR \
23
- --output_dir output_dir $MODEL_NAME-awq-uint4-asym-g128-lmhead-g32-fp16 \
24
- --quant_scheme w_uint4_per_group_asym \
25
- --num_calib_data 128 \
26
- --quant_algo awq \
27
- --dataset pileval_for_awq_benchmark \
28
- --model_export hf_format \
29
- --group_size 128 \
30
- --group_size_per_layer lm_head 32 \
31
- --data_type float32 \
32
- --exclude_layers
33
- # cpu
34
- python quantize_quark.py --model_dir $MODEL_DIR \
35
- --output_dir output_dir $MODEL_NAME-awq-uint4-asym-g128-lmhead-g32-fp16 \
36
- --quant_scheme w_uint4_per_group_asym \
37
- --num_calib_data 128 \
38
- --quant_algo awq \
39
- --dataset pileval_for_awq_benchmark \
40
- --model_export hf_format \
41
- --group_size 128 \
42
- --group_size_per_layer lm_head 32 \
43
- --data_type float32 \
44
- --exclude_layers \
45
- --device cpu
46
- ```
47
- ## Deployment
48
- Quark has its own export format, quark_safetensors, which is compatible with autoAWQ exports.
49
 
50
  #### License
51
- Modifications copyright(c) 2025 Advanced Micro Devices,Inc. All rights reserved.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - en
4
  pipeline_tag: text-generation
5
  base_model:
6
+ - amd/gemma-2-2b-awq-uint4-asym-g128-lmhead-g32-fp16-onnx
7
  license: gemma
8
  ---
9
 
10
  # gemma-2-2b-awq-uint4-asym-g128-lmhead-g32-fp16-onnx
11
  - ## Introduction
12
+ This model was prepared using the AMD Quark Quantization tool, followed by necessary post-processing.
13
+
14
  - ## Quantization Strategy
15
+ - AWQ / Group size lm_head 32 / Group size 128 / Asymmetric / UINT4 Weights / FP16 activations
16
+ - Excluded Layers: None
17
+
18
  - ## Quick Start
19
+ For quickstart, refer to [Ryzen AI doucmentation](https://ryzenai.docs.amd.com/en/latest/hybrid_oga.html)
20
+
21
+ #### Evaluation scores
22
+ The perplexity measurement is run on the wikitext-2-raw-v1 (raw data) dataset provided by Hugging Face. Perplexity score measured for prompt length 2k is 66.625.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  #### License
25
+ Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.
26
+
27
+ MIT License
28
+
29
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal
30
+ in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
31
+ copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
32
+
33
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
34
+
35
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
36
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
37
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
38
+ SOFTWARE.