KR PRO

KashyapR

AI & ML interests

LLM's, Generative AI, RAG, ML

Recent Activity

new activity about 10 hours ago

microsoft/bitnet-b1.58-2B-4T:configuration_bitnet missing

updated a model 4 months ago

KashyapR/alphatx

replied to their post 12 months ago

Question: Quantization through GPTQ Hi Team, I’m trying to quantize a 13b model using the below configuration on A100. I tried the below options quantization_config = GPTQConfig( bits=4, group_size=128, dataset="wikitext2", batch_size=16, desc_act=False ) 1. Enforce batch_size = 16 or batch_size = 2 at the quant configurations 2. Set tokenizer.pad_token_id = tokenizer.eos_token_id (which is 2) I observed that even if we explicitly enforce the batch size and set the pad_token_id value other than None. It is not being considered Can’t we set the batch_size and pad_token_id to some other value is this expected behavior with GPTQ . What is the reason behind this? Please suggest if there is any way to override the batch size config. https://github.com/huggingface/optimum/blob/main/optimum/gptq/data.py#L51 Could you kindly suggest? Appreciate your kind support. Thanks

View all activity

Organizations

None yet

Posts 1

Post

1991

Question: Quantization through GPTQ

Hi Team, I’m trying to quantize a 13b model using the below configuration on A100. I tried the below options

quantization_config = GPTQConfig(
bits=4,
group_size=128,
dataset="wikitext2",
batch_size=16,
desc_act=False

)

1. Enforce batch_size = 16 or batch_size = 2 at the quant configurations
2. Set tokenizer.pad_token_id = tokenizer.eos_token_id (which is 2)

I observed that even if we explicitly enforce the batch size and set the pad_token_id value other than None. It is not being considered

Can’t we set the batch_size and pad_token_id to some other value is this expected behavior with GPTQ . What is the reason behind this? Please suggest if there is any way to override the batch size config.

https://github.com/huggingface/optimum/blob/main/optimum/gptq/data.py#L51

Could you kindly suggest? Appreciate your kind support.
Thanks

models 1

KashyapR/alphatx

Updated Dec 31, 2024

datasets 0

None public yet