ValueError: Tokenizer class AksharaTokenizer does not exist or is not currently imported.
Dear @journalesque ,
Thank you for bringing this to our attention. We understand that you're encountering the following error while using Akshara-2B-Hindi:
ValueError: Tokenizer class AksharaTokenizer does not exist or is not currently imported.
This issue likely stems from the tokenizer not being correctly registered in transformers. We are currently investigating this on our end. In the meantime, please try the following:
- Ensure you have the latest version of transformers
pip install --upgrade transformers
- Load the tokenizer explicitly with trust_remote_code=True
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"SVECTOR-CORPORATION/Akshara-2B-Hindi",
trust_remote_code=True
)
If the issue persists, could you share your Python and transformers versions? This will help us diagnose the problem faster. We appreciate your patience and will update you as soon as we have a resolution.
Best regards,
SVECTOR Support Team
Dear @jayahariv ,
Thank you for bringing this to our attention. We understand that you're encountering the following error while using Akshara-2B-Hindi:
ValueError: Tokenizer class AksharaTokenizer does not exist or is not currently imported.
This issue likely stems from the tokenizer not being correctly registered in transformers. We are currently investigating this on our end. In the meantime, please try the following:
- Ensure you have the latest version of transformers
Run the following command to update:
pip install --upgrade transformers
- Load the tokenizer explicitly with trust_remote_code=True
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"SVECTOR-CORPORATION/Akshara-2B-Hindi",
trust_remote_code=True
)
If the issue persists, could you share your Python and transformers versions? This will help us diagnose the problem faster. We appreciate your patience and will update you as soon as we have a resolution.
Best regards,
SVECTOR Support Team
Thanks for the quick reply, I have tried both. Still the same issue.
Fixed the AksharaTokenizer Registration Issue
Issue
@jayahariv
, thank you for reporting this issue with our model. We've identified the root cause of the error you're encountering: the AksharaTokenizer class was not properly registered with the transformers
library.
Solution
To resolve this issue, please follow these steps:
1. Download the Tokenizer File
Download the akshara_tokenizer.py
file we've provided and save it in your project directory (the same directory where your script is located).
2. Import the Tokenizer Module
At the beginning of your script, add the following import:
import akshara_tokenizer
3. Use AutoTokenizer as Normal
After importing the tokenizer module, load your model as usual:
from transformers import AutoTokenizer
# Ensure the path is correct
tokenizer = AutoTokenizer.from_pretrained("path/to/model")
This ensures that the AksharaTokenizer class is registered before being loaded with AutoTokenizer
.
Dependencies
Please make sure you have the following dependencies installed:
pip install transformers>=4.49.0
regex
Additional Notes
- We have updated our model repository to include
akshara_tokenizer.py
for all users. - Future releases will have this component pre-registered, eliminating the need for manual registration.
If you continue to experience issues, please don't hesitate to contact our support team.
Best regards,
SVECTOR Support Team
[email protected]