
Unofficial Mistral Community
community
AI & ML interests
Unofficial org for community upload of Mistral's Open Source models.
Recent Activity
View all activity
mistral-community's activity
Cannot apply chat template from tokenizer
1
#31 opened 3 days ago
by
DarkLight1337
Post
7094
I was chatting with
@peakji
, one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).
He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.
As a thank you to the community, he shared 100 invite code first-come first serve, just use βHUGGINGFACEβ to get access!
He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.
As a thank you to the community, he shared 100 invite code first-come first serve, just use βHUGGINGFACEβ to get access!
Post
4648
10,000+ models based on Deepseek R1 have been publicly shared on Hugging Face! Which ones are your favorite ones: https://huggingface.co/models?sort=trending&search=r1. Truly game-changer!

JustinLin610Β
authored
a
paper
10 days ago
prefill assistant responses
1
#16 opened 12 days ago
by
alexsafayan

Post
5878
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!
Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise
Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise

JustinLin610Β
authored
a
paper
17 days ago

JustinLin610Β
authored
a
paper
24 days ago
Post
2823
What are the best organizations to follow on
@huggingface
?
On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch
Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch
Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
Post
3484
We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.
Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.
Have you been using any integration and how can we make it better?
https://huggingface.co/blog/inference-providers
Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.
Have you been using any integration and how can we make it better?
https://huggingface.co/blog/inference-providers
Support of flash attention 2?
2
#29 opened about 1 month ago
by
LuciusLan
Fastest way for inference?
3
#28 opened about 1 month ago
by
psycy
Fastest way for inference?
3
#28 opened about 1 month ago
by
psycy

reach-vbΒ
authored
a
paper
about 1 month ago
Getting shape mismatch while loading saved Pixtral model
4
#24 opened about 1 month ago
by
ss007
Post
2079
GPT4chan Series Release
GPT4chan is a series of models I trained on v2ray/4chan dataset, which is based on lesserfield/4chan-datasets. The dataset contains mostly posts from 2023. Not every board is included, for example, /pol/ is NOT included. To see which boards are included, visit v2ray/4chan.
This release contains 2 models sizes, 8B and 24B. The 8B model is based on meta-llama/Llama-3.1-8B and the 24B model is based on mistralai/Mistral-Small-24B-Base-2501.
Why I made these models? Because for a long time after the original gpt-4chan model, there aren't any serious fine-tunes on 4chan datasets. 4chan is a good data source since it contains coherent replies and nice topics. It's fun to talk to an AI generated version of 4chan and get instant replies, and without the need to actually visit 4chan. You can also sort of analyze the content and behavior of 4chan posts by probing the model's outputs.
Disclaimer: The GPT4chan models should only be used for research purposes, the outputs they generated do not represent the view of me on the subjects. Moderate the responses before sending it online.
Model links:
Full model:
- v2ray/GPT4chan-8B
- v2ray/GPT4chan-24B
Adapter:
- v2ray/GPT4chan-8B-QLoRA
- v2ray/GPT4chan-24B-QLoRA
AWQ:
- v2ray/GPT4chan-8B-AWQ
- v2ray/GPT4chan-24B-AWQ
FP8:
- v2ray/GPT4chan-8B-FP8
GPT4chan is a series of models I trained on v2ray/4chan dataset, which is based on lesserfield/4chan-datasets. The dataset contains mostly posts from 2023. Not every board is included, for example, /pol/ is NOT included. To see which boards are included, visit v2ray/4chan.
This release contains 2 models sizes, 8B and 24B. The 8B model is based on meta-llama/Llama-3.1-8B and the 24B model is based on mistralai/Mistral-Small-24B-Base-2501.
Why I made these models? Because for a long time after the original gpt-4chan model, there aren't any serious fine-tunes on 4chan datasets. 4chan is a good data source since it contains coherent replies and nice topics. It's fun to talk to an AI generated version of 4chan and get instant replies, and without the need to actually visit 4chan. You can also sort of analyze the content and behavior of 4chan posts by probing the model's outputs.
Disclaimer: The GPT4chan models should only be used for research purposes, the outputs they generated do not represent the view of me on the subjects. Moderate the responses before sending it online.
Model links:
Full model:
- v2ray/GPT4chan-8B
- v2ray/GPT4chan-24B
Adapter:
- v2ray/GPT4chan-8B-QLoRA
- v2ray/GPT4chan-24B-QLoRA
AWQ:
- v2ray/GPT4chan-8B-AWQ
- v2ray/GPT4chan-24B-AWQ
FP8:
- v2ray/GPT4chan-8B-FP8

JustinLin610Β
authored
2
papers
about 2 months ago
Post
7233
AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!