Unofficial Mistral Community

community

Activity Feed Request to join this org

AI & ML interests

Unofficial org for community upload of Mistral's Open Source models.

Recent Activity

RaushanTurganbay new activity 3 days ago

mistral-community/pixtral-12b:Cannot apply chat template from tokenizer

JustinLin610 authored a paper 10 days ago

START: Self-taught Reasoner with Tools

RaushanTurganbay new activity 12 days ago

mistral-community/pixtral-12b-240910:prefill assistant responses

View all activity

mistral-community's activity

RaushanTurganbay

in mistral-community/pixtral-12b 3 days ago

Cannot apply chat template from tokenizer

#31 opened 3 days ago by

DarkLight1337

clem

posted an update 3 days ago

Post

4011

We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!

3 replies

clem

posted an update 8 days ago

Post

7097

I was chatting with @peakji , one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).

He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.

As a thank you to the community, he shared 100 invite code first-come first serve, just use “HUGGINGFACE” to get access!

6 replies

clem

posted an update 9 days ago

Post

4648

10,000+ models based on Deepseek R1 have been publicly shared on Hugging Face! Which ones are your favorite ones: https://huggingface.co/models?sort=trending&search=r1. Truly game-changer!

JustinLin610

authored a paper 10 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 10 days ago • 93

RaushanTurganbay

in mistral-community/pixtral-12b-240910 12 days ago

prefill assistant responses

#16 opened 12 days ago by

alexsafayan

clem

posted an update 12 days ago

Post

5878

Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise

JustinLin610

authored a paper 17 days ago

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Paper • 2502.20172 • Published 17 days ago • 28

JustinLin610

authored a paper 24 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 25 days ago • 164

clem

posted an update 26 days ago

Post

2823

What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course

1 reply

clem

posted an update 27 days ago

Post

3484

We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co/blog/inference-providers

RaushanTurganbay

in mistral-community/pixtral-12b about 1 month ago

Support of flash attention 2?

#29 opened about 1 month ago by

LuciusLan

ArthurZ

in mistral-community/pixtral-12b about 1 month ago

Fastest way for inference?

#28 opened about 1 month ago by

psycy

nielsr

in mistral-community/pixtral-12b about 1 month ago

Fastest way for inference?

#28 opened about 1 month ago by

psycy

reach-vb

authored a paper about 1 month ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 204

RaushanTurganbay

in mistral-community/pixtral-12b about 1 month ago

Getting shape mismatch while loading saved Pixtral model

#24 opened about 1 month ago by

ss007

v2ray

posted an update about 1 month ago

Post

2080

GPT4chan Series Release

GPT4chan is a series of models I trained on v2ray/4chan dataset, which is based on lesserfield/4chan-datasets. The dataset contains mostly posts from 2023. Not every board is included, for example, /pol/ is NOT included. To see which boards are included, visit v2ray/4chan.

This release contains 2 models sizes, 8B and 24B. The 8B model is based on meta-llama/Llama-3.1-8B and the 24B model is based on mistralai/Mistral-Small-24B-Base-2501.

Why I made these models? Because for a long time after the original gpt-4chan model, there aren't any serious fine-tunes on 4chan datasets. 4chan is a good data source since it contains coherent replies and nice topics. It's fun to talk to an AI generated version of 4chan and get instant replies, and without the need to actually visit 4chan. You can also sort of analyze the content and behavior of 4chan posts by probing the model's outputs.

Disclaimer: The GPT4chan models should only be used for research purposes, the outputs they generated do not represent the view of me on the subjects. Moderate the responses before sending it online.

Model links:

Full model:
- v2ray/GPT4chan-8B
- v2ray/GPT4chan-24B

Adapter:
- v2ray/GPT4chan-8B-QLoRA
- v2ray/GPT4chan-24B-QLoRA

AWQ:
- v2ray/GPT4chan-8B-AWQ
- v2ray/GPT4chan-24B-AWQ

FP8:
- v2ray/GPT4chan-8B-FP8

JustinLin610

authored 2 papers about 2 months ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 63

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

Paper • 2501.14492 • Published Jan 24 • 31

clem

posted an update about 2 months ago

Post

7233

AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!

AI & ML interests

Recent Activity

Team members 23

mistral-community's activity

Cannot apply chat template from tokenizer

prefill assistant responses

Support of flash attention 2?

Fastest way for inference?

Fastest way for inference?

Getting shape mismatch while loading saved Pixtral model