Andrew Reed's picture

Andrew Reed

andrewrreed

AI & ML interests

Applied ML, Practical AI, Inference & Deployment, LLMs, Multi-modal Models, RAG

Recent Activity

Organizations

Hugging Face's profile picture Demo Corp's profile picture Atmos Bank's profile picture Hugging Test Lab's profile picture HuggingFaceM4's profile picture Cloudera Fast Forward Labs's profile picture Code Llama's profile picture Xlscout Ltd's profile picture Olto's profile picture Enterprise Explorers's profile picture Navigate360's profile picture Ryght AI's profile picture Marker Learning's profile picture Sanofi's profile picture Social Post Explorers's profile picture Xsolla's profile picture open/ acc's profile picture Langfuse's profile picture

andrewrreed's activity

reacted to MJannik's post with ๐Ÿ”ฅ 2 days ago
view post
Post
1594
I've published an article showing five ways to use ๐Ÿชข Langfuse with ๐Ÿค— Hugging Face.

My personal favorite is Method #4: Using Hugging Face Datasets for Langfuse Dataset Experiments. This lets you benchmark your LLM app or AI agent with a dataset hosted on Hugging Face. In this example, I chose the GSM8K dataset ( openai/gsm8k) to test the mathematical reasoning capabilities of my smolagent :)

Link to the Article here on HF: https://huggingface.co/blog/MJannik/hugging-face-and-langfuse
posted an update 2 months ago
view post
Post
2784
๐Ÿš€ Supercharge your LLM apps with Langfuse on Hugging Face Spaces!

Langfuse brings end-to-end observability and tooling to accelerate your dev workflow from experiments through production

Now available as a Docker Space directly on the HF Hub! ๐Ÿค—

๐Ÿ” Trace everything: monitor LLM calls, retrieval, and agent actions with popular frameworks
1โƒฃ One-click deployment: on Spaces with persistent storage and integrated OAuth
๐Ÿ›  Simple Prompt Management: Version, edit, and update without redeployment
โœ… Intuitive Evals: Collect user feedback, run model/prompt evaluations, and improve quality
๐Ÿ“Š Dataset Creation: Build datasets directly from production data to enhance future performance

Kudos to the Langfuse team for this collab and the awesome, open-first product theyโ€™re building! ๐Ÿ‘ @marcklingen @Clemo @MJannik

๐Ÿ”— Space: langfuse/langfuse-template-space
๐Ÿ”— Docs: https://huggingface.co/docs/hub/spaces-sdks-docker-langfuse
  • 1 reply
ยท
reacted to julien-c's post with ๐Ÿค—โค๏ธ๐Ÿ”ฅ 3 months ago
view post
Post
10359
After some heated discussion ๐Ÿ”ฅ, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community ๐Ÿ”ฅ

cc: @reach-vb @pierric @victor and the HF team
ยท
posted an update 4 months ago
view post
Post
1052
Trace LLM calls with Arize AI's Phoenix observability dashboards on Hugging Face Spaces! ๐Ÿš€

โœจ I just added a new recipe to the Open-Source AI Cookbook that shows you how to:
1๏ธโƒฃ Deploy Phoenix on HF Spaces with persistent storage in a few clicks
2๏ธโƒฃ Configure LLM tracing with the ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฒ๐˜€๐˜€ ๐—œ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—”๐—ฃ๐—œ
3๏ธโƒฃ Observe multi-agent application runs with the CrewAI integration

๐—ข๐—ฏ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜† ๐—ถ๐˜€ ๐—ฐ๐—ฟ๐˜‚๐—ฐ๐—ถ๐—ฎ๐—น for building robust LLM apps.

Phoenix makes it easy to visualize trace data, evaluate performance, and track down issues. Give it a try!

๐Ÿ”— Cookbook recipe: https://huggingface.co/learn/cookbook/en/phoenix_observability_on_hf_spaces
๐Ÿ”— Phoenix docs: https://docs.arize.com/phoenix
reacted to m-ric's post with โค๏ธ 4 months ago
view post
Post
1226
Made a new app to visualize the LLM race โ‡’ ๐—ก๐—ผ ๐—˜๐˜‚๐—ฟ๐—ผ๐—ฝ๐—ฒ๐—ฎ๐—ป ๐—ฐ๐—ผ๐—บ๐—ฝ๐—ฎ๐—ป๐˜† ๐—ถ๐—ป ๐˜๐—ต๐—ฒ ๐˜๐—ผ๐—ฝ ๐Ÿญ๐Ÿฌ ๐Ÿ‡ช๐Ÿ‡บโŒ

See the app here ๐Ÿ‘‰ m-ric/llm-race-to-the-top

I've adapted an app by @andrewrreed that tracks progress of LLMs ( andrewrreed/closed-vs-open-arena-elo), on the Chatbot Arena leaderboard, to compare companies from different countries.

The outcome is quite sad, as a Frenchman and European.

The top 10 is exclusively US ๐Ÿ‡บ๐Ÿ‡ธ and Chinese ๐Ÿ‡จ๐Ÿ‡ณ companies (after great Chinese LLM releases recently, like the Qwen2.5 series), with the notable exception of Mistral AI ๐Ÿ‡ซ๐Ÿ‡ท.

American companies are making fast progress, Chinese ones even faster. Europe is at risk of being left behind. And the EU AI Act hasn't even come into force yet to slow down the EU market. We need to wake up ๐Ÿ˜ฌ

โš ๏ธ Caution: This Chatbot Arena ELO ranking is not the most accurate, especially at high scores like this, because LLM makers can game it to some extent.
  • 1 reply
ยท
reacted to jsulz's post with โค๏ธ๐Ÿ”ฅ 4 months ago
view post
Post
2944
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. Thatโ€™s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

โฉ Only upload the chunks that changed.
๐Ÿš€ Download just the updates, not the whole file.
๐Ÿง  We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isnโ€™t just a performance boost. Itโ€™s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks
reacted to m-ric's post with ๐Ÿ”ฅ 4 months ago
view post
Post
3913
๐—ง๐—ต๐—ฒ ๐—ป๐—ฒ๐˜…๐˜ ๐—ฏ๐—ถ๐—ด ๐˜€๐—ผ๐—ฐ๐—ถ๐—ฎ๐—น ๐—ป๐—ฒ๐˜๐˜„๐—ผ๐—ฟ๐—ธ ๐—ถ๐˜€ ๐—ป๐—ผ๐˜ ๐Ÿฆ‹, ๐—ถ๐˜'๐˜€ ๐—›๐˜‚๐—ฏ ๐—ฃ๐—ผ๐˜€๐˜๐˜€! [INSERT STONKS MEME WITH LASER EYES]

See below: I got 105k impressions since regularly posting Hub Posts, coming close to my 275k on Twitter!

โš™๏ธ Computed with the great dataset maxiw/hf-posts
โš™๏ธ Thanks to Qwen2.5-Coder-32B for showing me how to access dict attributes in a SQL request!

cc @merve who's far in front of me
ยท
reacted to maxiw's post with โค๏ธ 4 months ago
view post
Post
4662
I was curious to see what people post here on HF so I created a dataset with all HF Posts: maxiw/hf-posts

Some interesting stats:

Top 5 Authors by Total Impressions:
-----------------------------------
@merve : 171,783 impressions (68 posts)
@fdaudens : 135,253 impressions (81 posts)
@singhsidhukuldeep : 122,591 impressions (81 posts)
@akhaliq : 119,526 impressions (78 posts)
@MonsterMMORPG : 112,500 impressions (45 posts)

Top 5 Users by Number of Reactions Given:
----------------------------------------
@osanseviero : 1278 reactions
@clem : 910 reactions
@John6666 : 899 reactions
@victor : 674 reactions
@samusenps : 655 reactions

Top 5 Most Used Reactions:
-------------------------
โค๏ธ: 7048 times
๐Ÿ”ฅ: 5921 times
๐Ÿ‘: 4856 times
๐Ÿš€: 2549 times
๐Ÿค—: 2065 times
ยท
reacted to clem's post with ๐Ÿš€๐Ÿ”ฅ 5 months ago
view post
Post
4472
This is no Woodstock AI but will be fun nonetheless haha. Iโ€™ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
ยท
reacted to melisa's post with ๐Ÿ”ฅ 7 months ago
view post
Post
3090
๐Ÿ”ฅ Introducing "Writing in the Margins (WiM)" - better inference pattern for long context LLMs that solves the Lost-in-the-Middle problem ๐Ÿ”ฅ

Paper page: Writing in the Margins: Better Inference Pattern for Long Context Retrieval (2408.14906)

TL;DR
Make your model write "margin notes" as you chunk prefill the KV cache. Then ask it reread all notes before it speaks up.
Works with humans, works with AI ๐Ÿค–

WiM leverages the chunked prefill of the key-value cache, which concurrently generates query-based extractive summaries at each step of the prefill that are subsequently reintegrated at the end of the computation. We term these intermediate outputs โ€œmarginsโ€, drawing inspiration from the practice of making margin notes for improved comprehension of long contexts in human reading. We show that this technique, which adds only minimal additional computation, significantly improves LLMs long context reasoning capabilities.

Think: Every chunk has a chance to be attended to/ be at the end of the context at least once. ๐ŸŽ‰

๐Ÿ“Š Results:
- An average accuracy boost of 7.5% in multi-hop reasoning tasks like HotpotQA and MultiHop-RAG.
- Even a 30% increase in F1-score for summarisation-like tasks (CWE).

Plus, WiM fits seamlessly into interactive applications (think: progress bar!). It can provide real-time progress updates during data retrieval and integration, making it user-friendly and transparent - a stark contrast to feeding 1mln tokens to an LLMs and waiting 6 min for the first token. ๐Ÿคฏ

๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿง‘โ€๐Ÿ’ป Check it out and contribute to our open-source project here: https://github.com/writer/writing-in-the-margins

๐Ÿง  More about chunked prefill: https://docs.vllm.ai/en/latest/models/performance.html#chunked-prefill
  • 2 replies
ยท
reacted to m-ric's post with ๐Ÿ”ฅ 7 months ago
view post
Post
1124
๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ-๐Ÿฏ.๐Ÿญ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐—ณ๐—ถ๐—ป๐—ฎ๐—น๐—น๐˜† ๐—ด๐—ฒ๐˜ ๐˜๐—ต๐—ฒ๐—ถ๐—ฟ ๐—–๐—ต๐—ฎ๐˜๐—ฏ๐—ผ๐˜ ๐—”๐—ฟ๐—ฒ๐—ป๐—ฎ ๐—ฟ๐—ฎ๐—ป๐—ธ๐—ถ๐—ป๐—ด ๐ŸŽ–๏ธ

Given the impressive benchmarks published my Meta for their Llama-3.1 models, I was curious to see how these models would compare to top proprietary models on Chatbot Arena.

Now we've got the results! LMSys released the ELO derived from thousands of user votes for the new models, and here are the rankings:

๐Ÿ’ฅ 405B Model ranks 5th overall, in front of GPT-4-turbo! But behind GPT-4o, Claude-3.5 Sonnet and Gemini-advanced.
๐Ÿ‘ 70B Model climbs up to 9th rank ! From 1206 โžก๏ธ 1244.
๐Ÿ‘ 8B Model improves from 1152 โžก๏ธ 1170.

โœ… This confirms that Llama-3.1 is a good contender for any task: any of its 3 model size is much cheaper to run than equivalent proprietary models!

For instance, here are the inference prices for the top models;
โžค GPT-4-Turbo inference price from OpenAI: $5/M input tokens, $15/M output tokens
โžค Llama-3.1-405B from HF API (for testing only): 3$/M for input or output tokens (Source linked in the first comment)
โžค Llama-3.1-405B from HF API (for testing only): free โœจ

Get a head start on the HF API (resource by @andrewrreed ) ๐Ÿ‘‰ https://huggingface.co/learn/cookbook/enterprise_hub_serverless_inference_api
  • 1 reply
ยท
reacted to dvilasuero's post with ๐Ÿค—โค๏ธ๐Ÿš€๐Ÿ”ฅ 9 months ago
view post
Post
8185
Today is a huge day in Argillaโ€™s history. We couldnโ€™t be more excited to share this with the community: weโ€™re joining Hugging Face!

Weโ€™re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.

Over the past year, weโ€™ve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyrโ€™s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets

After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, weโ€™re now the same team.

To those of you whoโ€™ve been following us, this wonโ€™t be a huge surprise, but it will be a big deal in the coming months. This acquisition means weโ€™ll double down on empowering the community to build and collaborate on high quality datasets, weโ€™ll bring full support for multimodal datasets, and weโ€™ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.

As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amรฉlie.

Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.

Would love to answer any questions you have so feel free to add them below!
ยท
reacted to lunarflu's post with โค๏ธ 10 months ago
view post
Post
1985
cooking up something....anyone interested in a daily activity tracker for HF?
ยท