JLouisBiz (Jean Louis)

reacted to ZennyKenny's post with 👍 about 11 hours ago

Post

1666

I've created a new dataset using the Algorithm of Thoughts architecture proposed by Sel et al. (2023) in a reasoning context. (paper: https://arxiv.org/pdf/2308.10379)

The dataset simulates the discovery phase of a fictitious VC firm called Reasoned Capital and, once expanded, can be used to create models which are able to make complex, subjective financial decisions based on different criteria.

The generation process encourages recursive problem-solving in increasingly complex prompts to encourage models to assess and reevaluate the conclusions and generated opinions of upstream models. Pretty neat stuff, and I'm not aware of this architecture being used in a reasoning context anywhere else.

Check it out: ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset

reacted to AdinaY's post with 🔥 1 day ago

Post

4753

Kimi-Audio 🚀🎧 an OPEN audio foundation model released by Moonshot AI
moonshotai/Kimi-Audio-7B-Instruct
✨ 7B
✨ 13M+ hours of pretraining data
✨ Novel hybrid input architecture
✨ Universal audio capabilities (ASR, AQA, AAC, SER, SEC/ASC, end-to-end conversation)

reacted to jasoncorkill's post with 🔥 1 day ago

Post

5055

🚀 Building Better Evaluations: 32K Image Annotations Now Available

Today, we're releasing an expanded version: 32K images annotated with 3.7M responses from over 300K individuals which was completed in under two weeks using the Rapidata Python API.

Rapidata/text-2-image-Rich-Human-Feedback-32k

A few months ago, we published one of our most liked dataset with 13K images based on the @data-is-better-together 's dataset, following Google's research on "Rich Human Feedback for Text-to-Image Generation" (https://arxiv.org/abs/2312.10240). It collected over 1.5M responses from 150K+ participants.

Rapidata/text-2-image-Rich-Human-Feedback

In the examples below, users highlighted words from prompts that were not correctly depicted in the generated images. Higher word scores indicate more frequent issues. If an image captured the prompt accurately, users could select [No_mistakes].

We're continuing to work on large-scale human feedback and model evaluation. If you're working on related research and need large, high-quality annotations, feel free to get in touch: [email protected].

reacted to Xenova's post with 🔥 1 day ago

Post

2762

Introducing the ONNX model explorer: Browse, search, and visualize neural networks directly in your browser. 🤯 A great tool for anyone studying Machine Learning! We're also releasing the entire dataset of graphs so you can use them in your own projects! 🤗

Check it out! 👇
Demo: onnx-community/model-explorer
Dataset: onnx-community/model-explorer
Source code: https://github.com/xenova/model-explorer

replied to as-cle-bert's post 1 day ago

Thanks so much for working on that. It's really good.

replied to hassenhamdi's post 1 day ago

So you are talking about a data set, but you can't prevent it. If you put a data set for military usage, then what's wrong with that? I mean if you don't put it, that means that some countries would have a better data set and take advantage of it. But if you open source it, that means there is certain competition in the military market and then they may think twice or create better data sets. If they put it or not, it really doesn't matter because we have to change the consciousness of this society to get peace in the world. So this is not like it's okay. I appreciate your intentions and that's actually the way to go. But forbidding people to put it, it's not going to forbid the world. With forbidding we don't change consciousness of people about the peace in the world. Do you get me?

replied to hassenhamdi's post 3 days ago

@JLouisBiz I have read your comment but it does not make any sense , what is the purpose of regulations and law if there is no limitations to what one can do.

I didn't speak about law on that level, but of licensing. And that is same opinion and same notion I have, leave it to the law to decide.

You can't limit people by telling them to nicely read the license and follow the guidelines on what you think it is ethical or moral. Nice people anyway behave nice, bad people don't listen.

We are not talking about chair here or any ordinary object for innocent usage, we are talking about war tech developement, are a chair and war tools that made primary for destruction and killing the same !!!?

Yes, and? There are numerous books written on that subject and available from many libraries including online. Information is accessible.

Please think deeper on what I said in my first paragraph in this message.

They are not , probably if compared it with a gun it might be more comparable , you can have a gun to protect yourself but you need a permission for possessing a gun as us citizen or you get arrested for possessing unathorized item, you ask why , to unsure public safety, but even with such procedure there still horrifying incident that happens, we are not talking about something similar to chair here, the analogy is poorly representitive of the present situation.

Sure, but it is not related to LLM licensing and free software.

And it is not about opensource and free software it is about not developing war tools.

I speak only related to Free Software.

And yes, it can and IS USED to develop war tools. It is also used for criminal purposes.

As you said it in the first paragraph, it is for the law to decide. Not for author.

Because nice software or LLM author(s) simply cannot prevent any war by placing some kind of "warnings", like please "it is forbidden to make war by using this LLM as tool". That is nonsense. You cannot enforce it. You cannot even know.

Keep it fully free software so that there are no doubts on how it can be used. There is reason for the freedom.

Knives are sold everyday and there is no warning on how to use it.

Take an example someone using their computer for piracy , cyber threats , scams fraudelent activities etcs., do you let them just go their way or some actions need be made to protect people , war tech are far worse than any thing mentioned earlier.

You or me, we are not crime hunter, and if we are, we have got our ways.

Crime hunting is not by limiting people how to use some software programs. It is simply not feasible. That approach is rather prone to accuse unjustly those people with good intentions.

Remember, no matter how many permissions a citizen may need to carry a gun, the bad guy doesn't care about it, and is going to get it so much easier than the nice guy.

replied to hassenhamdi's post 4 days ago

I don't agree on that. Any kind of such limitations is making the model proprietary and that means it's going to enter into less space and is going to help less people on this planet. The truly free software or open source model like LLM should not be limited because there is no limitation how a user is supposed to use it. Everybody should be able to use it how they wish and want. Another issue is there is no way for the author or anybody else to find out who used it in a way how you or author think it wasn't appropriate.

Let's just compare it to the everyday objects we use in our life. Let's say a chair and a keyboard, monitor or desk. When you are buying it or getting it for free from someone, do you get some kind of limitation? Please, this is the chair but you are not allowed to use it inappropriately or sit inappropriately or use the chair for other purposes but sitting. You don't.

So please think about that. People can definitely use chair to kill somebody or a desk to destroy things including using keyboard or monitor to fight.

And now how is somebody going to use their computer and software? Leave that for them because that's the point of free software. They can use it as they wish.

And just remember there is no way for anybody to control how somebody on the other side of the planet is using it.

This means that appealing to the ethical sense of honest and ethical people doesn't need to be there because they are already honest and ethical. And appealing to the sense of in-honest people is anyway not going to work. So you can't ensure of that anyway.

What is Free Software? - GNU Project - Free Software Foundation:
https://www.gnu.org/philosophy/free-sw.html

So read here and learn about the four free software freedoms.

replied to as-cle-bert's post 4 days ago

I am using Nomic embed text and Nomic embed vision from the local API endpoint. In my opinion your package should be more flexible on how to generate embeddings because some people may use remote embeddings as well. What matters here much is that any kind of document can be ingested. Another question, did you maybe think of page numbers?

replied to as-cle-bert's post 4 days ago

That sounds like the very needed thing. How can I use my own embedder?

reacted to as-cle-bert's post with 🔥 4 days ago

Post

2817

Ever dreamt of ingesting into a vector DB that pile of CSVs, Word documents and presentations laying in some remote folders on your PC?🗂️
What if I told you that you can do it within three to six lines of code?🤯
Well, with my latest open-source project, 𝐢𝐧𝐠𝐞𝐬𝐭-𝐚𝐧𝐲𝐭𝐡𝐢𝐧𝐠 (https://github.com/AstraBert/ingest-anything), you can take all your non-PDF files, convert them to PDF, extract their text, chunk, embed and load them into a vector database, all in one go!🚀
How? It's pretty simple!
📁 The input files are converted into PDF by PdfItDown (https://github.com/AstraBert/PdfItDown)
📑 The PDF text is extracted using LlamaIndex readers
🦛 The text is chunked exploiting Chonkie
🧮 The chunks are embedded thanks to Sentence Transformers models
🗄️ The embeddings are loaded into a Qdrant vector database

And you're done!✅
Curious of trying it? Install it by running:

𝘱𝘪𝘱 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 𝘪𝘯𝘨𝘦𝘴𝘵-𝘢𝘯𝘺𝘵𝘩𝘪𝘯𝘨

And you can start using it in your python scripts!🐍
Don't forget to star it on GitHub and let me know if you have any feedback! ➡️ https://github.com/AstraBert/ingest-anything

5 replies

·

reacted to orasul's post with 🔥 4 days ago

Post

2088

hi, it is deki, and now I am open sourced.

An Android AI agent powered by open-source ML model, 𝗱𝗲𝗸𝗶, was fully open-sourced.

It understands what’s on your screen and can perform tasks based on your voice or text commands.

Some examples:
* "Write my friend "some_name" in WhatsApp that I'll be 15 minutes late"
* "Open Twitter in the browser and write a post about something"
* "Read my latest notifications"
* "Write a linkedin post about something"

Currently, it works only on Android — but support for other OS is planned.

The ML and backend codes were also fully open-sourced.

Video prompt example:

"Open linkedin, tap post and write: hi, it is deki, and now I am open sourced. But don't send, just return"

License: GPLv3

You can find other AI agent demos or usage examples, like, code generation or object detection in github.

Github: https://github.com/RasulOs/deki

2 replies

·

reacted to as-cle-bert's post with 🤗 4 days ago

Post

2817

Ever dreamt of ingesting into a vector DB that pile of CSVs, Word documents and presentations laying in some remote folders on your PC?🗂️
What if I told you that you can do it within three to six lines of code?🤯
Well, with my latest open-source project, 𝐢𝐧𝐠𝐞𝐬𝐭-𝐚𝐧𝐲𝐭𝐡𝐢𝐧𝐠 (https://github.com/AstraBert/ingest-anything), you can take all your non-PDF files, convert them to PDF, extract their text, chunk, embed and load them into a vector database, all in one go!🚀
How? It's pretty simple!
📁 The input files are converted into PDF by PdfItDown (https://github.com/AstraBert/PdfItDown)
📑 The PDF text is extracted using LlamaIndex readers
🦛 The text is chunked exploiting Chonkie
🧮 The chunks are embedded thanks to Sentence Transformers models
🗄️ The embeddings are loaded into a Qdrant vector database

And you're done!✅
Curious of trying it? Install it by running:

𝘱𝘪𝘱 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 𝘪𝘯𝘨𝘦𝘴𝘵-𝘢𝘯𝘺𝘵𝘩𝘪𝘯𝘨

And you can start using it in your python scripts!🐍
Don't forget to star it on GitHub and let me know if you have any feedback! ➡️ https://github.com/AstraBert/ingest-anything

5 replies

·

reacted to ProCreations's post with 🚀 6 days ago

Post

2098

Come check out my new dataset Mistake to Meaning as an attempt to help smaller models understand user typos better! Hope you guys enjoy it

ProCreations/Mistake-To-Meaning

replied to onekq's post 6 days ago

Ollama? Takes more VRAM! It requires GGUF files, but they are anyway created by llama.cpp software, it is slower than llama.cpp, using models not published on ollama website requires user to think about it, configure, unlike llama.cpp

No way.

replied to hannayukhymenko's post 6 days ago

There is nothing to be proud of, you have based it on the proprietary model, disabling people to use it how they wish and want and totally disregarding free software principles. Why don't you take a good example from Microsoft IBM, Mistral or Allen AI, Qwen or DeepSeek companies which are distributing free software models?

Gemma License (danger) is not Free Software and is not Open Source
https://gnu.support/gnu-emacs/emacs-lisp/Gemma-License-danger-is-not-Free-Software-and-is-not-Open-Source.html

The Gemma Terms of Use and Prohibited Use Policy govern the use, modification, and distribution of Google's Gemma machine learning model and its derivatives. While Gemma is available for public use, it does not conform to Free Software or Open Source principles as defined by the Free Software Foundation (FSF) or Open Source Initiative (OSI). The terms impose significant restrictions, including prohibited use cases (e.g., illegal, harmful, or malicious activities), requirements to enforce Google's use restrictions on downstream users, and limitations on redistribution and derived works. Additionally, the terms do not guarantee access to source code or the freedom to use the software for any purpose, and they include broad disclaimers of warranty and liability. As a result, Gemma is a proprietary model with limited permissions, rather than a truly free or open-source software offering.

What is Free Software? - GNU Project - Free Software Foundation
https://www.gnu.org/philosophy/free-sw.html

replied to AdinaY's post 6 days ago

Totally fantastic, one of the best obviously. I hope it will run on a 30 90, 24 GB GPU.

replied to onekq's post 8 days ago

What is hard when I already gave you the command which works well?

replied to onekq's post 8 days ago

it works on the llama.cpp

It is how you can run it:

llama-server -ngl 999 --host 192.168.1.68 --override-kv glm4.rope.dimension_count=int:64 --override-kv tokenizer.ggml.eos_token_id=int:151336 -m /mnt/nvme0n1/LLM/quantized/GLM-4-9B-0414-Q8_0.gguf

Read here why:

Eval bug: GLM-Z1-9B-0414 · Issue #12946 · ggml-org/llama.cpp:
https://github.com/ggml-org/llama.cpp/issues/12946#issuecomment-2803564782

reacted to fantos's post with 🔥 9 days ago

Post

4160

🎨 BadgeCraft: Create Beautiful Badges with Ease! ✨
Hello there! Today I'm introducing BadgeCraft, a simple app that lets you create stunning badges for your websites, GitHub READMEs, and documentation.

🌟 Key Features

🖌️ 14 diverse color options including vibrant neon colors
🔤 Custom text input for label and message
🖼️ Support for 2000+ logos via Simple Icons
🔗 Clickable link integration
👁️ Real-time preview
💻 Ready-to-use HTML code generation

📝 How to Use

Label - Enter the text to display on the left side of the badge (e.g., "Discord", "Version", "Status")
Message - Enter the text to display on the right side of the badge
Logo - Type the name of a logo provided by Simple Icons (e.g., "discord", "github")
Style - Choose the shape of your badge (flat, plastic, for-the-badge, etc.)
Color Settings - Select background color, label background color, and logo color
Link - Enter the URL that the badge will link to when clicked

✅ Use Cases

Add social media links to your GitHub project README
Display version information or download links on your website
Include tech stack badges in blog posts
Show status indicators in documentation (e.g., "in development", "stable")

💡 Tips

Click on any of the prepared examples to automatically fill in all settings
Copy the generated HTML code and paste directly into your website or blog
HTML works in GitHub READMEs, but if you prefer markdown, use the ![alt text](badge URL) format

👨‍💻 Tech Stack
This app was built using Gradio and leverages the shields.io API to generate badges. Its simple UI makes it accessible for everyone!

🔗 openfree/Badge

✨ Available under MIT License - feel free to use and modify.

1 reply

·

Jean Louis PRO

AI & ML interests

Recent Activity

Organizations

JLouisBiz's activity