RAG-SA / README.md
javiervzpucp's picture
Update README.md
004b19b verified
|
raw
history blame contribute delete
1.35 kB
---
title: 'Vanishing Voices: Language Atlas'
emoji: ๐ŸŒ
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.44.1
app_file: rag_hf.py
pinned: false
---
# Vanishing Voices: South America's Endangered Language Atlas ๐ŸŒ
This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages:
- **Standard Search**: Based on Wikipedia/Wikidata embeddings only.
- **Hybrid Search**: Combines embeddings with RDF cultural knowledge.
- **GraphSAGE Search**: Includes structural information from a graph neural network.
## ๐Ÿง  Powered by
- ๐Ÿค— [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints)
- ๐Ÿงฑ SentenceTransformers for multilingual embeddings
- ๐Ÿงฎ NetworkX + RDFLib for cultural graphs
- ๐Ÿ”— Glottolog, Wikidata, Wikipedia
## ๐Ÿ“Š Features
- RAG with local numpy embeddings
- RDF triple inspection
- Comparison of methods in terms of relevance and hallucination
- Custom prompt injected into a Hugging Face endpoint
> Note: This app requires your own HF API token in `.streamlit/secrets.toml`.
## ๐Ÿ“„ Instructions
1. Upload your own `.ttl`, `.pkl`, `.npy` files for graph and embeddings.
2. Set up `HF_ENDPOINT` and `HF_API_TOKEN` in `.streamlit/secrets.toml`.
3. Deploy via Streamlit or Hugging Face Spaces.
---