Spaces:
Running
Running
title: 'Vanishing Voices: Language Atlas' | |
emoji: ๐ | |
colorFrom: indigo | |
colorTo: blue | |
sdk: streamlit | |
sdk_version: 1.44.1 | |
app_file: rag_hf.py | |
pinned: false | |
# Vanishing Voices: South America's Endangered Language Atlas ๐ | |
This app explores three retrieval-augmented generation (RAG) methods to support the documentation of South America's endangered indigenous languages: | |
- **Standard Search**: Based on Wikipedia/Wikidata embeddings only. | |
- **Hybrid Search**: Combines embeddings with RDF cultural knowledge. | |
- **GraphSAGE Search**: Includes structural information from a graph neural network. | |
## ๐ง Powered by | |
- ๐ค [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints) | |
- ๐งฑ SentenceTransformers for multilingual embeddings | |
- ๐งฎ NetworkX + RDFLib for cultural graphs | |
- ๐ Glottolog, Wikidata, Wikipedia | |
## ๐ Features | |
- RAG with local numpy embeddings | |
- RDF triple inspection | |
- Comparison of methods in terms of relevance and hallucination | |
- Custom prompt injected into a Hugging Face endpoint | |
> Note: This app requires your own HF API token in `.streamlit/secrets.toml`. | |
## ๐ Instructions | |
1. Upload your own `.ttl`, `.pkl`, `.npy` files for graph and embeddings. | |
2. Set up `HF_ENDPOINT` and `HF_API_TOKEN` in `.streamlit/secrets.toml`. | |
3. Deploy via Streamlit or Hugging Face Spaces. | |
--- |