--- language: [es] license: mit tags: - text-classification - agriculture - climate - potato - Peru - Huancavelica - LLaMA - environmental-prediction model-index: - name: llama-lateblight-classifier results: - task: type: text-classification name: Potato Late Blight Risk Classification dataset: name: Huancavelica Late Blight Benchmark (Balanced) type: tabular metrics: - name: Accuracy type: accuracy value: 0.97 - name: F1 (macro) type: f1 value: 0.97 - name: Precision type: precision value: 0.97 - name: Recall type: recall value: 0.97 pipeline_tag: text-classification library_name: transformers --- # 🌾 LLaMA Late Blight Classifier (Huancavelica, Peru) This model is a fine-tuned classifier based on `openlm-research/open_llama_3b`, trained to predict **potato late blight risk levels** (`Bajo`, `Moderado`, `Alto`) in the highlands of Huancavelica, Peru. It uses environmental inputs (temperature, humidity, precipitation) and crop variety metadata to output discrete classifications. --- ## 🤝 Use Case **Direct Use**: Agronomic advisory systems or research tools predicting potato late blight risk from structured prompts or API queries. **Not for**: Open-ended generation, conversational use, or regions with different pathogen pressures without retraining. --- ## 🌐 Model Details - **Base model**: `openlm-research/open_llama_3b` - **Architecture**: LLaMA-3B with classification head (`AutoModelForSequenceClassification`) - **Fine-tuning method**: Full fine-tuning on a balanced, curated dataset (not LoRA) - **Tokenizer**: Compatible LLaMA tokenizer (`tokenizer.model` included) - **Language**: Spanish (with structured Spanish prompts) - **Task**: Hard classification (3-class) --- ## 🎓 Training - **Dataset**: 156 training + 24 validation examples (balanced across 3 classes) - **Labels**: `Bajo`, `Moderado`, `Alto` - **Format** (JSONL): ```json { "instruction": "Evalúa el riesgo de tizón tardío basado en los datos climáticos y la variedad.", "input": "Escenario 1: Temperatura promedio 17.2 °C, Humedad 83%, Precipitación 3.4 mm, Variedad Yungay", "output": "Moderado" } ``` - **Epochs**: 10 - **Optimizer**: AdamW (mixed precision) - **Hardware**: 1x A100 40GB (Colab Pro, single GPU) --- ## 🌿 Evaluation (Balanced Test Set, n = 90) | Class | Precision | Recall | F1 | Support | |-----------|-----------|--------|-------|---------| | Bajo | 1.00 | 0.90 | 0.95 | 30 | | Moderado | 0.91 | 1.00 | 0.95 | 30 | | Alto | 1.00 | 1.00 | 1.00 | 30 | | **Accuracy** | | | **0.97** | 90 | --- ## 📈 Intended Use and Limitations - **Designed for**: Highland regions in Peru (esp. Huancavelica), with expert-labeled ground truth and local pathogen behavior. - **Limitations**: - May generalize poorly to lowland areas or different varieties. - Not a substitute for in-field disease monitoring. --- ## 📑 Citation If you use this model, please cite: > Jorge Luis Alonso, *Predicting Potato Late Blight in Huancavelica Using LLaMA Models*, 2025 --- ## 🌍 License MIT License (model + training data) --- ## ⚡ Quick Inference Example ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model = AutoModelForSequenceClassification.from_pretrained("jalonso24/llama-lateblight-classifier") tokenizer = AutoTokenizer.from_pretrained("jalonso24/llama-lateblight-classifier") clf = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1) prompt = "Escenario: Temperatura 18.1 °C, Humedad 85%, Variedad Amarilis" clf(prompt) # ➞ [{'label': 'Alto', 'score': 0.95}] ```