---
language: [es]
license: mit
tags:
  - text-classification
  - agriculture
  - climate
  - potato
  - Peru
  - Huancavelica
  - LLaMA
  - environmental-prediction
model-index:
  - name: llama-lateblight-classifier
    results:
      - task:
          type: text-classification
          name: Potato Late Blight Risk Classification
        dataset:
          name: Huancavelica Late Blight Benchmark (Balanced)
          type: tabular
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.97
          - name: F1 (macro)
            type: f1
            value: 0.97
          - name: Precision
            type: precision
            value: 0.97
          - name: Recall
            type: recall
            value: 0.97
pipeline_tag: text-classification
library_name: transformers
---

# 🌾 LLaMA Late Blight Classifier (Huancavelica, Peru)

This model is a fine-tuned classifier based on `openlm-research/open_llama_3b`, trained to predict **potato late blight risk levels** (`Bajo`, `Moderado`, `Alto`) in the highlands of Huancavelica, Peru. It uses environmental inputs (temperature, humidity, precipitation) and crop variety metadata to output discrete classifications.

---

## 🤝 Use Case

**Direct Use**: Agronomic advisory systems or research tools predicting potato late blight risk from structured prompts or API queries.

**Not for**: Open-ended generation, conversational use, or regions with different pathogen pressures without retraining.

---

## 🌐 Model Details

- **Base model**: `openlm-research/open_llama_3b`
- **Architecture**: LLaMA-3B with classification head (`AutoModelForSequenceClassification`)
- **Fine-tuning method**: Full fine-tuning on a balanced, curated dataset (not LoRA)
- **Tokenizer**: Compatible LLaMA tokenizer (`tokenizer.model` included)
- **Language**: Spanish (with structured Spanish prompts)
- **Task**: Hard classification (3-class)

---

## 🎓 Training

- **Dataset**: 156 training + 24 validation examples (balanced across 3 classes)
- **Labels**: `Bajo`, `Moderado`, `Alto`
- **Format** (JSONL):
  ```json
  {
    "instruction": "Evalúa el riesgo de tizón tardío basado en los datos climáticos y la variedad.",
    "input": "Escenario 1: Temperatura promedio 17.2 °C, Humedad 83%, Precipitación 3.4 mm, Variedad Yungay",
    "output": "Moderado"
  }
  ```
- **Epochs**: 10
- **Optimizer**: AdamW (mixed precision)
- **Hardware**: 1x A100 40GB (Colab Pro, single GPU)

---

## 🌿 Evaluation (Balanced Test Set, n = 90)

| Class     | Precision | Recall | F1    | Support |
|-----------|-----------|--------|-------|---------|
| Bajo      | 1.00      | 0.90   | 0.95  | 30      |
| Moderado | 0.91      | 1.00   | 0.95  | 30      |
| Alto      | 1.00      | 1.00   | 1.00  | 30      |
| **Accuracy** |         |        | **0.97** | 90      |

---

## 📈 Intended Use and Limitations

- **Designed for**: Highland regions in Peru (esp. Huancavelica), with expert-labeled ground truth and local pathogen behavior.
- **Limitations**:
  - May generalize poorly to lowland areas or different varieties.
  - Not a substitute for in-field disease monitoring.

---

## 📑 Citation

If you use this model, please cite:

> Jorge Luis Alonso, *Predicting Potato Late Blight in Huancavelica Using LLaMA Models*, 2025

---

## 🌍 License

MIT License (model + training data)

---

## ⚡ Quick Inference Example

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
model = AutoModelForSequenceClassification.from_pretrained("jalonso24/llama-lateblight-classifier")
tokenizer = AutoTokenizer.from_pretrained("jalonso24/llama-lateblight-classifier")
clf = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1)

prompt = "Escenario: Temperatura 18.1 °C, Humedad 85%, Variedad Amarilis"
clf(prompt)
# ➞ [{'label': 'Alto', 'score': 0.95}]
```