|
--- |
|
language: [es] |
|
license: mit |
|
tags: |
|
- text-classification |
|
- agriculture |
|
- climate |
|
- potato |
|
- Peru |
|
- Huancavelica |
|
- LLaMA |
|
- environmental-prediction |
|
model-index: |
|
- name: llama-lateblight-classifier |
|
results: |
|
- task: |
|
type: text-classification |
|
name: Potato Late Blight Risk Classification |
|
dataset: |
|
name: Huancavelica Late Blight Benchmark (Balanced) |
|
type: tabular |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.97 |
|
- name: F1 (macro) |
|
type: f1 |
|
value: 0.97 |
|
- name: Precision |
|
type: precision |
|
value: 0.97 |
|
- name: Recall |
|
type: recall |
|
value: 0.97 |
|
pipeline_tag: text-classification |
|
library_name: transformers |
|
--- |
|
|
|
# 🌾 LLaMA Late Blight Classifier (Huancavelica, Peru) |
|
|
|
This model is a fine-tuned classifier based on `openlm-research/open_llama_3b`, trained to predict **potato late blight risk levels** (`Bajo`, `Moderado`, `Alto`) in the highlands of Huancavelica, Peru. It uses environmental inputs (temperature, humidity, precipitation) and crop variety metadata to output discrete classifications. |
|
|
|
--- |
|
|
|
## 🤝 Use Case |
|
|
|
**Direct Use**: Agronomic advisory systems or research tools predicting potato late blight risk from structured prompts or API queries. |
|
|
|
**Not for**: Open-ended generation, conversational use, or regions with different pathogen pressures without retraining. |
|
|
|
--- |
|
|
|
## 🌐 Model Details |
|
|
|
- **Base model**: `openlm-research/open_llama_3b` |
|
- **Architecture**: LLaMA-3B with classification head (`AutoModelForSequenceClassification`) |
|
- **Fine-tuning method**: Full fine-tuning on a balanced, curated dataset (not LoRA) |
|
- **Tokenizer**: Compatible LLaMA tokenizer (`tokenizer.model` included) |
|
- **Language**: Spanish (with structured Spanish prompts) |
|
- **Task**: Hard classification (3-class) |
|
|
|
--- |
|
|
|
## 🎓 Training |
|
|
|
- **Dataset**: 156 training + 24 validation examples (balanced across 3 classes) |
|
- **Labels**: `Bajo`, `Moderado`, `Alto` |
|
- **Format** (JSONL): |
|
```json |
|
{ |
|
"instruction": "Evalúa el riesgo de tizón tardío basado en los datos climáticos y la variedad.", |
|
"input": "Escenario 1: Temperatura promedio 17.2 °C, Humedad 83%, Precipitación 3.4 mm, Variedad Yungay", |
|
"output": "Moderado" |
|
} |
|
``` |
|
- **Epochs**: 10 |
|
- **Optimizer**: AdamW (mixed precision) |
|
- **Hardware**: 1x A100 40GB (Colab Pro, single GPU) |
|
|
|
--- |
|
|
|
## 🌿 Evaluation (Balanced Test Set, n = 90) |
|
|
|
| Class | Precision | Recall | F1 | Support | |
|
|-----------|-----------|--------|-------|---------| |
|
| Bajo | 1.00 | 0.90 | 0.95 | 30 | |
|
| Moderado | 0.91 | 1.00 | 0.95 | 30 | |
|
| Alto | 1.00 | 1.00 | 1.00 | 30 | |
|
| **Accuracy** | | | **0.97** | 90 | |
|
|
|
--- |
|
|
|
## 📈 Intended Use and Limitations |
|
|
|
- **Designed for**: Highland regions in Peru (esp. Huancavelica), with expert-labeled ground truth and local pathogen behavior. |
|
- **Limitations**: |
|
- May generalize poorly to lowland areas or different varieties. |
|
- Not a substitute for in-field disease monitoring. |
|
|
|
--- |
|
|
|
## 📑 Citation |
|
|
|
If you use this model, please cite: |
|
|
|
> Jorge Luis Alonso, *Predicting Potato Late Blight in Huancavelica Using LLaMA Models*, 2025 |
|
|
|
--- |
|
|
|
## 🌍 License |
|
|
|
MIT License (model + training data) |
|
|
|
--- |
|
|
|
## ⚡ Quick Inference Example |
|
|
|
```python |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline |
|
model = AutoModelForSequenceClassification.from_pretrained("jalonso24/llama-lateblight-classifier") |
|
tokenizer = AutoTokenizer.from_pretrained("jalonso24/llama-lateblight-classifier") |
|
clf = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1) |
|
|
|
prompt = "Escenario: Temperatura 18.1 °C, Humedad 85%, Variedad Amarilis" |
|
clf(prompt) |
|
# ➞ [{'label': 'Alto', 'score': 0.95}] |
|
``` |
|
|
|
|