metadata
language:
- es
license: mit
tags:
- text-classification
- agriculture
- climate
- potato
- Peru
- Huancavelica
- LLaMA
- environmental-prediction
model-index:
- name: llama-lateblight-classifier
results:
- task:
type: text-classification
name: Potato Late Blight Risk Classification
dataset:
name: Huancavelica Late Blight Benchmark (Balanced)
type: tabular
metrics:
- name: Accuracy
type: accuracy
value: 0.97
- name: F1 (macro)
type: f1
value: 0.97
- name: Precision
type: precision
value: 0.97
- name: Recall
type: recall
value: 0.97
pipeline_tag: text-classification
library_name: transformers
🌾 LLaMA Late Blight Classifier (Huancavelica, Peru)
This model is a fine-tuned classifier based on openlm-research/open_llama_3b
, trained to predict potato late blight risk levels (Bajo
, Moderado
, Alto
) in the highlands of Huancavelica, Peru. It uses environmental inputs (temperature, humidity, precipitation) and crop variety metadata to output discrete classifications.
🤝 Use Case
Direct Use: Agronomic advisory systems or research tools predicting potato late blight risk from structured prompts or API queries.
Not for: Open-ended generation, conversational use, or regions with different pathogen pressures without retraining.
🌐 Model Details
- Base model:
openlm-research/open_llama_3b
- Architecture: LLaMA-3B with classification head (
AutoModelForSequenceClassification
) - Fine-tuning method: Full fine-tuning on a balanced, curated dataset (not LoRA)
- Tokenizer: Compatible LLaMA tokenizer (
tokenizer.model
included) - Language: Spanish (with structured Spanish prompts)
- Task: Hard classification (3-class)
🎓 Training
- Dataset: 156 training + 24 validation examples (balanced across 3 classes)
- Labels:
Bajo
,Moderado
,Alto
- Format (JSONL):
{ "instruction": "Evalúa el riesgo de tizón tardío basado en los datos climáticos y la variedad.", "input": "Escenario 1: Temperatura promedio 17.2 °C, Humedad 83%, Precipitación 3.4 mm, Variedad Yungay", "output": "Moderado" }
- Epochs: 10
- Optimizer: AdamW (mixed precision)
- Hardware: 1x A100 40GB (Colab Pro, single GPU)
🌿 Evaluation (Balanced Test Set, n = 90)
Class | Precision | Recall | F1 | Support |
---|---|---|---|---|
Bajo | 1.00 | 0.90 | 0.95 | 30 |
Moderado | 0.91 | 1.00 | 0.95 | 30 |
Alto | 1.00 | 1.00 | 1.00 | 30 |
Accuracy | 0.97 | 90 |
📈 Intended Use and Limitations
- Designed for: Highland regions in Peru (esp. Huancavelica), with expert-labeled ground truth and local pathogen behavior.
- Limitations:
- May generalize poorly to lowland areas or different varieties.
- Not a substitute for in-field disease monitoring.
📑 Citation
If you use this model, please cite:
Jorge Luis Alonso, Predicting Potato Late Blight in Huancavelica Using LLaMA Models, 2025
🌍 License
MIT License (model + training data)
⚡ Quick Inference Example
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
model = AutoModelForSequenceClassification.from_pretrained("jalonso24/llama-lateblight-classifier")
tokenizer = AutoTokenizer.from_pretrained("jalonso24/llama-lateblight-classifier")
clf = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1)
prompt = "Escenario: Temperatura 18.1 °C, Humedad 85%, Variedad Amarilis"
clf(prompt)
# ➞ [{'label': 'Alto', 'score': 0.95}]