riphunter7001x
/

PaliGemma3_FT_OCR

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

riphunter7001x commited on Feb 13

Commit

9612b7d

·

verified ·

1 Parent(s): 4f73a4f

Update README.md

Files changed (1) hide show

README.md +42 -3

README.md CHANGED Viewed

@@ -46,11 +46,50 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
 - PEFT 0.14.0
 - Transformers 4.47.0
 - Pytorch 2.2.1+cu121
-- Tokenizers 0.21.0

 ### Training results
 ### Framework versions
 - PEFT 0.14.0
 - Transformers 4.47.0
 - Pytorch 2.2.1+cu121
+- Tokenizers 0.21.0
+## Inference
+```python
+from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
+from PIL import Image
+import torch
+import json
+# Load model and processor
+model_id = "google/paligemma-3b-pt-448"
+peft_adapter_id = "riphunter7001x/PaliGemma3_FT_OCR"
+model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map="auto")
+processor = AutoProcessor.from_pretrained(model_id)
+model.load_adapter(peft_adapter_id).eval()
+TORCH_DTYPE = model.dtype
+DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+# Load and process image
+image = Image.open("image.jpg")
+prefix = "<image>extract Document data in JSON format"
+inputs = processor(
+    text=prefix,
+    images=image,
+    return_tensors="pt"
+).to(TORCH_DTYPE).to(DEVICE)
+prefix_length = inputs["input_ids"].shape[-1]
+with torch.inference_mode():
+    generation = model.generate(**inputs, max_new_tokens=512, do_sample=False)
+    generation = generation[0][prefix_length:]
+    decoded = processor.decode(generation, skip_special_tokens=True)
+    print(json.dumps(json.loads(decoded), indent=4))
+```
+This code loads the fine-tuned PaliGemma model, processes an input image, and extracts document data in JSON format.