3.png

Marathi-Sign-Language-Detection

Marathi-Sign-Language-Detection is a vision-language model fine-tuned from google/siglip2-base-patch16-224 for multi-class image classification. It is trained to recognize Marathi sign language hand gestures and map them to corresponding Devanagari characters using the SiglipForImageClassification architecture.

Classification Report:
              precision    recall  f1-score   support

           अ     0.9881    0.9911    0.9896      10090.9926    0.9237    0.9569      10220.8132    0.9609    0.8809      11010.9424    0.8894    0.9151      11030.9477    0.9073    0.9271      11980.9436    1.0000    0.9710      10710.9153    0.9378    0.9264      11410.7790    0.8871    0.8295      10890.9188    0.9581    0.9381      10751.0000    0.9226    0.9598      10210.9566    0.9160    0.9358      1083
           क्ष     0.9287    0.9667    0.9473      12000.9913    1.0000    0.9956      11400.9753    0.9982    0.9866      11090.8398    0.7908    0.8146      12000.9388    0.9016    0.9198      11580.9764    0.8127    0.8870      11690.9599    0.9967    0.9779      1200
           ज्ञ     0.9878    0.9483    0.9677      12000.9939    0.9567    0.9749      12000.8917    0.8992    0.8954      12000.9075    0.8425    0.8738      12000.9354    0.9900    0.9619      12000.8616    0.9025    0.8816      12000.9114    0.9425    0.9267      12000.9280    0.9025    0.9151      12000.9388    0.9717    0.9550      12000.8648    0.9275    0.8951      12000.9876    0.9917    0.9896      12000.7256    0.8967    0.8021      12000.9991    0.9683    0.9835      12000.8909    0.8575    0.8739      12000.9814    0.7917    0.8764      12000.9758    0.8383    0.9018      12000.8121    0.8142    0.8132      12000.5726    0.9133    0.7039      12000.7635    0.7339    0.7484      12100.9239    0.8800    0.9014      12000.8950    0.7533    0.8181      12000.9597    0.7542    0.8446      12000.8829    0.8667    0.8747      12000.8449    0.8758    0.8601      12000.9604    0.8883    0.9229      1200

    accuracy                         0.9027     50099
   macro avg     0.9117    0.9039    0.9051     50099
weighted avg     0.9107    0.9027    0.9040     50099

Label Space: 43 Classes

The model classifies a hand sign into one of the following 43 Marathi characters:

"id2label": {
  "0": "अ", "1": "आ", "2": "इ", "3": "ई", "4": "उ", "5": "ऊ",
  "6": "ए", "7": "ऐ", "8": "ओ", "9": "औ", "10": "क", "11": "क्ष",
  "12": "ख", "13": "ग", "14": "घ", "15": "च", "16": "छ", "17": "ज",
  "18": "ज्ञ", "19": "झ", "20": "ट", "21": "ठ", "22": "ड", "23": "ढ",
  "24": "ण", "25": "त", "26": "थ", "27": "द", "28": "ध", "29": "न",
  "30": "प", "31": "फ", "32": "ब", "33": "भ", "34": "म", "35": "य",
  "36": "र", "37": "ल", "38": "ळ", "39": "व", "40": "श", "41": "स", "42": "ह"
}

Install Dependencies

pip install -q transformers torch pillow gradio

Inference Code

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/Marathi-Sign-Language-Detection"  # Replace with actual path
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

# Marathi label mapping
id2label = {
    "0": "अ", "1": "आ", "2": "इ", "3": "ई", "4": "उ", "5": "ऊ",
    "6": "ए", "7": "ऐ", "8": "ओ", "9": "औ", "10": "क", "11": "क्ष",
    "12": "ख", "13": "ग", "14": "घ", "15": "च", "16": "छ", "17": "ज",
    "18": "ज्ञ", "19": "झ", "20": "ट", "21": "ठ", "22": "ड", "23": "ढ",
    "24": "ण", "25": "त", "26": "थ", "27": "द", "28": "ध", "29": "न",
    "30": "प", "31": "फ", "32": "ब", "33": "भ", "34": "म", "35": "य",
    "36": "र", "37": "ल", "38": "ळ", "39": "व", "40": "श", "41": "स", "42": "ह"
}

def classify_marathi_sign(image):
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    prediction = {
        id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
    }

    return prediction

# Gradio Interface
iface = gr.Interface(
    fn=classify_marathi_sign,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(num_top_classes=5, label="Marathi Sign Classification"),
    title="Marathi-Sign-Language-Detection",
    description="Upload an image of a Marathi sign language hand gesture to identify the corresponding character."
)

if __name__ == "__main__":
    iface.launch()

Intended Use

Marathi-Sign-Language-Detection can be applied in:

  • Educational platforms for learning regional sign language.
  • Assistive communication tools for Marathi-speaking users with hearing impairments.
  • Interactive applications that translate signs into text.
  • Research and data collection for sign language development and recognition.
Downloads last month
5
Safetensors
Model size
92.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Marathi-Sign-Language-Detection

Finetuned
(81)
this model

Dataset used to train prithivMLmods/Marathi-Sign-Language-Detection

Collection including prithivMLmods/Marathi-Sign-Language-Detection