Qwen2.5-1.5B-Auto-FunctionCaller

Model Details

  • Model Name: Qwen2.5-1.5B-Auto-FunctionCaller
  • Base Model: Qwen/Qwen2.5-1.5B
  • Model Type: Language Model fine-tuned for Function Calling.
  • Recommended Quantization: Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf
    • This GGUF file using Q4_K_M quantization with Importance Matrix is recommended as offering the best balance between performance and computational efficiency (inference speed, memory usage) based on evaluation.

Intended Use

  • Primary Use: Function calling extraction from natural language queries within an automotive context. The model is designed to identify user intent and extract relevant parameters (arguments/slots) for triggering vehicle functions or infotainment actions.
  • Research Context: This model was specifically developed and fine-tuned as part of a research publication investigating the feasibility and performance of Small Language Models (SLMs) for function-calling tasks in resource-constrained automotive environments.
  • Target Environment: Embedded systems or edge devices within vehicles where computational resources may be limited.
  • Out-of-Scope Uses: General conversational AI, creative writing, tasks outside automotive function calling, safety-critical vehicle control.

Performance Metrics

The following metrics were evaluated on the Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf model:

  • Evaluation Setup:
    • Total Evaluation Samples: 2074
  • Performance:
    • Exact Match Accuracy: 0.8414
    • Average Component Accuracy: 0.9352
  • Efficiency & Confidence:
    • Throughput: 10.31 tokens/second
    • Latency (Per Token): 0.097 seconds
    • Latency (Per Instruction): 0.427 seconds
    • Average Model Confidence: 0.9005
    • Calibration Error: 0.0854

Note: Latency and throughput figures are hardware-dependent and should be benchmarked on the target deployment environment.

Limitations

  • Domain Specificity: Performance is optimized for automotive function calling. Generalization to other domains or complex, non-structured conversations may be limited.
  • Quantization Impact: The Q4_K_M_I quantization significantly improves efficiency but may result in a slight reduction in accuracy compared to higher-precision versions (e.g., FP16).
  • Complex Queries: May struggle with highly nested, ambiguous, or unusually phrased requests not well-represented in the fine-tuning data.
  • Safety Criticality: This model is not intended or validated for safety-critical vehicle operations (e.g., braking, steering). Use should be restricted to non-critical systems like infotainment and comfort controls.
  • Bias: Like any model, performance and fairness depend on the underlying data. Biases present in the fine-tuning or evaluation datasets may be reflected in the model's behavior.

Training Data (Summary)

The model was fine-tuned on a synthetic dataset specifically curated for automotive function calling tasks. Details will be referenced in the associated publication.

Citation

TBD

Downloads last month
582
GGUF
Model size
1.54B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for baslak/Qwen2.5-1.5B-Auto-FunctionCaller

Base model

Qwen/Qwen2.5-1.5B
Quantized
(49)
this model