Model Card for gemma-2-2B-it-thinking-function_calling-V0
This model is a fine-tuned version of google/gemma-2-2b-it, specifically trained for function calling with an added "Thinking Layer". The model was trained using TRL and incorporates an explicit thinking process before making function calls.
π― Key Features
- Function Calling: Generation of structured function calls
- Thinking Layer: Explicit reasoning process before execution
- Supported Functions:
convert_currency
: Currency conversioncalculate_distance
: Distance calculation between locations
π Quick Start
Function Calling Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "Sellid/gemma-2-2B-it-thinking-function_calling-V0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example for currency conversion
prompt = """<bos><start_of_turn>human
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.
Here are the available tools:<tools>[{
"type": "function",
"function": {
"name": "convert_currency",
"description": "Convert from one currency to another",
"parameters": {
"type": "object",
"properties": {
"amount": {"type": "number", "description": "The amount to convert"},
"from_currency": {"type": "string", "description": "The currency to convert from"},
"to_currency": {"type": "string", "description": "The currency to convert to"}
},
"required": ["amount", "from_currency", "to_currency"]
}
}
}]</tools>
Hi, I need to convert 500 USD to Euros. Can you help me with that?<end_of_turn><eos>
<start_of_turn>model"""
# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))
π€ Model Architecture
The model uses a special prompt structure with three main components:
- Tools Definition:
<tools>
[Function signatures in JSON format]
</tools>
- Thinking Layer:
<think>
[Explicit thinking process of the model]
</think>
- Function Call:
<tool_call>
{
"name": "function_name",
"arguments": {
"param1": "value1",
...
}
}
</tool_call>
Thinking Layer Process
The Thinking Layer executes the following steps:
- Analysis of user request
- Selection of appropriate function
- Validation of parameters
- Generation of function call
π Performance & Limitations
- Memory Requirements: ~4GB RAM
- Inference Time: ~1-2 seconds/request
- Supported Platforms:
- CPU
- NVIDIA GPUs (CUDA)
- Apple Silicon (MPS)
Limitations
- Limited to pre-trained functions
- No function call chaining
- No dynamic function extension
π§ Training Details
The model was trained using SFT (Supervised Fine-Tuning):
Framework Versions
- TRL: 0.15.1
- Transformers: 4.49.0
- Pytorch: 2.7.0.dev20250222
- Datasets: 3.3.2
- Tokenizers: 0.21.0
π Citations
If you use this model, please cite TRL:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin GallouΓ©dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
And this model:
@misc{gemma-function-calling-thinking,
title = {Gemma Function-Calling with Thinking Layer},
author = {Sellid},
year = 2024,
publisher = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/Sellid/gemma-2-2B-it-thinking-function_calling-V0}}
}
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.