Falcon3-Mamba-R1-v0

Model Details

Model Description:

This model is a fine-tuned version of Falcon3-Mamba-7B-Instruct, optimized for logical reasoning and structured problem-solving before generating responses.

It leverages the Mamba architecture, which scales linearly with an increased number of tokens, making it an efficient and fast reasoning model while maintaining high response quality.

This fine-tuned version comes from an earlier checkpoint of the fine tuning pipeline.

Developed by: Hanzla Javaid
Base Model: tiiuae/Falcon3-Mamba-7B-Instruct
Model Type: Mamba-based causal decoder
Model Release Date: March 2025

Intended Uses

Direct Use:

This model is designed for:

Reasoning-heavy tasks (math, logic, and structured problem-solving)
STEM-based question-answering
General-purpose text generation

Downstream Use:

Fine-tuning for domain-specific applications such as finance, law, medicine, and research.
Integration into chatbots and virtual assistants that require advanced reasoning skills.
Enhancement of automated coding assistants with structured logic building.

Out-of-Scope Use:

Misinformation or deceptive applications
Automated decision-making in high-risk fields (e.g., medical diagnosis without human oversight)
Bias-sensitive applications where fairness is critical but not explicitly controlled

Bias and Limitations

Known Biases:

The model prioritizes English language data, so performance on multilingual tasks may be weaker.
Fine-tuning may introduce or amplify biases present in the training data, especially in areas like ethics, politics, and cultural perspectives.

Technical Limitations:

Performance may degrade on long-form generation beyond 64K tokens.

Recommendations:

Users should verify outputs for accuracy, especially in critical applications.
Regular bias evaluation should be conducted when deploying in production environments.

Getting Started

To use this model, you can load it with transformers:

repo_name = "hanzla/Falcon3-Mamba-R1-v0"
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained(repo_name)

model = AutoModelForCausalLM.from_pretrained(
    repo_name,
    device_map="auto",
    torch_dtype=torch.float16,
)

def generate_text(prompt,generation_model,generation_tokenizer,max_tokens=1024):
    messages = [
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": prompt},
    ]
    input_text = generation_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    print(input_text)
    input_ids = generation_tokenizer(input_text, return_tensors="pt").input_ids.to("auto")
    outputs = generation_model.generate(input_ids, max_new_tokens=max_tokens)
    generated_tokens = outputs[0][len(input_ids[0]):] 
    return tokenizer.decode(generated_tokens, skip_special_tokens=True)

Training Details

Training Procedure:

Pretrained Base Model: Falcon3-Mamba-7B-Instruct
Fine-tuning Data: A subset of STEM problems from open-thoughts/OpenThoughts-114k
Training Strategy: GRPO
Training Hyperparameters:
- Batch Size: 4
- Epochs: 3
- Precision: Mixed (fp16 / bf16)
- Hardware: 2xH100 GPUs

Evaluation

Testing Data and Metrics:

The fine-tuned model's performance was evaluated on a variety of benchmarks to assess its reasoning abilities and overall capabilities. The table below presents a comparison between the fine-tuned model and the base model:

Category	Benchmark	Falcon3-Mamba-R1-v0	Base Falcon3-Mamba-7B-Instruct
General	MMLU (5-shot)	72.1	65.3
Math	GSM8K (5-shot)	89.5	65.2

Technical Specifications

Model Architecture:

Mamba Blocks: 64
Hidden Size: 4096

Software Requirements:

transformers >= 4.38
torch >= 2.1
accelerate >= 0.25
mamba-ssm
causal-conv1d>=1.4.0

hanzla
/

Falcon3-Mamba-R1-v0

Falcon3-Mamba-R1-v0

Model Details

Intended Uses

Bias and Limitations

Getting Started

Training Details

Evaluation

Technical Specifications

Model tree for hanzla/Falcon3-Mamba-R1-v0

Space using hanzla/Falcon3-Mamba-R1-v0 1

Collection including hanzla/Falcon3-Mamba-R1-v0

Falcon3 Mamba R1