This is the BF16 converted model from FP8 original weights so it can be quantized to GGUF.

DeepSeek-R1T-Chimera

Model merge of DeepSeek-R1 and DeepSeek-V3 (0324)

An open weights model combining the intelligence of R1 with the token efficiency of V3.

Model Details

Architecture: DeepSeek-MoE Transformer-based language model
Combination Method: Merged model weights from DeepSeek-R1 and DeepSeek-V3 (0324)
Release Date: 2025-04-27

Safetensors

Model size

684B params

Tensor type

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Quantized

Quantized

(1)

this model