Is this support text-embeddings-inference?
#5
by
Copycats
- opened
I'm having trouble getting the mixedbread-ai/mxbai-rerank-large-v2 model to work with text-embedding-inference. It seems like it's not supported. Here's the Docker command I'm using for testing (on a T4 GPU):
revision=main
volume=$PWD/data
image=ghcr.io/huggingface/text-embeddings-inference:turing-1.3
model=mixedbread-ai/mxbai-rerank-large-v2
docker run -d --restart always --gpus all -p 8001:80 -v $volume:/data --pull always $image --model-id $model --revision $revision --auto-truncate --pooling mean
When I run this command, the container crashes with the following error:
2025-03-20T07:48:48.720904Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "mix*******-**/*****-******-****e-v2", revision: Some("main"), tokenization_workers: None, dtype: None, pooling: Some(Mean), max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "f921468cccc9", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2025-03-20T07:48:48.721260Z INFO hf_hub: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2025-03-20T07:48:50.049790Z INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:108: Downloading `config_sentence_transformers.json`
2025-03-20T07:48:50.114051Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-03-20T07:48:50.114227Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:22: Downloading `config.json`
2025-03-20T07:48:50.114316Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:25: Downloading `tokenizer.json`
2025-03-20T07:48:50.114733Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:52: Downloading `model.safetensors`
2025-03-20T07:48:50.115193Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:39: Model artifacts downloaded in 1.144562ms
thread 'main' panicked at /usr/src/router/src/lib.rs:143:62:
tokenizer.json not found. text-embeddings-inference only supports fast tokenizers: Error("data did not match any variant of untagged enum ModelWrapper", line: 757444, column: 1)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I'm wondering if TEI doesn't support this model yet."
Yes TEI does not support the model yet. Please open a issue for them and ask for adding support. Unfortunately, we can't do anything about that.
aamirshakir
changed discussion status to
closed
https://github.com/huggingface/text-embeddings-inference/issues/532
I opened an issue on the TEI repository and received a response stating that the model is not compatible with TEI and that I should use TGI instead.