SentenceTransformer
This is a sentence-transformers model trained on the cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation and geo_70k_multiplets_natural_language_annotation datasets. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: None tokens
- Output Dimensionality: None dimensions
- Similarity Function: Cosine Similarity
- Training Datasets:
- Language: code
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): MMContextEncoder(
(text_encoder): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(28996, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSdpaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(text_adapter): AdapterModule(
(net): Sequential(
(0): Linear(in_features=768, out_features=512, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=512, out_features=2048, bias=True)
(3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(omics_adapter): AdapterModule(
(net): Sequential(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=512, out_features=2048, bias=True)
(3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jo-mengr/mmcontext-100k-natural_language_annotation-geneformer-2024-text-unfrozen")
# Run inference
sentences = [
'Endothelial cell of lymphatic vessel derived from fresh fimbria tissue sample of a 65-year old female.',
'Neuron cell type from a 29-year-old human, specifically from the thalamic complex, specifically the thalamus (THM) - posterior nuclear complex of thalamus (PoN) - medial geniculate nuclei (MG).',
'Plasma cells derived from lung parenchyma tissue of a female individual in her eighth decade, with a 24-hour delay between sample collection and processing.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.9543 |
Triplet
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.949 |
Training Details
Training Datasets
cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation
- Dataset: cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation at 3c6f498
- Size: 31,500 training samples
- Columns:
anndata_ref
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anndata_ref positive negative_1 negative_2 type dict string string dict details - min: 53 characters
- mean: 163.04 characters
- max: 743 characters
- min: 43 characters
- mean: 169.26 characters
- max: 829 characters
- Samples:
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
geo_70k_multiplets_natural_language_annotation
- Dataset: geo_70k_multiplets_natural_language_annotation at 449eb79
- Size: 63,000 training samples
- Columns:
anndata_ref
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anndata_ref positive negative_1 negative_2 type dict string string dict details - min: 21 characters
- mean: 139.4 characters
- max: 696 characters
- min: 23 characters
- mean: 142.09 characters
- max: 705 characters
- Samples:
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Datasets
cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation
- Dataset: cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation at 3c6f498
- Size: 3,500 evaluation samples
- Columns:
anndata_ref
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anndata_ref positive negative_1 negative_2 type dict string string dict details - min: 51 characters
- mean: 168.27 characters
- max: 829 characters
- min: 51 characters
- mean: 167.36 characters
- max: 963 characters
- Samples:
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
geo_70k_multiplets_natural_language_annotation
- Dataset: geo_70k_multiplets_natural_language_annotation at 449eb79
- Size: 7,000 evaluation samples
- Columns:
anndata_ref
,positive
,negative_1
, andnegative_2
- Approximate statistics based on the first 1000 samples:
anndata_ref positive negative_1 negative_2 type dict string string dict details - min: 22 characters
- mean: 138.7 characters
- max: 702 characters
- min: 22 characters
- mean: 131.79 characters
- max: 702 characters
- Samples:
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 128per_device_eval_batch_size
: 128learning_rate
: 2e-05num_train_epochs
: 16warmup_ratio
: 0.1fp16
: Truedataloader_num_workers
: 1
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 16max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 1dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | cellxgene pseudo bulk 35k multiplets natural language annotation loss | geo 70k multiplets natural language annotation loss | cosine_accuracy |
---|---|---|---|---|---|
0.1351 | 100 | - | 16.5681 | 15.3425 | 0.5510 |
0.2703 | 200 | 15.2121 | 16.3962 | 14.5975 | 0.6669 |
0.4054 | 300 | - | 15.1565 | 13.5315 | 0.7754 |
0.5405 | 400 | 13.4551 | 12.2976 | 11.6012 | 0.8340 |
0.6757 | 500 | - | 10.1066 | 8.5850 | 0.8704 |
0.8108 | 600 | 8.9059 | 7.8946 | 6.7269 | 0.8931 |
0.9459 | 700 | - | 6.1265 | 5.8313 | 0.9036 |
1.0811 | 800 | 5.8557 | 5.3230 | 5.3629 | 0.9107 |
1.2162 | 900 | - | 4.7961 | 5.0623 | 0.9209 |
1.3514 | 1000 | 4.8756 | 4.6028 | 4.7280 | 0.9279 |
1.4865 | 1100 | - | 4.6467 | 4.4183 | 0.9373 |
1.6216 | 1200 | 4.3719 | 4.7835 | 4.1918 | 0.9440 |
1.7568 | 1300 | - | 4.4550 | 4.0311 | 0.9476 |
1.8919 | 1400 | 4.0077 | 4.5942 | 3.8520 | 0.9497 |
2.0270 | 1500 | - | 4.0982 | 3.8556 | 0.9517 |
2.1622 | 1600 | 3.7523 | 4.3389 | 3.7847 | 0.9554 |
2.2973 | 1700 | - | 4.1296 | 3.8354 | 0.9521 |
2.4324 | 1800 | 3.7573 | 4.3382 | 3.7801 | 0.9553 |
2.5676 | 1900 | - | 4.1184 | 3.8465 | 0.9521 |
2.7027 | 2000 | 3.7301 | 4.2711 | 3.7977 | 0.9540 |
2.8378 | 2100 | - | 4.0863 | 3.8529 | 0.9516 |
2.9730 | 2200 | 3.7111 | 4.1145 | 3.8415 | 0.9517 |
3.1081 | 2300 | - | 4.2684 | 3.8076 | 0.9536 |
3.2432 | 2400 | 3.7155 | 3.8739 | 3.9858 | 0.9476 |
3.3784 | 2500 | - | 4.5718 | 3.7554 | 0.9556 |
3.5135 | 2600 | 3.7532 | 4.7481 | 3.7515 | 0.9573 |
3.6486 | 2700 | - | 4.3598 | 3.7741 | 0.9544 |
3.7838 | 2800 | 3.7255 | 4.2423 | 3.8044 | 0.9544 |
3.9189 | 2900 | - | 4.1150 | 3.8462 | 0.9517 |
4.0541 | 3000 | 3.7 | 4.2966 | 3.7923 | 0.9553 |
4.1892 | 3100 | - | 4.1954 | 3.8200 | 0.9524 |
4.3243 | 3200 | 3.7556 | 4.3824 | 3.7742 | 0.9556 |
4.4595 | 3300 | - | 4.5560 | 3.7541 | 0.9560 |
4.5946 | 3400 | 3.7283 | 3.9065 | 3.9552 | 0.9487 |
4.7297 | 3500 | - | 3.8415 | 4.0087 | 0.9481 |
4.8649 | 3600 | 3.741 | 4.4399 | 3.7655 | 0.9557 |
5.0 | 3700 | - | 4.5457 | 3.7542 | 0.9561 |
5.1351 | 3800 | 3.6978 | 3.9224 | 3.9533 | 0.9487 |
5.2703 | 3900 | - | 4.3493 | 3.7846 | 0.9554 |
5.4054 | 4000 | 3.7399 | 4.3480 | 3.7832 | 0.9549 |
5.5405 | 4100 | - | 3.9356 | 3.9337 | 0.9500 |
5.6757 | 4200 | 3.7406 | 4.3089 | 3.7905 | 0.9546 |
5.8108 | 4300 | - | 4.4414 | 3.7711 | 0.9550 |
5.9459 | 4400 | 3.7161 | 4.0804 | 3.8547 | 0.9521 |
6.0811 | 4500 | - | 3.9827 | 3.9103 | 0.9509 |
6.2162 | 4600 | 3.7038 | 3.8720 | 3.9825 | 0.9486 |
6.3514 | 4700 | - | 3.9803 | 3.9070 | 0.9503 |
6.4865 | 4800 | 3.7522 | 4.2410 | 3.8043 | 0.9551 |
6.6216 | 4900 | - | 4.5504 | 3.7628 | 0.9557 |
6.7568 | 5000 | 3.7252 | 4.3341 | 3.7837 | 0.9550 |
6.8919 | 5100 | - | 4.5281 | 3.7531 | 0.9560 |
7.0270 | 5200 | 3.6791 | 4.0975 | 3.8550 | 0.9517 |
7.1622 | 5300 | - | 4.3336 | 3.7814 | 0.9553 |
7.2973 | 5400 | 3.7546 | 4.1190 | 3.8355 | 0.9523 |
7.4324 | 5500 | - | 4.3390 | 3.7763 | 0.9554 |
7.5676 | 5600 | 3.725 | 4.1069 | 3.8476 | 0.9516 |
7.7027 | 5700 | - | 4.2602 | 3.7962 | 0.9546 |
7.8378 | 5800 | 3.7309 | 4.0831 | 3.8483 | 0.9517 |
7.9730 | 5900 | - | 4.1081 | 3.8386 | 0.9519 |
8.1081 | 6000 | 3.7056 | 4.2598 | 3.8045 | 0.9534 |
8.2432 | 6100 | - | 3.8669 | 3.9848 | 0.9479 |
8.3784 | 6200 | 3.7322 | 4.5549 | 3.7529 | 0.9559 |
8.5135 | 6300 | - | 4.7403 | 3.7472 | 0.9576 |
8.6486 | 6400 | 3.7317 | 4.3473 | 3.7718 | 0.9547 |
8.7838 | 6500 | - | 4.2320 | 3.7998 | 0.9546 |
8.9189 | 6600 | 3.7208 | 4.1063 | 3.8423 | 0.9519 |
9.0541 | 6700 | - | 4.2851 | 3.7893 | 0.9547 |
9.1892 | 6800 | 3.6945 | 4.1825 | 3.8167 | 0.9526 |
9.3243 | 6900 | - | 4.3738 | 3.7702 | 0.9560 |
9.4595 | 7000 | 3.7437 | 4.5468 | 3.7502 | 0.9560 |
9.5946 | 7100 | - | 3.8960 | 3.9519 | 0.9489 |
9.7297 | 7200 | 3.7285 | 3.8328 | 4.0028 | 0.9474 |
9.8649 | 7300 | - | 4.4250 | 3.7606 | 0.9557 |
10.0 | 7400 | 3.6724 | 4.5225 | 3.7482 | 0.9563 |
10.1351 | 7500 | - | 3.9094 | 3.9493 | 0.9486 |
10.2703 | 7600 | 3.7461 | 4.3360 | 3.7803 | 0.9550 |
10.4054 | 7700 | - | 4.3358 | 3.7772 | 0.9553 |
10.5405 | 7800 | 3.7407 | 3.9274 | 3.9251 | 0.9499 |
10.6757 | 7900 | - | 4.2977 | 3.7844 | 0.9543 |
10.8108 | 8000 | 3.728 | 4.4351 | 3.7666 | 0.9551 |
10.9459 | 8100 | - | 4.0689 | 3.8480 | 0.9521 |
11.0811 | 8200 | 3.6982 | 3.9707 | 3.9039 | 0.9509 |
11.2162 | 8300 | - | 3.8588 | 3.9769 | 0.9481 |
11.3514 | 8400 | 3.7318 | 3.9676 | 3.9023 | 0.9503 |
11.4865 | 8500 | - | 4.2258 | 3.7993 | 0.9549 |
11.6216 | 8600 | 3.7316 | 4.5318 | 3.7566 | 0.9559 |
11.7568 | 8700 | - | 4.3155 | 3.7782 | 0.9544 |
11.8919 | 8800 | 3.7158 | 4.5147 | 3.7473 | 0.9559 |
12.0270 | 8900 | - | 4.0836 | 3.8483 | 0.9517 |
12.1622 | 9000 | 3.6941 | 4.3180 | 3.7766 | 0.9546 |
12.2973 | 9100 | - | 4.1086 | 3.8267 | 0.9530 |
12.4324 | 9200 | 3.7351 | 4.3192 | 3.7696 | 0.9550 |
12.5676 | 9300 | - | 4.0972 | 3.8375 | 0.9516 |
12.7027 | 9400 | 3.7224 | 4.2462 | 3.7891 | 0.9543 |
12.8378 | 9500 | - | 4.0651 | 3.8419 | 0.9514 |
12.9730 | 9600 | 3.7019 | 4.0886 | 3.8325 | 0.9514 |
13.1081 | 9700 | - | 4.2453 | 3.7956 | 0.9533 |
13.2432 | 9800 | 3.6979 | 3.8549 | 3.9746 | 0.9480 |
13.3784 | 9900 | - | 4.5402 | 3.7440 | 0.9556 |
13.5135 | 10000 | 3.7436 | 4.7189 | 3.7372 | 0.9571 |
13.6486 | 10100 | - | 4.3368 | 3.7617 | 0.9546 |
13.7838 | 10200 | 3.7129 | 4.2180 | 3.7909 | 0.9540 |
13.9189 | 10300 | - | 4.0913 | 3.8344 | 0.9509 |
14.0541 | 10400 | 3.6821 | 4.2673 | 3.7803 | 0.9543 |
14.1892 | 10500 | - | 4.1662 | 3.8081 | 0.9524 |
14.3243 | 10600 | 3.7336 | 4.3547 | 3.7615 | 0.9554 |
14.4595 | 10700 | - | 4.5219 | 3.7425 | 0.9560 |
14.5946 | 10800 | 3.7057 | 3.8819 | 3.9436 | 0.9484 |
14.7297 | 10900 | - | 3.8188 | 3.9952 | 0.9479 |
14.8649 | 11000 | 3.7205 | 4.4094 | 3.7525 | 0.9547 |
15.0 | 11100 | - | 4.5114 | 3.7421 | 0.9556 |
15.1351 | 11200 | 3.6753 | 3.8929 | 3.9439 | 0.9483 |
15.2703 | 11300 | - | 4.3207 | 3.7717 | 0.9543 |
15.4054 | 11400 | 3.7216 | 4.3187 | 3.7698 | 0.9551 |
15.5405 | 11500 | - | 3.9106 | 3.9202 | 0.9490 |
Framework Versions
- Python: 3.10.10
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.43.4
- PyTorch: 2.6.0+cu124
- Accelerate: 0.33.0
- Datasets: 2.14.4
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Evaluation results
- Cosine Accuracy on Unknownself-reported0.954
- Cosine Accuracy on Unknownself-reported0.949