SentenceTransformer

This is a sentence-transformers model trained on the cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation and geo_70k_multiplets_natural_language_annotation datasets. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): MMContextEncoder(
    (text_encoder): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(28996, 768, padding_idx=0)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
    (text_adapter): AdapterModule(
      (net): Sequential(
        (0): Linear(in_features=768, out_features=512, bias=True)
        (1): ReLU(inplace=True)
        (2): Linear(in_features=512, out_features=2048, bias=True)
        (3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (omics_adapter): AdapterModule(
      (net): Sequential(
        (0): Linear(in_features=512, out_features=512, bias=True)
        (1): ReLU(inplace=True)
        (2): Linear(in_features=512, out_features=2048, bias=True)
        (3): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("jo-mengr/mmcontext-100k-natural_language_annotation-geneformer-2024-text-unfrozen")
# Run inference
sentences = [
    'Endothelial cell of lymphatic vessel derived from fresh fimbria tissue sample of a 65-year old female.',
    'Neuron cell type from a 29-year-old human, specifically from the thalamic complex, specifically the thalamus (THM) - posterior nuclear complex of thalamus (PoN) - medial geniculate nuclei (MG).',
    'Plasma cells derived from lung parenchyma tissue of a female individual in her eighth decade, with a 24-hour delay between sample collection and processing.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9543

Triplet

Metric Value
cosine_accuracy 0.949

Training Details

Training Datasets

cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation

geo_70k_multiplets_natural_language_annotation

Evaluation Datasets

cellxgene_pseudo_bulk_35k_multiplets_natural_language_annotation

geo_70k_multiplets_natural_language_annotation

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 16
  • warmup_ratio: 0.1
  • fp16: True
  • dataloader_num_workers: 1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 16
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 1
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss cellxgene pseudo bulk 35k multiplets natural language annotation loss geo 70k multiplets natural language annotation loss cosine_accuracy
0.1351 100 - 16.5681 15.3425 0.5510
0.2703 200 15.2121 16.3962 14.5975 0.6669
0.4054 300 - 15.1565 13.5315 0.7754
0.5405 400 13.4551 12.2976 11.6012 0.8340
0.6757 500 - 10.1066 8.5850 0.8704
0.8108 600 8.9059 7.8946 6.7269 0.8931
0.9459 700 - 6.1265 5.8313 0.9036
1.0811 800 5.8557 5.3230 5.3629 0.9107
1.2162 900 - 4.7961 5.0623 0.9209
1.3514 1000 4.8756 4.6028 4.7280 0.9279
1.4865 1100 - 4.6467 4.4183 0.9373
1.6216 1200 4.3719 4.7835 4.1918 0.9440
1.7568 1300 - 4.4550 4.0311 0.9476
1.8919 1400 4.0077 4.5942 3.8520 0.9497
2.0270 1500 - 4.0982 3.8556 0.9517
2.1622 1600 3.7523 4.3389 3.7847 0.9554
2.2973 1700 - 4.1296 3.8354 0.9521
2.4324 1800 3.7573 4.3382 3.7801 0.9553
2.5676 1900 - 4.1184 3.8465 0.9521
2.7027 2000 3.7301 4.2711 3.7977 0.9540
2.8378 2100 - 4.0863 3.8529 0.9516
2.9730 2200 3.7111 4.1145 3.8415 0.9517
3.1081 2300 - 4.2684 3.8076 0.9536
3.2432 2400 3.7155 3.8739 3.9858 0.9476
3.3784 2500 - 4.5718 3.7554 0.9556
3.5135 2600 3.7532 4.7481 3.7515 0.9573
3.6486 2700 - 4.3598 3.7741 0.9544
3.7838 2800 3.7255 4.2423 3.8044 0.9544
3.9189 2900 - 4.1150 3.8462 0.9517
4.0541 3000 3.7 4.2966 3.7923 0.9553
4.1892 3100 - 4.1954 3.8200 0.9524
4.3243 3200 3.7556 4.3824 3.7742 0.9556
4.4595 3300 - 4.5560 3.7541 0.9560
4.5946 3400 3.7283 3.9065 3.9552 0.9487
4.7297 3500 - 3.8415 4.0087 0.9481
4.8649 3600 3.741 4.4399 3.7655 0.9557
5.0 3700 - 4.5457 3.7542 0.9561
5.1351 3800 3.6978 3.9224 3.9533 0.9487
5.2703 3900 - 4.3493 3.7846 0.9554
5.4054 4000 3.7399 4.3480 3.7832 0.9549
5.5405 4100 - 3.9356 3.9337 0.9500
5.6757 4200 3.7406 4.3089 3.7905 0.9546
5.8108 4300 - 4.4414 3.7711 0.9550
5.9459 4400 3.7161 4.0804 3.8547 0.9521
6.0811 4500 - 3.9827 3.9103 0.9509
6.2162 4600 3.7038 3.8720 3.9825 0.9486
6.3514 4700 - 3.9803 3.9070 0.9503
6.4865 4800 3.7522 4.2410 3.8043 0.9551
6.6216 4900 - 4.5504 3.7628 0.9557
6.7568 5000 3.7252 4.3341 3.7837 0.9550
6.8919 5100 - 4.5281 3.7531 0.9560
7.0270 5200 3.6791 4.0975 3.8550 0.9517
7.1622 5300 - 4.3336 3.7814 0.9553
7.2973 5400 3.7546 4.1190 3.8355 0.9523
7.4324 5500 - 4.3390 3.7763 0.9554
7.5676 5600 3.725 4.1069 3.8476 0.9516
7.7027 5700 - 4.2602 3.7962 0.9546
7.8378 5800 3.7309 4.0831 3.8483 0.9517
7.9730 5900 - 4.1081 3.8386 0.9519
8.1081 6000 3.7056 4.2598 3.8045 0.9534
8.2432 6100 - 3.8669 3.9848 0.9479
8.3784 6200 3.7322 4.5549 3.7529 0.9559
8.5135 6300 - 4.7403 3.7472 0.9576
8.6486 6400 3.7317 4.3473 3.7718 0.9547
8.7838 6500 - 4.2320 3.7998 0.9546
8.9189 6600 3.7208 4.1063 3.8423 0.9519
9.0541 6700 - 4.2851 3.7893 0.9547
9.1892 6800 3.6945 4.1825 3.8167 0.9526
9.3243 6900 - 4.3738 3.7702 0.9560
9.4595 7000 3.7437 4.5468 3.7502 0.9560
9.5946 7100 - 3.8960 3.9519 0.9489
9.7297 7200 3.7285 3.8328 4.0028 0.9474
9.8649 7300 - 4.4250 3.7606 0.9557
10.0 7400 3.6724 4.5225 3.7482 0.9563
10.1351 7500 - 3.9094 3.9493 0.9486
10.2703 7600 3.7461 4.3360 3.7803 0.9550
10.4054 7700 - 4.3358 3.7772 0.9553
10.5405 7800 3.7407 3.9274 3.9251 0.9499
10.6757 7900 - 4.2977 3.7844 0.9543
10.8108 8000 3.728 4.4351 3.7666 0.9551
10.9459 8100 - 4.0689 3.8480 0.9521
11.0811 8200 3.6982 3.9707 3.9039 0.9509
11.2162 8300 - 3.8588 3.9769 0.9481
11.3514 8400 3.7318 3.9676 3.9023 0.9503
11.4865 8500 - 4.2258 3.7993 0.9549
11.6216 8600 3.7316 4.5318 3.7566 0.9559
11.7568 8700 - 4.3155 3.7782 0.9544
11.8919 8800 3.7158 4.5147 3.7473 0.9559
12.0270 8900 - 4.0836 3.8483 0.9517
12.1622 9000 3.6941 4.3180 3.7766 0.9546
12.2973 9100 - 4.1086 3.8267 0.9530
12.4324 9200 3.7351 4.3192 3.7696 0.9550
12.5676 9300 - 4.0972 3.8375 0.9516
12.7027 9400 3.7224 4.2462 3.7891 0.9543
12.8378 9500 - 4.0651 3.8419 0.9514
12.9730 9600 3.7019 4.0886 3.8325 0.9514
13.1081 9700 - 4.2453 3.7956 0.9533
13.2432 9800 3.6979 3.8549 3.9746 0.9480
13.3784 9900 - 4.5402 3.7440 0.9556
13.5135 10000 3.7436 4.7189 3.7372 0.9571
13.6486 10100 - 4.3368 3.7617 0.9546
13.7838 10200 3.7129 4.2180 3.7909 0.9540
13.9189 10300 - 4.0913 3.8344 0.9509
14.0541 10400 3.6821 4.2673 3.7803 0.9543
14.1892 10500 - 4.1662 3.8081 0.9524
14.3243 10600 3.7336 4.3547 3.7615 0.9554
14.4595 10700 - 4.5219 3.7425 0.9560
14.5946 10800 3.7057 3.8819 3.9436 0.9484
14.7297 10900 - 3.8188 3.9952 0.9479
14.8649 11000 3.7205 4.4094 3.7525 0.9547
15.0 11100 - 4.5114 3.7421 0.9556
15.1351 11200 3.6753 3.8929 3.9439 0.9483
15.2703 11300 - 4.3207 3.7717 0.9543
15.4054 11400 3.7216 4.3187 3.7698 0.9551
15.5405 11500 - 3.9106 3.9202 0.9490

Framework Versions

  • Python: 3.10.10
  • Sentence Transformers: 3.5.0.dev0
  • Transformers: 4.43.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 0.33.0
  • Datasets: 2.14.4
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Evaluation results