gabrielloiseau
/

LUAR-MUD-sentence-transformers

@@ -15,37 +15,13 @@ language:
 All credits go to [(Rivera-Soto et al. 2021)](https://aclanthology.org/2021.emnlp-main.70/)
-<!--
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1). It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
-## Model Details
-### Model Description
-- **Model Type:** Sentence Transformer
-- **Base model:** [sentence-transformers/paraphrase-distilroberta-base-v1](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1) <!-- at revision 0520e7529d15c250345a95871495ea016ca93754 -->
-<!--- **Maximum Sequence Length:** 128 tokens
-- **Output Dimensionality:** 512 tokens
-- **Similarity Function:** Cosine Similarity
-<!-- - **Training Dataset:** Unknown -->
-<!-- - **Language:** Unknown -->
-<!-- - **License:** Unknown -->
-<!--
-### Model Sources
-- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
-- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
-- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
-### Full Model Architecture
-```
-SentenceTransformer(
-  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel
-  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
-  (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
-)
-```
 ## Usage
@@ -61,7 +37,6 @@ Then you can load this model and run inference.
 ```python
 from sentence_transformers import SentenceTransformer
-# Download from the 🤗 Hub
 model = SentenceTransformer("gabrielloiseau/LUAR-MUD-sentence-transformers")
 # Run inference
 sentences = [
@@ -72,78 +47,23 @@ sentences = [
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 # [3, 512]
-# Get the similarity scores for the embeddings
-similarities = model.similarity(embeddings, embeddings)
-print(similarities.shape)
-# [3, 3]
 ```
-### Direct Usage (Transformers)
-<details><summary>Click to see the direct usage in Transformers</summary>
-</details>
--->
-<!--
-### Downstream Usage (Sentence Transformers)
-You can finetune this model on your own dataset.
-<details><summary>Click to expand</summary>
-</details>
--->
-<!--
-### Out-of-Scope Use
-*List how the model may foreseeably be misused and address what users ought not to do with the model.*
--->
-<!--
-## Bias, Risks and Limitations
-*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
--->
-<!--
-### Recommendations
-*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
--->
-<!--
-## Training Details
-### Framework Versions
-- Python: 3.12.7
-- Sentence Transformers: 3.1.1
-- Transformers: 4.40.1
-- PyTorch: 2.4.1+cu121
-- Accelerate:
-- Datasets: 3.0.1
-- Tokenizers: 0.19.1
 ## Citation
-### BibTeX
-<!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
--->
-<!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
--->
-<!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

 All credits go to [(Rivera-Soto et al. 2021)](https://aclanthology.org/2021.emnlp-main.70/)
+---
+Author Style Representations using [LUAR](https://aclanthology.org/2021.emnlp-main.70.pdf).
+The LUAR training and evaluation repository can be found [here](https://github.com/llnl/luar).
+This model was trained on the Reddit Million User Dataset (MUD) found [here](https://aclanthology.org/2021.naacl-main.415.pdf).
 ## Usage
 ```python
 from sentence_transformers import SentenceTransformer
 model = SentenceTransformer("gabrielloiseau/LUAR-MUD-sentence-transformers")
 # Run inference
 sentences = [
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 # [3, 512]
 ```
 ## Citation
+If you find this model helpful, feel free to cite:
+```
+@inproceedings{uar-emnlp2021,
+  author    = {Rafael A. Rivera Soto and Olivia Miano and Juanita Ordonez and Barry Chen and Aleem Khan and Marcus Bishop and Nicholas Andrews},
+  title     = {Learning Universal Authorship Representations},
+  booktitle = {EMNLP},
+  year      = {2021},
+}
+```
+## License
+LUAR is distributed under the terms of the Apache License (Version 2.0).
+All new contributions must be made under the Apache-2.0 licenses.