dmitry-vorobiev
/

rubert_ria_headlines

encoder-decoder

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

dmitry-vorobiev commited on Jan 29, 2021

Commit

bff4577

·

1 Parent(s): 7a34ec2

upd README

Files changed (1) hide show

README.md +61 -1

README.md CHANGED Viewed

@@ -11,6 +11,66 @@ license: MIT
 ## Description
 *bert2bert* model, initialized with the `DeepPavlov/rubert-base-cased` pretrained weights and
    fine-tuned on the first 90% of ["Rossiya Segodnya" news dataset](https://github.com/RossiyaSegodnya/ria_news_dataset) for 1.6 epochs.
 ## Datasets
-- [ria_news](https://github.com/RossiyaSegodnya/ria_news_dataset)

 ## Description
 *bert2bert* model, initialized with the `DeepPavlov/rubert-base-cased` pretrained weights and
    fine-tuned on the first 90% of ["Rossiya Segodnya" news dataset](https://github.com/RossiyaSegodnya/ria_news_dataset) for 1.6 epochs.
+## Usage example
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+MODEL_NAME = "dmitry-vorobiev/rubert_ria_headlines"
+tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
+model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)
+text = "Скопируйте текст статьи / новости"
+encoded_batch = tokenizer.prepare_seq2seq_batch(
+    [text],
+    return_tensors="pt",
+    padding="max_length",
+    truncation=True,
+    max_length=512)
+output_ids = model.generate(
+    input_ids=encoded_batch["input_ids"],
+    max_length=32,
+    no_repeat_ngram_size=3,
+    num_beams=5,
+    top_k=0
+)
+headline = tokenizer.decode(output_ids[0],
+                            skip_special_tokens=True,
+                            clean_up_tokenization_spaces=False)
+print(headline)
+```
 ## Datasets
+- [ria_news](https://github.com/RossiyaSegodnya/ria_news_dataset)
+## How it was trained?
+Short answer - it's a mess :D
+1. [0.4 ep](https://www.kaggle.com/dvorobiev/train-seq2seq?scriptVersionId=52758945)
+2. [0.8 ep](https://www.kaggle.com/dvorobiev/train-seq2seq?scriptVersionId=52794838)
+3. [1.2 ep](https://www.kaggle.com/dvorobiev/train-seq2seq?scriptVersionId=52838778)
+4. [1.6 ep](https://www.kaggle.com/dvorobiev/train-seq2seq?scriptVersionId=52876230)
+Common train params:
+```shell
+python nlp_headline_rus/src/train_seq2seq.py \
+    --do_train \
+    --fp16 \
+    --tie_encoder_decoder \
+    --max_source_length 512 \
+    --max_target_length 32 \
+    --val_max_target_length 48 \
+    --per_device_train_batch_size 14 \
+    --gradient_accumulation_steps 4 \
+    --warmup_steps 2000 \
+    --learning_rate 3e-4 \
+    --adam_epsilon 1e-6 \
+    --weight_decay 1e-5 \
+```