banhabang
/

vit5-base-tag-generation

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

banhabang commited on Aug 18, 2023

Commit

b04c1a9

·

1 Parent(s): 68435c5

Update README.md

Files changed (1) hide show

README.md +9 -7

README.md CHANGED Viewed

@@ -43,12 +43,6 @@ group_by_length=True,
 I also evaluated the model on 20K dataset of video from youtube. We extract the title and tags (if possible) which is the input of the model. With videos with tags, we directly compare our tags with the existing tags. Otherwise, the obtained tags are evaluated by human. We see the results on link: https://drive.google.com/drive/folders/1RvywNl41QYNa2lthp-O8hakVCMsfX456
-[1] T. V. Bui, O. T. Tran, P. Le-Hong, Improving Sequence Tagging for Vietnamese Text using Transformer-based Neural Models, Proceedings of PACLIC 2020. link: https://github.com/fpt-corp/vELECTRA.
-[2]  Dat Quoc Nguyen and Anh-Tuan Nguyen. 2020. Phobert: Pre-trained language models for vietnamese. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1037–1042.
-[3] Long Phan, Hieu Tran, Hieu Nguyen, and Trieu H Trinh. 2022. Vit5: Pretrained text-to-text transformer for vietnamese language generation. arXiv preprint arXiv:2205.06457. link: https://github.com/vietai/ViT5
 How to use the model
 tokenizer = AutoTokenizer.from_pretrained("banhabang/vit5-base-tag-generation")
@@ -72,4 +66,12 @@ outputs = model.generate(
 for output in outputs:
-  outputs = tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)

 I also evaluated the model on 20K dataset of video from youtube. We extract the title and tags (if possible) which is the input of the model. With videos with tags, we directly compare our tags with the existing tags. Otherwise, the obtained tags are evaluated by human. We see the results on link: https://drive.google.com/drive/folders/1RvywNl41QYNa2lthp-O8hakVCMsfX456
 How to use the model
 tokenizer = AutoTokenizer.from_pretrained("banhabang/vit5-base-tag-generation")
 for output in outputs:
+  outputs = tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
+Reference
+[1] T. V. Bui, O. T. Tran, P. Le-Hong, Improving Sequence Tagging for Vietnamese Text using Transformer-based Neural Models, Proceedings of PACLIC 2020. link: https://github.com/fpt-corp/vELECTRA.
+[2]  Dat Quoc Nguyen and Anh-Tuan Nguyen. 2020. Phobert: Pre-trained language models for vietnamese. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1037–1042.
+[3] Long Phan, Hieu Tran, Hieu Nguyen, and Trieu H Trinh. 2022. Vit5: Pretrained text-to-text transformer for vietnamese language generation. arXiv preprint arXiv:2205.06457. link: https://github.com/vietai/ViT5