Model Card for Qwen3-4b-it-bookMeta

Model Details

  • Model ID: nesemenpolkov/Qwen3-4b-it-bookMeta
  • Model Type: Causal Language Model
  • Library: Transformers
  • Framework: PyTorch

Model Description

The Qwen-3-4b-it-bookMeta model is designed to extract metadata from book descriptions. It can identify authors, titles, publishers, years, page counts, translators, and illustrators, editors, compilers, opening speech and literature note (for i.e: "Предисловие") from a given text. Model was trained using Alpaca-like prompt styling.

Usage

Below is an example of how to use the Qwen3-4b-it-bookMeta model to extract metadata from a book description:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "nesemenpolkov/Qwen3-4b-it-bookMeta"

# загрузка модели и токенизатора / load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# пример промпта / prompt example
prompt_template = """Есть строка, из которой нужно выделять атрибуты:

    1) название (title)
    2) авторы (authors)
    3) паблишеры (publishers)
    4) иллюстраторы (illustrators)
    5) переводчики (translators)
    6) редакторы (editors)
    7) составители (compilers)
    8) литературная запись (lit_note)
    9) вступительное слово (opening_speech)
    10) год (year)
    11) кол-во страниц (pages_cnt) 
    
    Если какой-либо атрибут отсутствует, то укажи "нет".

    Выдели атрибуты из этой строки:
    {text}
    """

# примеры строк / text strings examples
# text = "Mikolov T., Corrado G., Chen K., Dean J. Efficient Estimation of Word Representations in Vector Space. Proceedings of the International Conference on Learning Representations ICLR, 2013. P. 1-12."
text = "Арефьев Н., Панченко А., Лукании А., Лесота О., Романов П. Сравнение трех систем семантической близости для русского языка // ДИАЛОГ-2015 // графика Михеев В. // языковая адаптация Анисимов А. [Электронный ресурс] URL: https://www.dialog=21.ru/digests/dialog2015/materials/pdf/ArefyevNVetal.pdf (дата обращения: 10.08.2022)."
messages = [
    {"role": "user", "content": prompt_template.format(text=text)}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# генерация ответа / response generation
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# декодирование токенов / token decoding
content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")

print("content:", content)

# ->title: Сравнение трех систем семантической близости для русского языка
# ->authors: Арефьев Н., Панченко А., Лукании А., Лесота О., Романов П.
# ->publishers: ДИАЛОГ-2015
# ->illustrators: графика Михеев В.
# ->translators: языковая адаптация Анисимов А.
# ->editors: нет
# ->compilers: нет
# ->lit_note: нет
# ->opening_speech: нет
# ->year: 2015
# ->pages_cnt: нет

Input/Output

  • Input: A text string containing book metadata.
  • Output: Extracted metadata fields such as authors, title, publisher, year, page count, translator, illustrators, editors, compilers and other.

Limitations

  • The model may struggle with non-standard or incomplete book descriptions.
  • Accuracy may vary depending on the formatting and language of the input text.

Ethical Considerations

  • Ensure that the model is used responsibly and ethically, respecting privacy and copyright laws when processing book metadata.

Contact

For more information or support, please contact the model maintainers.

Downloads last month
60
Safetensors
Model size
4.02B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nesemenpolkov/Qwen3-4b-it-bookMeta

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
(49)
this model

Collection including nesemenpolkov/Qwen3-4b-it-bookMeta