Sequentially Fine-Tuned Language Model: jnjj/xd_v1

Model Description

This repository hosts a language model that is being sequentially fine-tuned using Low-Rank Adaptation (LoRA) on a diverse range of datasets from the Hugging Face Hub. The process starts from the base model jnjj/multi-dataset-model (or its last fine-tuned state from this repository) and continuously adapts by merging LoRA weights after each dataset training cycle. This experiment aims to build a model with broad, cumulatively acquired knowledge. Current Base for Fine-Tuning: jnjj/multi-dataset-model The fully merged model weights and tokenizer are updated periodically at the root of this repository.

Training Methodology

Iterative Fine-Tuning: The model undergoes cycles of training on different dataset configurations.
LoRA Integration: PEFT's LoRA is employed for parameter-efficient fine-tuning. Adapters are merged post-training.
Dynamic Dataset Source: The script iterates through a wide array of Hugging Face Hub datasets.
Rapid Iteration Strategy: Training per dataset configuration is brief (max_steps=1), prioritizing breadth of exposure over depth on any single dataset.

Training Progress

Datasets Processed (Successfully trained on at least one config): 1
Text Examples Streamed (Total): 6
Tokens Processed (Total): 3072
Last Successful Model Update: 2025-05-08 18:02:08 UTC

Evaluation Snapshot (Approximate)

Current Perplexity (wikitext Subset): 282.70
Perplexity Change: -0.51 ⬇️ (vs previous cycle's perplexity)

Generated Examples (Qualitative Assessment)

Category	Input Prompt Snippet	Generated Output Snippet
Story Continuation	`Once upon a time, in a small villag...`	`How do I get the best picture of what we...`
Simple Instruction	`Explain in one sentence why trees a...`	`I have been trying to make progress and ...`
Creative Prompt	`Describe a friendly robot that love...`	`We are pleased to announce the launch of...`
Question Answering (Basic)	`What is the main color of a ripe ba...`	`As an example we've been using the same ...`
Code Generation (Simple Python)	`Write a Python function that takes ...`	`We are looking forward to seeing us in t...`
Reasoning (Simple)	`If a train leaves station A at 10:0...`	`The time of day we were trying to get ou...`

Standard Benchmarks (via `lighteval`)

Note: Running standard benchmarks requires a dedicated setup using the lighteval harness. The table below shows scores if available in evaluation_stats.json, otherwise N/A.

Common Benchmarks

Category	Benchmark	# Shots	Metric	This Model (`xd_v1`)	Llama 3.1 70B (Ref)
Reasoning & Knowledge	MMLU (Avg)	5	acc_norm	`N/A`	79.3
Reasoning & Knowledge	MMLU-Pro	5	acc	`N/A`	53.8
Reasoning & Knowledge	MATH	4	acc	`N/A`	41.6
Reasoning & Knowledge	TruthfulQA (MC2)	0	mc2	`N/A`	-
Reasoning & Knowledge	GPQA Diamond	0	acc	`N/A`	50.5
Code	MBPP	3	pass@1	`N/A`	66.4
Code	LiveCodeBench	0	pass@1	`N/A`	33.3
Multilingual	TydiQA	1	f1	`N/A`	29.9
Multilingual	MGSM	0	acc	`N/A`	91.1

How to Use

Load the model and tokenizer via transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "jnjj/xd_v1"
# For local usage after downloading:
# model_id = "./model_files"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# model.to("cuda") # if GPU is available
prompt = "Explain the concept of photosynthesis in simple terms:"
inputs = tokenizer(prompt, return_tensors="pt") # .to("cuda" if GPU available)
output_sequences = model.generate(
    **inputs,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)
generated_text = tokenizer.decode(output_sequences[0], skip_special_tokens=True)
print(generated_text)

Limitations & Considerations

This model is an experimental artifact of continuous learning; quality and coherence may vary.
Biases present in the underlying datasets may be reflected or amplified.
Performance on specific tasks is not guaranteed and may fluctuate as new datasets are incorporated.
Intended for research and exploration of sequential fine-tuning dynamics. For rigorous benchmarking, consider using tools like lighteval.

Disclaimer

This model is provided as-is. It may generate inaccurate, biased, or otherwise problematic content. Users should exercise discretion.