Llama-3.1-8B-Instruct-LegalCite-gguf

Built with LLaMA - Fine-tuned for legal citation extraction from documents like EU regulations.

This model specializes in quoting schort legal sections in response to user questions — directly grounded in the input text. It was fine-tuned on legislative corpora to enable compliance-minded, transparent responses.

Technical Notes

Finetuning was performed using a 4-bit MLX version of the original model, enabling efficient experimentation on Apple Silicon hardware.

On the evaluation set, we observed a significant reduction in loss and perplexity:

Test loss: ~2.4 → ~1.1
Test perplexity: ~11.1 → ~2.9

The model was then converted and exported to the gguf format.

This is a GGUF 4bit version
Transformers-compatible model: https://huggingface.co/seasparks/Llama-3.1-8B-Instruct-LegalCite

How to use

With llama-cpp-python

from huggingface_hub import hf_hub_download
from llama_cpp import Llama

model_path = hf_hub_download(
    repo_id="seasparks/Llama-3.1-8B-Instruct-LegalCite-gguf",
    filename="Llama-3.1-8B-Instruct-LegalCite-q4_k.gguf"
)

llm = Llama(model_path=model_path)

sys_message = {"role": "system", "content": "You are an expert assistant answering questions about texts only by accurately citing and providing direct quotes from the text."}

example_text = """(3)
AI systems can be easily deployed in a large variety of sectors of the economy and many parts of society, including
across borders, and can easily circulate throughout the Union. Certain Member States have already explored the
adoption of national rules to ensure that AI is trustworthy and safe and is developed and used in accordance with
fundamental rights obligations. Diverging national rules may lead to the fragmentation of the internal market and
may decrease legal certainty for operators that develop, import or use AI systems. A consistent and high level of
protection throughout the Union should therefore be ensured in order to achieve trustworthy AI, while divergences
hampering the free circulation, innovation, deployment and the uptake of AI systems and related products and
services within the internal market should be prevented by laying down uniform obligations for operators and
guaranteeing the uniform protection of overriding reasons of public interest and of rights of persons throughout the
internal market on the basis of Article 114 of the Treaty on the Functioning of the European Union (TFEU). To the
extent that this Regulation contains specific rules on the protection of individuals with regard to the processing of
personal data concerning restrictions of the use of AI systems for remote biometric identification for the purpose of
law enforcement, of the use of AI systems for risk assessments of natural persons for the purpose of law
enforcement and of the use of AI systems of biometric categorisation for the purpose of law enforcement, it is
appropriate to base this Regulation, in so far as those specific rules are concerned, on Article 16 TFEU. In light of
those specific rules and the recourse to Article 16 TFEU, it is appropriate to consult the European Data Protection
Board."""

question = "How will trustworthy AI be achieved?"

messages = [sys_message,
    {"role": "user", "content": example_text + "\nQuestion: " + question}]

response = llm.create_chat_completion(
    messages=messages,
    max_tokens=100,
)

# Print the assistant's reply
print(response['choices'][0]['message']['content'])

# Example output (actual model behavior may vary):
#
# Q: "How will trustworthy AI be achieved?"
# A: "...A consistent and high level of protection throughout the Union should therefore be ensured in order to achieve trustworthy AI, while divergences hampering the free circulation, innovation, deployment and the uptake of AI systems and related products and services within the internal market should be prevented..."

With Ollama installed locally

# Create Ollama local model with the Modelfile from the repository
ollama create Llama-3.1-8B-Instruct-LegalCite-q4_k -f Modelfile

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model = "Llama-3.1-8B-Instruct-LegalCite-q4_k",
    num_predict = 100,
)

sys_message = {"role": "system", "content": "You are an expert assistant answering questions about texts only by accurately citing and providing direct quotes from the text."}

example_text = """(3)
AI systems can be easily deployed in a large variety of sectors of the economy and many parts of society, including
across borders, and can easily circulate throughout the Union. Certain Member States have already explored the
adoption of national rules to ensure that AI is trustworthy and safe and is developed and used in accordance with
fundamental rights obligations. Diverging national rules may lead to the fragmentation of the internal market and
may decrease legal certainty for operators that develop, import or use AI systems. A consistent and high level of
protection throughout the Union should therefore be ensured in order to achieve trustworthy AI, while divergences
hampering the free circulation, innovation, deployment and the uptake of AI systems and related products and
services within the internal market should be prevented by laying down uniform obligations for operators and
guaranteeing the uniform protection of overriding reasons of public interest and of rights of persons throughout the
internal market on the basis of Article 114 of the Treaty on the Functioning of the European Union (TFEU). To the
extent that this Regulation contains specific rules on the protection of individuals with regard to the processing of
personal data concerning restrictions of the use of AI systems for remote biometric identification for the purpose of
law enforcement, of the use of AI systems for risk assessments of natural persons for the purpose of law
enforcement and of the use of AI systems of biometric categorisation for the purpose of law enforcement, it is
appropriate to base this Regulation, in so far as those specific rules are concerned, on Article 16 TFEU. In light of
those specific rules and the recourse to Article 16 TFEU, it is appropriate to consult the European Data Protection
Board."""

question = "How will trustworthy AI be achieved?"

messages = [sys_message,
    {"role": "user", "content": example_text + "\nQuestion: " + question}]

print(llm.invoke(messages).content)

# Example output (actual model behavior may vary):
#
# Q: "How will trustworthy AI be achieved?"
# A: "...A consistent and high level of protection throughout the Union should therefore be ensured in order to achieve trustworthy AI, while divergences hampering the free circulation, innovation, deployment and the uptake of AI systems and related products and services within the internal market should be prevented..."

You can steer how the model answers (e.g. how long or precise the quote is) to some extent using standard generation parameters like temperature, top_p, and top_k.

Intended Use

This model is intended to be used in explorations and for research. It was fine-tuned to explore how LLMs can support accurate question answering using direct quotes — potentially helpful especially when it comes to pulling precise references from complex policy texts like the GDPR or the EU AI Act.

At this stage, it's an experimental prototype — for testing, demoing, or just seeing how far citation-style prompting can go.

It's not production-ready, and should only be used in experimental and research contexts.
It does not provide legal advice, and its outputs should always be verified by a human in professional settings.

We’re curious where this can go — and happy to hear from you with feedback and ideas for improvement.

🔖 License & Attribution

This model is a derivative of meta-llama/Meta-Llama-3.1-8B-Instruct
Licensed under the LLaMA 3.1 Community License
Redistributed with required attribution: Built with LLaMA

Interested in using this model or adapting it for your use case? → Contact us