SandLogicTechnologies
/

Fino1-8B-GGUF

Model card Files Files and versions Community

Fino1-8B-GGUF / README.md

SandLogicTechnologies's picture

SandLogicTechnologies

Create README.md

8684c51 verified 3 months ago

|

history blame contribute delete

2.74 kB

	---
	license: llama3.1
	datasets:
	- TheFinAI/Fino1_Reasoning_Path_FinQA
	language:
	- en
	base_model:
	- TheFinAI/Fino1-8B
	tags:
	- Llama
	- conversational
	- finance
	---
	# Fino1-8B Quantized Models

	This repository contains Q4_KM and Q5_KM quantized versions of [TheFinAI/Fino1-8B](https://huggingface.co/TheFinAI/Fino1-8B), a financial reasoning model based on Llama 3.1 8B Instruct. These quantized variants maintain the model's financial reasoning capabilities while providing significant memory and speed improvements.

	Discover our full range of quantized language models by visiting our [SandLogic Lexicon HuggingFace](https://huggingface.co/SandLogicTechnologies). To learn more about our company and services, check out our website at [SandLogic](https://www.sandlogic.com/).

	## Model Details

	### Base Information
	- Original Model: Fino1-8B
	- Quantized Versions:
	- Q4_KM (4-bit quantization)
	- Q5_KM (5-bit quantization)
	- Base Architecture: Llama 3.1 8B Instruct
	- Primary Focus: Financial reasoning tasks
	- Paper: [arxiv.org/abs/2502.08127](https://arxiv.org/abs/2502.08127)


	## 💰 Financial Capabilities

	Both quantized versions maintain the original model's strengths in:
	- Financial mathematical reasoning
	- Structured financial question answering
	- FinQA dataset-based problems
	- Step-by-step financial calculations
	- Financial document analysis
	### Quantization Benefits

	#### Q4_KM Version
	- Model size: 4.92 GB (75% reduction)
	- Optimal for resource-constrained environments
	- Faster inference speed
	- Suitable for rapid financial calculations

	#### Q5_KM Version
	- Model size: 5.73 GB (69% reduction)
	- Better quality preservation
	- Balanced performance-size trade-off
	- Recommended for precision-critical financial applications




	## 🚀 Usage
	```bash
	pip install llama-cpp-python
	```
	Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.


	```python
	from llama_cpp import Llama

	llm = Llama(
	model_path="model/path/",
	verbose=False,
	# n_gpu_layers=-1, # Uncomment to use GPU acceleration
	# n_ctx=2048, # Uncomment to increase the context window
	)

	# Example of a reasoning task
	output = llm(
	"""Q: A company's revenue grew from $100,000 to $150,000 in one year.
	Calculate the percentage growth rate. A: """,
	max_tokens=256,
	stop=["Q:", "\n\n"],
	echo=False
	)

	print(output["choices"][0]["text"])

	```

	## Training Details

	### Original Model Training
	- Dataset: TheFinAI/Fino1_Reasoning_Path_FinQA
	- Methods: SFT (Supervised Fine-Tuning) and RF
	- Hardware: 4xH100 GPUs
	- Configuration:
	- Batch Size: 16
	- Learning Rate: 2e-5
	- Epochs: 3
	- Optimizer: AdamW