English
facebook
meta-pytorch
blt
blt / README.md
par-meta's picture
Update README.md
1c7101b verified
---
license: other
license_name: fair-noncommercial-research-license
license_link: https://huggingface.co/facebook/blt/blob/main/LICENSE
extra_gated_fields:
First Name: text
Last Name: text
Date of birth: date_picker
Country: country
Affiliation: text
I accept the terms and conditions: checkbox
geo: ip_location
language:
- en
tags:
- facebook
- meta-pytorch
- blt
---
# Byte Latent Transformer (BLT)
This repository contains the model weights for our paper: "Byte Latent Transformer: Patches Scale Better Than Tokens"
- [Paper Link](https://dl.fbaipublicfiles.com/blt/BLT__Patches_Scale_Better_Than_Tokens.pdf)
- [HF Paper Link](https://huggingface.co/papers/2412.09871)
## Abstract
We introduce the Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that
for the first time, matches tokenization-based LLM performance at scale, with significant improvements
in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve
as the primary units of computation. Patches are segmented dynamically based on the entropy of the
next byte, allocating more compute and model capacity where there is more data complexity. The BLT
architecture includes new attention mechanisms to maximize the information flow between byte and
patch hidden representations and a new type of byte-sequence memory. We present the first scaling
study of byte-level models up to 8B parameters and 8T training bytes, showing for the first time
that we can train a model end-to-end at scale from bytes with no tokenization or other preprocessing.
Scaling trends reveal training and inference efficiency benefits from dynamically selecting very long
patches on average, along with qualitative improvements with reasoning and long tail generalization
from modeling byte-sequences.
To run the model, see the readme here: https://github.com/facebookresearch/blt
## Links
- Code: https://github.com/facebookresearch/blt
- BLT 1B Weights: https://huggingface.co/facebook/blt-1b
- BLT 7B Weights: https://huggingface.co/facebook/blt-7b
- BLT Weight Collection: https://huggingface.co/collections/facebook/blt-6801263d4ac1704702a192a6