AWS Trainium & Inferentia documentation

NeuronX Text-generation-inference for AWS inferentia2

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

NeuronX Text-generation-inference for AWS inferentia2

Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs).

A neuron backend allows to deploy TGI for Trainium and Inferentia chips.