Add BOS token to config

#8

The tokenizer already has the BOS token <|startoftext|> in its vocabulary, but it is currently not set in the configuration and thus not used. This is causing issues with several downstream libraries that depend on the existence of a BOS token. This PR simply sets it.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment