"Bidirectional attention"

#1
by olivierdehaene HF staff - opened

Hello,

It seems that compared to the 7B variant, this model has bi-directional attention turned off. Is this normal?
See this line where is_causal is set to True in this variant.
VS
This line where is_causal is set to False in the 7B variant.

Alibaba-NLP org

Thanks for your careful observation! The bi-directional attention is indeed used on this model. The code has already been updated now.

olivierdehaene changed discussion status to closed
olivierdehaene changed discussion status to open

@zyznull , it seems that you didn't properly update the code. is_causal is still set to True by default in the model forward which is the main entrypoint for Transformers and SentenceTransformers.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment