significant extra memory usage compared to the other 27b
2
#4 opened 2 days ago
by
FlorinAndrei

Is it possible to fine-tune gemma 3 in a context beyond 131k?
#3 opened 16 days ago
by
dophys
fp16 merge from this model checkpoint
#1 opened about 1 month ago
by
Jongsim
