Downscaling the `Q_q` and `W_k` matrices for repeated layers in franken-merges
14
#4 opened about 1 year ago
by
jukofyork

Guidance on GPU VRAM Split?
5
#3 opened over 1 year ago
by
nmitchko
Performance
13
#2 opened over 1 year ago
by
KnutJaegersberg
