89,301,000 parameter attention_ii, z_lossed model trained for 6250 steps at batchsize:4*32, device_batchsize:32 8a69386 verified SQCU commited on Feb 1