pgptlformer-tinystories / re-pqt-rmsXrmsx2x2-ATTNII-791967c5-5c59-4a5f-a2c5-07772bcf65ab.txt
SQCU's picture
89,301,000 parameter attention_ii, z_lossed model trained for 6250 steps at batchsize:4*32, device_batchsize:32
8a69386 verified
raw
history contribute delete
585 kB
File too large to display, you can check the raw version instead.