Hi Michał,
Thank you for your work and for sharing your scripts! I'm currently trying to reproduce the results, but my model didn’t seem to converge at the 10,000th step, as indicated in your scripts. Could you clarify how many steps you used when training the released checkpoints?
Thank you for your help!