yobbo7 hours ago
Looks very nice, but I can't find numerical gradient checks, which is helpful when verifying that backward pass is correct:
https://github.com/markusheimerl/gpt/blob/main/transformer/a...
markusheimerlop4 hours ago
I deleted the numerical checks a while back after confirming the backward pass is correct to keep the code base lean - running https://github.com/markusheimerl/gpt/blob/main/transformer/a... is also somewhat of a confirmation that the backward pass is correct, since an analytically incorrect backward pass cant fit perfectly to synthetic data.
[deleted]3 days agocollapsed