精选5个月前0 投票
Let’s Build the GPT Tokenizer: A Complete Guide to Tokenization in LLMs
18 months ago, Andrej Karpathy set a challenge : “Can you take my 2h13m tokenizer video and translate the video into the format of a book chapter”. We’ve done it, and the chapter is below, including key pieces of code inlined, and images from the video at key points (hyperlinked to the video timestamp). It’s a great video for learning this key piece of how LLMs work, and this new text version is great too.


