Sunday, May 19, 2024

LLM Part 3: Decoding

Tada!  Here's Part 3 of the large language model series!

Are you excited for the next chapter of our LLM 101 series?  So far we've covered tokenization and encoding.  If you need a refresher, have a look at the pages below.
  1. LLM Part 1: Tokenization
  2. LLM Part 2: Encoding
In short, we've learnt how to break down pieces of text and assign a number to each piece.  This ensures that the model will "know" what your question is going to be.  But how do these models answer back in a way humans would understand?  This is where the concept of decoding is important.

If encoding is to turn text into numbers, decoding is the reverse: to turn numbers into text.  Here's a PDF of a Jupyter Notebook which allows us to revise the previous concepts as well as show what decoding looks like in code.


Now that you've had a look at the PDF, we've now covered tokenization (breaking down text to pieces), encoding (converting pieces of text to numbers), and decoding (converting numbers back to text).  

You should now be able to:
  • Type a text-based input into the model
  • Retrieve text back from the model

I know it doesn't sound like you're doing much at the moment but nailing these concepts will help you build and use LLMs in the long run.  Next time, we'll be asking the model to perform simple tasks using Python.  Namely, summarization, paraphrasing, and translation.

Ending

How have you found the LLM series so far?  Hopefully it's been helpful in understanding these basic concepts.  I remember being pretty lost when I first started studying about LLMs, so I'm aiming to make LLM studies more accessible for all!!! It may be easier to use ChatGPT to generate answers (especially since some of the earlier versions are available for free now) but I found that being able to use LLM using code is very satisfying. 💪

1 comment:

  1. This is a really clear and well paced continuation of the series. Breaking the journey into tokenization, encoding, and now decoding makes the learning curve feel far less intimidating, especially for people coming to LLMs without a deep math or ML background. The way you frame decoding as the natural reverse of encoding helps the concept click quickly.

    I also appreciate the decision to include a Jupyter Notebook example. Seeing decoding in action through code bridges the gap between theory and practical use, which is often where learners struggle. Your emphasis on mastering fundamentals before jumping into more advanced tasks sets readers up for long term success rather than quick wins.

    As this series grows and becomes a reusable learning resource, having strong processes around reviewing examples, validating outputs, and maintaining consistency becomes increasingly important. That mindset mirrors how teams approach complex systems in production, often relying on structured tools like QA management software to ensure quality as complexity increases.

    ReplyDelete

A New Frontier: Building bots without code!!!

 Dear Readers,  Welcome back to this month's Chronicles of a Neurodivergent Programmer.  Last month, I took a break from writing about t...