[#Script #Coding] Create a Large Language Model from Scratch with Python – Tutorial

Posted on Saturday August 26, 2023 by Eric Brooks

Spread the love

Create a Large Language Model from Scratch with Python – Tutorial

By freeCodeCamp.org
Published: Aug 25, 2023

“
Learn how to build your own large language model, from scratch. This course goes into the data handling, math, and transformers behind large language models. You will use Python.

âœï¸ Course developed by @elliotarledge

ðŸ’» Code and course resources: https://github.com/Infatoshi/fcc-intro-to-llms

âï¸ Contents âï¸
(0:00:00) Intro
(0:03:25) Install Libraries
(0:06:24) Pylzma build tools
(0:08:58) Jupyter Notebook
(0:12:11) Download wizard of oz
(0:14:51) Experimenting with text file
(0:17:58) Character-level tokenizer
(0:19:44) Types of tokenizers
(0:20:58) Tensors instead of Arrays
(0:22:37) Linear Algebra heads up
(0:23:29) Train and validation splits
(0:25:30) Premise of Bigram Model
(0:26:41) Inputs and Targets
(0:29:29) Inputs and Targets Implementation
(0:30:10) Batch size hyperparameter
(0:32:13) Switching from CPU to CUDA
(0:33:28) PyTorch Overview
(0:42:49) CPU vs GPU performance in PyTorch
(0:47:49) More PyTorch Functions
(1:06:03) Embedding Vectors
(1:11:33) Embedding Implementation
(1:13:06) Dot Product and Matrix Multiplication
(1:25:42) Matmul Implementation
(1:26:56) Int vs Float
(1:29:52) Recap and get_batch
(1:35:07) nnModule subclass
(1:37:05) Gradient Descent
(1:50:53) Logits and Reshaping
(1:59:28) Generate function and giving the model some context
(2:03:58) Logits Dimensionality
(2:05:17) Training loop + Optimizer + Zerograd explanation
(2:13:56) Optimizers Overview
(2:17:04) Applications of Optimizers
(2:18:11) Loss reporting + Train VS Eval mode
(2:32:54) Normalization Overview
(2:35:45) ReLU, Sigmoid, Tanh Activations
(2:45:15) Transformer and Self-Attention
(2:46:55) Transformer Architecture
(3:17:54) Building a GPT, not Transformer model
(3:19:46) Self-Attention Deep Dive
(3:25:05) GPT architecture
(3:27:07) Switching to Macbook
(3:31:42) Implementing Positional Encoding
(3:36:57) GPTLanguageModel initalization
(3:40:52) GPTLanguageModel forward pass
(3:46:56) Standard Deviation for model parameters
(4:00:50) Transformer Blocks
(4:04:54) FeedForward network
(4:07:53) Multi-head Attention
(4:12:49) Dot product attention
(4:19:43) Why we scale by 1/sqrt(dk)
(4:26:45) Sequential VS ModuleList Processing
(4:30:47) Overview Hyperparameters
(4:32:14) Fixing errors, refining
(4:34:01) Begin training
(4:35:46) OpenWebText download and Survey of LLMs paper
(4:37:56) How the dataloader/batch getter will have to change
(4:41:20) Extract corpus with winrar
(4:43:44) Python data extractor
(4:49:23) Adjusting for train and val splits
(4:57:55) Adding dataloader
(4:59:04) Training on OpenWebText
(5:02:22) Training works well, model loading/saving
(5:04:18) Pickling
(5:05:32) Fixing errors + GPU Memory in task manager
(5:14:05) Command line argument parsing
(5:18:11) Porting code to script
(5:22:04) Prompt: Completion feature + more errors
(5:24:23) nnModule inheritance + generation cropping
(5:27:54) Pretraining vs Finetuning
(5:33:07) R&D pointers
(5:44:38) Outro

ðŸŽ‰ Thanks to our Champion and Sponsor supporters:
ðŸ‘¾ davthecoder
ðŸ‘¾ jedi-or-sith
ðŸ‘¾ å—å®®åƒå½±
ðŸ‘¾ Agustín Kussrow
ðŸ‘¾ Nattira Maneerat
ðŸ‘¾ Heather Wcislo
ðŸ‘¾ Serhiy Kalinets
ðŸ‘¾ Justin Hual
ðŸ‘¾ Otis Morgan

—

Learn to code for free and get a developer job: https://www.freecodecamp.org

Read hundreds of articles on programming: https://freecodecamp.org/news

[READ MORE]

”

Spread the love

Proudly powered by WordPress

EricBrooks.Com^® is licensed under a Creative Commons License.

Disclaimer: The views expressed herein are solely those of Eric Brooks. They do not necessarily reflect those of his employers, friends, contacts, family, or even his pets (though my cat, Puddy, seems to agree with me on many key issues.). In accordance to my terms of use, you hereby acknowledge my right to psychoanalyze you, practice accupuncture, and mock you incessantly with every visit. As the user, you also acknowledge that the author has been legally declared a "Problem Adult" by the Commonwealth of Pennsylvania, and is therefore not responsible for any of his actions. ALSO, the political views and products advertised on this site may/may not reflect the views of Puddy or myself, so please don't take them as an endorsement. We just need to eat.

Connect