Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.

License

Notifications You must be signed in to change notification settings

codewithdark-git/Building-LLMs-from-scratch

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.


📘 Reference Book


🗓️ Weekly Curriculum Overview

🔹 Week 1: Foundations of Language Models

  • Set up the environment and tools.
  • Learn about tokenization, embeddings, and the idea of a "language model".
  • Encode input/output sequences and build basic forward models.
  • Understand unidirectional processing and causal language modeling.

🔹 Week 2: Building the Transformer Decoder

  • Explore Transformer components: attention, multi-head attention, and positional encoding.
  • Implement residual connections, normalization, and feedforward layers.
  • Build a GPT-style decoder-only transformer architecture.

🔹 Week 3: Training and Dataset Handling

  • Load and preprocess datasets like TinyShakespeare.
  • Implement batch creation, context windows, and training routines.
  • Use cross-entropy loss, optimizers, and learning rate schedulers.
  • Monitor perplexity and improve generalization.

🔹 Week 4: Text Generation and Deployment

  • Generate text using greedy, top-k, top-p, and temperature sampling.
  • Evaluate and tune generation.
  • Export and convert model for Hugging Face compatibility.
  • Deploy via Hugging Face Hub and Gradio Space.

Build Models

FaseehGPT is an advanced pipeline for training a GPT-style language model specifically designed for the Arabic language. FaseehGPT

🛠️ Getting Started

Prerequisites

  • Python 3.8+
  • PyTorch
  • NumPy
  • Matplotlib
  • JupyterLab or Notebooks
  • Hugging Face libraries: transformers, datasets, huggingface_hub
  • gradio for deployment

Installation

git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
cd Building-LLMs-from-scratch
pip install -r requirements.txt

📁 Project Structure

Building-LLMs-from-scratch/
├── notebooks/            # Weekly learning notebooks
├── models/               # Model architectures & checkpoints
├── data/                 # Preprocessing and datasets
├── hf_deploy/            # Hugging Face config & deployment scripts
├── theoretical/          # Podcast & theoretical discussions
├── utils/                # Helper scripts
├── requirements.txt
└── README.md

🚀 Hugging Face Deployment

This project includes:

  • Scripts to convert the model for 🤗 Transformers compatibility
  • Uploading to Hugging Face Hub
  • Launching an interactive demo on Hugging Face Spaces using Gradio

You’ll find detailed instructions inside the hf_deploy/ folder.


📚 Resources


📄 License

MIT License — see the LICENSE file for details.

About

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Morty Proxy This is a proxified and sanitized view of the page, visit original site.