GitHub - Cre4T3Tiv3/unsloth-llama3-alpaca-lora: Advanced 4-bit QLoRA fine-tuning pipeline for LLaMA 3 8B with production-grade optimization. Memory-efficient training on consumer GPUs for instruction-following specialization. Demonstrates cutting-edge parameter-efficient fine-tuning with Unsloth integration.

Instruction-tuned LoRA adapter for LLaMA 3 8B using QLoRA + Alpaca-style prompts, trained with Unsloth.

Overview

This repo hosts the training, evaluation, and inference pipeline for:

Cre4T3Tiv3/unsloth-llama3-alpaca-lora

A 4-bit QLoRA LoRA adapter trained on:

yahma/alpaca-cleaned
30+ grounded examples of QLoRA reasoning (added to mitigate hallucinations)

Core Stack

Base Model: unsloth/llama-3-8b-bnb-4bit
Adapter Format: LoRA (merged post-training)
Training Framework: Unsloth + HuggingFace PEFT
Training Infra: A100 (40GB), 4-bit quantization

Intended Use

This adapter is purpose-built for:

Instruction-following LLM tasks
Low-resource, local inference (4-bit, merged LoRA)
Agentic tools and CLI assistants
Educational demos (fine-tuning, PEFT, Unsloth)
Quick deployment in QLoRA-aware stacks

Limitations

Trained on ~2K samples + 3 custom prompts
Single-run fine-tune only
Not optimized for >2K context
4-bit quantization may reduce fidelity
Hallucinations possible; not production-ready for critical workflows
Previously hallucinated QLoRA terms now corrected; tested via eval script
Still not production-grade for factual QA or critical domains

Evaluation

This repo includes an eval_adapter.py script that:

Checks for hallucination patterns (e.g. false QLoRA definitions)
Computes keyword overlap per instruction (≥4/6 threshold)
Outputs JSON summary (eval_results.json) with full logs

Run make eval to validate adapter behavior.

Training Configuration

Parameter	Value
Base Model	`unsloth/llama-3-8b-bnb-4bit`
Adapter Format	LoRA (merged)
LoRA `r`	16
LoRA `alpha`	16
LoRA `dropout`	0.05
Epochs	2
Examples	~2K (alpaca-cleaned + grounded)
Precision	4-bit (bnb)

Usage

make install   # Create .venv and install with uv
make train     # Train LoRA adapter
make eval      # Evaluate output quality
make run       # Run quick inference

Hugging Face Login

export HUGGINGFACE_TOKEN=hf_xxx
make login

Local Inference (Python)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE = "unsloth/llama-3-8b-bnb-4bit"
ADAPTER = "Cre4T3Tiv3/unsloth-llama3-alpaca-lora"

base_model = AutoModelForCausalLM.from_pretrained(BASE, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(base_model, ADAPTER).merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)

prompt = "### Instruction:\nExplain LoRA fine-tuning in simple terms.\n\n### Response:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Demo Space

🖥 Try the model live via Hugging Face Spaces:

Launch Demo → unsloth-llama3-alpaca-demo

Links

Built With

Maintainer

Built with ❤️ by @Cre4T3Tiv3 at ByteStack Labs

Citation

If you use this adapter or its training methodology, please consider citing:

@software{unsloth-llama3-alpaca-lora,
  author = {Jesse Moses, Cre4T3Tiv3},
  title = {Unsloth LoRA Adapter for LLaMA 3 (8B)},
  year = {2025},
  url = {https://huggingface.co/Cre4T3Tiv3/unsloth-llama3-alpaca-lora},
}

License

Apache 2.0

Name	Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github	.github
docs/assets	docs/assets
scripts	scripts
.gitignore	.gitignore
LICENSE	LICENSE
Makefile	Makefile
README.md	README.md
pyproject.toml	pyproject.toml
unsloth_llama3_alpaca_lora_training.ipynb	unsloth_llama3_alpaca_lora_training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Overview

Core Stack

Intended Use

Limitations

Evaluation

Training Configuration

Usage

Hugging Face Login

Local Inference (Python)

Demo Space

Links

Built With

Maintainer

Citation

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Search code, repositories, users, issues, pull requests...

Uh oh!

License

Cre4T3Tiv3/unsloth-llama3-alpaca-lora

Folders and files

Latest commit

History

Repository files navigation

Overview

Core Stack

Intended Use

Limitations

Evaluation

Training Configuration

Usage

Hugging Face Login

Local Inference (Python)

Demo Space

Links

Built With

Maintainer

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages