Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Advanced 4-bit QLoRA fine-tuning pipeline for LLaMA 3 8B with production-grade optimization. Memory-efficient training on consumer GPUs for instruction-following specialization. Demonstrates cutting-edge parameter-efficient fine-tuning with Unsloth integration.

License

Notifications You must be signed in to change notification settings

Cre4T3Tiv3/unsloth-llama3-alpaca-lora

Open more actions menu

Repository files navigation

Demo GIF

Instruction-tuned LoRA adapter for LLaMA 3 8B using QLoRA + Alpaca-style prompts, trained with Unsloth.

HF Model HF Demo Space GitHub Stars License ByteStack Labs


Overview

This repo hosts the training, evaluation, and inference pipeline for:

Cre4T3Tiv3/unsloth-llama3-alpaca-lora

A 4-bit QLoRA LoRA adapter trained on:

Core Stack

  • Base Model: unsloth/llama-3-8b-bnb-4bit
  • Adapter Format: LoRA (merged post-training)
  • Training Framework: Unsloth + HuggingFace PEFT
  • Training Infra: A100 (40GB), 4-bit quantization

Intended Use

This adapter is purpose-built for:

  • Instruction-following LLM tasks
  • Low-resource, local inference (4-bit, merged LoRA)
  • Agentic tools and CLI assistants
  • Educational demos (fine-tuning, PEFT, Unsloth)
  • Quick deployment in QLoRA-aware stacks

Limitations

  • Trained on ~2K samples + 3 custom prompts
  • Single-run fine-tune only
  • Not optimized for >2K context
  • 4-bit quantization may reduce fidelity
  • Hallucinations possible; not production-ready for critical workflows
  • Previously hallucinated QLoRA terms now corrected; tested via eval script
  • Still not production-grade for factual QA or critical domains

Evaluation

This repo includes an eval_adapter.py script that:

  • Checks for hallucination patterns (e.g. false QLoRA definitions)
  • Computes keyword overlap per instruction (≥4/6 threshold)
  • Outputs JSON summary (eval_results.json) with full logs

Run make eval to validate adapter behavior.


Training Configuration

Parameter Value
Base Model unsloth/llama-3-8b-bnb-4bit
Adapter Format LoRA (merged)
LoRA r 16
LoRA alpha 16
LoRA dropout 0.05
Epochs 2
Examples ~2K (alpaca-cleaned + grounded)
Precision 4-bit (bnb)

Usage

make install   # Create .venv and install with uv
make train     # Train LoRA adapter
make eval      # Evaluate output quality
make run       # Run quick inference

Hugging Face Login

export HUGGINGFACE_TOKEN=hf_xxx
make login

Local Inference (Python)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE = "unsloth/llama-3-8b-bnb-4bit"
ADAPTER = "Cre4T3Tiv3/unsloth-llama3-alpaca-lora"

base_model = AutoModelForCausalLM.from_pretrained(BASE, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(base_model, ADAPTER).merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)

prompt = "### Instruction:\nExplain LoRA fine-tuning in simple terms.\n\n### Response:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Demo Space

🖥 Try the model live via Hugging Face Spaces:

Launch Demo → unsloth-llama3-alpaca-demo


Links


Built With


Maintainer

Built with ❤️ by @Cre4T3Tiv3 at ByteStack Labs


Citation

If you use this adapter or its training methodology, please consider citing:

@software{unsloth-llama3-alpaca-lora,
  author = {Jesse Moses, Cre4T3Tiv3},
  title = {Unsloth LoRA Adapter for LLaMA 3 (8B)},
  year = {2025},
  url = {https://huggingface.co/Cre4T3Tiv3/unsloth-llama3-alpaca-lora},
}

License

Apache 2.0


About

Advanced 4-bit QLoRA fine-tuning pipeline for LLaMA 3 8B with production-grade optimization. Memory-efficient training on consumer GPUs for instruction-following specialization. Demonstrates cutting-edge parameter-efficient fine-tuning with Unsloth integration.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.