LEGIONHETO
A comprehensive framework for fine-tuning Large Language Models with support for all major architectures.
Features
- All Major Architectures: Llama, Mistral, Qwen, DeepSeek, Phi, Gemma, Command, Falcon, Yi, and more
- Custom Optimizations: 8-bit AdamW with block-wise quantization
- Memory Efficient: Flash Attention 2, gradient checkpointing, DeepSeek MLA
- Multiple Trainers: SFT, DPO, and ORPO training methods
- Model Merging: SLERP, TIES, and DARE algorithms
- Export: GGUF format with multiple quantization types
Quick Start
from legionheto import LegionHetoModel, SFTTrainer
model = LegionHetoModel("meta-llama/Llama-2-7b-hf")
model.setup_lora(r=16, alpha=32)
trainer = SFTTrainer(model, dataset, output_dir="./output")
trainer.train()
Installation
For Flash Attention support:
Supported Models
LEGIONHETO automatically detects and configures optimal settings for:
- Llama / Llama 2 / Llama 3
- Mistral / Mixtral
- Qwen / Qwen2
- DeepSeek
- Phi / Phi-3
- Gemma / Gemma 2
- Command (Cohere)
- Falcon
- Yi
- StableLM
- MPT
- GPT-2 / GPT-NeoX / GPT-J
- BLOOM
- Mamba
- Granite
- Nemotron
- OpenELM
License
MIT License