LEGIONHETO

A comprehensive framework for fine-tuning Large Language Models with support for all major architectures.

Features

All Major Architectures: Llama, Mistral, Qwen, DeepSeek, Phi, Gemma, Command, Falcon, Yi, and more
Custom Optimizations: 8-bit AdamW with block-wise quantization
Memory Efficient: Flash Attention 2, gradient checkpointing, DeepSeek MLA
Multiple Trainers: SFT, DPO, and ORPO training methods
Model Merging: SLERP, TIES, and DARE algorithms
Export: GGUF format with multiple quantization types

Quick Start

from legionheto import LegionHetoModel, SFTTrainer

model = LegionHetoModel("meta-llama/Llama-2-7b-hf")
model.setup_lora(r=16, alpha=32)

trainer = SFTTrainer(model, dataset, output_dir="./output")
trainer.train()

Installation

pip install legionheto

For Flash Attention support:

pip install legionheto[flash-attn]

Supported Models

LEGIONHETO automatically detects and configures optimal settings for:

Llama / Llama 2 / Llama 3
Mistral / Mixtral
Qwen / Qwen2
DeepSeek
Phi / Phi-3
Gemma / Gemma 2
Command (Cohere)
Falcon
Yi
StableLM
MPT
GPT-2 / GPT-NeoX / GPT-J
BLOOM
Mamba
Granite
Nemotron
OpenELM

License

MIT License