Skip to content

LEGIONHETO

A comprehensive framework for fine-tuning Large Language Models with support for all major architectures.

Features

  • All Major Architectures: Llama, Mistral, Qwen, DeepSeek, Phi, Gemma, Command, Falcon, Yi, and more
  • Custom Optimizations: 8-bit AdamW with block-wise quantization
  • Memory Efficient: Flash Attention 2, gradient checkpointing, DeepSeek MLA
  • Multiple Trainers: SFT, DPO, and ORPO training methods
  • Model Merging: SLERP, TIES, and DARE algorithms
  • Export: GGUF format with multiple quantization types

Quick Start

from legionheto import LegionHetoModel, SFTTrainer

model = LegionHetoModel("meta-llama/Llama-2-7b-hf")
model.setup_lora(r=16, alpha=32)

trainer = SFTTrainer(model, dataset, output_dir="./output")
trainer.train()

Installation

pip install legionheto

For Flash Attention support:

pip install legionheto[flash-attn]

Supported Models

LEGIONHETO automatically detects and configures optimal settings for:

  • Llama / Llama 2 / Llama 3
  • Mistral / Mixtral
  • Qwen / Qwen2
  • DeepSeek
  • Phi / Phi-3
  • Gemma / Gemma 2
  • Command (Cohere)
  • Falcon
  • Yi
  • StableLM
  • MPT
  • GPT-2 / GPT-NeoX / GPT-J
  • BLOOM
  • Mamba
  • Granite
  • Nemotron
  • OpenELM

License

MIT License