Modular Python framework for building LLMs with INL Dynamics stability
Complexity Framework is a complete toolkit for building transformer architectures with built-in training stability. It provides:
Velocity tracking to prevent training explosion after 400k+ steps:
from complexity.api import INLDynamics
# CRITICAL: beta in [0, 2], NOT [0, inf)!
dynamics = INLDynamics(
hidden_size=768,
beta_max=2.0, # Clamp beta for stability
velocity_max=10.0, # Limit velocity
)
h_next, v_next = dynamics(hidden_states, velocity)
The bug we fixed: softplus without clamp goes to infinity, causing NaN after 400k steps. Clamping beta to [0, 2] keeps training stable.
INL Dynamics recovers from loss spikes thanks to velocity damping.
After beta clamping fix: training remains stable past 400k steps where it previously exploded.
pip install complexity-framework
from complexity.api import (
# Building blocks
Attention, MLP, RMSNorm, RoPE, INLDynamics,
# Optimizations
CUDA, Efficient,
# Architectures O(N)
Architecture, Mamba, RWKV,
)
# Flash Attention
attn = CUDA.flash(hidden_size=4096, num_heads=32)
# INL Dynamics (training stability)
dynamics = INLDynamics(hidden_size=768, beta_max=2.0)
h, velocity = dynamics(hidden_states, velocity)
# Small budget model
model = Efficient.tiny_llm(vocab_size=32000) # ~125M params
| Module | Description |
|---|---|
| Core | Attention (GQA/MHA/MQA), MLP (SwiGLU/GeGLU/MoE), Position (RoPE/YaRN/ALiBi) |
| INL Dynamics | Velocity tracking for training stability |
| CUDA/Triton | Flash Attention, Sliding Window, Sparse, Linear |
| Efficient | Quantization, Mixed Precision, Small Models |
| O(N) Architectures | Mamba, RWKV, RetNet |
| Multimodal | Vision, Audio, Fusion |
from complexity.api import MLP, TokenRoutedMLP
# Via factory
moe = MLP.moe(hidden_size=4096, num_experts=8, top_k=2)
# Direct
moe = TokenRoutedMLP(
hidden_size=4096,
num_experts=8,
top_k=2,
)
output, aux_loss = moe(hidden_states)
from complexity.api import Efficient
# Pre-configured models
model = Efficient.nano_llm(vocab_size=32000) # ~10M params
model = Efficient.micro_llm(vocab_size=32000) # ~30M params
model = Efficient.tiny_llm(vocab_size=32000) # ~125M params
model = Efficient.small_llm(vocab_size=32000) # ~350M params
# Memory optimizations
Efficient.enable_checkpointing(model)
model, optimizer, scaler = Efficient.mixed_precision(model, optimizer)
For very long sequences:
from complexity.api import Architecture
model = Architecture.mamba(hidden_size=768, num_layers=12)
model = Architecture.rwkv(hidden_size=768, num_layers=12)
model = Architecture.retnet(hidden_size=768, num_layers=12)
CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0)
@software{complexity_framework_2024,
title={Complexity Framework: Modular LLM Building Blocks with INL Dynamics},
author={Complexity-ML},
year={2024},
url={https://github.com/Complexity-ML/complexity-framework}
}
Build stable LLMs. Train with confidence.