Saba LLM Auto-Train
Train smarter, not harder
Automated fine-tuning pipelines that transform base LLMs into production-ready models. From dataset preparation to deployment — fully automated, measurable results.
Real Performance Gains
Measured improvements from our internal fine-tuning pipeline on Gemma 4 2B
Faster Inference
Latency reduction from baseline to fine-tuned
Higher Throughput
Tokens per second improvement
Success Rate
Zero failed inferences in benchmark runs
gemma4:e2b
Stock Gemma 4 2B via Ollama
saba-gemma4-2b
Custom fine-tuned by SabaTech Auto-Train
From Base to Production
A streamlined 4-step pipeline that automates the entire fine-tuning journey
Dataset Prep
Collect, clean, and format training data. Automated deduplication, filtering, and quality scoring.
Fine-Tuning
LoRA/QLoRA training on base model. Hyperparameter optimization, early stopping, and checkpointing.
Evaluation
Automated benchmark runs against baseline. Latency, throughput, and quality metrics comparison.
Deployment
Export to GGUF/Ollama format. Push to production with zero-downtime model swap.
Built With
Powered by industry-leading tools and models
Gemma 4 2B
Base model from Google DeepMind
Unsloth
2x faster training, 50% less memory
Ollama
Local model serving & inference
Axolotl
Config-driven fine-tuning framework
Llama.cpp
GGUF quantization & inference
OpenCode
Automated pipeline orchestration
SabaTech Internal: Gemma 4 2B Fine-Tune
How we used our own pipeline to build a faster, cheaper inference model
SabaTech (Internal)
Replacing llama.cpp Qwen with a custom fine-tuned Gemma 4 2B
We applied our own Auto-Train pipeline to fine-tune Gemma 4 2B for our internal agent workloads. The goal: reduce inference latency and cost while maintaining quality. The baseline model (gemma4:e2b via Ollama) served as our reference point. After automated LoRA training, evaluation, and GGUF quantization, saba-gemma4-2b delivered a 67.4% reduction in latency and 2.51x higher throughput — all within a single automated pipeline run.
"Within 2 hours of pipeline execution, we had a production-ready model with 67% better performance than stock. Zero manual intervention."
Ready to optimize your models?
Get a free consultation on setting up an automated fine-tuning pipeline for your use case.