I will build custom llm and slm with qlora
AI Engineer and Full Stack Developer: Expert in Scalable AI Solutions!
Over deze dienst
Fine-tune a custom LLM that knows YOUR domain not the whole internet.
I'm Raihan, an AI/ML engineer & CTO at ClarioScope AI. I train small language models from scratch (ORCH 350M3B, MedLLM, ILMA Lang) and fine-tune open LLMs with QLoRA on your real data.
What you get: Fine-tuned LLM/SLM on your dataset Llama, Mistral, Qwen, Gemma, Phi LoRA / QLoRA / full fine-tune (I pick what fits your data & budget) Dataset cleaning, formatting + synthetic data generation Evaluation report vs the base model (perplexity, accuracy) Inference-ready: Hugging Face, GGUF for Ollama, or an API endpoint Clean PyTorch code + documentation
Why me, not a $90 gig? Most "fine-tuning" gigs just wrap the OpenAI API. I build real SLMs from scratch so I choose the right base model & LoRA rank and ship a model that actually beats the base. Portfolio: raihan-js.github.io
️Process: Free scoping chat data prep training evaluation vs base delivery + handoff.
Your data stays private. Full weights + commercial-use rights available.
Message me your use case first for an accurate quote. Let's build it right!
Klanten waar ik mee heb gewerkt
GNatural Products
All Natural Skincare
I designed and developed Full WordPress Website for this client.
okt 2020
Mijn portfolio
Andere Data science en ML diensten die ik aanbied
Veelgestelde vragen
Do you train models from scratch or only fine-tune existing ones?
Both. I've trained the ORCH series (350M–3B) and MedLLM from scratch, and I fine-tune open LLMs daily. For most use cases, QLoRA fine-tuning a strong base (Llama 3.1, Mistral, Qwen) gives 80–90% of the benefit at a fraction of the cost — I'll recommend honestly based on your data.
Which base models can you fine-tune?
All major open-source LLMs: Llama 3.1/3.2 (1B–13B), Mistral 7B / Mixtral, Qwen 2.5, Gemma 2, Phi-3, DeepSeek, and Code Llama / Code Qwen. I can also fine-tune OpenAI (GPT-4o-mini, GPT-4.1) and Gemini via their tuning APIs.
How much training data do I need?
For LoRA/QLoRA, as few as 500 high-quality examples can work; 2,000–10,000 is the sweet spot. Have less? I generate synthetic data for you (Standard & Premium). Training a small model from scratch needs a substantial corpus — we'll confirm on the scoping call.
What hardware is used, and who pays for compute?
I use Runpod / Vast.ai (A100 / H100 GPUs). Compute for standard runs is included in all packages. For very large datasets or long pre-training, GPU cost may be billed at-cost as a small extra — always agreed upfront (typically $20–$120).
Will my data and trained model stay private?
Yes. Your data is used only for your project and never reused. You receive the full weights, code, and commercial-use rights (included in Premium; +$180 on Basic/Standard).
Can you deploy the model so my app can call it via API?
Yes — Premium includes a FastAPI + Docker container with an OpenAI-compatible endpoint, so your existing code just swaps the base URL. Standard buyers can add deployment for +$250.
What's the difference between fine-tuning and RAG?
Fine-tuning changes the model's behavior and knowledge in its weights. RAG retrieves answers from your documents at query time. Need RAG instead? I offer that as a separate gig — or message me and I'll tell you which one actually fits your goal.
Why should I hire you over a cheaper fine-tuning gig?
Most low-priced gigs are thin wrappers around the OpenAI API. I'm a CTO who trains real SLMs from scratch (portfolio: raihan-js.github.io) — so I'll tell you when fine-tuning is the wrong answer, pick the right base model and LoRA config, and deliver a model that measurably beats the base.

