c
cortexforge_ai

IMRAN ULLAH

@cortexforge_ai

Building intelligent AI systems with NLP and Vision

Pakistan
Engels, Koreaans, Spaans, Urdu
Sommige informatie wordt in het Engels weergegeven.
Over mij
I am a Senior AI ML Engineer. I am new here but bring years of enterprise experience designing deep learning architectures. I build multi agent systems with agent2agent and MCP workflows. For NLP and vision, I create smart systems hybrid RAG and OCR pipelines using Qwen3 YOLOv12 and SAM3. I specialize in synthetic dataset generation and model fine tuning using PEFT LoRA QLoRA DoRA and Unsloth. I apply the latest reinforcement learning algorithms like RLHF DPO ORPO GRPO and DR GRPO. I optimize deployments using lightning-fast inference frameworks like vLLM SGLang TGI ONNX and TensorFlow.... Lees meer

Skills

c
cortexforge_ai
IMRAN ULLAH
offline • 
Gemiddelde reactietijd: 1 uur

Bekijk mijn diensten

AI-technologie advies
I will do local llm deployment on premise using vllm sglang ollama and llamacpp
Verfijning van AI-modellen
I will fine tune llm vlm whisper stable diffusion using unsloth lora rlhf post training

Werkervaring

Upwork

machine learning engineer

Upwork • Fulltime

Feb 2024 - Nov 20251 yr 9 mos

At Grinda AI I worked remotely as a Machine Learning Engineer. My primary role was spearheading the end to end lifecycle of large language models specifically tailored for high throughput banking environments. Because financial institutions require strict data privacy and cost efficiency I led the technical initiatives to fine tune and deploy multi billion parameter models entirely on premise. A major part of my job was solving data scarcity. For a specific Korean banking client I generated a custom synthetic dataset of over one million samples using GPT 4 and Claude. I used this high quality synthetic data to fine tune the Qwen2.5 32B model utilizing QLoRA on multi GPU clusters with DDP and FSDP. Beyond model training I was heavily responsible for production inference optimization. I deployed these fine tuned financial models using vLLM and SGLang. I engineered the on premise infrastructure to successfully handle over 4000 concurrent requests while perfectly optimizing the GPU memory usage. I also designed robust evaluation pipelines using Ragas and custom frameworks to constantly benchmark our models for accuracy latency and financial domain compliance. Additionally my role expanded into low resource speech AI. I fine tuned OpenAI Whisper models specifically for the Kazakh language which achieved a 25 percent Word Error Rate and significantly outperformed the baseline models for audio transcription.