May 2025- Present
May 2025- Present
Machine Leaning Engineer Intern @ Vosyn AI
Role Overview:
Fine-tuning the DeepSeek 8B language model using LoRA, QLoRA, and 8-bit quantization for Arabic–English and English–Japanese translation
Utilizing PyTorch and Hugging Face Transformers for efficient multi-GPU training and checkpoint management.
Working on building event-driven deployment pipelines on Google Cloud:
Model hosted on Vertex AI Endpoints.
Inference via Google Kubernetes Engine (GKE) and Cloud Run.
Triggers linked to Cloud Storage bucket uploads.
Parallelizing inference using ThreadPoolExecutor, batch chunking, and caching strategies to support scalable translation of large files (e.g., SRTs).
Currently working on the deployment of a low-latency speech-to-speech (S2S) translation system, integrating a Node.js backend with a Django microservice, containerized using Docker and deployed on Cloud Run for scalable serving.
Ugarding towards Compute Engines to maintain a monolithic architecture for real-time synchronization and optimized coordination across backend components with REST APis/ gRPCs.
Impact Highlights:
BLEU improvement: 0.00 → 16.06 (JA→EN), 3.08 → 34.56 (AR→EN).
BLEURT boost: +36%.
Pipeline latency reduced from 2 minutes → 30–40 seconds for 25-block SRT files.
Tech Stack:
DeepSeek 8B, LoRA, QLoRA, 8-bit quantization, PyTorch, Hugging Face, Vertex AI, GKE, Cloud Run, Cloud Storage, Docker, GitHub Actions, Python, Node JS, Django, CUDA, GCP.
Growth & Takeaways:
Strengthened expertise in efficient LLM tuning and deployment orchestration using event-driven cloud infrastructure.
Improved understanding of event-based architectures, LLM I/O bottlenecks, and efficient parallel compute management
Gained deeper insight into MLOps orchestration, parallel processing, and real-time system optimization and low-latency inference optimization.
Jun 2025- Present
Instructor Assistant Upward Bound @ CU boulder
Role Overview:
Supporting a programming fundamentals course with 15+ students.
Assisting lab sessions on Python, covering core topics like:
Data types, control flow, and function design
File I/O operations
Debugging strategies and exception handling
Providing personalised assistance via Jupyter Notebooks and live feedback during labs.
Tracking student progress through performance metrics, surveys, and assignment reviews.
Impact Highlights:
Increased student comprehension by 40% via personalised guidance and iterative feedback.
Helped multiple students move from basic to intermediate proficiency in Python within one semester.
Tech Stack:
Python, Terminal, VSCode, Excel, Google Classroom.
Growth & Takeaways:
Improved ability to explain abstract programming concepts through analogies and live examples.
Learned how to tailor technical communication to varied learning styles and skill levels.
May 2023 - July 2023
Machine Leaning Research Intern @ DRDO INMAS
Role Overview:
Designed and implemented a real-time video acquisition pipeline using:
Arduino UNO for hardware trigger generation
OpenCV and Python for real-time video capture and frame processing
Achieved hardware-software synchronisation through serial communication and multithreaded signal handling.
Applied advanced computer vision techniques:
Gaussian blur, adaptive thresholding, contour detection
Region of Interest (ROI) segmentation for feature extraction
Enhanced the visual analytics pipeline for researchers performing object-based study of frames in lab settings.
Impact Highlights:
Latency improvement: 73% faster retrieval.
Robustified experimental imaging pipelines for extended-duration trials
Tech Stack:
Arduino, OpenCV, Python, Serial Protocols, Multithreading, USB Camera Interface, NumPy, Matplotlib, CNN.
Growth & Takeaways:
Developed real-time embedded-vision system skills
Learned performance tuning and multithreaded control at the edge device level
Dec 2022- Feb 2023
Machine Leaning Research Intern @ DRDO INMAS
Role Overview:
Built CNN models using TensorFlow and Keras to detect COVID-19 from chest X-ray images.
Preprocessed 700+ real medical images using:
noise filtering
Data augmentation (flips, rotations, scaling)
Designed both binary and multi-class pipelines to analyze diagnostic trade-offs.
Tuned hyperparameters using early stopping, dropout, and L2 regularization to avoid overfitting.
Generated heatmaps for model interpretability, aiding physician validation of predictions.
Impact Highlights:
Binary model achieved 93.75% accuracy vs. 71% for multi-class
Produced visual activation maps for clinical feedback and explainability
Tech Stack:
TensorFlow, Keras, Python, Scikit-learn, Matplotlib, Grad-CAM, NumPy, OpenCV, Jupyter
Growth & Takeaways:
Strengthened understanding of medical AI workflows, data bias handling.
Learned how to align ML pipelines with real-world diagnostic requirements.