Nauman Mustafa

About Me

Experience

7+ years in industry

6 years ML Engineering 2 years vibe coding

Core Skills

Python PyTorch Docker FastAPI Google Cloud Prompt Engineering

Technical Solutions

OCR Computer Vision NLP Transformers Training & Fine-tuning LLMs

Industry Domains

Retail Receipt Scanning

Software Testing Visual Testing e2e Test Generation Source Code Analysis

Personal Work/Writings

Sep 2023

MLUI Mobile: Autify OCR vs. Google OCR

Comprehensive performance comparison of Autify's in-house OCR system against Google Cloud OCR and EasyOCR, with detailed methodology, results analysis, and sample evaluations. Autify OCR achieved 91% accuracy on mobile screenshot text recognition.

Work Done in Autify

Aug 2023

Token Compression: Reducing Attention Waste?

Explored using LLMs to compress multiple tokens into single tokens for more efficient transformers. Demonstrated that 2048 hidden dimensions can compress ~8 tokens losslessly using a two-stage encode-decode architecture with LoRA fine-tuning.

Apr 2023

Long Pythia

Explored token length extension when it was common to have 2048 or 4096 context length.

Nov 2021

Machine Learning Features in Autify for Mobile

Comprehensive overview of AI-powered mobile testing features including Visual Regression Testing (VRT), Visual Self-Healing algorithms, and the upcoming Visual App Explorer (VAX) for autonomous app navigation and bug discovery.

Work Done in Autify

Jul 2021

Solving Automated App Navigation: A Use-case

Detailed exploration of behavior cloning techniques for automated mobile app navigation, comparing regression vs heatmap approaches, and demonstrating how U²Net successfully models uncertainty in tap location prediction.

Work Done in Autify

Jan 2021

Applying Modern Deep Learning in Autify

Comprehensive overview of deep learning applications in software testing, including visual regression detection, genetic algorithm optimization, graph neural networks for HTML analysis, and reinforcement learning for intelligent test discovery.

Work Done in Autify

Apr 2020

Getting the Most Out of Pre-trained Models

Deep dive into pre-trained NLP models like GPT-2 and T5, exploring their capabilities for text generation, question answering, summarization, and transfer learning applications. Originally published on Toptal.

Published in Toptal

Jun 2019

Professional Experience

Autify

Sr. Machine Learning Engineer

May 2020 - June 2025 Tokyo, Japan

Jan 2025

Bootstrapped project Lexa

Generate PRD from Source Code
Vibe-coded entire desktop app using cursor in Swift UI
Vibe-migrated the desktop app to Electron for multiple platforms

Jan 2024

Bootstrapped project Genesis

Designed Prompts for the following:
- Generate Test Cases from PRD
- Generate Test Scenarios from Test Cases
- Generate Playwright code from Test Scenarios using Agentic AI
Build Backend/frontend using LLMs

July 2023

Developed Step Suggestion Chrome Extension

Wrote frontend code in PReact
Wrote Backend in Python/FastAPI
Developed the algorithm to slim down the HTML
Designed the prompt to provide next step suggestions based on current page

Jan 2022

Core Deep Learning Work

Developed MLUI, an ensemble of ML models for parsing screenshot of mobile apps
- A custom OCR model based on encoder-decoder transformer architecture trained on synthetic data
- Retina net fine-tuned on rico + internal dataset for detecting UI elements
- A post processing model to merge the output of the above two models
Deployment of MLUI at scale on GKE using cost-effective GPU & queue optimizations
Explored Reinforcement Learning for automated testing

Jan 2021

Built the baseline ML Infrastructure at Autify

Set up Google Cloud GKE for ML Deployment
Set up pipeline for VRT (Visual Regression Testing) algorithm deployment at scale on GKE
Fine-tuned Adobe Semantic Web Segmentation model
Explored Visual Self-Healing using Template Matching

May 2020

Early Exploration of Application of ML on Web Apps

Implemented OpenCV-based web screenshot comparison
Fine-tuned BERT on HTML data
Optimized the weights of an internal feature extraction algorithm using evolutionary algorithms

About Me

Experience

Core Skills

Technical Solutions

Industry Domains

Personal Work/Writings

MLUI Mobile: Autify OCR vs. Google OCR

Token Compression: Reducing Attention Waste?

Long Pythia

Machine Learning Features in Autify for Mobile

Solving Automated App Navigation: A Use-case

Applying Modern Deep Learning in Autify

Getting the Most Out of Pre-trained Models

Recent Advancements in AI

This Icon Does Not Exist — GAN for Icon Generation

Deploy ML on Cloud Run

Cloud Run — Future Tech

Deep Learning in Cloud

Professional Experience

Autify

Bootstrapped project Lexa

Bootstrapped project Genesis

Developed Step Suggestion Chrome Extension

Core Deep Learning Work

Built the baseline ML Infrastructure at Autify

Early Exploration of Application of ML on Web Apps