Nauman Mustafa

AI Systems Engineer (prev. Sr. Machine Learning Engineer)

7+ years building ML-backed products, with work across OCR, computer vision, NLP, transformers, LLM workflows, and software testing systems.

Portrait of Nauman Mustafa

About Me

Industry 7+ years

Product engineering and applied ML delivery.

ML engineering 6 years

Training, deployment, OCR, CV, NLP, and model pipelines.

AI-assisted delivery 2 years

Agentic coding, prompt systems, and Playwright generation.

Core Skills

  • Python
  • PyTorch
  • Docker
  • FastAPI
  • Google Cloud
  • Prompt engineering
  • Playwright workflows

Technical Solutions

OCR and document understanding
Computer vision systems
LLM workflows and agents
Model deployment and infrastructure
Source-code and UI analysis

Industry Domains

  • Retail receipts Receipt scanning and OCR extraction.
  • Software testing E2E generation, step suggestion, and self-healing.
  • Visual testing Screenshot comparison and regression workflows.
  • Mobile UI understanding Screen parsing, element detection, and action support.

Professional Experience

Autify

Sr. Machine Learning Engineer · May 2020 - June 2025 · Tokyo, Japan

Bootstrapped project Lexa

  • Generated PRDs from source code.
  • Vibe-coded the first desktop app using Cursor and SwiftUI.
  • Migrated the desktop app to Electron for multiple platforms.

Bootstrapped project Genesis

  • Designed prompts to generate test cases from PRDs.
  • Designed prompts to generate test scenarios from test cases.
  • Generated Playwright code from scenarios using agentic AI.
  • Built backend and frontend workflows with LLM assistance.

Developed Step Suggestion Chrome Extension

  • Wrote frontend code in Preact.
  • Wrote backend services in Python and FastAPI.
  • Developed the algorithm to slim down HTML before model input.
  • Designed prompts for next-step suggestions based on the current page.

Core Deep Learning Work

  • Developed MLUI, an ensemble of ML models for parsing mobile app screenshots.
  • Built a custom OCR model based on an encoder-decoder transformer architecture trained on synthetic data.
  • Fine-tuned RetinaNet on RICO and internal data for detecting UI elements.
  • Built post-processing to merge model outputs and deployed MLUI at scale on GKE with cost-effective GPU and queue optimizations.
  • Explored reinforcement learning for automated testing.

Built the baseline ML Infrastructure at Autify

  • Set up Google Cloud GKE for ML deployment.
  • Set up deployment pipelines for visual regression testing at scale on GKE.
  • Fine-tuned Adobe semantic web segmentation models.
  • Explored visual self-healing using template matching.

Early Exploration of ML on Web Apps

  • Implemented OpenCV-based web screenshot comparison.
  • Fine-tuned BERT on HTML data.
  • Optimized internal feature extraction weights with evolutionary algorithms.