Light Mode
Nauman Mustafa Writing archive

Writing on AI systems, OCR, LLMs, and product delivery

This archive organizes the site by topic instead of chronology so the strongest internal links are explicit and the older essays are easier to interpret in context.

Some posts from 2019-2021 are preserved as historical snapshots. They are still useful, but they reflect the state of the field at the time they were written.

AI Product Delivery

Field reports on shipping with coding agents, choosing stacks, and keeping quality high while moving faster.

LLM Systems + Research Notes

Experiments and essays on context length, token efficiency, prompting, and practical LLM engineering.

Token Compression

An experiment on compressing multiple tokens into one representation to reduce attention waste.

Long Pythia

Extending transformer context length and documenting the tradeoffs that show up in practice.

OCR + Computer Vision

Mobile UI parsing, OCR evaluation, and machine learning features that shipped into real testing products.

Cloud + ML Infrastructure

Practical notes on deploying models, choosing infrastructure, and understanding cloud tradeoffs.

Cloud Run

A high-level introduction to why serverless containers mattered for ML application delivery.

Deep Learning in Cloud

A cost-oriented comparison of the cloud options that were available for deep learning at the time.

Historical Snapshots

Older essays that are still worth keeping online, but should be read as time-bound snapshots rather than current guidance.