Writing

Deep Learning in Cloud

A cost-focused comparison of cloud GPU options for deep learning across AWS, Paperspace, Colab, and Google Cloud preemptible machines.

Free baseline
Colab T4
Best value
T4 $0.35/hr
High-end
V100 $0.80/hr
Tradeoff
Preemptible

Cheap but High Quality

Google Colab is a free platform that provides GPUs for training. It is excellent for experiments, but not ideal for larger models such as BigGANs or large language models. The annoying part is always moving checkpoints to Google Drive and restoring them when the runtime disappears.

I had a GTX 1060 laptop, which was fine for prototyping models and training scripts. Once the scripts were ready, though, the laptop was not enough for bigger runs. That pushed me to compare cloud GPU options.

Amazon Web Services (AWS)

AWS was the obvious first place to look because it had the trust, maturity, and breadth of services. The problem was pricing. For the kind of training I wanted to do, GPU cost mattered more than the surrounding cloud ecosystem.

A p2.xlarge was about $0.90/hour, but K80 GPUs were weak for modern deep learning. A p3.2xlarge with V100 was around $3.00/hour, or roughly $1.20/hour as a spot instance.

Paperspace

My search continued because some models needed less compute than a V100, and even $1.20/hour was still expensive for hobby use.

Paperspace had lower-end GPUs at better prices: P4000 near desktop GTX 1070 performance for about $0.51/hour, P5000 around $0.78/hour, P6000 around $1.10/hour, and V100 around $2.30/hour.

I used Paperspace for a while, but the upfront storage cost was a drag because I only needed a few hours of training in a typical month. That kept the search going.

Google Cloud Platform (GCP)

Google Cloud was compelling because the preemptible prices were low enough for hobby deep learning work. A T4 at around $0.35/hour and a V100 at around $0.80/hour changed the economics.

The other useful difference from Colab was persistence. Preemptible machines can still shut down, but the machine data stays intact. As long as you save checkpoints regularly, interruption is annoying rather than destructive.

Platform GPU / machine Approx. price at the time My note
AWS p2.xlarge / K80 $0.90/hour Cheap, but weak for deep learning.
AWS p3.2xlarge / V100 $3.00/hour Powerful, but expensive.
AWS Spot p3.2xlarge / V100 $1.20/hour Better, but still high for hobby use.
Paperspace P4000 $0.51/hour Close to desktop GTX 1070 performance.
Paperspace P5000 / P6000 $0.78-$1.10/hour Good mid-range options without spot pricing.
Google Cloud T4 preemptible $0.35/hour The most attractive value for my workload.
Google Cloud V100 preemptible $0.80/hour High-end compute at a much lower price.

Key Takeaway

Google Cloud offered the best cost/performance story for this workload at the time, especially with preemptible instances. Colab remained the easiest free baseline, but preemptible GCP machines made longer or heavier experiments more practical.