Machine Learning on local or Cloud based NVidia or Apple GPUs

Introduction

This blog details various configurations around running machine learning software towards LLM or general AI based applications on a variety of hardware including NVidia professional workstation GPUs locally or on the cloud or local Apple ARM hardware.

Performance

Batch Size variations among GPUs (shorter time per iteration is better). Notice that a dual GPU setup is performant only for large batch sizes over 1024 and correlates to GPU core count - in this case 32768 cores for the dual RTX-4050.

Quickstart

Setup

Architecture

DevOps

Example ML Systems

2023 Lenovo P1 Gen 6 : i7-13800H 64G and NVidia RTX-A3500 Ada AD-104 5120 cores 12G 192bit VRam

Order the 64G laptop not the 96G version for now https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile-Workstations/P1-Gen6-Bricked-after-BSOD-second-laptop-with-the-same-problem/m-p/5254145?page=2#6148028

2019 Lenovo P17 Gen 1 : Xeon W-10855M 128G and NVidia Quadro RTX-5000 TU104 Turing 3072 cores 16G 256bit VRam

2023 Custom : i9-13900K 192G and Dual NVidia GTX-4090 MSI Suprim Liquid X

2023 Custom : i9-13900K 128G and Dual NVidia RTX-A4500 with NVidia RTX-4000

2021 Lenovo X1 Carbon gen 9 : Intel GPU

Google Cloud Workstation : NVidia L4 GPU

Google Pixel 6 : Google TPU

https://blog.google/products/pixel/introducing-google-tensor/

Links

PMLE Training

https://www.cloudskillsboost.google/cou

PMLE Notes

Hardware

AD102 RTX-4090 Ada Consumer

24GB 384 bit 1008 GB/s 16384 cores 76B transistors 1344 GTexels

AD104 RTX-3500 Ada Mobile Workstation P1Gen6 2023

12GB 192 bit 432 GB/s 5120 cores 35B transistors 319 GTexels

RTX-A4500 Ampere Workstation 2021

20GB

RTX-A4000 Ampere Workstation 2021

16GB

RTX-5000 Lenovo P17Gen1 2020

16GB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly