Hi, I’m Will Paik. Welcome to The Login Node.
I’m an HPC Performance Engineer specializing in optimizing large-scale GPU clusters for AI/ML workloads. In supercomputing, there’s always a natural tension between system administrators (“Keep it stable!”) and researchers (“Run it faster!”). My job is to find the technical sweet spot that makes both happy.
During the day I work on production HPC infrastructure for AI research. Outside of work, I build a mini-supercomputer from consumer hardware and document every step of it here.
CORE STACK: Slurm Linux Docker/Apptainer PyTorch Distributed Ansible
What You’ll Find Here #
The Login Node is an HPC and ML infrastructure engineering blog aimed at people who want to understand how the underlying systems actually work – not just how to submit a job and wait.
Content is organized into three series:
My Home Cluster #
| Role | Hardware | Specs |
|---|---|---|
| Login Node | Lenovo IdeaPad 1 | Ryzen 5 7520U, 8GB RAM |
| Management | Lenovo ThinkCentre M715q | Ryzen 5 2400GE, 16GB RAM |
| Visualization | Lenovo ThinkCentre M715q | Ryzen 5 2400GE, 16GB RAM |
| Worker Nodes (x2) | Lenovo ThinkCentre M715q | Ryzen 5 2400GE, 16GB RAM |
| GPU Node | HP Envy TE01 | Core i7-10700F, 32GB RAM, GTX 1660 Super |
| Storage | (via Management) | 1TB NVMe SSD (NFS) |
| Network | Gigabit Managed Switch | 8-port, VLAN support |
Software stack: Rocky Linux 10, Slurm 25, Ansible, Apptainer, Prometheus + Grafana (in progress)
Background #
I hold a PhD in Aerospace Engineering from Penn State with a minor in Computational Science, and spent 8 years there supporting 500+ researchers before moving to Northeastern University. The astrodynamics background informs how I think about large-scale optimization problems which I just applied to GPU clusters instead of spacecraft trajectories.
For the full professional history, see the Career page.