NLP is going to the cloud! The next generation of LLMs will have a trillion parameters, you’ll never run those locally! Why would you buy a desktop ML rig?

The crypto market crashed, and GPU prices fell to below MSRP for the first time in a while. I made an impulse purchase of 2 RTX 3090s. That’s pretty much it.

That’s not much justification, and this was far from a disposable income purchase. But it really did start with an impulse buy. I think shallow justification can work for or against you, and this time I let it motivate me to do something new.

After staring at the 3090s for a bit, it turned out there were a few more good reasons:

Sunk cost (the good kind): Cognitive biases like the sunk cost fallacy usually get a bad rep. At least in this context, the sunk cost bias can be a useful motivator. It’s hard to wake up, look at a brand new machine, and not want to put it to use. Language models are getting bigger, and a lot of the action in NLP is happening on cloud infra. But paying per-token for a cloud API, there is a financial disincentive to do more experiments. With local compute, I feel a little bit bad if I don’t do more experiments.

Work for the meticulous muscle. Working with code has changed my psychology. Building something in software, you can start with whatever you know best, and build out the less-certain components later. Very few other disciplines work this way. You can’t build the 14th floor of a skyscraper before building the foundation. This relative freedom from constraint is one of my favorite parts of working with code, but it has a downside. In “the real world” I’m uncomfortably aware of how order-sensitive and irreversible most tasks are.

Every so often I think I can cure myself with exposure therapy. Yeah, I want to be a meticulous craftsman. Yeah, I want to do something that requires attention to detail at each of a dozen irreversible steps.

This rarely goes as planned. I usually realize halfway through that I like software a lot more. If I’m being honest, this was another one of those times. But I will count it as having worked some important mental muscle or something. No pain no gain.

Work while sleeping. Now I can use my laptop to watch Netflix in the evening without worrying about stealing cycles from experiment runs. There is something very pleasing about knowing that your computer is working for you even while you sleep. There’s always something that the computer could be doing to help you make progress. Often this is a reasonable experiment where you have expectations about the outcome. But sometimes you finish those, then what? Having resources that would otherwise be sitting idle encourages me to be more creative when I’ve exhausted the obvious ways to put the computer to work. Sometimes it’s a long training run for no reason other than to see what happens.

Become more knowledgeable. Breadth vs. depth is a kind of exploration-exploitation tradeoff - broad encyclopedic knowledge comes in handy, but in unpredictable ways. Increasing depth of knowledge in an area of expertise has more predictable rewards, but it’s hard to be creative without some crossover ideas from other domains. This project gave me a chance to learn a little about areas adjacent to my main interests.

I learned something about compute hardware. What is a PCIe lane, for example? I hadn’t thought much about the connection between computer architecture in the abstract and the reality of a desktop computer before this project.

Having my own hardware has also encouraged me to learn more about quantization and other topics related to accelerating LLM training and inference. “How many LLM parameters can you fit into 48 gigs of VRAM?” is a much more interesting question when you are the proud owner of 48 gigs of VRAM.

Build pics and some LLM inference results coming soon…