Is practical to use ? It appears the limiting factor is memory. You either need...
Is practical to use #localLLM? It appears the limiting factor is memory. You either need a lot-32-64GB- of regular memory (cheap) or spectacularly expensive VRAM (GPU), e.g. $4500 for a random 32gb gpu card (!!!).
That leaves
- APIs (must follow ToS, risk of getting flagged as bad & account closed),
- renting a cloud machine (high set up costs, maximum friction)
- running on ordinary CPUs with low memory (stupid LLM, glacial text generation)
- buy a new computer & tune model for that cpu
Self-replies
Funny story about getting incorrrectly flagged... years ago I searched for #VisualSourceSafe clicked a result and the company's net nanny said, "No Matt, no porn at work!"
I mean, it wasn't completely wrong, #VSS was obscene.