The Pendulum Might Be Swinging Back to On-Premises

image

For years the story was simple. Move everything to the cloud, pay for what you use, never buy a server again. I bought into a lot of that, and for plenty of workloads it still holds up. But lately I’ve been having a different conversation with MSP owners, and it keeps circling back to one thing: the cost of running AI is climbing, nobody is quite sure where it stops, and it’s no longer a fringe worry — it comes up in nearly every planning chat I sit in on.

The bill that grows while you sleep

Here’s what I’m seeing. A client switches on an AI feature, the team loves it, usage goes up, and three months later the invoice has quietly doubled. Tokens — the units these models bill against — get consumed every time someone asks a question, summarises a thread, or drafts a reply. Individually they cost almost nothing. At scale, across a busy business, they add up fast.

The trouble is the meter never sleeps. A traditional software licence is a known number you can budget around. Token-based AI is a tap that’s always running, and the more useful it becomes, the more it costs. I’ve watched owners realise that the productivity win they were celebrating has a recurring price tag attached that grows in lockstep with their success.

Why bringing it home is back on the table

This is where on-premises starts creeping back into the conversation. Hardware that runs capable models locally is getting cheaper and far more practical. For a business with predictable, high-volume AI work — document processing, internal search, summarising the same kinds of records all day — a one-off box in the comms room can start to look smarter than an open-ended monthly bill.

I’m not saying everyone rips out the cloud. That would be daft. But the calculation has shifted. Five years ago, running your own AI infrastructure was exotic and expensive. Now it’s a line item a serious MSP can actually model for a client and stand behind, and that’s a service opportunity worth taking seriously.

The hybrid answer most businesses will land on

In practice I think most organisations end up somewhere in the middle, and that’s fine. Microsoft 365 Copilot is a good example of where the cloud stays. When someone asks Copilot in Outlook to draft a reply, or pulls a summary out of a long Teams meeting, that’s woven so tightly into the service that dragging it on-premises makes no sense. You’re paying for the integration, not just the tokens.

But the heavy, repetitive, high-volume jobs — the ones that chew through tokens by the thousand — are exactly the ones worth questioning. Does that bulk processing need to sit in a metered cloud, or could it run on a local model and feed the results back into SharePoint or a Power Automate flow? That’s the kind of question I think every MSP should be asking on behalf of their clients this year.

Where I’ve landed

The pendulum has swung hard towards the cloud for a decade, and it’s earned its place. But cost has a way of correcting fashion. When the meter starts hurting, people look for the off switch — and sometimes the off switch is a server you own. Watch your AI usage like any other variable cost. Know which workloads belong in Copilot and the cloud, and which ones might be cheaper closer to home.

Leave a comment