21 Sep 2021
The cloud is a great consumption model for many use cases, and is particularly useful for AI Startups with limited capital budgets – primarily those who want to spend their money on research and development that leads to a product.
In many respects, the cloud certainly has proven to be a great way for the hyperscalers to get others to fund their vast IT systems, some of the capacity of which they get for a much lower price than they might otherwise have paid if they just supported their core business. Think of Amazon’s IT costs – unit costs for compute, storage, and networking as well as all of the advanced software the online retailer would have had to create anyway to scale its business – without Amazon Web Services. In a real sense, AWS customers subsidise Amazon’s IT, and the same holds true for the cloud customers of Google, Microsoft, Alibaba, Baidu, and Tencent.
One of the early misconceptions of the cloud – and one that mysteriously persists to this day – is that it offers cheaper infrastructure than customers can build themselves. The cloud certainly is easier to pitch to upper management, thanks to the appearance of infinite scale (which is also a false impression, as anyone who has tried to run a large-scale simulation or model can attest to), a utility pricing model that keeps IT costs off the balance sheet, a wide variety of compute, and a reasonable set of tools to manage infrastructure and monitor costs.
But that doesn’t mean those costs are transparent, and it certainly does not mean that cloud costs are lower than they would be if customers laid out the capital and bought infrastructure, depreciated it over time, and installed it in their own data centre - or within a colocation facility managed by experts.
Every distributed computing system has its own needs and consequent quirks, and the economics of cloud capacity rented on an hourly basis, versus on-premises gear acquired by customers and hosted in a colo comes down to specifics.
However, the evidence suggests that startups who have workloads that are going to consume cloudy infrastructure for more than half of any reasonable span of time, or on a sustained basis, will save money in the long run by either building a cloud outpost (based on utility priced equipment attained from one of the big OEMs like Hewlett Packard Enterprise, Dell, or others) and running it in their own data centre, or a colocation facility.
And for startups with a heavy HPC and AI component – machine and deep learning, genomics and bioinformatics, quant and financial services, and so on – that are located in urban areas where putting together a small data centre that can scale as the business takes off, the best way, and sometimes the only practical way, to deliver IT infrastructure is to put it into a colocation facility.
Let’s not forget performance benefits either. Web-scale infrastructure in essence, compromises virtual compute, virtual storage and virtual networking scattered around a 100,000 server data centre or multiple data centres in a cloud region. This is a sub-optimal platform for running AI applications that simply cannot deliver the same deterministic, genuine performance that HPC customers expect and need.
For those AI Startups that are lucky enough to be located in the UK Innovation Corridor between London and Cambridge, Kao Data has built a colocation facility that can give customers the comfort of something that feels like a data centre they can control, without all of the grief of securing real estate, power, and cooling. At the same time it allows for utility-style pricing for the facility and services (rack, power, cooling, network) and complementing the cloud-like pricing available from the major OEMs such as HPE GreenLake or Dell APEX.
NVIDIA does not yet offer utility pricing on its DGX AI supercomputers, but I suspect it will before too long. Incidentally, the Cambridge-1 supercomputer, built by NVIDIA for quantum computing, bioinformatics, genomics, and AI-healthcare research is located at the Kao Data facility, not in an NVIDIA data centre that the company could have surely built if it wanted to.
To prove out the idea that colocation of owned or subscription-priced IT gear can be significantly cheaper than renting cloud capacity over a long term and heavily utilising it, Kao Data has done the math: comparing renting AI systems on the Amazon Web Services cloud, with on demand and reserve instances, with buying DGX machines from NVIDIA and putting them into a colocation Technology Suite run by Kao Data.
Importantly, the scenarios priced out by Kao Data take into account the utilisation rate of cloud capacity over two and three year periods, and range from 10 percent of the time to 100 percent of the time – showing that for AI startups, colocation infrastructure is far more affordable than cloud capacity of equal performance once it is being used more than half the time.
Check out the details for yourself - download the whitepaper, published by The Next Platform here.