Professional Documents
Culture Documents
2023 State of AI Infrastructure Survey
2023 State of AI Infrastructure Survey
2023 State of AI Infrastructure Survey
All rights reserved to Run:ai. No part of this content may be used without express permission of Run:ai. www.run.ai
eBook | Jan 2023
Table of Contents
Introduction and Key Findings 3
Demographics 17
About Run:ai 19
2
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Introduction
and Key Findings
Introduction
The artificial intelligence (AI) industry has grown the rapidly evolving industry, it is akin to a sort of
rapidly in recent years, as has the need for more technological “Wild West”, with no real best practices
advanced and scalable infrastructure to support for enterprises to follow as they get AI into production.
its development and deployment. The global AI As they begin to invest more heavily in AI, there’s a lot
Infrastructure Market was valued at $23.50 billion in riding on how they decide to build their infrastructure
2021 and is expected to reach $422.55 billion by 2029, and service their practitioners.
at a forecasted CAGR of 43.50% between 2022-2029.
This is the second ‘State of AI Infrastructure’ survey
One of the main drivers of progress in the AI we are running, due to all the new activity in the
infrastructure market has been increasing awareness industry and new AI companies in the AI space,
among enterprises of how AI can enhance their we’re keen to see what’s changed. We’re particularly
operational efficiency, attract new business and grow interested in new insights into how organizations
new revenue streams, while reducing costs through are approaching the build of their AI infrastructure,
the automation of process flows. Other drivers include why they are building it, how they are building it,
the adoption of smart manufacturing processes using what are the main challenges they face, and how the
AI, blockchain and IoT technologies, the increased abundance of different tools has affected getting AI
investment by GPU/CPU manufacturers in the into production. We hope that the insights from this
development of compute-intensive chips, and the survey will be helpful to those who both build and use
rising popularity of chatbots, like OpenAI’s recently AI infrastructure.
launched ChatGPT, for example.
3
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Methodology
To get more insight into the current state of AI US and Western EU ranging in size between under
Infrastructure, we commissioned a survey of 450 200 and over 10,000 employees. The respondents
Data, Engineering, AI and ML (Machine Learning) were recruited through a global B2B research panel
professionals from a variety of industries. This report and invited via email to complete the survey, with all
was administered online by Global Surveyz Research, responses collected during the second half of 2022.
an independent global research firm. The survey is The average amount of time spent on the survey
based on responses from a mix of Data Scientists, was five minutes and fifty seconds. The answers to
Researchers, Heads of AI, Heads of Deep Learning, the majority of the non-numerical questions were
IT Directors, VPs IT, Systems Architects, ML Platform randomized, in order to prevent order bias in the
Engineers and MLOps, from companies across the answers.
4
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Key Findings
1
Data has been overtaken by
Infrastructure and Compute as the
main challenges for AI development
A whopping 88% of survey respondents admitted wait for resources, etc.) – chosen by 54% and 43% of
to having AI development challenges (Figure 1), respondents respectively as their main challenges.
which is telling in itself. But it’s also interesting This year, Data ranked as the third biggest challenge
to note that Data, which was ranked by 61% of in AI development (41%). The fact that infrastructure
respondents in last year’s survey as their main and compute-related challenges are now the top
challenge in AI development, was overtaken this concern for companies reinforces the importance of
year by infrastructure (i.e., the different platforms building the right foundation, for the right stack, to get
and tools that comprise “the stack”), and compute the most out of their compute.
(i.e., getting access to GPU resources, not having to
2
The more GPUs, the bigger the
reliance on multiple third-party tools
As organizations scale and require more GPUs, the from 29% reliance in companies with less than 50
more complex it has become to build the right AI GPUs, to 50% reliance in companies with more
infrastructure to get the right amount of compute than 100 GPUs (Figure 2). What would make more
to all of the different workloads, tools, and end users. sense, is a more open, middleware approach, where
80% of companies are now using third-party tools, organizations can use different tools that run on the
and the more GPUs they require, the bigger their same infrastructure, so that they are not locked into
reliance on multiple third-party platforms, increasing one end-to-end platform.
5
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Key Findings
3
On-demand access to GPU compute
is still very low, with 89% of companies
facing resource allocation issues
regularly
Only 28% of the respondents have on-demand access lacking. So, it’s no wonder that 89% of respondents
to GPU compute (Figure 8). When asked how GPUs face allocation issues regularly (Figure 10) – even
are assigned when not available via on-demand, though some of them (58%) claim to have somewhat
51% indicated they are using a ticketing system automatic access – with 40% facing those GPU/
4
(Figure 9), suggesting that on-demand access is still Compute resource allocation issues weekly.
5
91% of companies are planning to
grow their GPU capacity or other AI
infrastructure in the next 12 months
The vast majority (91%) of companies are planning they can actually get value out of it, so this result
to grow their GPU capacity or other AI infrastructure is a resounding testament to the fact that most
by an average of 23% in the next 12 months (Figure companies see huge potential and value in continued
4) despite the uncertainty of the current economic investment in AI.
climate. Organizations won’t invest in AI unless
6
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Survey
Report Findings
34%
This reinforces the importance of building the right
foundation, for the right stack, to get the most out of Defining business goals
12%
7
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
7% 8% 8% 3%
12%
17%
29% < 50 GPUs 47% 30% 51-100 GPUs 50% 50% 100+ GPUs 39%
8
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
With the majority of respondents still using tools Figure 3: Tools used to optimize GPU allocation between users
that are not enterprise-grade, these tools will require
attention, especially as their organizations scale and
need to take more AI into production.
9
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
10
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Aspects of AI Infrastructure
Monitoring, observability and explainability
Planned for Implementation
50%
(within 6-12 months)
Model depolyment and serving
When asked what aspect of AI infrastructure they
44%
are looking to implement in the next 6-12 months,
respondents indicated that the top aspects relate Orchestration and pipelines
to challenges in production, including monitoring, 34%
observability, and explainability (50%), model
deployment and serving (44%), and orchestration and Data versioning and lineage
11
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
2%
12
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
On-Demand Access to
GPU Compute
When asked about availability of on demand access No automatic
Automatically
access
to GPU compute, only 28% of the respondents said accessed
Only 11% said they rarely run GPU allocation issues, 13%
have allocation issues daily, and 40% face allocation
issues weekly.
Figure 8: Availability of on-demand access to GPU compute.
18%
51% 40%
31%
24%
Assigned
statically to 13% 13% 11%
specific users/
jobs
Figure 9: GPUs assignment w/o on-demand access. Figure 10: Frequency of GPU/compute resource allocation issues.
13
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
46%
No automatic access
78%
14
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
15
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
23%
Privacy / legal issue
8%
16
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
Demographics
Country, Department, Role, Job Seniority
37%
16% 24%
France 50%
18% 26%
DevOps
United Data /
Kingdom Data Science
33% 7%
MLOps / DevOps
25% 35%
Head of an AI / Deep Learning Intiative
17%
5%
Platforms Engineer
31%
3%
VP / Head
CTO
1% Director
Amount of People
42% 42%
40%
20% 22%
16%
9%
6% 4%
2%
<200 201-500 501-1K 1,001-5K 5,001-10K >10K <10 GPUs 11 - 50 51 - 100 100+
18
The 2023 State of AI Infrastructure Survey
eBook | Jan 2023
About Run:ai
Run:ai's Atlas Platform brings cloud-like simplicity to implementation, increase team productivity, and
AI resource management - providing researchers gain full utilization of expensive GPUs. Using run:ai,
with on-demand access to pooled resources for any companies streamline development, management,
AI workload. An innovative cloud-native operating and scaling of AI applications across any
system - which includes a workload-aware scheduler infrastructure, including on-premises, edge and cloud.
and an abstraction layer - helps IT simplify AI