MissingLink.ai TAM

Part
01
of nine
Part
01

Number of GPU Machines Globally - Historical

In terms of overall GPU sales, numbers have largely been decreasing from 2014-2017. In 2014, 460 million units were sold, followed by 397 million units in 2015, 420 million units in 2016, and 408.4 million units in 2017.

The estimated number of GPUs sold for use in AI-related functions in 2014 (market share estimated at 14%) were 6.44 million units, in 2015 (market share estimated at 15%) were 5.955 million units, in 2016 (market share, estimated at 16.5%) were 6.31 million units, and in 2017 (exact market share value was available) were 7.7628 million units.

To answer your question, I report data on the number of GPUs sold between 2014 and 2017 below. I could not find specific information on what volume of sales were destined for, or used in AI-related functions. However, I have provided global figures and estimations based on research and news articles on the subject.

OVERVIEW OF GPUS IN AI COMPUTING
Pioneered in 2007 by NVIDIA, graphics processing units (GPUs) were originally used for gaming and doing complex simulations, but now play an important role together with computer processing units (CPUs) in accelerating deep learning, analytics, and various other engineering applications. GPUs facilitate parallel computations, which can vastly improve the time and cost-efficiency of training neural nets. Researchers at Stanford built a GPU-accelerated system to train deep nets for US$33 000, compared to Google’s US$5 billion system CPU cluster system built before the deep learning boom.

GPUs can be categorized into 2 categories: discrete or integrated. Integrated GPUs are already onboard the PC, while discreet GPUs are on separate or stand-alone cards. They are also developed and sold as either server GPUs or gaming GPUs. Server GPUs tend to be more compact and are not permitted to run at higher speeds to prevent overheating.

More people are buying gaming GPUs in their data centers to run and develop artificial intelligence software. GPUs marketed for gaming are sometimes used as they are cheaper and can perform at higher speeds than server-grade chips. However, server GPUs serve machine learning purposes better as gaming GPUs are more limited in terms of memory. The largest data center operators including Google, Microsoft and Facebook are also increasing their purchases of server GPUs.

Research and development on GPU hardware and its applications is active, and industry players can expect progress in terms of GPU acceleration capabilities and flexibility in the near future.

OVERALL GPU SHIPMENT VOLUME
Information was publicly available on the number of GPUs shipped across both PC and notebook, discrete and integrated. The information in this section is derived from data collected by Jon Peddie Research (JPR), a research and consulting firm for graphics and multimedia.

Three companies dominate the global GPU market: Intel, AMD and Nvidia. Combining the figures for both desktop and notebook GPUs, overall GPU shipments from 2014 to 2017 were as follows:

2014 - 460 million units
(230 million discrete notebook + 130 million integrated desktop + 50 million discrete notebook + 50 million discrete desktop)

2015397 million units
(200 million discrete notebook + 110 million integrated desktop + 42 million discrete notebook + 45 million discrete desktop)

2016420 million units
(200 million discrete notebook + 120 million integrated desktop + 50 million discrete notebook + 50 million discrete desktop)

2017408.4 million units

For 2017, figures were not readily available. I determined the sales volume for 2017 using absolute figures and growth figures using the following calculation:

GPU sales for notebooks shipped in 2017 = 200 million + 50 million = 250 million * 6% = 15 million - 250 million = 235 million units

GPU sales for PCs shipped in 2017 = 120 million + 50 million = 170 million x 102% = 173.4 million units

Therefore, the combined GPU sales from PCs and notebooks shipped in 2017 was 408.4 million units.

According to Jon Peddie Research, 2017 was an amazing year for GPU development driven by games, eSports, AI, cryptocurrency mining, and simulations.

WHAT PROPORTION OF GPUS ARE PURCHASED USED FOR AI-RELATED FUNCTIONS?
Nvidia was said to have a monopoly over the market for discrete GPUs used for machine learning purposes. All the major clouds with GPU support (Amazon Web Services, Azure, and Google Cloud) are overwhelmingly Nvidia-powered. In a September 2017 report, Susquehanna Financial Group analyst Christopher Rolland estimated that as much as 10% of Nvidia's reported gaming business is being used for data center infrastructure. Using this information together with JPR’s sales volume and market share data, I estimated the volume of GPU sales used in data-center machine learning (and, with less certainty, this can be extrapolated to AI in general).

The estimation can be made according to the following formula:
Estimated number of GPUs sold for use in AI-related functions in a given year = total GPU sales volume (as calculated above) * Nvidia Discrete Desktop GPU market share in that year (from JPR's market share report; used as a proxy for GPU market share for AI-related purposes, as integrated GPUs are not commonly used in AI-related computing) * 0.1 (speculated proportion of Nvidia’s gaming business used for AI-related purposes)

The market share in Q3 was used in the calculation as those are periods that the market typically experiences the strongest sales.

Using this formula, the breakdown of estimated number of GPUs sold for use in AI-related functions by year is as follows:

2014 (market share estimated at 14%)—6.44 million units
2015 (market share estimated at 15%)—5.955 million units
2016 (market share, estimated at 16.5%)—6.31 million units
2017 (exact market share value was available)—7.7628 million units

More detailed calculations can be made with access to figures in JPR’s full Market Watch report (which has a selling price of US$2500).

Little information on future sales projections was found, though most of the articles online expect GPU sales to experience a steady increase.

AMD has plans to compete with Nvidia in the market for GPUs for use in artificial intelligence and cryptocurrency mining, however, the article emphasises the strong foothold Nvidia has in the market.

CONCLUSION
There is no rigorous pre-compiled information on the volume of GPUs sold and used globally for AI-related processing purposes. Estimations of the number of machines sold for this purpose seem to hover around 6 million units per year, but this is expected to increase, based on GPU sales trends and reports in the media.
Part
02
of nine
Part
02

Number of GPU Machines Globally - Future

Overview

The overall GPU market can be broken down into six major categories. These are Desktop PC, Mobile PC, Tablet, Console, Handheld, and Smartphone (Slide 6).

In the year 2017, 2.1 billion GPUs were sold of which 17% were used in PCs and workstations. Also, the average GPUs per PC has grown from 1.2 GPUs per PC in 2001 to 1.44 GPUs per PC in 2017.

As our study focuses on GPU machines used for deep learning and machine learning, we have only considered the data for Desktop PC and Mobile PC (or Notebook PC).

We have used the historical data from the Jon Peddie Research (Slide 9) and using the formulae for CAGR, we have calculated the future figures for the GPU machines. The expected CAGR is -3.89% and expected number of GPU machines in 2019 is 377.3 million, in 2020 is 362.6 million, in 2021 is 348.5 million, in 2022 is 334.9 million, and in 2023 is 321.9 million.

We will now discuss and detail out the methodology followed for our calculations.

Methodology

From the Jon Peddie Research, combining the figures for both desktop and notebook GPUs, overall GPU shipments from 2014 to 2016 were as follows (Slide 9):

Year 2014 460 million units
(230 million Notebook integrated + 130 million Desktop integrated + 50 million Notebook discrete + 50 million Desktop discrete)

Year 2015 397 million units
(200 million Notebook integrated + 110 million Desktop integrated + 42 million Notebook discrete + 45 million Desktop discrete)

Year 2016 420 million units
(200 million Notebook integrated + 120 million Desktop integrated + 50 million Notebook discrete + 50 million Desktop discrete)

For 2017, figures were not readily available. The sales volume for 2017 was calculated using absolute figures (of 2016) and growth figures:

GPU sales of Notebooks shipped in 2016 = 200 million + 50 million = 250 million. Also, there is an annual decrease of 6% in the year 2017. Hence, notebooks shipped in 2017 = 250 * (1.00 0.06) = 235 million units

GPU sales for Desktops shipped in 2016 = 120 million + 50 million = 170 million. Also, there is an annual increase of 2% in the year 2017. Hence, Desktops shipped in 2017 = 170 * (1.00 + 0.02) = 173.4 million units

Therefore, the combined GPU machine sales of Desktops and Notebooks shipped in 2017 was 408.4 million units.

Now, we will calculate CAGR.
CAGR = ((End value / Initial value) ^ (1 / No. of year)) 1
Plugging the numbers we will get,
CAGR = ((408.4 / 460) ^ (1/3)) 1 = -3.89%

Hence, the CAGR from 2014 to 2017 = -3.89%

Using this CAGR we will now calculate the number of GPU machines in future.

Hence,
Total number of machines in 2018 = 408.4 * (-3.89%) = 392.5 million GPU machines
Total number of machines in 2019 = 392.5 * (-3.89%) = 377.3 million GPU machines
Total number of machines in 2020 = 377.3 * (-3.89%) = 362.6 million GPU machines
Total number of machines in 2021 = 362.6 * (-3.89%) = 348.5 million GPU machines
Total number of machines in 2022 = 348.5 * (-3.89%) = 334.9 million GPU machines
Total number of machines in 2023 = 334.9 * (-3.89%) = 321.9 million GPU machines

These calculations are also shown in this excel sheet.

Conclusion

Three companies dominate the global GPU market: Intel, AMD, and Nvidia. We have used historical data from Jon Peddie Research and also considered the previous request. Based on these figures, we have calculated CAGR and estimated the future figures of GPU machines in the year 2018 to 2023. The CAGR is expected to be -3.89% and thus there are predicted to be 392.5 million machines in 2018, 377.3 million machines in 2019, 362.6 million machines in 2020, 348.5 million machines in 2021, 334.9 million machines in 2022, and 321.9 million machines in 2023.

Part
03
of nine
Part
03

GPU Machines - Cloud vs. Non-cloud

While there is no pre-existing information to fully answer your question, we've used the available data to pull together key findings:
Sales and cloud use of GPUs are driven by emerging solutions from startups. Adoptions of GPU/cloud use include machine learning training, simulations, financial analysis and other high-performance compute (HPC) use cases.

Below you'll find an outline of our research methodology to better understand why information you've requested is publicly unavailable, as well as a deep dive into our findings.

Methodology

The use of GPUs accessed via cloud is only recently seeing applications come to market in the last few years. Market leaders in cloud technology like Google and AWS, and GPU productions such as Intel and Nvidia are only beginning to be part of solutions. At the moment, there are no surveys, collection, or quantitative data on the subject for triangulation. We can only see the years that GPU cloud services incorporated and potentially came to market, and evidence that these solutions are expanding.

We scanned information starting with major cloud service providers such as AWS, Google, Microsoft, and IBM. Then we searched for any data relevant from GPU providers themselves like Nvidia, Intel, and AMD. Having found no data from industry leaders, we searched news sources, which led us to discovering a few mentions of trends and emerging companies in this industry. Unfortunately this did not reveal any statistics or data that directly answered your question. We found a few sources that discussed the future of GPU and cloud integration that would suggest it is a rising trend.

Helpful Findings


Drivers of the GPU cloud computing technology may include "machine learning training and inference, geophysical data processing, simulation, seismic analysis, molecular modeling, financial analysis and other high-performance compute (HPC) use cases."

Nvidia is seeing increased revenue and profits thanks to the Tesla V100. The unit is described as "the GPU at the highest end dedicated for deep learning workloads. Another area boosting its revenues is the rise of high performance computing and AI." Those watching the industry expect increases of this technology over the coming years. "At NASDAQs 37th Investor Conference, Microns (MU) CFO, Ernie Maddock, stated.... cloud demand is expected to increase 40%." While this doesn't account for GPU use exclusively, we can see that emerging companies on the scene would provide some evidence.

Third party Solutions


Companies are only just introducing their GPU cloud based products to market after initial research and funding. Paperspace are making GPU cloud solutions commercially viable. Partnered with GPU suppliers Nvidia and Intel, the company is looking to disrupt VMware's long-standing monopoly on the remote desktop space. The company received seed funding in 2015 meaning its product was only recently brought to market. Teradici is creating ways for existing graphic software to be accessed via cloud to its users, which may open up a new payment modeling system. Meanwhile, the most obvious application, gaming, is being developed by companies like Parsec. Because of the novelty of the space, data collection on how many units are used in these types of services are non-existant.

Open source library program TensorFlow is actively used on many of Google's products that allow programs like Google Cloud Machine Learning Engine (GCMLE) to work.

Major Industry Holders

Lastly, we see major companies opening up their own use and marketing the GPU cloud capabilities. Google is offering hourly use of GPU attached services with the "Nvidia K80 and Nvidia P100 GPUs to Preemptible VMs for $0.22 and $0.73 per GPU hour." This is a game changer for "distributed, fault-tolerant workloads that don't continuously require any single instance" and reportedly undercuts many of the other cloud computing options available today.


Conclusion

The evidence of both industry leaders and small scale disruptors adopting this service would indicate that GPU availability through cloud is going to be increasingly adopted in coming years. While there is no qualitative data to triangulate a % or unit number, the emerging applications and news articles would point towards continuing services emerging on the market.

Part
04
of nine
Part
04

GPU Machines - AI

With the vast requirements of AI and Machine Learning, GPU-based machines are far preferable to CPU machines. This report hopes to outline how many GPU-based AI machines are being used globally. In addition, we will be projecting these figures through to 2023.

Past, present and future data

Number of GPU units being used for AI purposes, Historical
(Excludes 2014 as this was the first year of market penetration — Please see below for calculations)

2015: 72,289
2016: 77,062
2017: 188,679
2018: 375,085

Number of GPU units being used for AI purposes, Projected

2019: 566,116
2020: 854,439
2021: 1,289,604
2022: 1,946,399
2023: 2,937,700

Note: As the market grows, Nvidia will no longer be the sole indicator and these numbers may become quite conservative

assumptions

Nvidia is the clear market leader in AI GPU architecture, with 90% of the market share. For this reason, we can use per-segment revenue data available through Nvidia to give an accurate overview of the market as a whole.

Note: Intel and Google are both building competitive chips, however until this point they have not captured the market share necessary to affect our numbers.

We have assumed a flat, average per-unit price for the equivalent of an Nvidia Tesla (The primary chip for AI and Machine Learning), calculated based on the attached pricing table.

# of Units = (data center revenue) / ($4399)

Growth was calculated via CAGR to stay consistent with previous requests.

CAGR = (EV / BV)^(1/n) - 1
EV = End Value
BV = Beginning Value
n = # of Periods

Effective CAGR: 50.93%

Calculations

The total revenue for 2015 from Data Center sales can be found by adding the quarterly sales. Therefore, 2015 Data Center sales = $57 million + $83 million + $90 million + $88 million = $318 million.

The average price of a GPU can be found by averaging the prices of multiple GPUs. Therefore, average GPU price = ($4,599 + $1,599 + $4,499 + $5,099 + $2,299 + $4,599 + $5,699 + $1,899 + $5,699 + $7,999) / 10 = $4,399.

The number of GPUs sold can be found by dividing the total sales by the price of a single GPU. Therefore, the number of GPUs sold in 2015 = $318,000,000 / $4,399 = 72,289.

Similar calculations can be done for 2016 using the quarterly sales, which amount to $339 million. Therefore, the number of GPUs sold in 2016 were $339,000,000 / $4,399 = 77,062.

Similar calculations can be done for 2017 using the quarterly sales, which amount to $830 million. Therefore, the number of GPUs sold in 2016 were $830,000,000 / $4,399 = 188,679.

Similar calculations can be done for 2018 using the quarterly sales, however, sales data for 2018 is only available for the first two quarters. Hence, we have used those numbers to project the sales for the entire year by multiplying them with 2. Therefore, the total sales for 2018 will amount to $1,650 million. Therefore, the number of GPUs sold in 2016 were $1,650,000,000 / $4,399 = 375,085.

For projection of the number of GPUs in 2019 through 2023, we have used the CAGR of the last 4 years, which can be calculated using this method.
Therefore, CAGR = [(375,085 / 72,289)^(1/4)]-1 = 50.93%.

Therefore, GPU sales in 2019 = 375,085 * 1.5093 = 566,116.
GPU sales in 2020 = 375,085 * 1.5093^2 = 854,439.
GPU sales in 2021 = 375,085 * 1.5093^3 = 1,289,604.
GPU sales in 2022 = 375,085 * 1.5093^4 = 1,946,399.
GPU sales in 2023 = 375,085 * 1.5093^5 = 2,937,700.

Conclusion

In short, there will be approximately 2.94 million GPU processing units being used for AI and Machine Learning by 2023, 40 times that of 2014.
Part
05
of nine
Part
05

AI Storage - Historical

While there is no pre-existing information to fully answer your question, we've used the available data to pull together key findings:
The estimated amount of data stored for cognitive systems (artificial intelligence, machine learning, and natural language processing) is:

2018: 19,240 petabytes
2017: 14,000 petabytes
2016: 11,570 petabytes
2015: 8,970 petabytes
2014: 5,070 petabytes

Below you'll find an outline of our research methodology to better understand why information you've requested is publicly unavailable, as well as a deep dive into our findings.

METHODOLOGY

We have reviewed industry data reports, as well as statistics databases, in search of the requested information. In addition, we looked for academic papers that may explore this topic. Finding no sources that directly answered the question, we searched for data points that would allow me to triangulate the figure. Little data was available specific to the AI segment of data storage, but we found a few figures that allowed me to provide some very rough estimations. We were not able to find any information regarding data storage vendors for Google or Amazon, despite searching company websites and financial reports.

FINDINGS AND CALCULATIONS

Data storage needs are growing exponentially. In a 2014 report, IDC estimated "the world's data will amount to 44 zettabytes by 2020." A 2017 IDC report now estimates data will grow to 163 zettabytes by 2025. In a report by Seagate, it states that IDC has estimated that the amount of global data that will be "subject to data analysis" by cognitive systems (machine learning, artificial intelligence (AI), and natural language processing) will "grow by a factor of 50 to 5.2ZB in 2025; the amount of analyzed data that is “touched” by cognitive systems will grow by a factor of 100 to 1.4ZB in 2025."

For the purposes of this research, we will assume that the amount of data anticipated or existing equals the amount of storage needed. A zettabyte (ZB) equals 1 million petabytes (PB). So, using the IDC information, we can estimate that the amount of data being stored for use by AI and other cognitive systems in 2017 is around 104,000 petabytes (5.2 million PB / 50). Expanding on the IDC data, the amount of data that will be "touched" by cognitive systems would be approximately 14,000 petabytes (1.4 million PB / 100).

Data compiled by Statista offers figures for the global data storage supply over several years. Using the previously triangulated figures for the amount of data touched by cognitive systems, we can very roughly triangulate how much of that data may have been attributable to AI over the years.

In 2017, the data supply was estimated to be 10,800 exabytes (or 10,800,000 petabytes). If we use the estimated 14,000 petabytes of data stored for cognitive systems (i.e. AI and machine learning) and apply it to the 2017 data supply, we get 0.13% (14,000 / 10,800,000).

Now, if we apply 0.13% to the global data supply for 2014-2016, we get the following estimated figures:

2014: The global data supply was 3,900,000 PB; therefore, the number attributable to AI is estimated to be 5,070 petabytes (3,900,000 x .0013).

2015: The global data supply was 6,900,000 PB; therefore, the number attributable to AI is estimated to be 8,970 petabytes (6,900,000 x .0013)

2016: The global data supply was 8,900,000 PB; therefore, the number attributable to AI is estimated to be 11,570 petabytes (8,900,000 x .0013).

2018: The global data supply is expected to reach 14,800,000 PB; therefore, the number attributable to AI is estimated to be 19,240 petabytes (14,800,000 x .0013).

CONCLUSION

In conclusion, using available data, we have estimated the historical amount of data storage used for artificial intelligence, to be approximately:

2018: 19,240 petabytes
2017: 14,000 petabytes
2016: 11,570 petabytes
2015: 8,970 petabytes
2014: 5,070 petabytes
Part
06
of nine
Part
06

AI Storage - Future

Artificial Intelligence (AI) data storage is expected to reach 442,749.63 petabytes by 2023, from 14,000 petabytes in 2017, at an expected CAGR of 77.83%.

The above figure was derived from a study conducted by International Data Corporation (IDC) and sponsored by Seagate. According to this study, cognitive AI, or cognitive system's data (consists of artificial intelligence, natural language processing, and machine learning), will be 100x by 2025, or an equivalent 1.4 zettabytes.

AI GROWTH RATE AND EXPECTED FUTURE DATA STORAGE

According to IDC report released in 2017, in 2025, the amount of global cognitive system's datasphere through analyzed data is projected to grow by a factor of 50 (or the amount will be 50x) to 5.2 zettabytes. The amount of "analyzed data that is directly used or "touched" by cognitive/artificial intelligence (AI) systems will grow by a factor of 100 (or the amount will be 100x) to 1.4 zettabytes in 2025."
This means that in 2017, the data was around 0.014 zettabytes (or equivalent to 14,000 petabytes).

Year 2017 amount = year 2025 amount / 100
1.4 ZB / 100 = 0.014 ZB (or 14,000 petabytes)
1 zettabyte = 1,000,000 petabytes

If in 2017 the data amount for AI storage was 14,000 petabytes and is expected to reach 1,400,000 petabytes by 2025, then CAGR from 2017 to 2025 is calculated to have a value of 77.83%.

CAGR = (future_value / present_value)^(1 / number of years) - 1, or by simply using a CAGR calculator;

CAGR = (14,000/1,400,000)^(1/8) - 1
CAGR = 77.83%

Using the calculated 77.83% CAGR, we are able to calculate the value from 2019 through 2023 using the below formula for each year, or by simply using a future value (FV) calculator;

FV = present_value * (1 + CAGR)^number of years

2019 = 44,272.91 petabytes
2020 = 78,730.52 petabytes
2021 = 140,006.48 petabytes
2022 = 248,973.53 petabytes
2023 = 442,749.63 petabytes
2024 = 783,341.67 petabytes
2025 = 1,400,129.68 petabytes (1.4 zettabytes)

AI DATA STORAGE GLOBAL SHARE

According to Statista report, the below annual global data storage supply (all industries) is expected for 2019 to 2020. Note that I have already converted the data to petabytes (1 exabyte = 1,000 petabytes).

2017 = 10,800,000 petabytes
2018 = 14,800,00 petabytes
2019 = 19,800,000 petabytes
2020 = 24,800,000 petabytes

Using a CAGR calculator, global data storage (all industries) is growing at 31.93% annually. If we use the 31.93% CAGR to calculate until 2023 and using FV formula (FV = present_value * (1 + CAGR)^number of years), we will gave the following values:

2021 = 32,718,640 petabytes
2022 = 43,165,702 petabytes
2023 = 56,948,510 petabytes

From the above figures and using the earlier computed value of AI data storage for years 2019 to 2023, we are able to determine AI data storage percent share against global data storage.

2019 = (44,272.91/19,800,000)*100 = 0.22%
2020 = (78,730.52/24,800,000)*100 = 0.32%
2021 = (140,006.48/32,718,640)*100 = 0.43%
2022 = (248,973.53/43,165,702)*100 = 0.58%
2023 = (442,749.63/56,948,510)*100 = 0.78%

We can also calculate the CAGR of the AI global data storage percent share with the formula:
CAGR = (future_value / present_value)^(1 / number of years) - 1, or by simply using a CAGR calculator

CAGR = (0.78/0.22)^(1/4) - 1 = 37.22%

Hence, Artificial intelligence (AI) global data storage percent share is growing at an average of 37.22% annually.

CONCLUSION

While AI data storage is expected to grow at 77.83% annually, its percent share is also growing at an average of 37% annually. The staggering growth of data storage in AI applications is highly expected as according to Datamation, AI (in relation to machine learning), is "experiencing an undeniable boom". Technology, media, and telecommunications sector is set to intensify the use of machine learning and the number of pilot projects will double in 2018, as compared with 2017, and is set to double its figure again in 2020. Additionally, IDC predicted that spending on cognitive and artificial intelligence systems including machine learning solutions will also grow at about 50.1% annually through 2021.
Part
07
of nine
Part
07

AI Storage Percentage Vs. Everything Else

Artificial intelligence currently (2018) occupies a relatively small 0.057% of the overall digital storage available worldwide. However, this is set to grow substantially through to 2025, both in terms of percentage of overall storage and absolute amount of storage allocation. This data was triangulated using different sources, and a deep dive of how we arrived at these figures is available below.

METHODOLOGY AND RESULTS
In researching this request, we used the two previously completed Wonder research reports for the following projects; "AI storage - Historical" and "AI storage - Future". Both reports used triangulated data, as there was no pre-compiled information available on the project. To provide consistency in reporting, we used the results of these reports to provide the percentage of storage devoted to AI compared to other data.
The estimated amount of data both past and present stored for cognitive systems (artificial intelligence, machine learning, and natural language processing) as obtained from the two previous reports is:
2025 = 1,400,129.68 petabytes (1.4 zettabytes)
2024 = 783,341.67 petabytes
2023 = 442,749.63 petabytes
2022 = 248,973.53 petabytes
2021 = 140,006.48 petabytes
2020 = 78,730.52 petabytes
2019 = 44,272.91 petabytes
2018: 19,240 petabytes
2017: 14,000 petabytes
2016: 11,570 petabytes
2015: 8,970 petabytes
2014: 5,070 petabytes

It is evident here that the data storage needs for AI are accelerating exponentially, with the most growth still to come as AI technology matures.

According to IDC’s Data Age 2025 study (sponsored by Seagate and published in April 2017), the total amount of data stored in the world for various years is as follows:
2014 = 12 zettabytes
2015 = 16 zettabytes
2016 = 19 zettabytes
2017 = 23 zettabytes
2018 = 34 zettabytes
2019 = 42 zettabytes
2020 = 53 zettabytes
2021 = 65 zettabytes
2022 = 82 zettabytes
2023 = 104 zettabytes
2024 = 130 zettabytes
2025 = 160 zettabytes

It is important to note that this information is for all data stored worldwide, not just that stored on Seagate's data banks. Therefore, we can use this data to estimate the percentage of global storage that will be used by AI.
To determine this percentage as compared to total amount of data stored in the world we performed the following calculations (process of converting zettabytes to petabytes not shown below for clarity):

Format:
Year (Total amount of data stored for cognitive systems "AI" / Total amount of data stored globally for that particular year) * 100
2014
(5,070 petabytes / 12 Zettabytes) * 100 = 0.042%
2015
(8,970 petabytes / 16 Zettabytes) * 100 = 0.056%
2016
(11,570 petabytes / 19 Zettabytes) * 100 = 0.061%
2017
(14,000 petabytes / 23 Zettabytes) * 100 = 0.061%
2018
(19,240 petabytes / 34 Zettabytes) * 100 = 0.057%
2019
(44,272.91 petabytes / 42 Zettabytes) * 100 = 0.105%
2020
(78,730.52 petabytes / 53 Zettabytes) * 100 = 0.149%
2021
(140,006.48 petabytes / 65 Zettabytes * 100) = 0.215%
2022
(248,973.53 petabytes / 82 Zettabytes * 100) = 0.304%
2023
(442,749.63 petabytes / 104 Zettabytes * 100) = 0.426%
2024
(783,341.67 petabytes / 130 Zettabytes * 100) = 0.602%
2025
(1.4 zettabytes / 160 Zettabytes * 100) = 0.875%

LIMITATIONS
This report relies on the data contained in the Wonder reports "AI Storage - Historical" and "AI Storage - Future" which contained triangulations to determine how much data storage AI requires. Therefore, this report will necessarily encounter the same limitation as these reports - namely that the data is only an estimate.

SUMMARY
By 2025, AI will take up approximately 0.875% of the world's 160 zettabyte data storage, up from 0.057% of the current 34 zettabytes available. It is clear from this data that the percentage of total space taken up by AI is increasing at a rapid rate, despite the large growth in total storage space available. This represents the vast amounts of storage space needed as AI technology becomes more complex. More complex processing and better machine learning requires more data and, as we can see, the volume of this data (and the storage needed to contain it) are going to be immense.
Part
08
of nine
Part
08

Deep Learning and Machine Learning - US Individuals

While there are no preexisting reports available in the public domain regarding number of people working in the AI industry specific for machine learning or deep learning category in the United States, we were able to come up with an estimated number from 2014 to 2017 and a forecast number until 2023 using available statistics and reports from reputable industry reports and articles. For the purpose of this research, AI companies are also sometimes called AI startups.

In 2017, the United States has over 1000 AI companies employing about 144,356 people (estimated). Of these, the machine learning category employs about 60,630, and forecast to reach 711,455 by 2023. Below you will find detailed calculation, triangulation, and assumptions of how these figures were derived.

AVERAGE SIZE OF EMPLOYEES OF AI COMPANIES

The average size of employees for AI companies in the United States was derived from the reported number of employees of the top or leading AI companies operating in the U.S. released by Fortune. The list of these AI companies (only those located in the United States) including their number of employees are as per below.

Drawbridge — 143 employees
Persado227 employees
InsideSales — 362 employees
nuTonomy100 employees
Nauto50 employees
Zoox104 employees
CrowdFlower — 200 employees
RapidMiner — 88 employees
Tamr99 employees
Versive/Context Relevant — 60 employees
DataRobot — 149 employees
Paxata250 employees
Trifacta123 employees
Dataminr274 employees
BloomReach — 203 employees
MindMeld — 24 employees
x.ai43 employees
Numenta23 employees
H2O.ai72 employees
CognitiveScale — 143 employees

The average number of employees from the above top 20 data is 136.85.

2,737/20 = 136.85 (average employees for AI companies in the U.S.)

AI COMPANIES AND ESTIMATED NUMBER OF PEOPLE WORKING IN AI IN THE UNITED STATES

According to an industry statistics released by Tech Emergence in 2017, North America (including Canada) has about 1,500 companies operating in the AI sector. The report stated that less than 1% of all mid-to-large enterprise across all sectors in the U.S. are adopting artificial intelligence. Software and information technology industry is leading with about 32% adoption rate, followed by internet services and telecommunications at 8.78% and 4.19% respectively.

Venture Scanner, an analyst and technology powered startup research firm, released quarterly reports of AI companies globally. See below details;

Q4 2017: there are about 2,029 AI companies from over 70 countries globally in all categories.
Q3 2017: AI startups measured at 1,965 companies in all categories across 70 countries. During this time, the U.S. has over 1000 companies.
Q2 2017: about 1,844 Artificial Intelligence companies in all categories across 70 countries
Q1 2017: 1,700 Artificial Intelligence companies globally.
Q4 2016: 1,503 Artificial Intelligence companies
Q3 2016: 1,287 Artificial Intelligence companies
Q2 2016: 1,139 Artificial Intelligence companies. The U.S. has 600 companies at this time.

From the above figures, the United States share in 2017 was 51% while the share in 2016 was 52%. The average percent share is measured at 52%.

2017 share = (1000/1,965)*100 = 51%
2016 share = (600/1,139)*100 = 53%
Average = (51%+53%)/2 = 52%

Using a CAGR calculator, it would appear that quarterly growth (from Q2 2016 thru Q4 2017) of AI companies is 10.10%.

Given the numbers in 2016 and using the 10.10% quarterly growth to calculate backward for 2015, 2014 and 2013;

Q2 2016 = 1,139 AI companies
Q1 2016 = 1,139/1.101 = 1,034 companies; and using the same methodology to come up with each end of the year number;
2015 (or end of 2015)= 939
2014 (or end of 2014)= 639
2013 (or end of 2013)= 474

If at the end of 2013, there was 474 AI companies and end of 2017 has 2,029, the calculated CAGR (using a CAGR calculator) form 2013 to 2017 is 50.74%. Using this percentage to calculate further for 2018 to 2023;

2018 = 2,029 * 1.5074 = 3,059 AI companies
2019 = 3,059 * 1.5074 = 4,610
2020 = 4,610 * 1.5074 = 6,950
2021 = 6,950 * 1.5074 = 10,476
2022 = 10,476 * 1.5074 = 15,792
2023 = 15,792 * 1.5074 = 23,804

Earlier we have already identified that the average U.S. % share in terms of company count is 52% and the average size of employees per AI company is 136.85. Assuming this % share and size of employees are constant all throughout the years (2013 to 2023), we are able to calculate the estimated number of employees annually;

2013 = 474*52%*136.85 = 33,731 people or employees
2014 = 639*52%*136.85 = 45,473 people
2015 = 939 *52%*136.85 = 66,821
2016 = 1503 *52%*136.85 = 106,933
2017 = 2029 *52%*136.85 = 144,356
2018 = 3,059 *52%*136.85 = 217,684
2019 = 4,610 *52%*136.85 = 328,056
2020 = 6,950 *52%*136.85 = 494,575
2021 = 10,476 *52%*136.85 = 745,329
2022 = 15,792 *52%*136.85 = 1,123,790
2023 = 23,804 *52%*136.85 = 1,693,940

If we look at the LinkedIn database, there are currently 124,200 employees in the United States who is working in any AI-related industry. Comparing this with our calculation above for 2017 which is 144,356 employees, the numbers are reasonably close.

AI AND MACHINE LEARNING/DEEP LEARNING

The calculated numbers above is applicable for all AI categories which includes machine learning applications, machine learning platforms, smart robots, recommendations, CV platforms, CV applications, NLP, virtual assistants, speech recognition, gesture control, video recognition, context computing and speech-to-speech categories.

According to Venture Scanner, machine learning category has been leading the industry in terms of company counts and funding amount. In 2017, machine learning applications and machine learning platforms companies have a total count of about 669 and 200 respectively, which denotes a 42% share of the total AI industry. If we apply this percentage to get an estimate for the size of employees working specifically for machine learning applications and platforms in the AI industry, we can come up with the below annual results.

2013 = 33,731 people * 42% = 14,167 people; and using the same formula to calculate for 2014 to 2023;
2014 = 19,099
2015 = 28,065
2016 = 44,912
2017 = 60,630
2018 = 91,427
2019 = 137,784
2020 = 207,722
2021 = 313,038
2022 = 471,992
2023 = 711,455

CONCLUSION

In summary, based on the top 20 AI companies in the United States, the average size of employees per company is 136.85. About 52% of the AI companies are located in the United States (or about 1,055 companies in 2017). AI companies are growing in numbers at more than 50% annually since 2013. Machine learning applications and platforms account for about 42% share based on company counts. By 2023, the estimated forecast number of people working in AI industry (machine learning category) in the United States is 711,455 from about 19,099 in 2014.
Part
09
of nine
Part
09

Deep Learning and Machine Learning - US Companies

The AI market is booming, and as a result the number of companies, particularly startups, producing AI solutions is growing rapidly. There are 1,500 established companies and 860 startups who were developing AI in 2017 for a total of 2,360 companies, up 21% from 2016. In addition, the global market for AI as measured by revenue is expected to grow at a CAGR of 45.4-57.2% over the next 5 to 7 years.

MARKET OVERVIEW

The global AI market was valued at $641.9 million in direct revenue in 2016 and is expected to grow to $35.9 billion by 2025, growing at an incredible CAGR of 57.2%. A slightly more pessimistic estimate is that it will grow at a CAGR of "only" 45.4% and reach $19.5 billion by 2022. McKinsey estimates that in 2016, companies invested $26-39 billion in AI, $20-30 billion of which went to tech giants, and $6-9 billion were invested in startups.

ESTABLISHED COMPANIES DEVELOPING AI

Aman Naimat, the founder of Spiderbook, surveyed over 500,000 companies around the world in 2016 and found that only 1,500 companies in North America were working on AI. Of these, 967 were still at the lab project phase, 494 were building applications, and only 87 were already using AI strategically to direct their business.
The key manufacturers, as profiled in one market report, are Google Inc., IBM Corp., Microsoft Corporation, IPsoft, Rocket Fuel Inc., Qlik Technologies Inc., MicroStrategy, Inc., Brighterion, Inc., 24/7 Customer, Inc., and Next IT Corp. This list is similar to, but not identical with, Naimat's top 18 list:
1. Google
2. Facebook
3. Rocket Fuel
4. IBM
5. Amazon
6. Yahoo
7. Intel
8. Microsoft
9. Deloitte
10. MITRE
11. Baidu
12. LinkedIn
13. Apple
14. Cylance
15. Lockheed Martin
16. NASA
17. Sentient Corporation
18. Electronic Arts
Naimat's report also provides a list of top companies investing in AI by industry, if this is desired. However, despite a thorough search, we were unable to find any industry expert offering a prediction on how many more established companies would start developing AI in the future.

AI STARTUPS

At the end of 2017, CB Insights released its list of the top 100 AI startups (including 11 unicorns), which between them had raised $11.7 billion in funding. This is well above the $6-9 billion invested in all AI startups in 2016. The funding of the top 100 startups ranged from $3 million to $3.1 billion, which indicates that those that didn't make the list have funding of less than $3 million each.
However, CB Insights points out that they started with a list of over 2,000 AI startups. This is a 21% increase from the 1,650 AI startup candidates that they reviewed in 2016. They did not disclose how many of these startups were based in the US. It is perhaps significant that while China's three tech giants, Baidu, Didi, and Tencent, "have all set up their own AI research labs," they are also investing heavily in US AI startups: "Research firm CB Insights found that Chinese participation in funding rounds for American startups came close to US $10bn in value last year, while recent figures indicate that Chinese companies have invested in 51 US artificial intelligence companies to the tune of US $700m." In other words, even countries with their own AI development are investing in US startups.
However, we can determine an approximate number of US AI startups with a triangulation. According to Crunchbase, 1,339 of the 3,127 AI startups currently being tracked are headquartered in the US, so about 42.8%. If we assume for the sake of calculation that an AI startup is likely to do most of its work in its country of origin, then about 860 of the over 2,000 companies reviewed by CB Insights in 2017 were US-based, as well as 706 of the 1650 reviewed in 2016.
As with the number of established companies, we were unable to find a prediction by an industry expert on how many new startups will enter the AI field in the future. While we could attempt a back-of-the-envelope calculation based on the 21% growth between 2016 and 2017, we believe that extrapolation from a single data point would be untrustworthy and little more than a wild guess.

CONCLUSION

From the available information, we triangulate that there are 1,500 established companies and 860 startups--a total of 2,360 companies--who were developing AI in 2017. This number is up 21% from 2016. We cannot be certain that this rate of growth is consistent due to a lack of additional data points. The global market for AI as measured by revenue is expected to grow at a CAGR of 45.4-57.2% over the next 5 to 7 years, but this of course does not indicate that the number of companies involved will grow at the same pace.
Sources
Sources