Why does machine learning technology need GPUs?

Why do so many chip makers like Nvidia and AMD choose GPUs to invest in machine learning instead of CPUs?

The following article by journalist Michale Byrne from the Motherboard website will explain why the machine learning technology needs GPU:

In the summer of last year, I had the opportunity to attend a conference on GPUs organized by Nvidia. Although famous for being a gaming conference, in fact, the main topic discussed here is GPU for machine learning technology.

Not only Nvidia, another chip maker, AMD, has introduced a new GPU line aimed at machine learning technology at the ongoing CES 2018 event. is a concept in the field of artificial intelligence to research and build techniques that allow computers to "learn" automatically from the data available to solve various problems. It may seem irrelevant but why all Nvidia to AMD jump into GPU research for machine learning technology. Why don't they choose CPU? The main answer is because of the statistical matrix .

Picture 1 of Why does machine learning technology need GPUs?
Nvidia CEO Jen-Hsu Huang at the introduction of the new GPU for machine learning technology.

First, we need to understand that no magic can make computers "learn". All the secrets of machine learning technology are math, more precisely statistical mathematics . Based on a large amount of initial data, machine learning technology will analyze and interpret complex equations with a lot of formulas, then optimize them to give us accurate and reliable predictions. . Although it sounds simple, this is really a difficult field in artificial intelligence research.

To better understand the optimization of machine learning technology, let us imagine the relationship between cause and effect. For example, it is cold outside. Why do we feel that way? There are many things outside that can tell us about how this is like the surrounding air, the sky is cloudy or sunny, how much moisture is and which season it is currently. Thus, even if we don't know what the temperature is, we can almost predict whether it is cold outside or not.

Of course, not all data has the same importance when predicting the outside temperature. For example, seasonal data may be 10 times more important than any other data. Meanwhile, air humidity data is only as important as 1/3 of the altitude data. The problem with machine learning is that we need to perform a series of observations to retrieve data and determine the importance of each specific data. Then we bring these data into an optimized equation and make predictions.

Optimized equations are often referred to as models . Models are a way to simulate the relationship between data and make predictions. The problem of math in machine learning is how to find a model or understand the importance of this data compared to other data.

Each data in the model will be attached to a weight indicating its importance. For example, in the model below, the temperature factor is of greatest importance, followed by seasons and air pressure:

Picture 2 of Why does machine learning technology need GPUs?

In order for the weights to be meaningful in comparison, we have to make observations many times. Predicting the outside temperature is just a simple task. A true machine learning model needs millions of different observations and has to edit the weights multiple times to optimize predictions.

In fact, no one implemented the machine learning model by observing each data one by one. Instead, we will use statistical matrices with horizontal rows of observations and vertical rows as separate observations (seasons, temperature, air pressure .). Through it, computers can understand the relationship between data.

Machine learning technology does not only use a single statistical matrix. Instead, in order to be most effective, we need to combine multiple matrices together. Calculating and combining a large number of matrices is very complicated. Therefore, software engineers have devised ways to implement them on the graphics processor, where each matrix is ​​treated as a pixel (pixel) . Multi-matrix computing is the main reason why machine learning technology needs GPUs.

So what is the difference between computation on GPU and normal computing on CPU. That is the ability to parallel calculations multiple calculations at once. Typically, computing on the CPU (central processor) takes place in a sequential manner. This means that the calculations must depend on each other and the following calculation must wait until the previous calculation is done. In other words, adding multiple cores cannot help increase the computing power of the CPU.

Picture 3 of Why does machine learning technology need GPUs?
AMD's new 7nm Vega GPU is like a "bomb" in the GPU market for machine learning technology.

However, the calculations in the GPU are completely different when the calculations are done in parallel. Therefore, adding multiple cores helps the GPU have tremendous computational power to implement and combine statistical matrices in machine learning models. Investing in GPUs is the right step for chip makers because technical problems in computing limits are easier to solve than CPUs. As machine learning technology grows and needs more computational power, the importance of GPU will be even more evident.