Apple’s brutal M1 Ultra processor compared to the other chips in the family

  • 13

“The M1 Ultra marks a before and after in the history of Apple chips that will shock the PC industry again. Connecting two M1 Max chips with the UltraFusion packaging architecture allows us to take Apple chips to new heights.” This statement by Johny Srouji, who is one of the heads of Apple’s hardware technology department, leaves no room for doubt. about trust which has this mark on this chip.

As Srouji explains, the M1 Ultra processor is the result of the interconnection of two M1 Max chips, which reflects that Apple probably included it in its roadmap when this family of microprocessors took its first steps. In any case, the foundations of the four CPUs that we are going to investigate are the same. They share the same design philosophy, the same microarchitectureand also the same manufacturing technology.

The M1 Ultra, M1 Max, M1 Pro and M1 processors are being produced by Taiwanese semiconductor manufacturer TSMC using 5nm photolithography. The most ambitious of these chips, the Ultra model presented yesterday by Apple, brings together no less than 114 billion transistorsa staggering number that the original M1 processor’s 16 billion pales in comparison to.

Like the other members of the M1 family, the new Ultra chip uses two different types of cores with the purpose of balancing the performance, the level of thermal energy dissipation, and the consumption of the CPU at runtime based on the requirements imposed by the threads (threads) that are being processed at a given instant.

An M1 Max processor incorporates ten general-purpose CPU cores, so eight of them are high-performance, and the remaining two are high-efficiency. From here and keeping in mind that an M1 Ultra chip is the result of join two M1 Max we can easily conclude what its figures are: twenty CPU cores, of which sixteen are high-performance, and the remaining four are high-efficiency.

It is enough to take a closer look at the specifications of the M1 Ultra processor to realize that it is mainly intended for workstations. After all, the new Mac Studio, which is the first computer that integrates it, beyond the fact that its design is so slim and compact, it is a computer with a clear professional vocationespecially the version that is equipped with an M1 Ultra processor (it can also be supported by an M1 Max).

Apple M1 Ultra, M1 Max, M1 Pro and M1: technical specifications

M1 ultra

m1max

m1 pro

m1

PHOTOLITHOGRAPHY

5nm

5nm

5nm

5nm

number of transistors

114 billion

57 billion

33.7 billion

16 billion

MAKER

TSMC

TSMC

TSMC

TSMC

number of cpu cores

twenty

10

10 / 8

8

high performance (AR) cores

16

8

8 / 6

4

high efficiency cores (Ae)

4

two

two

4

instruction cache (AR)

192KB

192KB

192KB

192KB

data cache (AR)

128KB

128KB

128KB

128KB

shared level 2 cache (AR)

48MB

24MB

24MB

12MB

instruction cache (Ae)

128KB

128KB

128KB

128KB

data cache (AE)

64KB

64KB

64KB

64KB

shared level 2 cache (ae)

8MB

4MB

4MB

4MB

number of graphics cores

64

32 / 24

16 / 14

8

execution units

8192

4096

2048

1024

fp32

20.8 TFLOPS

10.4TLOPS

5.2 TFLOPS

2.6 TFLOPS

maximum main memory

128GB

64GB

32GB

16 GB

memory technology

LPDDR5-6400

LPDDR5-6400

LPDDR5-6400

LPDDR4X-4266

memory bandwidth

800GB/s

400GB/s

200GB/s

68GB/s

neural engine (NE) nuclei

32

16

16

16

operations per second (NE)

22 billion

11 billion

11 billion

11 billion

Scalability: this is the best asset of the M1 processors

The table that we publish above these lines reflects with enormous clarity the indisputable scalability that puts on the table the architecture implemented in this family of microprocessors.

High performance and high efficiency cores are essentially identical on these chips, so Apple engineers can act on their number, the shared space of level 2 and level 3 caches, the number of graphics cores, and the memory interface to relatively easily balance performance and the consumption of these CPUs.

An industry in the hands of TSMC and Asian factories: the map of world chip production

The graphics logic built into the M1 Pro processors, and especially the M1 Max, is very capable, but that of the M1 Ultra play in another league. It is the result, once again, of scaling the hardware of the other chips in the family, but what it offers us, if we stick to the information we currently have, which is what Apple has provided us, is impressive.

The brute force of the graphics logic of the M1 Ultra chip is comparable to that of the most ambitious GPUs that NVIDIA and AMD currently have in their catalog

And it is that in theory its brute force is perfectly comparable to that of the most ambitious graphics processors that NVIDIA and AMD currently have in their catalog. In fact, in some sections the graphics logic of the M1 Ultra is even more capable. Their 20.8 TFLOPS (FP32) they convince, but it is their texture fill rate of 660 Gtexels/s and 330 Gpixels/s pixel rate that are the most impressive.

As a reference we can look at the figures of an NVIDIA GeForce RTX 3080 Ti GPU. Its texture rate is 532.8 Gtexels/s, and its pixel rate is 186.5 Gpixels/s. These figures they only reflect a partial view of a graphics processor’s performance because there are many other subsystems of the GPU that need to be considered, but they serve to put the numbers of the graphics logic of the M1 Ultra chip into context.

Apple M1 Ultra Chipset 220308

In the photograph that we publish below these lines we can see the physical interface used by Apple to resolve the communication of the two M1 Max chips that make up an M1 Ultra processor. The heart of the UltraFusion packaging architecture, which is the name of the innovation that allows the two M1 Max CPUs to work in perfect synchronicity, is a link that binds more than 10,000 drivers.

The heart of UltraFusion packaging technology is a high-performance bond that bundles more than 10,000 conductors

This communication path is capable of reaching a theoretical maximum bandwidth of 2.5 TB/s, which is truly outrageous. However, it is imperative that the cores and other functional logic units of each M1 Max can communicate. with minimal latency. Otherwise the overall productivity of the M1 Ultra processor would suffer when running multi-threaded applications capable of hosting threads in cores on either side of the high-performance link.

It is clear that what Apple engineers have been aiming for during the design phase of the UltraFusion packaging architecture is to ensure that in practice an M1 Ultra processor is not perceived as being made up of two M1 Max chips. In this ambit minimizing latency is crucial, but it is also very important to give the operating system the illusion that below it is a single CPU equipped with 20 cores. In this way, developers do not have to worry about the peculiarities of the M1 chip on which their software will be executed.

ultrafusion

Before we take a look at the figures that Apple has published to describe the performance per watt of the M1 Ultra processor it is worth taking a brief look at the main memory that this CPU can work with. Like the M1 Pro and M1 Max chips, the M1 Ultra works in tandem with LPDDR5-6400 memory, although the latter allows the installation of a unified main memory map with a maximum capacity of 128 GB. The theoretical maximum bandwidth of this subsystem amounts, according to Apple, to 800 GB/s.

There is no shortage of chips, there is a huge deficit: we are manufacturing more than ever and it is still not enough

On the other hand, the logic that is responsible for executing the artificial intelligence algorithms is, again, the result of linking two M1 Max chips. And it is that the Neural Engine of the M1 Ultra processor incorporates 32 cores (compared to the 16 of each M1 Max chip), which allows it to carry out the monstrous number of 22 trillion operations per second. And it is about billions of us, not the Anglo-Saxons.

The performance per watt ratio that Apple promises us is fabulous

We come to the most controversial part of the presentation that Apple made yesterday. The efficiency of the M1 family processors it is beyond any doubt if we stick to the result we have obtained in our performance tests so far, but the figures that this brand offers us to describe the performance per watt of the M1 Ultra chip are impressive.

According to Apple, an M1 Ultra processor almost doubles the performance of a PC equipped with an Intel Core i9-12900K CPU and DDR5 memory.

On the next slide we can see that, according to Apple, an M1 Ultra processor almost doubles the performance of a PC equipped with an Intel Core i9-12900K CPU and DDR5 memory when both draw 60 watts. When we have the opportunity to review Apple’s first computer equipped with an M1 Ultra processor, we will see if, indeed, its performance per watt is as attractive as Apple claims.

Apple M1 Ultra CPU Performance 02

The next slide features the same fighters from the previous chart: the M1 processor on one side of the ring, and on the other a PC equipped with an Intel Core i9-12900K CPU and DDR5 memory. What this second graph reflects is that, always according to Apple, the M1 Ultra processor achieves match relative PC performance when the first shows a consumption of about 40 watts, and the second is close to 160 watts.

The figures that Apple has published contribute to generating very high expectations around the M1 Ultra chip

It is clear that Apple is an interested party, and we do not know in detail under what conditions and with what software these graphs were made, but there is no doubt that these figures contribute to set very high expectations around the M1 Ultra chip. It may not take long for us to confirm if these numbers accurately reflect reality, but, again, they are very promising.

Apple M1 Ultra CPU Performance 01

The latest graph produced by Apple seeks to deepen the wound it intends to inflict on its competitors. And it is that those from Cupertino defend that a team with an M1 Ultra chip manages to match the relative performance of a PC equipped with an Intel Core i9-12900K processor, DDR5 memory and nothing less than a graphics card equipped with an NVIDIA GeForce RTX 3090 GPU when the first shows a consumption of just over 100 watts, and the second clearly exceeds the 300 watts.

In the last slide Apple intends to value the capacity of the graphic logic integrated in its M1 Ultra processor

It is at least striking that Apple has decided to introduce in this slide a graphics card as ambitious as the GeForce RTX 3090, which alone according to NVIDIA shows a consumption of 350 watts under load. However, your intention is likely to be value the capacity of the integrated graphics logic in its new M1 Ultra processor. It will be very interesting to see for ourselves if it really lives up to expectations when the first computer equipped with this chip falls into our hands.

Apple M1 Ultra GPU Performance 01

More information | Manzana

“The M1 Ultra marks a before and after in the history of Apple chips that will shock the PC industry…

“The M1 Ultra marks a before and after in the history of Apple chips that will shock the PC industry…

Leave a Reply

Your email address will not be published.