Trying to understand… ‘AI Drives Need For Optical Interconnects In Data Centers’

Welcome to “Trying to understand…”, a series devoted to me trying to understand a piece of media (usually a blog post or technical article) until I feel confident that I generally understand it.

Introduction

I never took an electrical-engineering class in college. In fact, I only had a dim awareness that the field existed at all. Once I jumped the digital-analog gap (thanks MIT) and started exploring the elemental order of electricity, I realized I had a lot to learn. I’ve found the website SemiEngineering to be a great resource for exposing myself to high-level concepts in the electrical engineering and chip industry.

One article I’ve come back to a couple times is ‘AI Drives Need For Optical Interconnects In Data Centers.’ Even after a couple reads, I was still confused:

In what magnitude of size (kilometers, meters, nanometers) were optic cables poised to overtake electrical wire?
What exactly are the differences in the chip diagrams (see above) and why do they yield significant performance differences?

The article contains enough diagrams that the message should be obvious, yet it felt like I was missing a common vocabulary. I needed more knowledge to translate between diagrams and text and to understand the nuances between “solution” and “package” and “chip.”

At the same time, it seemed a good example of real-world electrical engineering: dealing with the practical topics like heat and micro-delays in data transfer at scale, limited by physical and economical constraints. So, I decided to deep-dive the topic and understand exactly what was going on! Here’s what I found:

Datacenters

I’ve previously worked at AWS for a couple years, so I’m familiar with data centers on a high level. The article makes the distinction between hyperscalar and enterprise data centers.

Layout of a data center diagram Wait, you can right-click -> ‘open image in new tab’ and then zoom in for additional detail?!? Wild.

Hyperscalar data centers are (what I would describe as) datacenters as a service, such as AWS or Azure. They are built to allow easy virtual partitioning, networking, and configuration. Because all of these changes can happen virtually, the physical datacenter itself can take a standard (homogeneous) layout. This setup facilitates the use optic cables, especially single-fiber optic cables which are precise cut and measured.

Enterprise data centers, by contrast, are data centers built for a specific company or purpose. One might imagine a video game company that runs it’s own game servers (of course, any workload on a enterprise data center can be run on a hyperscalar data center and vice-versa). Enterprise data centers are usually more heterogeneous as their layout depends on the specific compute task they’re performing. Because their layout is less standardized and more physically dependant, physically flexible solutions such as copper cables or multi-mode fiber have traditionally been used. However, single fiber optics offers much better throughput.

One crucial component of optic cables is the transceiver, which converts between electric and optic signals. These devices sit at the ends of optic cables and translate between the elecric and photonic signals. However, these transceivers are currently hand-assembled from many discrete parts, somewhat reminiscent of semiconductors before the integrated circuit. From the article

One of the core components of the transceiver is a semiconductor laser, and then there are some ICs that drive that laser. But there are a lot of small mechanical components needed in order to hold different mechanicals down. For example, to get fiber from the laser to the front of the module, requires that you have strain relief so you can operate under tough environmental conditions. The 10 million or so of these that are purchased every year by the hyperscalers are almost all manually assembled at factories throughout Asia…There is an absolute need for the siliconization of optics.

For historical context, circuits were originally built by manually connecting semiconductors together in a pattern. Innovations in processes and material engineering eventually allowed circuits to be carved into a single block of silicon, or “siliconized”, rather than built from many discrete parts. This manufacturing change greatly increased production output and decreased mechanical failures. These circuits made of one physical part, called “integrated circuits”, exploded onto the scene.

As hyperscalar data centers continue to grow and as performance benefits start outweighing physical concerns for enterprise data centers, the optic cable market will continue to grow. Cable manufacturers will need to innovate with new “siliconization” of optic cable parts, which may reduce physical limitations and in turn continue to drive demand for optics.

Takeaway 1: Enterprise data centers entering the optic cable game increases demand. Current optical transceiver manufacturing techniques are done by hand, creating a manufacturing bottleneck and increasing mechanical failures. As such, manufacturing companies are going to need to come up with new techniques to scale optic cable circuit creation.

One way to make optics more physically reliable is to embed the hardware on chips themselves. But why stop there? To understand where we’re headed, we have to dive into the chips themselves.

Understanding ICs and packaging solutions

An integrated circuit (IC) is a complex circuit constructed on a single piece of semiconductor material, usually silicon. These ICs provide scalable, reliable logic blocks. These ICs are carved into a large wafer of silicon, which is then diced (cut) up.

However, these ICs are easily damaged and vulnerable to physical forces (dust, vibrations, air moisture, etc). The ICs are suspended in a safe medium and packaged in plastic or ceramic container. Metal wires are used to connect the vulnerable silicon chip with the outside of the packaging. If you’ve ever looked at a circuit board, chances are the “large black squares” you see on the board are the casings which contain ICs inside. More effort has gone into the packaging that one might expect. Entire companies have made their fortunes in this subdomain (ASE comes to mind).

chip process diagram

Moore’s law has continued through the 1960s to the present. As such, physical space on the board represented more and more compute potential, and became more valuable. This pushed packaging innovations to reduce the size of packaging and space needed to connect the chip to the board, such as dual inline packaging, then flat pack. Most modern packaging uses a sodder ball grid array, which are metallic balls that electrically connect the bottom of the chip to the board. If you look at a chip diagram and wonder “Why the hell are there balls there??” as I have, this is why. It’s just a way to provide input to/output from the chip without taking up more space than strictly required.

Asianometry has a great video on semiconductor packaging. Would recommend.

Understanding 2.5D and 3D chips

So now we have chips, where a signal will leave the chip through the packaging lead wires, travel via the board to another lead wire and chip, which receives the message. However, this physical distance caused by the chips sitting in different packages, known as “packaging delay”, adds time and heat/electricity usage. Sometimes the simplest solution is also the best one: put chips that communicate often in the same physical package.

The first solution is 2.5D chips, which I really like because of the fractal nature of the name. In 2.5D chips, multiple dies or ICs are included in one chip packaging, reducing the distance signals are required to travel. They’re sort of like chips within chips (not to be confused with chiplets, or maybe they should be, still not sure about that one), hence the fractal name.

3D chips take things a half step further, actually stacking multiple dies/ICs on top of one another to reduce distance and latency. The downside of such physical proximity can be heat (which has less surface area per volume to diffuse through) and facility of accessing parts of the chip for testing (for quality-control purposes).

chip types diagram

Asianometry also has a good video on 2.5D and 3D chips.

It might be about this point where you say to yourself: “Wait, is this the profound innovation? Putting things closer together and maybe stacking a couple things on top of each other?” Kinda, yea. But we also take for granted cheap, hyper-advanced chips made using materials with impurities of literally 1 in 10 billion or less. Just because it’s simple doesn’t mean it’s easy.

We can look at the diagrams below and better understand what’s going on:

other image

We are transitioning from

(a, top left) pluggable optics, which sit far away from the chip and are prone to mechanical failure

to recently

(a, top right) on board optics

to in the near-future

(a, bottom left) optics co-packaged (in a 2.5D or 3D way) with the IC.

You might then ask: Why not package all the chips in one package then? Wouldn’t that completely solve packaging delay? My first answer would be: “Shut up, nerd.” My second answer would be: “I’m not sure, but my best guess is chip design is an intersection of technical limitations, physical limitations, and economic limitations, and there’s economic incentive to split chips into modular designs. That way, companies can sell the chips to many companies at scale rather than packaging each chip according to specific needs. Companies like Apple or AWS which are large enough to provide scale by themselves have their own system on chips, which does exactly the co-packaging you’re talking about.”

In fact, companies are even now pushing for optic communication within multiple parts of the chip itself. Two of the companies mentioned in the article, Lightmatter and Celestial AI, are examples of this approach. Using our new understanding of chip vs packaging, we can examine Celestial AI’s diagram below and see the different layers/magnitudes of communication layers.

celestial optics diagram

Takeaway 2: By leveraging 2.5D and 3D chip design improvements, optics can be used for inter-package communication, inter-chip communication within a package, and even within parts of the same chip.

Understanding Optics

There’s little need to understand optics itself to understand the rest of the article. Still, it’s worth mentioning a few properties of optics that make it useful for communication.

Light travels faster than electicity. It also travels with less loss than electricity, which loses energy by creating heat in the wire. These properties have been known about light for a while. Bell labs, the research arm of AT&T, developed and implemented optic wires, or fiber optics, as early as the 1980s with glass wires. It was originally used for cross-country phone calls, which had the best cost/reward payoff. As the technology advanced, however, it became practial for shorter and shorter distances.

In addition, light can carry more “bits” of information than electricity. Light has more transmissible physical properties, such as wavelength (color) or polarity, which encode “bits” of information. You can then have a higher transfer rate. Would recommend reading this explainer on how data is transferred over the wire and then continuing.

Seriously, it’s quite good.

Further learning, since I don’t have time to get into the nuances of optics:

Software standards

So far, we’ve only talked about hardware changes. But where there are hardware changes, there are software changes.

For networking, ethernet seems to be sticking around. However, most of the intra-computer and intra-chip currently use PCIe. There’s an open question of how closely hardware vendors will stick to the existing PCIe interface versus defining new standards which better use the benefits of optic cables. There’s strong incentive currently to interface with PCIe, which is what Lightmatter does, because it allows integration with current hardware. As more companies target optic computing, they might drag PCIe standard updates with them.

Conclusion

Looking back at the introduction of the article, we find

Optical communication has been in use for several decades, starting with long-haul communications, and evolving from there to connect external storage to server racks, and then to servers within those racks. More recently, it is being eyed for communication between different modules and advanced packages and ultimately it could end up as a way of speeding up data movement inside of advanced packages with minimal heat and power.

We might summarize it to our former self as

Optical communications is already used for inter-server communication. Soon it will be used for inter-chip communication and eventually intra-chip communication.

It’s always humbling when you realize the answer was in front of you the whole time. Still, it takes understanding to extract the key sections of text and to parse their meaning. I will say the article could do a better job of separating different insights into different sections. Then again, the author was writing for an audience more familiar with the technical terms. Oh well, confusion is the sweat of learning :)

Leftover bits

If you think about a data center as a large computer (see The Datacenter as a Computer), there’s already optics being used for a lot of the computer.
If you don’t know what a backplane is, the best image I could find to demonstrate it was this image.