Intel has proposed a new standard to link together multiple GPUs that may alter the economics of compute massively.
The concept of running multiple GPUs simultaneously, and distributing tasks between them has a long history, at least as far back as Scan-line Interleave on 3Dfx cards in the late 90s, which split responsibility for an individual scan line alternately between two (or rarely, more) cards.
Multiple cards can be linked together in one system today, but they generally need to have identical model numbers, and for various reasons, it is very challenging to synchronize and load balance tasks. Cards may need to rapidly shift between various shader or geometry tasks, and realistically, individual cards often need to pause for a laggard to catch up.
CXL may change that – by improving the load balancing mechanics, and very importantly, by enabling tasks to be split dynamically and asymmetrically between (perhaps radically) differing cores or dies, even ones that were designed for classic PCIe, in a bizarre mixed-marriage of protocols.
If they can pull it off in practice, it could make computing significantly more modular. Instead of upgrading one powerful card at a time, one might realistically keep the old one(s) in there, to assist with less challenging tasks, in a way that doesn't compromise the performance of the more powerful GPU. CXL also offers cards a more direct, unsegmented form of memory access and address, that can bypass the bottleneck of the CPU.
Potentially, this can radically alter the economics of GPU-enabled tasks, such as various forms of crypto and A.I. We can expect to see GPUs being held onto for longer periods given a longer effective lifetime, with resulting fewer e-waste issues.
This could also reshape computing in general, making it a lot easier to integrate older deprecated hardware (a last-generation smartphone, for example), or dedicated clusters, more usefully into a decentralized compute cloud.