Okay, I admit it. SMP is cool. More processors in the machine makes coolness rise exponentially, not just additively.
So why don't we have a bunch of big SMP machine running the enterprise? Well, actually, we do. While 90% of the servers are at most 2-way x86 machines, they only represent about 50% of the server revenue as of Q4, 2006. The other 50% of the revenue comes from big SMP beasts that cost quite a bit more but represent far fewer machines. So big SMP is not only there, but it is quite alive and kicking.
Why is it though, that we have come to the point where the ratio of physical machines is 9:1 in favor of small servers?
Ahhh... here is where a little IT experience goes a long way. :-)
IT, like any organization, has all of the pros and cons of being run by humans. One of the cons is that humans, for the most part, prefer small succinct solutions to point problems. This makes the problem easier to comprehend and it makes the solution more malleable. A server dedicated to DNS for instance is easy to digest. I know exactly what expertise I need to run the server and the interaction between the environment and the application (DNS) is clear and well defined. There aren't other applications creating complexity. It is this same logic that has created the market for network appliances - one application with one hardware/OS combination with one owner that understands the whole system inside out. Thus, there is only one finger necessary when things go amok and that finger cannot be passed along to others.
Big integrated "God boxes" by comparison are a bit of an uphill climb. They require that the administrator truly grok all of the elements and their interactions. The adoption of a God box happens in one of three situations: (1) The interaction between the elements is so complex that it is clear to the administrator he will never comprehend it and thus an integrated solution with one vendor is preferred, (2) The interaction between the elements so well understood that the benefits of a highly integrated solution can be seen, or (3) There is a cost benefit that is so astounding that the administrator would be foolish to overlook it.
Integrated fax/printer/scanners are an example of this - the elements and their interactions are well understood and there is a significant cost benefit to using it. This makes an administrator overlook the risk of losing all three functions if one of them should go bad. Integrated networking systems are in a similar boat where there is a combination of cost benefit as well as simplification of an otherwise intensely complex system. By abstracting the complexity to within one system, there is now one vendor to point at.
In the world of systems administration, such integrations from vendors never really emerged. A support contract with Sun for instance guarantees that they'll replace/fix hardware and provide software bug fixes, but configuration hiccups are my own problem. If I choose to run many services on a server at once with a complex interaction, I'm responsible for their configuration and management.
In the early 90s, with no appliance vendors to speak of, the solution to derisking complex configurations in Unix servers was simple - buy a few small servers instead of one large one. Each small server could then provide a single function thus reducing complexity. If there is a problem with the software, there is no question regarding configurations interacting poorly with other applications. The rise of x86 based servers running Linux and Windows drove the transition home. Today, only those applications which mandate large SMP systems get them.
With servers doing fewer things and servers getting more powerful, the market for virtualization was created. We want to keep the compartmentalization of the application but gain the benefit of running multiple applications on a single CPU. As a result, the non-SMP to SMP ratio is likely to remain high and possibly get higher.
So amongst the SMP crowd, is there opportunity to eat into that market? Possibly... There are two approaches to further removing the need of large SMP systems: (1) SOA-ification of applications, and (2) Creating virtual SMP clusters with commodity x86 hardware.
Let's start with item 2 first. Historically, creating virtual SMP machines has been a tough sell. The technology has been around since the early 80s in the form of MOSIX. Efforts around distributed shared memory in the early 90s furthered the process. Unfortunately, these efforts largely stayed with the academics. That is until Qlusters came around in the early 2000s. Part of the team that started OpenMOSIX created a company around their effort so that they could sell a commercially supported implementation of MOSIX. Early adopters were in the HPC crowd that needed the massive scalability x86 clusters but didn't want to have to adopt their applications to use clustering software like MPI or PVM. With MOSIX, they just wrote their program as if they had infinite processor and memory space - a much easier proposition, especially for people that weren't programmers by nature. (e.g., scientists, mathematicians, etc.) However, none of these projects ever really took off in the enterprise. This is most likely due to the fact that large ISVs never supported the configurations.
Another issue with virtual SMP clusters is that of host to host latency. If you have an application that needs to do a lot of random memory accesses, getting memory from another host becomes a very expensive part of the equation. Low latency fabrics like LLE and Infiniband help with this, but add significantly to the cost of the overall solution.
Item 1 by comparison is gaining serious momentum, especially with the ISVs. I've written about this before -- the use of SOA is partially the creation of Microsoft with their push on the .Net side as well as the overall industry making a move to XML for everything. The great thing about SOA is that is compartmentalizes things in a way that classic system administrators like - one server doing one thing really well.
Which wins in the end? Well SOA is without a doubt going to be a big part of the solution, but I'm not ready to write off virtual SMP yet. I think there are some startups doing some neat work here and there is the VMware card as well as they are well poised to extend their virtual SMP model across multiple physical hosts.
Let the games continue...