Thursday, May 31, 2007

Scaling big web sites

Peter Van Dijck's Guide to Ease blog contains a nice list of presentations from a few big Web 2.0 properties. Definitely an interesting read, especially given that I've worked with some of the largest of properties through my time at Citrix/NetScaler (e.g., Google, eBay, Amazon, MSN, etc.) and as a result have typically seen this discussion from the networking side. Most of the presentations listed take the view from the application developer side which is indeed a different beast.

Wednesday, May 30, 2007

It's Like the BMW 5-Series...

xkcd had an amusing comic strip the other day referring to Godwin's Law. In case you're not familiar with the law, it states:

As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one.

In the spirit of Godwin's Law, I'd like to propose Steve's Law:

As a marketing discussion grows longer or involves an increasing number of people outside of marketing, the probability of a comparison to some aspect of a car approaches one.

I've lost count of the number of times someone has tried to compare the realities of selling a complex piece of networking gear to an (occasionally) educated enterprise customer to some aspect of a car. I understand how this happens -- it's an easy parallel to try and draw and most likely everyone in the room will be able to follow it. Ditto with using the iPod as an example.

The problem with doing such a comparison is that the analogies are usually bogus. No matter how hard you try, you can't make a firewall an ultimate driving machine.

In general, I'm loathe to draw comparisons between consumer focused products and their marketing campaigns and selling high tech to enterprises. The two are completely unlike one another. And don't give me that line about enterprise buyers being a subset of the consumer space. That isn't true and you know it.

To start with, a consumer is generally making a purchase that only impacts himself or possibly his family. With rare exception (e.g., a house), a consumer is typically spending an inconsequential amount of money. If the purchase is wrong, he may be pissed off, but it is unlikely you're going to be changing his life in a measurable way. By comparison, an enterprise buyer is likely to affect a significant number of users and their ability to get work done. A bad buy not only costs the company hundreds of thousands of dollars in equipment, but possibly millions of dollars of lost productivity. Screw up bad enough and your job is in on the line.

In addition to impact comes education. Someone looking to commit a few million dollars into a storage area network is going to do homework on the topic first. He's going to get a handle on the market, who's who, learn about the technology itself, and possibly even sit down and assess what features really matter to his business. When selling to this person, you assume a level of education and given the potential price tag, you spend the time making sure they understand the ins and outs of what you're pitching. This is not the same as going to Consumer Reports, seeing what they picked for the best spatula, and then heading down to the local Spatula City.

The differences go on. And really, if you're still not sold, you haven't read this far anyway and are probably working on an email to me about how your pre-chasm-crossed-blue-ocean-long-tail-enabled-technology is so much like the BMW family line that you're using the same product numbers.

There are of course valid exceptions to the Steve's Law. There are times when a good consumer based example highlights an element of a marketing campaign that you're trying to explain. For example, in a positioning exercise where everyone in the room may be struggling with the very idea of what marketing positioning is, there are some excellent consumer based examples worth highlighting. They give a baseline to point out what a market position is, why it matters, and why it is not the same thing as a tagline.

Just don't try to show how 7-Up being the un-cola is the same thing as iSCSI the un-Fibre Channel.

Ugh, I had to control my gag reflex just typing that...

Friday, May 25, 2007

DKIM Correction

Courtesy of some comments from yesterday's DKIM post, my commentary doesn't matter because the CNET article was wrong and I missed this fact while looking around DKIM's web site.

Richi Jennings's blog explains the essence of DKIM quite nicely. To summarize, Domain Keys Identified Mail (DKIM) is simply there to protect against forgeries. This means there exists an easy way to know whether an email that claims to come from paypal.com really is coming from Paypal. Of course, this does nothing to protect users from getting email from similar-to-paypal.com, nor is it meant to. DKIM is one part of what is to become a larger solution to establishing a scalable web of trust that does not require complex end user interaction, unlike PGP or S/MIME.

Interesting? Kind of. Long term usefulness? Well, I'm not sold on it yet.

Wednesday, May 23, 2007

Why DKIM Will Fail

CNET reports that the Domain Keys Identified Email (DKIM) project just got preliminary approval from the IETF. The usual suspects in the open source crowd (e.g., Sendmail, Postfix) have added support and a few closed source guys have too. In addition to software support, the big names in email providers (AOL, Yahoo, etc.) have added support on their end.

Notably missing is Microsoft which is significant given that most of the corporate world runs on Exchange.

Here are two reasons why DKIM is destined to become yet another irrelevant standard:

  1. Without Microsoft's backing, DKIM will lack the wide scale adoption necessary to be effective.

  2. Spammers will simply sign their spam.
On the first point, you have to stop and think about how email as a market is segmented because raw numbers alone are misleading. Email is broken down into two major categories: enterprise email and everything else. The everything else bucket includes the big email providers like Yahoo, AOL, and Hotmail, as well countless hosting providers. User for user, the everything else bucket is *huge*, but largely consumer based. While they can make the C2C and B2C worlds move, they can't make the B2B world move. Thus the kicker: enterprise email is dollar for dollar more significant.

The people willing to pay the hosting providers for services tend to be businesses that need services above basic email and are willing to pay for it. These businesses need to communicate with everyone and unless all the other enterprise users start using DKIM, Yahoo and friends will need to continue allowing non-DKIM email. After all, hosting providers are not going to anger their highest margin customers.

As for the second point, spammers have long adopted the hit and run approach to spamming. There are two approaches to working around DKIM. Approach 1: Get a legitimate domain with a legitimate signature, spam until the signature is not trusted. Ditch the domain and get a new signature. Approach 2: Leverage botnets to send email from users that have trusted accounts through trusted email servers. Continue until either the user is not trusted and the ISP shuts them down or the user cleans up their machine.

Either way, spammers will get a signature and users will continue to see spam in their inbox.

At this point, I believe that spamming is going to be an indefinite problem much like how junk mail is an indefinite problem. The most effective approaches to blocking spam will continue to be content based filtering and the arms race between content filtering technology and spammers will continue. Anti-spam companies know this -- despite DKIM and countless other "stop them at their source" projects, anti-spam's long term prospects continue to look good.

Time to go update SpamAssassin...

Friday, May 18, 2007

Random Aside: C2C at DMC 2004

It's Friday afternoon and time for a quick break from the norm...

As I've mentioned before, I DJ as a hobby. You can even hear some of my mixes, if you're so inclined. One of the stranger side effects of having practiced this skill for 13 years now is that I tend to listen to music with an ear towards production with an electronica filter. I hear layers. I hear loops. I hear segments of music as if they were to be assembled and post-processed. I can even fall out of phase with a beat which is very odd thing to experience. It makes listening to albums like Brothers Gonna Work It Out by the Chemical Brothers quite fun.

That said, hearing a live album with all of the little warts that come with it is always a good time. Hearing a live DJ do something creative -- even better. Now this video is just stunning...

C2C, the winners of the 2004 DMC competition have their video posted over at YouTube. This is absolutely incredible. Five turntables, four mixers, and four DJs, assembling a unique song by piecing together elements of other songs in real time. The individual elements are very simple, short segments of various other songs in their elemental form (e.g., just a piano piece, just a drum loop, just a vocal sample, etc.) so they usually have three to five turntables adding to the final song at any given moment. Even if you don't regularly listen to electronica (or it's many sub-genres), it's worth giving this five minute video a spin. It's true artistry with musical influence from many other genres, including some (really) old school Bollywood.

Thursday, May 17, 2007

A Quick Security Reminder

Salon has a nice little article titled The secret Iraq documents my 8-year-old found. How exactly did the author's 8-year-old find these documents? If you guessed Microsoft Word's mark-up feature, you would have guessed correctly.

Over the last several years, there have been all kinds of little jewels like this that have cropped up. I used to pull up old versions of people's resumes by simply viewing Word documents using the Unix strings command. A previous employer changed their negotiation tactic with an OEM after clicking through undo a few times. And then there was that contract I rewrote for RSA... That was one of my favorites. Their legal department must have been steaming as they red lined the terms I added for charging them every time they used one of my new and improved contracts.

In case you haven't figured out my point yet... don't send Word documents around when you don't want to risk someone playing with it. Use PDF when you can. If you absolutely have to send a Word document, cut the entire document and paste it into a new document when you're done to get rid of all of fast saves, undo, and markup metadata.

Wednesday, May 16, 2007

What is SOA?

Big news. Gartner predicts the Service Oriented Architecture (SOA) market to explode.

I'd provide a link to the article, but really... Is anyone surprised that Gartner is predicting the explosive growth of anything?

Okay, let's be fair here. The problem isn't whether Gartner is right, although in this case I happen to agree with their assessment. The problem is the definition of SOA. If you poke at it enough, you'll quickly find that SOA as a technology expands far and wide. Anything with an XML interface is going to fall under the definition of being SOA enabled as far as the vendor is concerned. This little detail makes the market definition tricky.

Taking the market analysis hat off for a minute, the bottom line is that an enterprise looking to break their application architecture up into bite sized modules that communicate via web services, SOA is the hot tip. You want to know why? Dust off the history book, it's time for a quick read...

When application development software started back in the 1960s, there was really no significant network to communicate on. Application software was written as monolithic blobs of code and modularity was enforced at an architectural level. As applications became increasingly unwieldy and the size of servers grew, the justification for object oriented programming came with it. Modularity beyond objects took form as intra-process communication on large SMP servers. Networks being relatively slow weren't the ideal choice for using as a intra-process communications. Unless you could afford an extremely low latency specialty network, it just wasn't a practical solution.

Now come around to the mid-90s. The 100Mbps Ethernet standard had taken shape and the cost of network infrastructure for it was relatively affordable. The Parallel Virtual Machine (PVM) libraries and the Message Passing Interface (MPI) standard emerged in 1989 and 1994 respectively, which made software development for distributed software much easier. The result was that cluster computing had a genuine shot at taking down the requirement of large and expensive SMP machines for the first time. It also meant that intra-process communication had really standardized in a cross-platform way.

The problem with MPI and PVM, however, is that they stayed largely aimed at the High Performance Computing (HPC) space. Furthermore, while it is possible to use them in an object oriented context, they don't easily lend itself to that model. Now pause this development for just a second and hop over the XML crowd.

Mid to late 90s saw the resurrection of SGML in the simpler form of XML after HTML had become the black sheep of the SGML children. In 1998 saw the first development of SOAP as an XML based intra-process communication mechanism. SOAP really started seeing the light of day when Microsoft made a big push for their .Net architecture around 2002/2003.

So here you have it... Applications are getting big. Networks are fast and relatively cheap. Big iron SMP machines are getting more expensive while clusters of cheap PCs running server class operating systems are readily available. The web movement showed that breaking up applications into horizontally scalable solutions is not only doable, it is preferable. But enterprise applications are still missing their version of MPI. Their version to make machine to machine communication really click.

Microsoft may have dropped the ball with the .Net marketing campaign, but it did get enough developers on the .Net bandwagon that critical mass was reached. Finding a .Net capable programmer that understands how to leverage web services for intra-process communication means visiting your local HR department, not hunting down some obscure HPC deployment doing atomic bomb simulations. Lucky for for us marketing-types, IBM picked up the ball and did a stellar job of coining SOA and making it fully buzzword compliant with CIO visibility.

So going back to the original question... What exactly does and should the "SOA market" constitute? Really, I don't think it matters. What matters is that SOA, or more accurately, SOAP, is the key enabler to make big applications easier to modularize across clusters of small (virtual?) machines in a datacenter. Over the next 10 years, I believe that this has the potential to put some serious hurt on the SMP market (10% of server units, 50% of server revenue). SOA may be fully buzzword compliant, but I genuinely think it's going to change the way applications are written.

Of course, if you really need a market number... Cite me. Rising Edge Consulting puts the SOA CAGR at 134%. I'll be happy to discuss my findings. Did I mention I was a consultant for hire? Contact me, and we'll discuss your needs.

Oh, and tube socks... Even bigger. I have a whole team of researchers working on that.

Monday, May 14, 2007

The Death of Infiniband

I've always wanted to predict the death of something.

Today is my lucky day - I predict the death of Infiniband.

Yeah, sure. I'm not exactly going out on a limb here. Consolidation onto Ethernet has been an ongoing saga in networking as one LAN technology after another dies and is replaced with Ethernet. FDDI, ATM, Token Ring, and even early AppleTalk. The list goes on...

However, there have been some things that even Ethernet hasn't mastered yet and low latency is one of them. The market for low latency networking has continued amongst the High Performance Computing (HPC) crowd where researchers who can't brute force their way through gigabit Ethernet or afford big SMP machines end up with either Infiniband or Myrinet.

The problem with the HPC space, however, is that it isn't very big to start with and the growth rate just isn't there. As is traditionally the case, the real growth opportunity lies in the enterprise. The challenge with the enterprise for high speed interconnects is that they aren't ready to blow off their investment in Ethernet. As a result, Myrinet has already announced a 10G Ethernet product. They've made their position pretty clear: we aren't going to beat'm, so we'll join'm. They'll still offer their uber-low-latency products, but administrators will have a choice.

Mellanox, the Infiniband poster child, may be pushing their Infiniband story in the press, however they recently backed a 40G Ethernet proposal. I'm not a genius, but if that don't hint to a roadmap, I don't know what does.

In the end, the only player that will be left for the foreseeable future will be Fibre Channel (FC). This isn't because FC has the magic touch that will make it invincible to Ethernet, but it is because FC has a significant installed base that isn't going to toss their investment so quickly. But... If FC over Ethernet (FCoE) takes off as expected, it is only a matter of time before straight up FC goes the way of SNA.

Does all this death create a significant opportunity for low latency Ethernet? That remains to be seen. To date, I have only seen low latency claims achieved through RDMA and I don't buy that applications are going to change to accommodate the protocol shift. Does intra-datacenter latency really matter then? Will people put money behind reduced latency Ethernet that doesn't require changes?

Good questions. I believe that the answer is yes. But that's for another blog entry...

Thursday, May 03, 2007

Vyatta gets round B

I just caught wind that Vyatta just received round B financing to the tune of $11M thus totaling $18.5M. The latest round comes courtesy of Capital JP Morgan Partners, Comcast Interactive, ComVentures, and ArrowPath Venture Partners.

Vyatta, in case you haven't heard, does open source routing. I've written about this before. In summary, I'm suspicious of the open source on a stick business model.

Here's my problem with Vyatta. The assertion is that because the development cost is lower, they are going to be able to offer their products for cheaper than other vendors. There are few problems here... The features that go into a low end Cisco (or similar) router are relatively simple and have long been paid for their total router sales. Furthermore, the large number of units sold by the established vendors in routing make their COGS very low. So if the engineering for Cisco is relatively low cost since the bigger products do most of the work and the low end teams need to package/refactor, they're going to have the same (if not greater) benefits than an open source router.

Take a Cisco 2801 branch office router with the VoIP software. Street price is around $2,500. A Juniper J2300 has a street price of around $1,500. A Vyatta appliance with software and support is $2,200. The Vyatta appliance has nowhere near the feature set of the 2501 and no promise of those additional features anytime soon. The software goes for between $600-$1500 depending on support. Assuming an ASP of $1000, they have to sell 1,000 units just to make $1M. For a startup, moving 1,000 units of anything is not easy especially when the "big" vendors are selling for around the same price, possibly cheaper (e.g., the J2300 vs. the enterprise Vyatta appliance).

In the end, the average user doesn't care if the source is open. The casual IT administrator can't do anything with a pile of C code on an operating system that he isn't familiar with. It's just gotta work and at the cost of a PC, if it breaks or the vendor is being flakey, it's often cheaper and easier to simply move to another vendor than to fight the problem.

I of course wish Vyatta the best of luck. However, at this point at least, I don't see a bright future for them.