Sunday, July 13, 2003

Distributed Computing Economics

Slashdot | Distributed Computing Economics Jim Gray has a great article about the economics of distributed computing Distributed Computing Economics:Jim Gray. It lays out how to effectively quantify the benefits of distributed computing, what problems work in what areas. It clears through a lot of the hype created around Grid Computing and On-Demand and puts them into perspective.

There is also what I would call a companion article at Mailing Disks is Faster than Uploading Data that lays out the economics of data transfer. Taken together these articles lay out a very good case for designing systems that move as little data around as possible and centralizing the storage of data as a service in the network.


With cheap storage we are getting incredibly lazy and creating data everywhere. Our sysadmin regularly sends me e-mail about the bloat in my mailbox. Enterprises have databases everywhere and applications have a tendency to be very database centric.


Most enterprise applications store significantly more information than they actual need. The major impact of this is a data synchronization problem. There are many companies building and selling solutions to synchronize this data, but they are solving the wrong problem. We are designing applications wrong, mainly because we do not realize the impact of have data everywhere - after all storage is cheap. The real issue though is managing and synchronizing it is very expensive. Solutions like SForce are delivering storage as a service. This forces people to think in new terms, as storage is essentially free what they are charging for is management. The economics of scale they can bring to managing data reliably are enormous. Over time, I assume they will also provide the visibility and management tools that allow you to understand the flow of your data as it is now becoming part of your services network.


The bloat in my in-box is mainly due to office documents. The reason I get the original not the link is because the sender wants to send me a snapshot i.e. the document at that point in time. As the file system has no versioning (where is VMS when we need it) therefore they do not send a link they send the actual document. One company that has a really cool approach to this problem is Its the Content they have the potential to change how information workers interact and significantly reduce the content in the workspace. [Disclaimer: I have an advisory relationship with ITC.]


Just because it is cheap to build distributed systems and share information does not change the need to consider the fundamental economics and complexity they introduce. The companies that understand this will be able to take advantage of the services network the others will become mired in complexity