Friday, February 27, 2009

Scaleable Data Stores

This week I came across two great articles about scaleable data repositories with MySQL. The first article is from Bret Taylor where he explains how they use MySQL at FriendFeed. The second one is from Jurriaan Persyn and he discusses database sharding at Netlog. Both are fantastic reads. These are two examples of creative approaches to handle massive scalability requirements. These approaches join a number of other projects like Cassandra and Voldemort looking for ways to scale out to meet the requirements of the most demanding applications. But a lot of these approaches today might seem like best practices more than bleeding edge. Although I say this with reservation because these techniques are still not easily embraced by the majority of the corporate world.

However, what truly amazes me is the true genius of companies like Inktomi, Yahoo and Google that came to the same conclusions 10 years ago. They had the vision to understand the challenges ahead and the courage to follow the path less traveled. They did not settle for suboptimal solutions using commercially available technology. Instead they chose to innovate: writing their own file system, their own distributed storage and their own algorithms. True genius, the courageous kind.  

Wednesday, February 25, 2009 thoughts

A few weeks ago I started looking at -'s Cloud development platform. Initially I was a bit skeptical because I had never thought of Salesforce as a platform provider. I have used the SFA and CRM applications many times and although I have always liked its simplicity and performance of I did not know what to expect about the underlying infrastructure as a development platform.

I must admit that I have been very pleasantly surprised. Salesforce is well known for its aggressive marketing but the quality and depth of their technology should not be overlooked. However what has impressed me the most is their pragmatic approach to software development and the quality of their documentation and developer support.

What do I mean by pragmatic approach to software development? I mean that Salesforce has developed technology that is truly focused to solve a business problem. As you look around their API, data models, tools and configuration options it is easy what specific problem they address. You would not find technology for technology sake. 

The quality of their documentation and developer support are quite remarkable. Salesforce truly makes it easy for developers to get started. Their support turn around time is terrific and their user and developer communities are active and vibrant. I will discuss their platform in a future post but as of now I consider a leading Cloud platform that should be carefully considered by anyone looking to build an application on the Cloud.

Monday, February 23, 2009

Open Data Mashups

A few days ago the smart fellows from JuiceAnalytics published an interesting Treemap using data from a draft of the economic stimulus bill. I am a big fan of Treemaps and I think they executed very well. 
This exercise reminded me of several comments on Open Government. I believe it was President Obama who a few weeks ago was talking about taking data from several government agencies and offer it to the public like Open Source. His theory was that citizens would take it upon themselves to analyze the data and hold their government accountable.
I find the concept fascinating. Maybe we could take the tax returns of individuals in power and pass them through a TurboTax Web Service to detect irregularities. At least it would help narrow down the field of candidates for cabinet positions.

Sunday, February 22, 2009

Weekly roundup

This is the Cloud Computing announcements that I found most interesting this past week:
  • Juniper and IBM partner for Cloud management. Thse Cloud ecosystem continues to grow but now the emphasis is on the networking operational processes. The effort and awareness generated by these giants will continue to accelerate Cloud developments and deployments. However I get a bit nervous when too much attention is centered around private clouds. Few cases will justify the investment in a private Cloud but for most cases I think it beats the purpose. In my mind embracing public Clouds securely will create a more open world.

Tuesday, February 17, 2009

The Cloud and the RDB

One of the promises of the Cloud is limitless disk storage. This storage is often delivered by massively distributed data engines. The scalability and simplicity of these engines comes with some compromises, which tend to surprise people with relational database backgrounds. Tony Blain at RWW does a fantastic job explaining the differences, threats and opportunities. This is an architecture area of Cloud Computing where I find the most concerns and confusion. A superficial understanding of the application requirements can lead to performance bottlenecks or costly architecture mistakes. Some applications will be able to use value pair data stores, others might need to deploy a partitioned row/column relational database while others might need a combination. Tony does a great job outlining the high level differences. The decision you make can affect your Cloud partners, your team's skill set and your development approach. When looking at data repositories for your cloud application choose the deep dive; it will be time well spent.

Monday, February 16, 2009

Cloud Success Story

Cindy Waxer wrote a nice article for Fortune Small Business about the experience of a couple of companies using Cloud Computing for mission critical applications. It makes a lot of sense that the majority of early adopters tend to be medium or small size companies. It could be that large corporations are more skeptical, they have resources to spare or simple job security stands in the way of innovation.

Cloud Computing Paper

This is an interesting paper about Cloud Computing from the University of California at Berkeley. A lot of its content is focused on AMZN's AWS but I think it does a very good job at explaining the high level opportunities for Cloud Computing as well as the challenges it faces. Don't expect deep technical analysis of architecture, protocols or patterns. Instead you will find solid examples that will satisfy a broad audience. Given all of the fragmented content on Cloud Computing available online, I consider this guide a very solid introduction to Cloud Computing's past, present and future.

Wednesday, February 11, 2009

Weekly roundup

These are the headlines that caught my attention this past week:
  • IBM launches AMZN AMIs. Glad to see IBM move in this direction. Customers will still need to have their own licenses but the provisioning and setup should be much faster. This would have been very handy for a former customer of mine in the automotive industry. We had a 45 day project delay because the production server was not ready.
  • Experian launches QAS Pro On Demand, address verification and standardization on demand. I am a big fan of on demand data quality/enhancement services. In house solutions could be very expensive and cumbersome to put together. 
  • YHOO announces pricing structure for BOSS. I am all for this because nothing can be free forever. I hope other vendors follow suit and charge users for extended use of their services. I also wish everybody offered (AMZN AWS) offered a limited set of services at no charge.
  • The New York Times "big" API is now available: 2.8 million articles from 1981 to date.

Tuesday, February 10, 2009

Reading Radar

I came across the wonderful Reading Radar mashup by John Herren. It is a great example of the power of mashups: simplicity and elegance. Of course, John's neat packaging helps enhance the overall experience.

I really liked this mashup for 3 reasons: time to market, maintainability and business angle. Time to market: per John's blog the mashup was very simple to put together maybe a handful of days. Granted John is very talented but the end product is well worth 2 or 3 weeks of effort; still fairly quick to put together. Maintainability: I love the fact that the content is managed by AMZN and the NYTimes. With an automatic update this application requires no administration overhead. Finally business angle: by leveraging AMZN's associate program John now has the opportunity to profit from this application with minimum on going costs (e.g. hosting).

This example helps to underline the tremendous potential of the programmable web. A few dozen reusable services can potentially power thousands of applications and if these applications expose their own APIs then they become building blocks on their own to create millions of new services. I believe this formidable domino effect will create a new virtual marketplace with almost unlimited economic power.

Saturday, February 7, 2009

Weekly Roundup

These are the headlines that caught my attention this past week:
  • Sun announces Cloud Computing Service. This is a somewhat late entry from a struggling company. I look forward to more details; I think they face an uphill battle against more established players like AMZN and GOOG and against more recent offerings from IBM and MSFT.
  • Mosso puts pressure on the pricing of cloud storage and content delivery, directly challenging Amazon. I don't know how these services compare but I am sure Mosso will not be the last vendor to go after AMZN with aggressive pricing. I have never been a big fan of competing on price in a market that is still so young. I would rather see additional functionality be the key differentiator. Particularly when there are so many opportunities to innovate. I hope this is not a sign that the Mosso team is running low on ideas.

Wednesday, February 4, 2009

Cloud Availability

Recently I was talking to an IT executive about Cloud Computing. He was concerned about vendor reliability and SLAs. This is a recurring topic among Cloud skeptics. I often hear them quote the Gmail outages from a few months ago.

Downtime tolerance certainly varies by application and there is a school of thought that proposes that applications with high availability requirements will never move to the Cloud. 

"Independent Applications" that control all of its components should have no problems.  Services like Amazon's EC2 provide so much flexibility and control that administrators and developers can feel right at home. Traditional recovery and redundant approaches can be put in place but with the benefit of having limitless (virtual) resources that can be launched on demand. These are the key ingredients to satisfy even the most demanding environments.

Applications that rely on external services face a bigger challenge. These applications will be as strong as their weakest link. Whether using public services or from partners these Mashups or composite applications will have to develop new patterns to handle exceptions and recovery. Maybe through data caching if timeliness requirements permit or maybe by re-routing to secondary providers. Whatever the approach might be, it seems that the first order of priority is working with IT leaders to ensure a smooth migration for the "Independent Applications". 

Monday, February 2, 2009

Cloud Computing Definition

William Hurley's post about the need to have a common definition for Cloud Computing made me think for a couple of days; particularly this quote:
"What matters is whether or not the community can get together, collaborate on a definition, and support that definition."
To me Cloud Computing is the most disruptive technology of the past 10 years (wireless communications was the previous one) and contrary to popular wisdom, the lack of a common definition will not hamper its evolution nor its adoption.

For many years the Internet was just as difficult to define but its clarity and effectiveness took the world by storm. Cloud Computing will be the same. The Cloud will continue to find its way into our every day life, at home and at work. Call it Gmail, AWS, Twitter, GAE or Virtualization. Individuals and businesses of any size will be convinced by the Cloud's simplicity, efficiency and economics.

While Vendors, Press and Analysts might struggle a bit, the Cloud will continue to impact the way we communicate and process information. We are witnessing a technology tsunami. It is here to stay and it needs no formal definition.