What is Toluu?
Toluu is a free service for sharing the feeds you read and discovering new ones.
Get Invite

Scalable web architectures

Performance and availability matters too...


Amazon launches CloudFrontNovember 18 2008

Amazon has finally opened the doors of its new CDN (Content Delivery Network) called CloudFront. But instead of building a completely new product it has interestingly expanded its S3 network to include content replication for lower latency content delivery. By not reinventing a whole new way of uploading data to the CDN network, Amazon has seriously cut down the cost for end users to try out this technology.image Most of the CDNs I’ve investigated do very well with static content which needs to be periodically refreshed somehow.

There is at least one service from Akamai called WAA - Web application accelerator which seem to understand the importance of accelerating extremely dynamic content using intelligent routing and closer points of presence to end user. WAA doesn’t put the content closer to the end user, but provides an extremely efficient conduit for this traffic where Akamai controls both ends network by placing a POP in front of the client and the server. By doing this Akamai can take control of

Scaling Early: FeedjitNovember 10 2007

Mark Maunder from Feedjit make an interesting presentation about scaling early. He focuses on some of the key operational issues related to the web server and server caching which I found very interesting.

SlideShare | View | Upload your own

arch?i=MwDKU9

Mysql on HDFSNovember 4 2007

A short thought provoking post by Mark Callaghan about running Mysql over HDFS. Its probably not ideal, but its an interesting thought regardless.

arch?i=cn46aw

arch?i=Ox21l arch?i=SHeIl arch?i=Eqf9l arch?i=y8XSL
179828661
Scaling technorati - 100 million blogs indexed everydayOctober 25 2007

Indexing 100 million blogs with over 10 billion objects, and with a user base which is doubling every six months, technorati has an edge over mostlogo_md.gif?1177631794 blog search engines. But they are much more than search, and any technorati user can explain you that. I recommend you read John Newton’s interview with David Sifry which I found fascinating. Here are the highlights from the interview if you don’t have time to read the whole thing

  • Current status of technorati
    • 1 terabyte a day added to its content storage
    • 100 million blogs
    • 10 billion objects
    • 0.5 billion photos and videos
    • Data doubling every six months
    • Users doubling every six months
  • The first version was supposed to be for tracking temporal information on low budget.
    • That version put everything in relational database which was fine since the index sizes were smaller then physical memory
    • It worked fine till about 20 million blogs
  • The next generation took advantage of parallelism.
    • Data
Scalability stories for Oct 22, 2007October 22 2007