A legal botnet – Billion $$ Startup Idea ?

Idea: Content Delivery Network powered by ‘opted-in’ desktops – i.e. a legal botnet CDN.

This post was inspired by this comment on Hacker News.

Explanation: CDNs now, are traditionally powered by data centers scattered all over the globe – that in many cases have edge networks where blade servers cache the most frequently requested content (i.e. content requested in the last 15-minutes). The good thing about this model, as opposed to downloading from a centralized server, is when you request content you can be downloading from the closest server to you – which results in faster download times.

The problem is that CDNs nowadays are still only able to have a handful of these datacenters all over the world. Some people estimate that Amazon (the great Amazon) only has data centers in approximately 14 cities. I know that many CDNs have arrangements with major ISPs to house local edge servers, so users that belong to those ISPs can get access to those data much quicker. That is a step in the right direction, in terms of reducing total amount of network hops to the source file, but can’t match the distribution network of an illegal P2P service.

Suppose when you wanted to download a large file, though, you could download it from everybody in your neighborhood - at the same time. This drastically changes the equation. That’s what P2P networks and clients kind of allow you to do. Even some games allow you to search for the closest server (hosted by an individual) to your IP.

So, back to the billion $$ idea. If you were able to create a CDN that distributed content (you would have to determine whether or not the content would be encrypted and the trade-offs therein) to a major network of peers all over the world. Think of a legal botnet. Not a spyware botnet, where users are tricked into installing ‘Anti-virus 2010′, but a network that people register to join and are paid per GB of disk storage and bandwidth used.

Let’s look at some math.

From the company’s perspective – using Amazon’s price sheet as benchmark (as of Jan 8, 2010):

  • Charge $0.15/GB for storage
  • Charge $0.17/GB for bandwidth

For calculations sake, we are only going to use these figures. The calculations can also be applied to a graduated scale of bandwidth used and storage consumed – for the user.

  • Pay a peer $0.01/GB for storage
  • Pay a peer $0.02/GB for bandwidth




The main issues we can see with this model are that for the lower allocations of disk space – provided by the user – and slower internet pipes, the user takes home a pittance (approximately $4.83 for a 1mbit connection with 50GB storage space).  However, for more powerful connections with higher quantities of space, a user can be more than doubling or tripling the cost of their internet connection every month (with prices falling so rapidly, especially in the US). With a 50mbit connection, and 50GB storage, a user can be taking home approximately $220 per month. Assuming 50mbit will be $50 throughout the US, following the lead of a town in Minneapolis, they have quadrupled the cost of their broadband connection – assuming the CDN uses their connection for the amount of time they allocate it for (and also assuming their computer is left running 24×7).

Note: The Discount Factor in the spreadsheet model above is used to reflect the amount of time the user will allow the CDN to use their computer and bandwidth. So a 0.70 discount factor means that 70% of the time the CDN will have access to the users computer and bandwidth.

Just to do a quick comparison to the large CDNs (in terms of their cost structure), here is something to think about. As far as I can tell, the largest line item expenses that the ‘non-fibre-owning CDNs’ have is the cost of bandwidth. Looking at Limelight Networks (LLNW), we see that their Cost of Revenue was approximately 64%. An unscientific comparison with the above model highlighted, shows our Cost of Revenue being 6.7% for storage, and 11.8% for bandwidth = approximately 18.5% of Revenues. Even if our model inputs were doubled ($0.02 for storage, $0.04 for bandwidth) cost of revenues would be approximately 37% (i.e. 50% lower than the traditional CDN).

The CDN business is a multi-billion dollar business. With a model like this, scaling becomes more of a software problem – than a hardware problem – which is a beautiful place to be. You scale on the backs of ISPs and users are responsible for maintaining their machines. It’s Google’s Adsense business model implemented as a CDN – in short.

Please feel free to tinker with the numbers, and leave comments for how I can improve the model.

Disclaimer: This post assumes a number of variables, all of which are not fully fleshed out – due to the difficulty in fully conveying the complete picture in a simple blog post.

If you liked this post, subscribe to my feed to get notifications on other posts, and you should follow me on twitter here.

Related posts:

  1. App Idea for the iPhone/iPod Touch/iPad
  2. Five things your startup can learn from David Plouffe
  3. What can Google hold in RAM?
  4. This is why people pirate software
  5. Clever Business Card Design


5 Responses to “A legal botnet – Billion $$ Startup Idea ?”

  • Marc Gayle Says:

    @Kevin…thanks for the positive contribution to the dialogue. Care to be specific with your criticisms about the idea?

    @Jason You are right, that all requests go through the ISP (or local DNS server) first, before going to my next door neighbor. However, the point I was making was not about Yahoo specifically, but all web content. Not every content host will be close to your ISP or use a CDN (i.e. LLNW or Akamai) that is close to your ISP. I agree that on the surface, based on current implementations, it might seem impossible to get a nice large data experience with what seems like smaller upstreams. However, there are many projects that have started doing just that – http://www.bitlet.org/video is a service that streams video (using Bit Torrent) through other people’s upstream. It plays fine to me….could do with a little work – and yes it does require a java applet to be installed – but the video quality and watching experience is good.

    I never said it could be done with the toolset we currently have, but I think it can be done with a little ingenuity.

  • Jason Says:

    “when done properly, is that the latency from you to me (when we live side by side) is much lower than you and yahoo.com”

    Here is the problem – that’s simply not true. Given the best case scenario of us living next to each other on the same ISP, and that the consumer is 30ms from the ISP’s backbone it will take 60ms to get to you. This is as opposed to around 30ms for me to get to Yahoo now (because they are so close to my ISP’s backbone, probably even in the same building as they use Akamai?).

    The limitation is the speed of light, not “technological advancement”.

    “Large data experiences” are even worse – most consumer connections are asymmetric. I have a 24mbit/s line, but I can only upload at 1mbit/s.

  • Kevin Says:

    Wow, This is a steaming pile written by someone entirely ignorant on the topic. You don’t know anything at all about CDN’s do you?

  • Marc Gayle Says:

    Jason that is one of the main concerns that all ISPs & CDNs have. I do agree that not all connections will be valid – i.e. not all peers will be used all the time. But that is a decision the network (i.e. the software) will make as needed.

    In other words, the latency of your connection is only the time it takes for packets to reach a prespecified network point and back – e.g. a DNS server or a popular website. However, the beautiful thing with the P2P CDN, when done properly, is that the latency from you to me (when we live side by side) is much lower than you and yahoo.com. So if the network can optimize to chose the lowest path (which is what most P2P protocols & many other applications do now), that problem will be solved.

    Also…I am not sure that all the work that Amazon is doing is resulting in dramatically reduced latency all the time – http://chrismeller.com/2009/10/amazon-cloudfront-vs-rackspace-cloudfiles-cdn-performance/

    We have reached the stage in technological advancement where latency is no longer a factor of the internet connections, but rather of the distance between the hops. The premise to my thought is, if we could reduce the total number of hops to a very granular amount (i.e. in an ideal world every time someone wants a piece of content on the CDN, they are downloading it from their 50 closest peers) then that would solve the latency issue and large data experiences (like HD video, etc.) would be much more pleasant than it is today (although let the truth be told, today it isn’t TOOO unpleasant in comparison to where we are coming from).

    Hope that makes sense.

  • Jason Says:

    You’ve not mentioned one of the main thing CDNs trade off – low latency. Consumer connections have relatively high latency (25-30ms on my consumer ADSL line, less on cable). Amazon’s 14 datacentres will be hugely connected on many major hubs in an attempt to reduce latency.

Leave a Reply