I enjoy listening to This American Life, even during the time of year when Ira Glass is doing the annual ask-for-money routine. Like most years I'll probably make a small contribution because I like the stories and work they do. However, the last episode I listened to opened with two interesting statistics:
- Over 400,000 people download the podcast every week.
- The bandwidth costs add up to $130,000 per year.
My knee-jerk reaction was that had to be a mistake. Or rather, someone somewhere along the lines was making a huge mistake, resulting in this exorbitant bandwidth bill. Surely they must be able to save money along the way if the fixed their infrastructure, right?
As a professional web developer, I felt like the cost of running a highly successful podcast was something I should know. Making guesses about systems you don't know much about is folly, but it didn't stop me — I decided to do a little research and make some back-of-the-envelope calculations to see if I came within the same ballpark. To make a reasonably educated guess I had to figure out:
- How they were currently hosting their content. Are they using a storage service like Amazon S3?
- How are the feeds being delivered?
- How big is the average episode and how many have they done?
- How many downloads are occurring per month?
After poking around their website I deduced the following:
- They were using Feedburner to aggregate their podcast. Why does this matter? Assuming 400,000 people download the show every week, that's 400,000 podcast subscribers pinging somebody's server. The difference between having their server or Feedburner's server bear the burden seems significant.
- They did not appear to be using any specialized storage service to host their mp3s. My thought was switching over to something like Amazon S3 would be the thing that could save them money here, but we'll explore this notion in a moment.
- The average episode is approximately 30mb.
- They have, at the time of this writing, 437 original episodes which appear to all be available online.
- At 400,000 downloads per week and 4.348 weeks in a month, that's about 1,739,200 downloads every month.
- Knowing the average number of downloads and average size of a podcast we have the most important number: the monthly bandwidth. That turns out to be about 52.2 terabytes every month.
Holy cow. That is a lot of bandwidth. Still, I pulled up the Amazon Web Services calculator to do a back-of-the-envelope calculation and guess how much it would cost to host their podcast. To be safe, I rounded the numbers up, assuming an even 2 million downloads per month, storage for 450 podcasts (13.5gb), just in case some of them were longer than others, and an 60 terabytes a month in bandwidth. Another part of the Amazon S3 equation is how much stuff you're uploading to their servers every month. This was a bit more difficult to guess, but I figured it couldn't be too much, seeing as they do about one episode per week and we already learned one episode is only 30mb, but I entered 1gb, again to be on the safe side. The complete shot in the dark was how many PUT requests they would need. Because I'm a fairly heavy S3 user, using it for everything from daily work backups, a few online projects and this blog, I went with my own number here and added a zero, putting it at 220,000. That might be completely overshooting with that number, but the cost is only $0.01 per 1,000 PUT requests, so it's not a very significant factor.
The resulting cost, as of this writing, was approximately $7,000 per month or $84,000 per year. That doesn't include the cost of hosting their current Drupal site, which likely receives similar traffic and requires a solid hosting solution that would add at least a thousand or two per year to the total.
Still, that seems like a significant savings compared to $130,000 per year, so I poked around their website a bit more to see what I was missing. I noticed that, in addition to making their podcast available in the usual ways and for download, they offered a streaming online version. I didn't do the math to figure out the additional cost, but I know that running an Amazon Cloudfront instance in addition to the S3 usage, when you're talking that kind of bandwidth, could add up in a hurry.
So, in conclusion, the $130,0000 a year number is completely believable. I would be curious to know how the bandwidth usage breaks down - namely, what percentage of people were listening via the web stream. If I were them, I would consider cutting that option to save costs, especially if it's costing thousands per month.
All in all, it was an interesting thought exercise in determining the cost of running a popular podcast.
Also, if anyone smarter than me would like to chime in and point out any gross oversights or assumptions I've made here, I'd be more than happy to hear about it!