Digg and Revision3 are not my first Internet businesses, but they are entirely different from my previous ventures in how success is measured by everyone around us. This is the first time I've been hostage to outside metrics companies, with which I usually have no relationship, for telling others how well I'm doing. That annoys me, of course, but "welcome to the web," analysts, reporters, and my investors tell me.
Equinix is a public company, so it was well established (and reinforced with SOX) that outside consulting companies, acting as independent auditors, would determine the accuracy of our reports. Even as a private company, I can still hire consulting firms to come in and do the same thing where revenue or valuation is concerned...Still, for the most important metric of all, usage (generally viewed through page views and unique visitors), there are no audit options. I have to rely on companies who have performed no technical audit of what is going on, providing their viewpoints through flashy websites and questionable panel methods. Doesn't anyone else see this as strange?
Well, before I get too deep into that, remember the purpose of publishing these numbers is ultimately two fold: One, to tell the public how well we're doing. Two, to tell the advertisers how many people will see their ads, to help establish a price/market for our ad inventory.
The funny thing about the first one is that page views and unique visitors aren't a perfect view into success, because they don't mean the same thing for every site. A blog with 20,000 visitors may consider themselves hyper-successful; it really depends on the site's purpose and the niche quality of the site. For example, if I were to do e-commerce selling rare vacuum tubes, and most of the specialists who order those things used my web site, I'd have penetrated my market perfectly. If Digg was only a tech site, my market would be considerably smaller than it is, etc. Nevertheless, it is true that measuring change over time, or in our case, growth quarter over quarter (month over month tends to be a bit too granular because of seasonal changes) is a pretty good indicator of success.
The second issue is typically measured through page views, though it's not exactly what advertisers want. It is true that the greater the page views the greater the ad impressions... Though what advertisers ultimately want to know is how many ad impressions they will push, and even more important, how well targeted are those ads. You can't really get that data from page views, you have to do more direct research. A great way is via the ad networks themselves, who know better than anyone how many ads are being served. Honestly, I find when I speak with advertisers they trust themselves the most, do test campaigns, measure results, and know for certain the effect of their ads on a particular audience.
So now we're back to the issue of how these statistics are measured. I've read countless articles over the past two years on the problem with external panel-based measurement, such as BusinessWeek and Fortune. We've all heard about it, the trust in these metric companies is gone. At this point, I've had conversations with just about every major media company online, and they all agree that panel-based measurement doesn't work, particularly for niche-targeted web sites. The problem is fundamental: Let's say you're a targeted, niche site (this is not about Digg, this is just a made up example). If 5% of the general population, tens of millions of people, are your audience, and a typical panel only has 5% of it represented by your audience, you know you'll show less than 5% of their result. Further, since it's such a small portion of the sample base, that 5% isn't really a diverse representation of those tens of millions. You need a panel designed for your site, which is a fairly subjective number with an ever-changing audience. It's simply not the right way to do it. It can measure what your demographic is fairly well, which is important, but it can't really measure usage.
The issue is with credibility. ComScore and Nielsen/Netratings, largely for historical reasons, are assigned a certain trust level regardless of the open outcry of the failure of panel-based reporting. The alternatives are young or flawed themselves... One metric company I spoke with claimed to be more accurate through the use of sampling aggregate ISP backbone pipes. I could go into the technical and statistical reasons why this sounds good on paper but doesn't work, but during their presentation to us, I didn't need to because they revealed they didn't target any business traffic, so they missed people surfing the web at work. These guys want credibility through an alternative approach, which is awesome, but they can't break through the age and tradition of using comScore. Also, considering so many people hit Digg from work, I wasn't thrilled that this demographic wasn't important to them.
These panel methods or sampling methods by definition would need panels that fit the profile of your userbase, or would have to somehow adjust for your niche in their modeling. To date, I haven't seen anyone come close to accuracy using these tricks, so we can't trust these numbers when doing our own market research. When I see a panel repeatedly match the real numbers, then I'll reconsider my opinion.
The alternative approach, championed by Quantcast, is to use panel-based methods for the mass, and for those who subscribe (for free, I might add), they'll measure using the accurate pixel-based method (where they put a pixel on each page that they can track directly). It's not a bad approach, but having any panel based results, then setting them side by side with the direct measurements, drops the credibility of the direct measurements. Still, I like these guys as far as external services go.
Then, of course, there are the infamous toolbar-based metric systems. About the only websites these are good for are the ones that everyone on earth uses, because very few Digg users are the type of people who would want someone watching what they were doing. The more popular Digg becomes, the less likely these are correct. Maybe... Unless going more mainstream means more users willing to use toolbars, but I doubt it.
Anyway, the absolute best and most accurate way to know how well a website is doing is to just plug right in. Just as auditors come into the offices of publicly traded companies to check everyone's accounting, so could the same auditors come in and compare WebSideStory or Omniture statistics, ensure they are correctly configured, and give their stamp of approval. What will it take to move to this more accurate method of reporting? It's already happening... A number of us websites are starting to work together to plan these audits, because we're tired of inaccurate numbers.
For example, Digg did 18.5 million unique visitors in July, 2007, as measured by WSS. Remember, WSS uses a pixel and is a third-party service, so we're not talking about "internal logs." This also doesn't take into account RSS feeds (which are important for measuring success, but ignored by most making comparisons) or the Digg buttons syndicated all over the planet. In an article in Fortune making similar points to this blog entry, they even got it wrong, citing the number as 10.5 million. [Editor's note: 10.5M is correct for U.S. visitors only, my bad.] ComScore still says 4 million. The Digg employees look at each other and just shake our heads. Essentially, the numbers being traded around about how many people visit Digg is are completely wrong. Stop the insanity.
At least with websites like Digg there is some common metric as defined by a browser-based ad impression. For Revision3, anyone looking at the website is missing the point: Revision3's success is not measured by how well the website is doing, but rather how many ad impressions are viewed when people watch the episodes. 70% or more of the people watching Revision3's shows do not watch them on the website (something the folks there are working on making more attractive, by the way)... but rather receive the shows via RSS, such as with iTunes, and thus skip the web all together. How do we measure these impressions? Right now, the common method is to measure full downloads (Revision3 tracks over 1M downloads a month, for example) that repeat. The RSS readers stop downloading if you don't watch, so there is some trust in a repeat download.
PodTrac is a great start, and Revision3 continues to experiment with them, as they also measure downloads and views. However, like panel-based measurements of websites, the devil is in the details. Some media players download files in chunks at a time, and thus show up as multiple "hits" on a download server. Revision3 takes this into account by dividing the total amount of bits downloaded by the file sizes in question. Most tracking services measure based on how often you use a particular URL to establish a download, built into the enclosure, but this will increase artificially with these weird players. I'm not sure if PodTrac takes this into account, it may, but I'll let the wizards at Revision3 do the analysis themselves by comparing the numbers.
[Recently, Revision3's "The GigaOM Show" interviewed
execs from Quantcast and Hitwise. Interesting to hear it from their
own perspectives.]
Eventually, the promise of some of the technologies involved is that regardless
of how you watch it, there can be some tracking capabilities built
within the media itself. Don't kid yourself, however, with the myriad
of formats and players, that day hasn't come yet.
Like Digg, Revision3 and other video companies could do well with consulting companies doing third party audits. This way, everyone would conform to the same standard, and no one would second-guess the numbers.
Panel-based measure is a quick fix for an impatient audience. While we all want an automated and universal way to deal with this problem, the truth, if you want it, is going to take more work. The good news is, I can tell you directly: This is work we're willing to do.
[Editor's Note: For a great summary of various web analytics packages and their various methods of tracking, as well as a great analysis and comparison of the packages, check out Jim Sterne's 2007 Web Analytics Shootout.]
Recent Comments