- Avg. SEO traffic percentage: 7.68% (previously reported at 0.76%)
- Avg. SEO revenue position: 1.85 (previously reported at position 8.2)
- Landing pages for the (not provided) segment, and for queries that are retained, tend to line up. In other words, we don't see (not provided) driving predominantly to unique, or outlier, landing pages. No real surprise there. However, since that is the case, then making assumptions about the distribution of queries for said landing pages becomes easier. We can, for example, project the query distribution of our known landing page URLs onto the (not provided) landing page URLs to re-create the (admittedly pseudo) data.The point raised, is that this won't account for outlier terms - notably long tail varieties - that are infrequent, hard to predict, and often only have one occurrence. That's the cost.
- Paid search can be a fruitful area to explore. For campaigns that are driving both SEO and PPC traffic to the same set of URLs, managers can reports to pull these URLs and line them up together. Then, using a search query report, the actual query a user entered which fired a PPC ad can be found for a URL. These can be "back filled" to re-create data for the lost (not provided) segment on the same URL. Surely it's not apples to apples, but it's something.
- Probably the most elegant solution for (not provided) is to simply report on the segment with landing page URLs. Look to quality metrics such as time on site, average pageviews, bounce rate, and conversion or revenue numbers, and compare that to known segments. Then, when called upon to do a deeper analysis, run a few of the techniques here to derive pseudo (but approximate) query data for the lost terms.
"If you want to take things a step further you can apply the distribution of the clustered keywords against the pool of (not provided) traffic. First you reduce the denominator by subtracting the (not provided) traffic from the total. In this instance that’s 208 – 88 which is 120. "Even without any clustering you can take the first keyword (bounce rate vs. exit rate) and determine that it comprises 20% of the remaining traffic (24/120). You can then apply that 20% to the (not provided) traffic (88) and conclude that approximately 18 visits to (not provided) are comprised of that specific keyword."What About Google Webmaster Console? Finally, a word about Google Webmaster Tools. Others have discussed (and Google themselves have stated) that Search Query reports run here can be a useful replacement for (not provided). And it should be noted that this data is indeed an actual replacement for the lost data: Google passes query information for its logged in users to Adwords and Webmaster Tools. That's the good news.
Just a sample: GWT data is not a replacement for analytics. The unfortunate bad news is that Search Query reports from GWT are famously unreliable. They are limited to a rolling 30 day window, so unless you're manually exporting the reports each month (no API is offered), you're only seeing a snapshot of data. Secondly, the data itself is limited to the top 1,000 terms. Finally, and probably most importantly, it's not a replacement for analytics. As Matt Cutts has correctly stated,
"Please don't make the argument that the data in our webmaster console is equivalent to the data that websites can currently find in their server logs, because that's not the case."It's nice to have some data available in GWT, but it's not a solution. What about you? What are you doing about this problem? Special thanks go out to Jamey Barlow, Jody O'Donnell, Cara Pettersen, and the rest of the RKG SEO team for help writing this post! UPDATE: Ben Goodsell of RKG has a recent piece on SearchEngineWatch covering this topic: 5 Stages of Coping With Lost Search Query Data. Entertaining and spot on!