In a previous article, I showed how to set up an account structure for easy slicing and analysis across variations of keyword themes, product types, regions, etc…. This time around, I’m going to show you how to aggregate that data, compare the various slices, and make sure that the results are meaningful enough to take action on.
There is only one thing, other than your base search engine performance data export and some version of the comparison tool I’ve added here, to make something with impact to share. That is a statistical significance assessor. I’ve included a link, here, to my favorite. It’s from Chad Summerhill’s blog, and I’ve found it useful for years.
In the example I’ll be using to illustrate how I go about this, the ad group naming convention was set up to show locations then keyword spin modifiers based on variations of how to refer to the given product. The actuals have had their names changed to protect the innocent. I’ll focus on the spin modifiers, as I’m trying to assess the general efficacy of certain wordings in a particular region (which was part of the naming convention of the campaigns, and has been pre-filtered for this study). Also, for this example, let us say that conversion rate is the item being optimized to solely, and outside of any other concerns.
First, I processed a text-to-columns on the ad group names, then pivoted the results. The fields that I’ve chosen for this example analysis are “Modifier Theme”, “Impressions”, “Clicks”, and “Initial Conversions”, as seen below. Please note that when I refer to conversion rates in the rest of this article, in this case, it will be initial conversion rate.
There are already many things that can be seen, due to the nature of the spread of impressions, but that is for another analysis. The good news is that because the data is uneven, it is a good example of making sense of a situation where we can’t just pick the winner based on the highest conversion number.
Next, I will plug the top 6 data heavy items in this slice (theme “F” has no data and theme “G” has too few impressions and clicks to bother with, so I’ll leave those out … and factor that performance into the next steps of testing) into the tool I’m sharing with you, to help compare the data in the first step, before letting the math do its thing. A simple copy and paste special values into cells B4 through E9 later, and we should see this:
It looks like theme “G” is the clear winner, if we just compare the conversion rates to the other themes directly, but should we then consider the other themes the “losers” in this scenario? Let’s find out. By using columns T through V in the above sheet, and plugging them into the test validity spreadsheet that Chad provides, set at a significance level that instills confidence (for me, given my physics background, that would be 95%). I will start with theme “G” compared to the rest of the data combined.
Theme “G” is the winner (the spreadsheet automatically indicates “winners” and “losers” by highlighting the appropriate cells either green or red respectively) in CTR and in conversions per impression, but not in actual conversion rate (which is neither green nor read, because the difference from the rest of the themes performance is not great enough, given the total data at hand), at least, not with statistical significance at a confidence level of 95%. So, while it is definitely better at lower confidence levels, it might not be the sole theme to focus on. Let’s see if perhaps we can confidently (teehee) eliminate other variations as losers (understanding that I will actually rework ad copy and landing pages for those keywords, and not just pause them, unless there was an efficiency priority looming). Let’s check on a little further down on the scale of conversion rates, popping in numbers for theme “C”.
Now, as the person optimizing for efficiency, if I were stressed for time, I would pause out the keywords which were a part of this theme of spins, knowing that these are worse, in general, in terms of the click through rates and initial conversion rates, freeing up budget and improving the general quality of the account. Or, at least, that’s where this might stop if I were not working for a client or my company. Someone is going to ask me about why I’d take this action, and what the data says.
It is at this point that I will provide some depth to what these tools are saying, without going into the actual math or the many, many statistical schools of thought on significance. In short, what these tools tell us, is that we can be confident that, based on the historical data we’ve entered, theme “C” is a lesser performer in CTR and ICR at least 95% of the time in the past, as an indicator of its nature. We can also say that we are confident that at least 95% of the time in the past, theme “G” has been a winner in CTR, but that it’s too close to call whether theme “G” has been the best converter per click, as an indicator of its nature.
Always be sure to avoid making promises about the future, and know how to feel confident about the way you interpret the past.