Tag Archive | "Analysis"

How to Do a Competitor Analysis for SEO

Posted by John.Reinesch

Competitive analysis is a key aspect when in the beginning stages of an SEO campaign. Far too often, I see organizations skip this important step and get right into keyword mapping, optimizing content, or link building. But understanding who our competitors are and seeing where they stand can lead to a far more comprehensive understanding of what our goals should be and reveal gaps or blind spots.

By the end of this analysis, you will understand who is winning organic visibility in the industry, what keywords are valuable, and which backlink strategies are working best, all of which can then be utilized to gain and grow your own site’s organic traffic.

Why competitive analysis is important

SEO competitive analysis is critical because it gives data about which tactics are working in the industry we are in and what we will need to do to start improving our keyword rankings. The insights gained from this analysis help us understand which tasks we should prioritize and it shapes the way we build out our campaigns. By seeing where our competitors are strongest and weakest, we can determine how difficult it will be to outperform them and the amount of resources that it will take to do so.

Identify your competitors

The first step in this process is determining who are the top four competitors that we want to use for this analysis. I like to use a mixture of direct business competitors (typically provided by my clients) and online search competitors, which can differ from whom a business identifies as their main competitors. Usually, this discrepancy is due to local business competitors versus those who are paying for online search ads. While your client may be concerned about the similar business down the street, their actual online competitor may be a business from a neighboring town or another state.

To find search competitors, I simply enter my own domain name into SEMrush, scroll down to the “Organic Competitors” section, and click “View Full Report.”

The main metrics I use to help me choose competitors are common keywords and total traffic. Once I’ve chosen my competitors for analysis, I open up the Google Sheets Competitor Analysis Template to the “Audit Data” tab and fill in the names and URLs of my competitors in rows 2 and 3.

Use the Google Sheets Competitor Analysis Template

A clear, defined process is critical not only for getting repeated results, but to scale efforts as you start doing this for multiple clients. We created our Competitor Analysis Template so that we can follow a strategic process and focus more on analyzing the results rather than figuring out what to look for anew each time.

In the Google Sheets Template, I’ve provided you with the data points that we’ll be collecting, the tools you’ll need to do so, and then bucketed the metrics based on similar themes. The data we’re trying to collect relates to SEO metrics like domain authority, how much traffic the competition is getting, which keywords are driving that traffic, and the depth of competitors’ backlink profiles. I have built in a few heatmaps for key metrics to help you visualize who’s the strongest at a glance.

This template is meant to serve as a base that you can alter depending on your client’s specific needs and which metrics you feel are the most actionable or relevant.

Backlink gap analysis

A backlink gap analysis aims to tell us which websites are linking to our competitors, but not to us. This is vital data because it allows us to close the gap between our competitors’ backlink profiles and start boosting our own ranking authority by getting links from websites that already link to competitors. Websites that link to multiple competitors (especially when it is more than three competitors) have a much higher success rate for us when we start reaching out to them and creating content for guest posts.

In order to generate this report, you need to head over to the Moz Open Site Explorer tool and input the first competitor’s domain name. Next, click “Linking Domains” on the left side navigation and then click “Request CSV” to get the needed data.

Next, head to the SEO Competitor Analysis Template, select the “Backlink Import – Competitor 1” tab, and paste in the content of the CSV file. It should look like this:

Repeat this process for competitors 2–4 and then for your own website in the corresponding tabs marked in red.

Once you have all your data in the correct import tabs, the “Backlink Gap Analysis” report tab will populate. The result is a highly actionable report that shows where your competitors are getting their backlinks from, which ones they share in common, and which ones you don’t currently have.

It’s also a good practice to hide all of the “Import” tabs marked in red after you paste the data into them, so the final report has a cleaner look. To do this, just right-click on the tabs and select “Hide Sheet,” so the report only shows the tabs marked in blue and green.

For our clients, we typically gain a few backlinks at the beginning of an SEO campaign just from this data alone. It also serves as a long-term guide for link building in the months to come as getting links from high-authority sites takes time and resources. The main benefit is that we have a starting point full of low-hanging fruit from which to base our initial outreach.

Keyword gap analysis

Keyword gap analysis is the process of determining which keywords your competitors rank well for that your own website does not. From there, we reverse-engineer why the competition is ranking well and then look at how we can also rank for those keywords. Often, it could be reworking metadata, adjusting site architecture, revamping an existing piece of content, creating a brand-new piece of content specific to a theme of keywords, or building links to your content containing these desirable keywords.

To create this report, a similar process as the backlink gap analysis one is followed; only the data source changes. Go to SEMrush again and input your first competitor’s domain name. Then, click on the “Organic Research” positions report in the left-side navigation menu and click on “Export” on the right.

Once you download the CSV file, paste the content into the “Keyword Import – Competitor 1” tab and then repeat the process for competitors 2–4 and your own website.

The final report will now populate on the “Keyword Gap Analysis” tab marked in green. It should look like the one below:

This data gives us a starting point to build out complex keyword mapping strategy documents that set the tone for our client campaigns. Rather than just starting keyword research by guessing what we think is relevant, we have hundreds of keywords to start with that we know are relevant to the industry. Our keyword research process then aims to dive deeper into these topics to determine the type of content needed to rank well.

This report also helps drive our editorial calendar, since we often find keywords and topics where we need to create new content to compete with our competitors. We take this a step further during our content planning process, analyzing the content the competitors have created that is already ranking well and using that as a base to figure out how we can do it better. We try to take some of the best ideas from all of the competitors ranking well to then make a more complete resource on the topic.

Using key insights from the audit to drive your SEO strategy

It is critically important to not just create this report, but also to start taking action based on the data that you have collected. On the first tab of the spreadsheet template, we write in insights from our analysis and then use those insights to drive our campaign strategy.

Some examples of typical insights from this document would be the average number of referring domains that our competitors have and how that relates to our own backlink profile. If we are ahead of our competitors regarding backlinks, content creation might be the focal point of the campaign. If we are behind our competitors in regards to backlinks, we know that we need to start a link building campaign as soon as possible.

Another insight we gain is which competitors are most aggressive in PPC and which keywords they are bidding on. Often, the keywords that they are bidding on have high commercial intent and would be great keywords to target organically and provide a lift to our conversions.

Start implementing competitive analyses into your workflow

Competitive analyses for SEO are not something that should be overlooked when planning a digital marketing strategy. This process can help you strategically build unique and complex SEO campaigns based on readily available data and the demand of your market. This analysis will instantly put you ahead of competitors who are following cookie-cutter SEO programs and not diving deep into their industry. Start implementing this process as soon as you can and adjust it based on what is important to your own business or client’s business.

Don’t forget to make a copy of the spreadsheet template here:

Get the Competitive Analysis Template

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

The SEO Competitive Analysis Checklist

Posted by zeehj

The SEO case for competitive analyses

“We need more links!” “I read that user experience (UX) matters more than everything else in SEO, so we should focus solely on UX split tests.” “We just need more keywords on these pages.”

If you dropped a quarter on the sidewalk, but had no light to look for it, would you walk to the next block with a street light to retrieve it? The obvious answer is no, yet many marketers get tunnel vision when it comes to where their efforts should be focused.

1942 June 3, Florence Morning News, Mutt and Jeff Comic Strip, Page 7, Florence, South Carolina. (NewspaperArchive)

Which is why I’m sharing a checklist with you today that will allow you to compare your website to your search competitors, and identify your site’s strengths, weaknesses, and potential opportunities based on ranking factors we know are important.

If you’re unconvinced that good SEO is really just digital marketing, I’ll let AJ Kohn persuade you otherwise. As any good SEO (or even keyword research newbie) knows, it’s crucial to understand the effort involved in ranking for a specific term before you begin optimizing for it.

It’s easy to get frustrated when stakeholders ask how to rank for a specific term, and solely focus on content to create, or on-page optimizations they can make. Why? Because we’ve known for a while that there are myriad factors that play into search engine rank. Depending on the competitive search landscape, there may not be any amount of “optimizing” that you can do in order to rank for a specific term.

The story that I’ve been able to tell my clients is one of hidden opportunity, but the only way to expose these undiscovered gems is to broaden your SEO perspective beyond search engine results page (SERP) position and best practices. And the place to begin is with a competitive analysis.

Competitive analyses help you evaluate your competition’s strategies to determine their strengths and weakness relative to your brand. When it comes to digital marketing and SEO, however, there are so many ranking factors and best practices to consider that can be hard to know where to begin. Which is why my colleague, Ben Estes, created a competitive analysis checklist (not dissimilar to his wildly popular technical audit checklist) that I’ve souped up for the Moz community.

This checklist is broken out into sections that reflect key elements from our Balanced Digital Scorecard. As previously mentioned, this checklist is to help you identify opportunities (and possibly areas not worth your time and budget). But this competitive analysis is not prescriptive in and of itself. It should be used as its name suggests: to analyze what your competition’s “edge” is.


Choosing competitors

Before you begin, you’ll need to identify six brands to compare your website against. These should be your search competitors (who else is ranking for terms that you’re ranking for, or would like to rank for?) in addition to a business competitor (or two). Don’t know who your search competition is? You can use SEMRush and Searchmetrics to identify them, and if you want to be extra thorough you can use this Moz post as a guide.

Sample sets of pages

For each site, you’ll need to select five URLs to serve as your sample set. These are the pages you will review and evaluate against the competitive analysis items. When selecting a sample set, I always include:

  • The brand’s homepage,
  • Two “product” pages (or an equivalent),
  • One to two “browse” pages, and
  • A page that serves as a hub for news/informative content.

Make sure each site has equivalent pages to each other, for a fair comparison.


The scoring options for each checklist item range from zero to four, and are determined relative to each competitor’s performance. This means that a score of two serves as the average performance in that category.

For example, if each sample set has one unique H1 tag per page, then each competitor would get a score of two for H1s appear technically optimized. However if a site breaks one (or more) of the below requirements, then it should receive a score of zero or one:

  1. One or more pages within sample set contains more than one H1 tag on it, and/or
  2. H1 tags are duplicated across a brand’s sample set of pages.


Platform (technical optimization)

Title tags appear technically optimized. This measurement should be as quantitative as possible, and refer only to technical SEO rather than its written quality. Evaluate the sampled pages based on:

  • Only one title tag per page,
  • The title tag being correctly placed within the head tags of the page, and
  • Few to no extraneous tags within the title (e.g. ideally no inline CSS, and few to no span tags).

H1s appear technically optimized. Like with the title tags, this is another quantitative measure: make sure the H1 tags on your sample pages are sound by technical SEO standards (and not based on writing quality). You should look for:

  • Only one H1 tag per page, and
  • Few to no extraneous tags within the tag (e.g. ideally no inline CSS, and few to no span tags).

Internal linking allows indexation of content. Observe the internal outlinks on your sample pages, apart from the sites’ navigation and footer links. This line item serves to check that the domains are consolidating their crawl budgets by linking to discoverable, indexable content on their websites. Here is an easy-to-use Chrome plugin from fellow Distiller Dom Woodman to see whether the pages are indexable.

To get a score of “2” or more, your sample pages should link to pages that:

  • Produce 200 status codes (for all, or nearly all), and
  • Have no more than ~300 outlinks per page (including the navigation and footer links).

Schema markup present. This is an easy check. Using Google’s Structured Data Testing Tool, look to see whether these pages have any schema markup implemented, and if so, whether it is correct. In order to receive a score of “2” here, your sampled pages need:

  • To have schema markup present, and
  • Be error-free.

Quality of schema is definitely important, and can make the difference of a brand receiving a score of “3” or “4.” Elements to keep in mind are: Organization or Website markup on every sample page, customized markup like BlogPosting or Article on editorial content, and Product markup on product pages.

There is a “home” for newly published content. A hub for new content can be the site’s blog, or a news section. For instance, Distilled’s “home for newly published content” is the Resources section. While this line item may seem like a binary (score of “0” if you don’t have a dedicated section for new content, or score of “2” if you do), there are nuances that can bring each brand’s score up or down. For example:

  • Is the home for new content unclear, or difficult to find? Approach this exercise as though you are a new visitor to the site.
  • Does there appear to be more than one “home” of new content?
  • If there is a content hub, is it apparent that this is for newly published pieces?

We’re not obviously messing up technical SEO. This is partly comprised of each brand’s performance leading up to this line item (mainly Title tags appear technically optimized through Schema markup present).

It would be unreasonable to run a full technical audit of each competitor, but take into account your own site’s technical SEO performance if you know there are outstanding technical issues to be addressed. In addition to the previous checklist items, I also like to use these Chrome extensions from Ayima: Page Insights and Redirect Path. These can provide quick checks for common technical SEO errors.


Title tags appear optimized (editorially). Here is where we can add more context to the overall quality of the sample pages’ titles. Even if they are technically optimized, the titles may not be optimized for distinctiveness or written quality. Note that we are not evaluating keyword targeting, but rather a holistic (and broad) evaluation of how each competitor’s site approaches SEO factors. You should evaluate each page’s titles based on the following:

H1s appear optimized (editorially). The same rules that apply to titles for editorial quality also apply to H1 tags. Review each sampled page’s H1 for:

  • A unique H1 tag per page (language in H1 tags does not repeat),
  • H1 tags that are discrete from their page’s title, and
  • H1s represent the content on the page.

Internal linking supports organic content. Here you must look for internal outlinks outside of each site’s header and footer links. This evaluation is not based on the number of unique internal links on each sampled page, but rather on the quality of the pages to which our brands are linking.

While “organic content” is a broad term (and invariably differs by business vertical), here are some guidelines:

  • Look for links to informative pages like tutorials, guides, research, or even think pieces.
    • The blog posts on Moz (including this very one) are good examples of organic content.
  • Internal links should naturally continue the user’s journey, so look for topical progression in each site’s internal links.
  • Links to service pages, products, RSVP, or email subscription forms are not examples of organic content.
  • Make sure the internal links vary. If sampled pages are repeatedly linking to the same resources, this will only benefit those few pages.
    • This doesn’t mean that you should penalize a brand for linking to the same resource two, three, or even four times over. Use your best judgment when observing the sampled pages’ linking strategies.

Appropriate informational content. You can use the found “organic content” from your sample sets (and the samples themselves) to review whether the site is producing appropriate informational content.

What does that mean, exactly?

  • The content produced obviously fits within the site’s business vertical, area of expertise, or cause.
    • Example: Moz’s SEO and Inbound Marketing Blog is an appropriate fit for an SEO company.
  • The content on the site isn’t overly self-promotional, resulting in an average user not trusting this domain to produce unbiased information.
    • Example: If Distilled produced a list of “Best Digital Marketing Agencies,” it’s highly unlikely that users would find it trustworthy given our inherent bias!

Quality of content. Highly subjective, yes, but remember: you’re comparing brands against each other. Here’s what you need to evaluate here:

  • Are “informative” pages discussing complex topics under 400 words?
  • Do you want to read the content?
  • Largely, do the pages seem well-written and full of valuable information?
    • Conversely, are the sites littered with “listicles,” or full of generic info you can find in millions of other places online?

Quality of images/video. Also highly subjective (but again, compare your site to your competitors, and be brutally honest). Judge each site’s media items based on:

  • Resolution (do the images or videos appear to be high quality? Grainy?),
  • Whether they are unique (do the images or videos appear to be from stock resources?),
  • Whether the photos or videos are repeated on multiple sample pages.

Audience (engagement and sharing of content)

Number of linking root domains. This factor is exclusively based on the total number of dofollow linking root domains (LRDs) to each domain (not total backlinks).

You can pull this number from Moz’s Open Site Explorer (OSE) or from Ahrefs. Since this measurement is only for the total number of LRDs to competitor, you don’t need to graph them. However, you will have an opportunity to display the sheer quantity of links by their domain authority in the next checklist item.

Quality of linking root domains. Here is where we get to the quality of each site’s LRDs. Using the same LRD data you exported from either Moz’s OSE or Ahrefs, you can bucket each brand’s LRDs by domain authority and count the total LRDs by DA. Log these into this third sheet, and you’ll have a graph that illustrates their overall LRD quality (and will help you grade each domain).

Other people talk about our content. I like to use BuzzSumo for this checklist item. BuzzSumo allows you to see what sites have written about a particular topic or company. You can even refine your search to include or exclude certain terms as necessary.

You’ll need to set a timeframe to collect this information. Set this to the past year to account for seasonality.

Actively promoting content. Using BuzzSumo again, you can alter your search to find how many of each domain’s URLs have been shared on social networks. While this isn’t an explicit ranking factor, strong social media marketing is correlated with good SEO. Keep the timeframe to one year, same as above.

Creating content explicitly for organic acquisition. This line item may seem similar to Appropriate informational content, but its purpose is to examine whether the competitors create pages to target keywords users are searching for.

Plug your the same URLs from your found “organic content” into SEMRush, and note whether they are ranking for non-branded keywords. You can grade the competitors on whether (and how many of) the sampled pages are ranking for any non-branded terms, and weight them based on their relative rank positions.


You should treat this section as a UX exercise. Visit each competitor’s sampled URLs as though they are your landing page from search. Is it clear what the calls to action are? What is the next logical step in your user journey? Does it feel like you’re getting the right information, in the right order as you click through?

Clear CTAs on site. Of your sample pages, examine what the calls to action (CTAs) are. This is largely UX-based, so use your best judgment when evaluating whether they seem easy to understand. For inspiration, take a look at these examples of CTAs.

Conversions appropriate to several funnel steps. This checklist item asks you to determine whether the funnel steps towards conversion feel like the correct “next step” from the user’s standpoint.

Even if you are not a UX specialist, you can assess each site as though you are a first time user. Document areas on the pages where you feel frustrated, confused, or not. User behavior is a ranking signal, so while this is a qualitative measurement, it can help you understand the UX for each site.

CTAs match user intent inferred from content. Here is where you’ll evaluate whether the CTAs match the user intent from the content as well as the CTA language. For instance, if a CTA prompts a user to click “for more information,” and takes them to a subscription page, the visitor will most likely be confused or irritated (and, in reality, will probably leave the site).

This analysis should help you holistically identify areas of opportunity available in your search landscape, without having to guess which “best practice” you should test next. Once you’ve started this competitive analysis, trends among the competition will emerge, and expose niches where your site can improve and potentially outpace your competition.

Kick off your own SEO competitive analysis and comment below on how it goes! If this process is your jam, or you’d like to argue with it, come see me speak about these competitive analyses and the campaigns they’ve inspired at SearchLove London. Bonus? If you use that link, you’ll get £50 off your tickets.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

Competitive analysis: Making your auction insights work for you

Columnist Amy Bishop shares tips for identifying actionable takeaways from your AdWords auction insights data.

The post Competitive analysis: Making your auction insights work for you appeared first on Search Engine Land.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Link building: Preliminary research and analysis

Columnist Andrew Dennis outlines the research necessary to provide a solid foundation for your link-building strategy.

The post Link building: Preliminary research and analysis appeared first on Search Engine Land.

Please visit Search Engine Land for the full article.

Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

One Formula to Rule Them All: SEO Data Analysis Made Easy in Excel

Posted by Jeremy_Gottlieb

Working in SEO, I always find myself poring over data and looking for ways to expedite the analysis process. Analyzing data can often be tedious, mind-numbing, and boring work, so anything that can be done to speed up finding that needle in the haystack is almost always a good idea. A few months ago, I began using a formula in Excel to categorize data and I’m constantly finding new ways to use it.

It took a little bit of time and practice to remember the formula, to understand how it works and how to troubleshoot it if it breaks, but the time and energy put into learning it have been dwarfed by the rewards I’ve seen from employing it successfully. If you take the time to learn this formula, I promise that it will be worth it — you’ll easily be able to cut down thousands (or more) of rows in Excel into bite-sized chunks for easy insight-pulling and data presentation.

Without further ado, I present to you:

=if(isnumber(search(“string 1”, [beginning cell])),”Category 1”, if(isnumber(search(“string 2”, [beginning cell])),”Category 2”, “Other”)

I apologize if I’ve confused you already. I’ll dive into the formula deeper, explaining its meaning and providing 3 different use cases for how it can help you speed up your work.

Use Case #1: Keyword research

When I’m doing keyword research for a client and I’m staring down a list (likely thousands of rows long) of potential keywords to analyze and their search volumes, I try to lump similar ones together to see patterns of similarity. At Distilled (we’re hiring, btw!), I might use a tool like Brightedge or SEMrush to see the queries a website has visibility for. Additionally, I could just put a topic into Google Keyword Planner and receive an output of similar terms per Google. Export your results in a CSV file and you’ll have your starting point for data analysis. You might even wonder how the formula I mentioned before could even be useful because Google Keyword Planner provides an “Ad Group” column, so one should easily be able to know how to divide up the provided keywords.

Problem is, the output is often divided up between “Seed Keywords” and “Keyword Ideas”, neither of which is helpful for segmenting keyword cohorts. The screenshot above captures the queries and search volumes around related terms for “workout supplements” (note the “Seed Keyword” in cell A2 compared to all others.)

But what if I want to break down this entire list (681 queries, obviously all not shown in the screenshot) to find out how many queries include the word “supplement?” Or perhaps I want to know how many contain “muscle”; I can do that too.

The first thing I’m going to do is remove column A (Ad group) because it’s completely useless. I’m then going to add a column to the right of our search volume column and label it “Category.” At this point we’ll come up with our initial ideas for categorization, so let’s go with “supplement” and “muscle.” In cell C2 we’ll type the formula:

=if(isnumber(search(“supplement”,A2)),”Supplement”, if(isnumber(search(“muscle”,A2)),”Muscle”,”Other”))

Translated, this formula says: Search cell A2 and if “supplement” is found, return the category “Supplement.” If “supplement” is not found, look for “muscle,” and if that is found, return “Muscle” as the category. If neither “supplement” nor “muscle” are found, return “Other” as the category.

I can continue to add specifications to the formula as I see fit; “other” would just keep getting pushed back as other strings get searched for. The screenshot below shows this formula in action:

The real power of this formula is that it can be used across the entire dataset, removing the need for someone to manually go through and categorize each keyword. Double-clicking on the bottom-right corner of cell C2 (where our sheet now says Supplement) will apply the formula to all cells in column C, as long as there’s a value next to it in column B (this is a rule of Excel, not the formula). The screenshot below shows the effects of applying the formula to all of the data. Notice how the formula has changed from analyzing cell A2 to cell A19 within cell C19, where the formula is being applied.

“Muscle” isn’t listed as a category in the screenshot, but it is listed as a category later in the dataset. I also need to point out a deficiency in the formula at this point. Where a particular query includes more than one of the strings we’re trying to categorize for, it will return a category for the first positive string match it finds. Row 29 is a good example of this. In this particular scenario, the query is “muscle supplements,” but because the formula looks for “supplement” before it looks for “muscle,” and it found a positive match in “supplement,” it categorizes the cell as “Supplement.”

In the cells where neither “supplement” nor “muscle” were found, it returns “other.” At this point, we add a filter to the data set and can filter out all “muscle” and “supplement” queries to reveal exactly what makes up “other.”

Looking at this list, queries containing “protein” seem to be a sizable percentage of the list, so we can add that as a category as well. From here we can add in a pivot table and sort by search volume and count of keywords. Click here to learn more about pivot tables.

From here we can gain a perspective of where we should be targeting our efforts and where we need to focus more. “Other,” at this point, is still too large a category, so I’d go in and refine it further to create more categories to find out how we can make this even more actionable.

Use Case #2: Disavow work

Google claims that a new Penguin update is “getting closer and closer,” but the actual release date is still unknown. What is known is that monitoring your backlink profile for spammy and manipulative links is a pretty smart idea. I recommend being proactive and analyzing opportunities to disavow certain links if you think they could be a potential liability. My colleague Sergey Stefoglo recently wrote a piece on how to do a backlink audit in 30 minutes, but if you plan on manually inspecting your referring domains (and you should), this categorization formula can help.

Depending on the size of your site, you could potentially be dealing with thousands or millions of linking root domains, so you’d need to start somewhere and cut your list down. One way is to sort the domains by some sort of metric (I often use trust flow from Majestic). I use the formula to look for common words that are associated with spammy domains like “submit,” “seo,” “directory,” “free,” “drugs,” and “articles,” though there are certainly many more (“.xyz” is another I’ve seen frequently). The formula finds any of the specified queries within your list of linking root domains, allowing you to quickly identify those as spam and add them to your disavow list. The screenshot below shows a sample site’s link profile sorted by “Spam,” using the filters above as criteria and then by ascending order of trust flow. The formula used in this case is slightly longer than our previous example, but follows the same pattern.


In many cases, your link profile will have spammy links that come from legitimate-sounding domains. This formula won’t be able to filter out all of the spam, but it often helps remove at least some of the domains from your list. Also, it’s possible that some of the domains now flagged as spam by the formula may actually be legitimate websites. You should always analyze the output of this formula just to make sure it’s worked properly. Again, it serves as a starting point for your disavow work and can hopefully cut down on some of the domains, but it is by no means the only thing you should be looking at.

Use Case #3: Parsing Analytics

Another really cool use case for this categorization formula is data analysis from Google Analytics. For my clients, I’m often analyzing information about traffic to a client’s site from organic channels. I’ll change the displayed number of results from 10 to 2,500 and export the data. Once exported, I may want to know which types of pages tend to get the most traffic, convert at the highest rate, bring in the most money, or the opposite of all of these.

As each client’s site is different, you’d be looking for different things on each site. Ideally, the site will have an established subfolder structure like example.com/blog/article-1, example.com/supplements/product-1, or example.com/toys/gadget-1. With these common features in the URLs, you’d be able to label them whatever you’d like, perhaps “blog” or “supplements” or “toys,” and use this categorization to break down what types of pages work best and where can improvement be made.

For one client, I exported their data from Google Search Console and broke out their pages by “comparison,” “reviews,” “alternatives,” and “other.” From this, I was able to identify where we could possibly improve, establish what was working, and have more concrete data to show the client.


Categorization will not solve any SEO or digital marketing problems for you, but it can make data analysis much faster and visually compelling. The faster you can identify opportunities, the more time you’ll actually have for making recommendations and an impact for your business or client.

This formula is so versatile that it can be used for nearly anything. I hope that you find clever ways for it to make your data analysis easier and less tedious. As each site is different, it’s impossible to say exactly which strings you should be looking for in any given scenario, but if you can take away from this post an understanding of the power of this formula and how to re-create it, you’ll find quite quickly it can be used for more tasks than you can dream up. Please comment or share your ideas for how to use this formula in the comments section below or at my Twitter handle, @mr_jeremyg.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

Using Term Frequency Analysis to Measure Your Content Quality

Posted by EricEnge

It’s time to look at your content differently—time to start understanding just how good it really is. I am not simply talking about titles, keyword usage, and meta descriptions. I am talking about the entire page experience. In today’s post, I am going to introduce the general concept of content quality analysis, why it should matter to you, and how to use term frequency (TF) analysis to gather ideas on how to improve your content.

TF analysis is usually combined with inverse document frequency analysis (collectively TF-IDF analysis). TF-IDF analysis has been a staple concept for information retrieval science for a long time. You can read more about TF-IDF and other search science concepts in Cyrus Shepard’s
excellent article here.

For purposes of today’s post, I am going to show you how you can use TF analysis to get clues as to what Google is valuing in the content of sites that currently outrank you. But first, let’s get oriented.

Conceptualizing page quality

Start by asking yourself if your page provides a quality experience to people who visit it. For example, if a search engine sends 100 people to your page, how many of them will be happy? Seventy percent? Thirty percent? Less? What if your competitor’s page gets a higher percentage of happy users than yours does? Does that feel like an “uh-oh”?

Let’s think about this with a specific example in mind. What if you ran a golf club site, and 100 people come to your page after searching on a phrase like “golf clubs.” What are the kinds of things they may be looking for?

Here are some things they might want:

  1. A way to buy golf clubs on your site (you would need to see a shopping cart of some sort).
  2. The ability to select specific brands, perhaps by links to other pages about those brands of golf clubs.
  3. Information on how to pick the club that is best for them.
  4. The ability to select specific types of clubs (drivers, putters, irons, etc.). Again, this may be via links to other pages.
  5. A site search box.
  6. Pricing info.
  7. Info on shipping costs.
  8. Expert analysis comparing different golf club brands.
  9. End user reviews of your company so they can determine if they want to do business with you.
  10. How your return policy works.
  11. How they can file a complaint.
  12. Information about your company. Perhaps an “about us” page.
  13. A link to a privacy policy page.
  14. Whether or not you have been “in the news” recently.
  15. Trust symbols that show that you are a reputable organization.
  16. A way to access pages to buy different products, such as golf balls or tees.
  17. Information about specific golf courses.
  18. Tips on how to improve their golf game.

This is really only a partial list, and the specifics of your site can certainly vary for any number of reasons from what I laid out above. So how do you figure out what it is that people really want? You could pull in data from a number of sources. For example, using data from your site search box can be invaluable. You can do user testing on your site. You can conduct surveys. These are all good sources of data.

You can also look at your analytics data to see what pages get visited the most. Just be careful how you use that data. For example, if most of your traffic is from search, this data will be biased by incoming search traffic, and hence what Google chooses to rank. In addition, you may only have a small percentage of the visitors to your site going to your privacy policy, but chances are good that there are significantly more users than that who notice whether or not you have a privacy policy. Many of these will be satisfied just to see that you have one and won’t actually go check it out.

Whatever you do, it’s worth using many of these methods to determine what users want from the pages of your site and then using the resulting information to improve your overall site experience.

Is Google using this type of info as a ranking factor?

At some level, they clearly are. Clearly Google and Bing have evolved far beyond the initial TF-IDF concepts, but we can still use them to better understand our own content.

The first major indication we had that Google was performing content quality analysis was with the release of the
Panda algorithm in February of 2011. More recently, we know that on April 21 Google will release an algorithm that makes the mobile friendliness of a web site a ranking factor. Pure and simple, this algo is about the user experience with a page.

Exactly how Google is performing these measurements is not known, but
what we do know is their intent. They want to make their search engine look good, largely because it helps them make more money. Sending users to pages that make them happy will do that. Google has every incentive to improve the quality of their search results in as many ways as they can.

Ultimately, we don’t actually know what Google is measuring and using. It may be that the only SEO impact of providing pages that satisfy a very high percentage of users is an indirect one. I.e., so many people like your site that it gets written about more, linked to more, has tons of social shares, gets great engagement, that Google sees other signals that it uses as ranking factors, and this is why your rankings improve.

But, do I care if the impact is a direct one or an indirect one? Well, NO.

Using TF analysis to evaluate your page

TF-IDF analysis is more about relevance than content quality, but we can still use various precepts from it to help us understand our own content quality. One way to do this is to compare the results of a TF analysis of all the keywords on your page with those pages that currently outrank you in the search results. In this section, I am going to outline the basic concepts for how you can do this. In the next section I will show you a process that you can use with publicly available tools and a spreadsheet.

The simplest form of TF analysis is to count the number of uses of each keyword on a page. However, the problem with that is that a page using a keyword 10 times will be seen as 10 times more valuable than a page that uses a keyword only once. For that reason, we dampen the calculations. I have seen two methods for doing this, as follows:

term frequency calculation

The first method relies on dividing the number of repetitions of a keyword by the count for the most popular word on the entire page. Basically, what this does is eliminate the inherent advantage that longer documents might otherwise have over shorter ones. The second method dampens the total impact in a different way, by taking the log base 10 for the actual keyword count. Both of these achieve the effect of still valuing incremental uses of a keyword, but dampening it substantially. I prefer to use method 1, but you can use either method for our purposes here.

Once you have the TF calculated for every different keyword found on your page, you can then start to do the same analysis for pages that outrank you for a given search term. If you were to do this for five competing pages, the result might look something like this:

term frequency spreadsheet

I will show you how to set up the spreadsheet later, but for now, let’s do the fun part, which is to figure out how to analyze the results. Here are some of the things to look for:

  1. Are there any highly related words that all or most of your competitors are using that you don’t use at all?
  2. Are there any such words that you use significantly less, on average, than your competitors?
  3. Also look for words that you use significantly more than competitors.

You can then tag these words for further analysis. Once you are done, your spreadsheet may now look like this:

second stage term frequency analysis spreadsheet

In order to make this fit into this screen shot above and keep it legibly, I eliminated some columns you saw in my first spreadsheet. However, I did a sample analysis for the movie “Woman in Gold”. You can see the
full spreadsheet of calculations here. Note that we used an automated approach to marking some items at “Low Ratio,” “High Ratio,” or “All Competitors Have, Client Does Not.”

None of these flags by themselves have meaning, so you now need to put all of this into context. In our example, the following words probably have no significance at all: “get”, “you”, “top”, “see”, “we”, “all”, “but”, and other words of this type. These are just very basic English language words.

But, we can see other things of note relating to the target page (a.k.a. the client page):

  1. It’s missing any mention of actor ryan reynolds
  2. It’s missing any mention of actor helen mirren
  3. The page has no reviews
  4. Words like “family” and “story” are not mentioned
  5. “Austrian” and “maria altmann” are not used at all
  6. The phrase “woman in gold” and words “billing” and “info” are used proportionally more than they are with the other pages

Note that the last item is only visible if you open
the spreadsheet. The issues above could well be significant, as the lead actors, reviews, and other indications that the page has in-depth content. We see that competing pages that rank have details of the story, so that’s an indication that this is what Google (and users) are looking for. The fact that the main key phrase, and the word “billing”, are used to a proportionally high degree also makes it seem a bit spammy.

In fact, if you look at the information closely, you can see that the target page is quite thin in overall content. So much so, that it almost looks like a doorway page. In fact, it looks like it was put together by the movie studio itself, just not very well, as it presents little in the way of a home page experience that would cause it to rank for the name of the movie!

In the many different times I have done an analysis using these methods, I’ve been able to make many different types of observations about pages. A few of the more interesting ones include:

  1. A page that had no privacy policy, yet was taking personally identifiable info from users.
  2. A major lack of important synonyms that would indicate a real depth of available content.
  3. Comparatively low Domain Authority competitors ranking with in-depth content.

These types of observations are interesting and valuable, but it’s important to stress that you shouldn’t be overly mechanical about this. The value in this type of analysis is that it gives you a technical way to compare the content on your page with that of your competitors. This type of analysis should be used in combination with other methods that you use for evaluating that same page. I’ll address this some more in the summary section of this below.

How do you execute this for yourself?

full spreadsheet contains all the formulas so all you need to do is link in the keyword count data. I have tried this with two different keyword density tools, the one from Searchmetrics, and this one from motoricerca.info.

I am not endorsing these tools, and I have no financial interest in either one—they just seemed to work fairly well for the process I outlined above. To provide the data in the right format, please do the following:

  1. Run all the URLs you are testing through the keyword density tool.
  2. Copy and paste all the one word, two word, and three word results into a tab on the spreadsheet.
  3. Sort them all so you get total word counts aligned by position as I have shown in the linked spreadsheet.
  4. Set up the formulas as I did in the demo spreadsheet (you can just use the demo spreadsheet).
  5. Then do your analysis!

This may sound a bit tedious (and it is), but it has worked very well for us at STC.


You can also use usability groups and a number of other methods to figure out what users are really looking for on your site. However, what this does is give us a look at what Google has chosen to rank the highest in its search results. Don’t treat this as some sort of magic formula where you mechanically tweak the content to get better metrics in this analysis.

Instead, use this as a method for slicing into your content to better see it the way a machine might see it. It can yield some surprising (and wonderful) insights!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

Advanced Content Analysis in Google Analytics

Posted by Jeffalytics

We analyze the performance of our content every day. Sometimes it’s subconscious, like when we check the number of tweets we get from a new blog post. Other times, we make more conscious efforts, like reviewing performance metrics in Google Analytics. 

This feedback—both formal and anecdotal—informs what we do next. It influences future blog posts and validates our strategies. Reviewing content performance on a regular basis has been key to the growth of many online publishers. We should all be taking note of these successes as we build our own content marketing efforts. 

Paying attention to which of your content efforts are working well is the cornerstone to data-driven marketing. The companies that make these investments can produce tremendous results. For an in-depth analysis on the importance of being data driven, here are two recent articles that inspired me:

These articles show how taking data-driven approach to producing content can produce great results. Exponential traffic and revenue in these cases. 

I don’t know about you, but exponential traffic sounds pretty great to me! 

But we will never get there without taking a methodical and data-driven approach to our efforts. We will never get there if we are only counting page views. 

It’s time to take things to the next level!

Using Google Analytics Content Groupings and Dimensions to inform our content strategy

For many of us, Google Analytics is
the tool of choice for analyzing website performance. It’s free, easy to use, and extremely powerful. But because of the free and easy nature, most users do not explore the more advanced features of the product. 

One of the more advanced features that you have at your disposal is content grouping. Content grouping allows you to gather your content into common themes to create a more meaningful analysis of your data. 

For example, you can group your blog posts by the type of content that they represent. This grouping is helpful if you cover many topics on your website or sell many products. 

This is something that I have been doing for years on my own site. It helps me understand which topics resonate the most with readers. It also helps understand which topics drive organic search visitors. 

In the past, I would have to do this in a manual fashion. It involved exporting data into Excel and grouping content by the presence of certain words in the page URL. This was an ugly manual process that I would not wish on anyone. 

With content grouping in Google Analytics, we can get a view of this data with little effort involved. Here is a screenshot of traffic performance by content groups, based on common topics that I cover on my blog.

Content Groupings for Jeffalytics This simple screenshot is quite revealing. It shows which topics resonate the most, as well as content deficiencies. And these reports get even more valuable once you start to segment your data. More on this shortly.

Configuring content groupings in Google Analytics

Content Grouping Options Before we can get into deep analysis of our content, it makes sense for us to talk about how we can configure this report in Google Analytics. 

There are three ways to set up this feature. The easiest way to do it is by creating rules to define your groups. Rules work like advanced segments in Google Analytics. Set the criteria for your groupings and Google Analytics will do the rest of the work. 

Note that these rules work only on the page URL, page title or screen name (for apps). 

Here is an example of how to configure groupings matching words found in your page URLs. 

Content Groupings by Rules

The definitions work as a waterfall. If a page url/title fits in your first definition, we exclude it from each future definition. For this reason, we want to be specific with our first rules and then leave the more general and “catch all” rules for the end. 

Notice how I used a regular expression to define what makes up a PPC Page. The pipe (|) symbol serving as an “or” statement in the expression. You can also use the “or” statement on the right, but this can get unwieldy fast. 

For long regular expressions, use the extraction method for content grouping. This works wonders for complex regular expressions with several criteria to classify posts.

Using code to define your content groupings

The above options use the data that you already send to Google Analytics with each page view (page URL and page title). While this works well if we have search friendly URLs and titles, it is also limiting in our ability to perform analysis. 

If you would like to analyze beyond words in your content, then you will need to use code to push this data into Google Analytics. 

While this sounds daunting, it is not too bad. I was able to get this code working in less than 30 minutes to provide a proof of concept. 

What are some groupings that you might want to use for measuring content performance? 

How about the length of your content? Many of us have seen studies on the importance of the length of our content. Is it worthwhile to write longer articles, or is that just a “best practice” that does not apply to your site? 

Let’s measure it! 

How about the date that you published your content? If you put the date of your post in the URL, you can use rules to build these content groupings. I don’t include the publishing year in my URL, so I would need code to get this done. 

Here is how I configured Google Analytics to track word counts and publishing year of articles. 

First, you set a new definition for your content grouping in the admin section. I selected indexes 4 & 5 to avoid any potential conflicts.

Tracking Code for Content Groupings As soon as you have defined your grouping, Google will give you code snippets to use for tracking in Google Analytics. There is code for Classic and Universal Analytics. 

I use Google Tag Manager on my website, so I pushed data into the system using the data layer functionality.

My code looked like this for tracking word count, word count range and year published:

Data Layer Variables for Custom Content Groupings 

We trigger this code on every page of my website using native functions from WordPress. If you are using Google Tag Manager and WordPress, I would be more than happy to provide you with the code that I used to build this data layer. 

Next, I created a macro in Tag Manager to recognize these variables.
Data Layer Variable Google Analytics I gave a default value of 0-200—in the event that a word count is unavailable from WordPress, it will list 0-200 words. Then in my Universal Analytics tag, I set content groups in the tag configuration options. My indexes correspond to the groups we set in the Google Analytics interface. The words in the {{}} brackets represent the macros we defined above. Universal Analytics TagSetting Content Groupings in Universal Analytics After publishing, every page load will send content grouping data into Google Analytics. Pretty awesome! 

Once your definitions are in place, you will see your groups listed in the admin section of Google Analytics. You can define up to 5 unique content groups per view.
Naming the Content GroupingsFor even more on the topic of setting up content groupings, here is an awesome article by Michael King on content groupings for the user journey.

Viewing this data in Google Analytics

Once your definitions are in place, Google Analytics will start to push this data into your account. Note that these definitions do not work retroactively—only on data moving forward. Unfortunately that means that you will need to wait a few days for meaningful analysis of this data. 

But when the data starts to come in, it’s beautiful! 

Let’s start with the content grouping definitions for post topic type. I have had these in place for a while, so this data is already providing meaningful insights. Here is what we start to see when looking at website visits by topic type. 

content grouping to analyze content ideas While WordPress pages drive the most traffic, they have relatively low value per page view. This does not count any affiliate revenue, but it is indicative of the traffic brought in by this topic. High traffic volume/low value. 

This high traffic volume, low page value metric helps me draw two conclusions:

  1. I need a better call to action and offer for WordPress content. I can’t write about this topic without having an action for visitors to take. I may need to invest in some sort of premium content for this topic.
  2. As I plan my content strategy, it may not make a lot of sense to focus on WordPress if I cannot find a way to get more value out of the visits. It is clear that Google Analytics content is more valuable for me.

By grouping my content into themes, I now have a fresh perspective on the effectiveness of my content. Instead of choosing the topic on my mind on any given day, I may benefit by only writing about Google Analytics. 

This level of insight is not possible without content grouping. Content grouping is incredible when you have this data tied into the goals you have already set up with Google Analytics.

Checking in on our code-driven content groupings

As you can see, content grouping provides excellent insights into your content strategy performance. If you have thousands of articles on your website, content groupings will help you sift through the noise and go right to the signal. 

You can gain insight into other aspects of your content strategy through this same method. Let’s check in on the groupings that we set up through code earlier in this article. Please note that this is a proof of concept with only a small amount of data to support the groupings. Over time, your picture will start to become more valuable as you see conversions rolling into your account. 

How many page views are we getting for the content we produced over the past 4 years? This is easy to view with our content groupings.
Blog post visits by year This is a traffic pattern that I had assumed in my mind (I wrote much more in 2013 than 2014). Now, I have the numbers to prove it. 

What about by word count? 

Not surprising, lower word count pages (like the homepage) are getting the most traffic.

Word Count This data will get even more interesting over time.

Applying segmentation to our content groupings

We have grouped our content by length of the article and when it was published. Now we can measure how these factors impact our organic search traffic. We can do this a few ways. My preferred method is to look at the medium of organic search and then use a secondary dimension of content group. 

Organic Search by Word Range Again, we see that our shorter articles are driving the most search traffic. This is for two reasons. 1) The default content range is 0-200, so this includes articles with no word count defined by WordPress. 2) It includes our home page, which often ranks for branded search results. 

If granular keyword data were still available in Google Analytics, we would be able to segment brand/non brand traffic. But alas. 

We can do this same analysis by year as well.

Organic Search by Year Notice that the current year is receiving the most organic traffic. I can only assume that this is again due to branded traffic. 

Content grouping makes everything better!

Where else does content grouping make Google Analytics data shine?

Many of your favorite Google Analytics reports get better with content groupings. The behavior flow report comes to life with your content groupings.

Behavior Flow 

We no longer need to look at this report with several branches of data hidden from view. Now you can see how people visit your site based on your pre-defined content groupings.

Behavior Flow Report

Custom Reports 

You can also use custom reports to combine several fields together. For example, try to view organic visits by the year you wrote the content and the topics into a single report. 

Google Organic by Year by TopicYou can also start to add your conversion data in place and understand the value of the content that you have produced over the years. 

Several years ago I wrote a post about
investing in SEO for YouMoz. The basic premise is that SEO investment does not fit into normal budget constraints. For example, you may budget for all your SEO efforts in 2015, but there is a revenue impact of these efforts for years to come. 

A custom report by post year can help you better understand the continued return on your SEO investment over the years.

What other content groupings make sense to explore?

Once we start grouping our content for analysis, many possibilities become available. Here are a few more ideas for what we can measure for content groupings:

  • Grouping by social share counts. How do share counts affect traffic and conversions? I have done a proof of concept with social shares in the past and the data is revealing.
  • Grouping by external links using the Mozscape API. Push this into your data layer and you can start to analyze how links may be impacting your content performance.
  • Grouping by any on page metadata for your post. We included word count here, but we can also include title length, keyword usage, etc.
  • Grouping by targeted keyword. Use a custom field from WordPress (or your CMS) to push this into your data layer for content grouping.
  • More specific date based grouping. Instead of grouping by year, group by month or week to see how strategies take hold more quickly.
  • Grouping by author of content. Which authors drive the most traffic and revenue?
  • Grouping by department of company. Are certain departments producing better content? 

You can measure pretty much anything with content grouping. The only real limitation being your imagination AND Google’s current limit of 5 content groups in each view. You can even get around that by using multiple views if you want.

What type of questions can we answer with content groupings?

With content groupings in place, we can answer more business questions than standard content reports. Here are a few business questions I can start to answer with the content groupings we have already discussed.

  • Is our content marketing hitting the mark?
  • Are we making progress toward our goals with our recent content marketing?
  • Did our SEO investment mature like we thought it would?
  • Has our new focus on converting visitors affected overall revenue significantly?

Through content grouping, we can find answers within our pre-defined points of analysis. We no longer have to look at individual posts and pages to find answers. 

We provide the taxonomy that works for our business. Then we use this taxonomy to show how visitors reached our website through acquisition reports. We see how they performed on the site through conversion reports. 

Now Google Analytics starts to think a lot more like our business. It uses our own words to describe content within a structure we define. Plus, we have the tremendous processing power of Google Analytics to handle our queries.

Bonus: Use custom dimensions to make these reports even more useful

If you were paying close attention to the data layer variables I showed earlier in the post, you will see a third variable. This third variable is the exact word count for each page. This variable was added to the data layer as I was starting to do analysis on the content groupings. I found that some analysis may become easier if I have the exact word count available in Google Analytics. 

In Google Tag Manager, I set a custom dimension of Word Count using my third data layer variable. Now, I can view post topic by word count of the article in Google Analytics. 

Word Count Secondary Dimension Useful? Definitely! There are many times when you need an exact number available to conduct analysis. 

You can add up to 20 custom dimensions per web property in Google Analytics. It only works with the Universal Analytics version.

What type of content analysis are you going to do now?

Groupings are like a cheat-code for content marketers to take their analysis to the next level. You get to push your own data into Google Analytics. You get to use your own definitions within the tool. 

There are really no limits to what you can measure. What is it going to be? I would love to hear your ideas in the comments section.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

Using Analytical Analysis to Help Improve Conversions

The secret to getting the most success out of your initiatives typically lies in accessing the biggest hammer in your marketing tool bag, analytics!
Search Engine Watch – Latest

Posted in IM NewsComments Off

Fast SEO Competitive Analysis

Careful keyword research is a time-consuming, often overlooked, but yet foundational aspect of creating a fantastic SEO strategy.
Search Engine Watch – Latest

Posted in IM NewsComments Off

A New Analysis of Google SERPs Across Search Volume and Site Type

Posted by Matt Peters

At Moz, we have been following up on our 2013 Search Engine Ranking Factors study by continuing to analyze interesting aspects of the data. One of our most frequently asked questions is, “Do you see any systematic differences in Google’s search results across search volume or topic category?” By design, our main study used a broad keyword set across all search volumes and industries to capture Google’s overall search algorithm. As a result, we weren’t able to answer this question since it requires segmenting the data into different buckets. In this post, I’ll do just that and dig into the data in an attempt to answer this question.

Our approach

We used a subset of the data from our 2013 Ranking Factors study, focusing on a few of the most important factors. In the main study, we collected the top 50 search results for about 15,000 keywords from Google, along with more then 100 different factors. These included links, anchor text, on-page factors, and social signals, among others. Then, for each factor we computed the mean Spearman correlation between the factor and search position. Here’s a great graphic from Rand that helps illustrate how to interpret the correlations:

In general, a higher correlation means that the factor is more closely related to a higher ranking than a lower correlation. It doesn’t necessarily mean that there is causation!

In addition to search results and factors, we collected the categories from AdWords (e.g. “Home and Garden”) and the monthly US (local) search volume. This allows us to examine correlations across these different segments.

Search volume

First up is search volume. We segmented each keyword into one of three buckets depending on the average local (US) monthly search volume from AdWords: less than 5,000 searches per month, 5,000-15,000 searches per month, and more than 15,000 searches per month.

To begin exploring the data, here is the median page and domain authority in each bucket, along with the total percentage of results with a domain name exactly matching the keyword:

Not too surprisingly, we see the overall page authority, domain authority and the exact match domain (EMD) percentage all increase with search volume. This is presumably because higher-volume queries are targeted by larger, more authoritative sites.

Now, an overall higher page authority for high-volume queries doesn’t necessarily mean that the correlation with search position will be larger. The correlation measures the extent to which page authority (or any other factor) can predict the ordering. As a example, consider two three-result SERPs, one with page authorities of 90, 92, and 88 for the first three positions; and another with values of 30, 20, and 10. The first SERP has higher values overall, but a lower correlation. To examine how these impact search ordering, we can compute the mean Spearman correlation in each bucket:

And for those who prefer a chart:

From left to right, the table lists link-related factors (page authority, domain authority, and exact match anchor text); a brand-related factor (number of domain mentions in the last 30 days from Fresh Web Explorer); social factors (number of Google +1s, Facebook shares, and tweets); and keyword-related factors (keyword usage on the page, in the title, and EMD).

Looking at the data, we can see a few interesting things:

  1. The correlations increase noticeably with search volume for link, brand, and social media factors.
  2. The correlations are mostly constant for keyword-related factors (keyword usage on the page or in the domain name).

Primarily, point #1 says that these factors do a better job at predicting rank as search volume increases. We’d expect to see a larger discrepancy in the link or social metrics throughout the SERPs in higher volume queries than in lower-volume queries. One corollary is that SERPs from lower-volume queries are more heavily influenced by factors that aren’t represented in the table (e.g. positive or negative user signals).

One implication of point #2 is that Google’s keyword-document relevance algorithm is the same for high- and low-volume queries. That is, their method for determining what a page is about doesn’t depend the query popularity.

We can make this more concrete by considering two different queries and SERPs: one high volume (“cheap flights” with more than 1 million searches per month), and one low-volume (“home goods online” with less than 500 searches per month). For reference, here are the top results for each search, with the page and domain authority from the MozBar:

Above: Google SERP for “cheap flights”

Above: Google SERP for “home goods online”

When a user enters a query, Google first determines which of the many pages in its index are relevant to the query, then ranks the results. A popular query will likely have several relevant pages (or more) with many links, since they are targeted by marketers. In this case, Google should have plenty of signals to determine ranking. A relevant page with high page authority? Check, put it in the top 10. On the other hand, pages in the dark corners of the internet with relatively few links are likely most relevant to low-volume queries. In the low-volume case, since the link signals aren’t as clear, Google is forced to rely more heavily on other signals to determine ranking, and the correlations decrease. This example oversimplifies the complexity of the algorithm, but provides some intuitive understanding of the data.

Site category

We can repeat the analysis for the different AdWords categories. First, the median page and domain authority and EMD percentage:

And the mean Spearman correlations:

Overall, the trends are similar to search volume, with significant differences in the link correlations, and smaller differences in the keyword-related correlations. The explanation for these results is similar to the one above for search volume. The industries with the largest link and social correlations — “Health” and “Travel & Tourism” — tend to have broad-based queries targeted by lots of sites. On the other hand, the industries near the bottom of the table — “Apparel,” “Dining & Nightlife,” and “Retailers & General Merchandise” — all tend to have specific or local intent queries that are likely to be relevant to specific product pages or smaller sites.


In this post, we have explored how a few individual ranking factors vary across search volume and keyword category. Correlations of link- and social-related metrics increase with search volume, but correlations of keyword-related factors (usage on page and in the domain name) are constant across search volume. Taken together, this suggests that Google is using the same query document relevance algorithm for both head and tail queries, but that link metrics predict SERPs from popular queries better then tail queries. We see something similar across site categories with the largest differences in link related correlations. Industries like “Health” that have broad, informational queries have higher correlations than industries like “Apparel” that tend to have queries with specific product intent.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off