Tag Archive | "*Comprehensive*"

A Comprehensive Analysis of the New Domain Authority

Posted by rjonesx.

Moz’s Domain Authority is requested over 1,000,000,000 times per year, it’s referenced millions of times on the web, and it has become a veritable household name among search engine optimizers for a variety of use cases, from determining the success of a link building campaign to qualifying domains for purchase. With the launch of Moz’s entirely new, improved, and much larger link index, we recognized the opportunity to revisit Domain Authority with the same rigor as we did keyword volume years ago (which ushered in the era of clickstream-modeled keyword data).

What follows is a rigorous treatment of the new Domain Authority metric. What I will not do in this piece is rehash the debate over whether Domain Authority matters or what its proper use cases are. I have and will address those at length in a later post. Rather, I intend to spend the following paragraphs addressing the new Domain Authority metric from multiple directions.

Correlations between DA and SERP rankings

The most important component of Domain Authority is how well it correlates with search results. But first, let’s get the correlation-versus-causation objection out of the way: Domain Authority does not cause search rankings. It is not a ranking factor. Domain Authority predicts the likelihood that one domain will outrank another. That being said, its usefulness as a metric is tied in large part to this value. The stronger the correlation, the more valuable Domain Authority is for predicting rankings.


Determining the “correlation” between a metric and SERP rankings has been accomplished in many different ways over the years. Should we compare against the “true first page,” top 10, top 20, top 50 or top 100? How many SERPs do we need to collect in order for our results to be statistically significant? It’s important that I outline the methodology for reproducibility and for any comments or concerns on the techniques used. For the purposes of this study, I chose to use the “true first page.” This means that the SERPs were collected using only the keyword with no additional parameters. I chose to use this particular data set for a number of reasons:

  • The true first page is what most users experience, thus the predictive power of Domain Authority will be focused on what users see.
  • By not using any special parameters, we’re likely to get Google’s typical results. 
  • By not extending beyond the true first page, we’re likely to avoid manually penalized sites (which can impact the correlations with links.)
  • We did NOT use the same training set or training set size as we did for this correlation study. That is to say, we trained on the top 10 but are reporting correlations on the true first page. This prevents us from the potential of having a result overly biased towards our model. 

I randomly selected 16,000 keywords from the United States keyword corpus for Keyword Explorer. I then collected the true first page for all of these keywords (completely different from those used in the training set.) I extracted the URLs but I also chose to remove duplicate domains (ie: if the same domain occurred, one after another.) For a length of time, Google used to cluster domains together in the SERPs under certain circumstances. It was easy to spot these clusters, as the second and later listings were indented. No such indentations are present any longer, but we can’t be certain that Google never groups domains. If they do group domains, it would throw off the correlation because it’s the grouping and not the traditional link-based algorithm doing the work.

I collected the Domain Authority (Moz), Citation Flow and Trust Flow (Majestic), and Domain Rank (Ahrefs) for each domain and calculated the mean Spearman correlation coefficient for each SERP. I then averaged the coefficients for each metric.


Moz’s new Domain Authority has the strongest correlations with SERPs of the competing strength-of-domain link-based metrics in the industry. The sign (-/+) has been inverted in the graph for readability, although the actual coefficients are negative (and should be).

Moz’s Domain Authority scored a ~.12, or roughly 6% stronger than the next best competitor (Domain Rank by Ahrefs.) Domain Authority performed 35% better than CitationFlow and 18% better than TrustFlow. This isn’t surprising, in that Domain Authority is trained to predict rankings while our competitor’s strength-of-domain metrics are not. It shouldn’t be taken as a negative that our competitors strength-of-domain metrics don’t correlate as strongly as Moz’s Domain Authority — rather, it’s simply exemplary of the intrinsic differences between the metrics. That being said, if you want a metric that best predicts rankings at the domain level, Domain Authority is that metric.

Note: At first blush, Domain Authority’s improvements over the competition are, frankly, underwhelming. The truth is that we could quite easily increase the correlation further, but doing so would risk over-fitting and compromising a secondary goal of Domain Authority…

Handling link manipulation

Historically, Domain Authority has focused on only one single feature: maximizing the predictive capacity of the metric. All we wanted were the highest correlations. However, Domain Authority has become, for better or worse, synonymous with “domain value” in many sectors, such as among link buyers and domainers. Subsequently, as bizarre as it may sound, Domain Authority has itself been targeted for spam in order to bolster the score and sell at a higher price. While these crude link manipulation techniques didn’t work so well in Google, they were sufficient to increase Domain Authority. We decided to rein that in. 

Data sets

The first thing we did was compile a series off data sets that corresponded with industries we wished to impact, knowing that Domain Authority was regularly manipulated in these circles.

  • Random domains
  • Moz customers
  • Blog comment spam
  • Low-quality auction domains
  • Mid-quality auction domains
  • High-quality auction domains
  • Known link sellers
  • Known link buyers
  • Domainer network
  • Link network

While it would be my preference to release all the data sets, I’ve chosen not to in order to not “out” any website in particular. Instead, I opted to provide these data sets to a number of search engine marketers for validation. The only data set not offered for outside validation was Moz customers, for obvious reasons.


For each of the above data sets, I collected both the old and new Domain Authority scores. This was conducted all on February 28th in order to have parity for all tests. I then calculated the relative difference between the old DA and new DA within each group. Finally, I compared the various data set results against one another to confirm that the model addresses the various methods of inflating Domain Authority.


In the above graph, blue represents the Old Average Domain Authority for that data set and orange represents the New Average Domain Authority for that same data set. One immediately noticeable feature is that every category drops. Even random domains drops. This is a re-centering of the Domain Authority score and should cause no alarm to webmasters. There is, on average, a 6% reduction in Domain Authority for randomly selected domains from the web. Thus, if your Domain Authority drops a few points, you are well within the range of normal. Now, let’s look at the various data sets individually.

Random domains: -6.1%

Using the same methodology of finding random domains which we use for collecting comparative link statistics, I selected 1,000 domains, we were able to determine that there is, on average, a 6.1% drop in Domain Authority. It’s important that webmasters recognize this, as the shift is likely to affect most sites and is nothing to worry about.  

Moz customers: -7.4%

Of immediate interest to Moz is how our own customers perform in relation to the random set of domains. On average, the Domain Authority of Moz customers lowered by 7.4%. This is very close to the random set of URLs and indicates that most Moz customers are likely not using techniques to manipulate DA to any large degree.  

Link buyers: -15.9%

Surprisingly, link buyers only lost 15.9% of their Domain Authority. In retrospect, this seems reasonable. First, we looked specifically at link buyers from blog networks, which aren’t as spammy as many other techniques. Second, most of the sites paying for links are also optimizing their site’s content, which means the sites do rank, sometimes quite well, in Google. Because Domain Authority trains against actual rankings, it’s reasonable to expect that the link buyers data set would not be impacted as highly as other techniques because the neural network learns that some link buying patterns actually work. 

Comment spammers: -34%

Here’s where the fun starts. The neural network behind Domain Authority was able to drop comment spammers’ average DA by 34%. I was particularly pleased with this one because of all the types of link manipulation addressed by Domain Authority, comment spam is, in my honest opinion, no better than vandalism. Hopefully this will have a positive impact on decreasing comment spam — every little bit counts. 

Link sellers: -56%

I was actually quite surprised, at first, that link sellers on average dropped 56% in Domain Authority. I knew that link sellers often participated in link schemes (normally interlinking their own blog networks to build up DA) so that they can charge higher prices. However, it didn’t occur to me that link sellers would be easier to pick out because they explicitly do not optimize their own sites beyond links. Subsequently, link sellers tend to have inflated, bogus link profiles and flimsy content, which means they tend to not rank in Google. If they don’t rank, then the neural network behind Domain Authority is likely to pick up on the trend. It will be interesting to see how the market responds to such a dramatic change in Domain Authority.

High-quality auction domains: -61%

One of the features that I’m most proud of in regards to Domain Authority is that it effectively addressed link manipulation in order of our intuition regarding quality. I created three different data sets out of one larger data set (auction domains), where I used certain qualifiers like price, TLD, and archive.org status to label each domain as high-quality, mid-quality, and low-quality. In theory, if the neural network does its job correctly, we should see the high-quality domains impacted the least and the low-quality domains impacted the most. This is the exact pattern which was rendered by the new model. High-quality auction domains dropped an average of 61% in Domain Authority. That seems really high for “high-quality” auction domains, but even a cursory glance at the backlink profiles of domains that are up for sale in the $ 10K+ range shows clear link manipulation. The domainer industry, especially the domainer-for-SEO industry, is rife with spam. 

Link network: -79%

There is one network on the web that troubles me more than any other. I won’t name it, but it’s particularly pernicious because the sites in this network all link to the top 1,000,000 sites on the web. If your site is in the top 1,000,000 on the web, you’ll likely see hundreds of root linking domains from this network no matter which link index you look at (Moz, Majestic, or Ahrefs). You can imagine my delight to see that it drops roughly 79% in Domain Authority, and rightfully so, as the vast majority of these sites have been banned by Google.

Mid-quality auction domains: -95%

Continuing with the pattern regarding the quality of auction domains, you can see that “mid-quality” auction domains dropped nearly 95% in Domain Authority. This is huge. Bear in mind that these drastic drops are not combined with losses in correlation with SERPs; rather, the neural network is learning to distinguish between backlink profiles far more effectively, separating the wheat from the chaff. 

Domainer networks: -97%

If you spend any time looking at dropped domains, you have probably come upon a domainer network where there are a series of sites enumerated and all linking to one another. For example, the first site might be sbt001.com, then sbt002.com, and so on and so forth for thousands of domains. While it’s obvious for humans to look at this and see a pattern, Domain Authority needed to learn that these techniques do not correlate with rankings. The new Domain Authority does just that, dropping the domainer networks we analyzed on average by 97%.

Low-quality auction domains: -98%

Finally, the worst offenders — low-quality auction domains — dropped 98% on average. Domain Authority just can’t be fooled in the way it has in the past. You have to acquire good links in the right proportions (in accordance with a natural model and sites that already rank) if you wish to have a strong Domain Authority score. 

What does this mean?

For most webmasters, this means very little. Your Domain Authority might drop a little bit, but so will your competitors’. For search engine optimizers, especially consultants and agencies, it means quite a bit. The inventories of known link sellers will probably diminish dramatically overnight. High DA links will become far more rare. The same is true of those trying to construct private blog networks (PBNs). Of course, Domain Authority doesn’t cause rankings so it won’t impact your current rank, but it should give consultants and agencies a much smarter metric for assessing quality.

What are the best use cases for DA?

  • Compare changes in your Domain Authority with your competitors. If you drop significantly more, or increase significantly more, it could indicate that there are important differences in your link profile.
  • Compare changes in your Domain Authority over time. The new Domain Authority will update historically as well, so you can track your DA. If your DA is decreasing over time, especially relative to your competitors, you probably need to get started on outreach.
  • Assess link quality when looking to acquire dropped or auction domains. Those looking to acquire dropped or auction domains now have a much more powerful tool in their hands for assessing quality. Of course, DA should not be the primary metric for assessing the quality of a link or a domain, but it certainly should be in every webmaster’s toolkit.

What should we expect going forward?

We aren’t going to rest. An important philosophical shift has taken place at Moz with regards to Domain Authority. In the past, we believed it was best to keep Domain Authority static, rarely updating the model, in order to give users an apples-to-apples comparison. Over time, though, this meant that Domain Authority would become less relevant. Given the rapidity with which Google updates its results and algorithms, the new Domain Authority will be far more agile as we give it new features, retrain it more frequently, and respond to algorithmic changes from Google. We hope you like it.

Be sure to join us on Thursday, March 14th at 10am PT at our upcoming webinar discussing strategies & use cases for the new Domain Authority:

Save my spot

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off

Take a Deep Dive into Comprehensive Content Marketing Strategy

This week was all about the bigger picture of content marketing strategy. “Great content” is a wonderful start, but you need the strategic context that pulls it all together. Whether you’re a pro or just getting started, the posts and podcasts below will give you a framework to make your project really strong. On Monday,
Read More…

The post Take a Deep Dive into Comprehensive Content Marketing Strategy appeared first on Copyblogger.


Posted in IM NewsComments Off

How to Beat Your Competitor’s Rankings with More *Comprehensive* Content – Whiteboard Friday

Posted by randfish

Longer, more thorough documents tend to do better in the search results. We know that’s true, but why? And is there a way we can use that knowledge to our advantage? In today’s Whiteboard Friday, Rand explains how Google may be weighting content comprehensiveness and outlines his three-step methodology for gaining an edge over your competitors when it comes to meeting searchers’ needs.

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat about, well, something I’ve noticed, something we’ve noticed here at Moz, which is that there seems to be this extra weight that Google is putting right now on what I’m going to call content comprehensiveness, the degree to which a piece of content answers all of a searcher’s potential questions. I think this is one of the reasons that we keep seeing statistics like word length and document length is well-correlated with higher rankings and why it tends to be the case that longer documents tend to do better in search results. I’m going to break this down.

Broad ranking inputs

On the broad ranking inputs, when Googlebot is over here and sort of considering like: Which URL should I rank? Someone searched for best time to apply for jobs, and what am I going to put in here? They tend to look at a bunch of stuff. Domain authority and page-level link authority and keyword targeting, for sure. Topic authority, the domain, and load speed and freshness and da, da, da.

But these four, all of which are sort of related:

  1. Searcher engagement and satisfaction, so the degree to which when people land on that page they have a good experience, they don’t bounce back to the search results and click another result.
  2. The diversity and uniqueness of that content compared to everything else in the results.
  3. The raw content quality, which I think Google has probably lots of things they use to measure content quality, including engagement and satisfaction, so these might overlap.
  4. And then comprehensiveness.

It’s sort of this right mix of these three things, like the depth, the trustworthiness, and the value that the content provides seems to really speak to this. It’s something we’ve been seeing like Google kind of overweighting right now, especially over the last 12 to 18 months. There seems to be this confluence of queries, where this very comprehensive content comes up in ranking positions that we wouldn’t ordinarily expect. It throws off things around link metrics and keyword targeting metrics, and sometimes SEOs go, “What is going on there?”

So, in particular, we see this happening with informational- and research-focused queries, with product and brand comparison type queries, like “best stereo” or “best noise cancelling headphones,” so those types of things. Broad questions, implicit or explicit questions that have complex or multifaceted answers to them. So probably, yes, you would see this type of very comprehensive content ranking better, and, in fact, I did some of these queries. So for things like “job application best practices,” “gender bias in hiring,” “résumé examples,” these are broad questions, informational/research focus, product comparison stuff.

Then, not so much, you would not see these in things like “job application for Walmart,” which literally just takes you to Walmart’s job application page, which is not a particularly comprehensive format. The comprehensive stuff ranks vastly below that. “Gender bias definition,” which takes you to a short page with the definition, and “résumé template Google Docs,” which takes you to Google Docs’ résumé template. These are almost more navigational or more short-format answer in what they’re doing. I didn’t actually mean to replace that.

How to be more comprehensive than the competition

So if you want to nail this, if you identify that your queries are not in this bucket, but they are in this bucket, you probably want to try and aim for some of this content comprehensiveness. To do that, I’ve got kind of a three-step methodology. It is not easy, it is hard, and it is going to take a lot of work. I don’t mean to oversimplify. But if you do this, you tend to be able to beat out even much more powerful websites for the queries you’re going after.

1. Identify ALL the questions inherent in the search query term/phrase:

First off, you need to identify all the questions that are inherent in the searcher’s query. Those could be explicit or implicit, meaning they’re implied or they’re obvious. They could be dependent on the person’s background, the searcher’s background, which means you need to identify: Who are all the types of people searching for this, and what differences do they have? We may need different types of content to serve different folks, and there needs to be some bifurcation or segmentation on the page to help them get there.

Same thing on their purpose. So some people who are searching for “job application best practices” may be employers. Some people may be job applicants. Some may be employees. Some may be people who are starting companies. Some may be HR directors. You need to provide that background for all of them.

One of the ways to do this, to get all the questions, truly all the questions is to survey. You can do that to your users or your community, or you can do it through some sort of third-party system. For example, Oli Gardner from Unbounce was very kind and did this for Moz recently, where he was asking about customer confusion and objections and issues. He used UsabilityHub’s tests. UsabilityHub, you can use this there as well. You can also use Q and A sites, things like Quora. You can use social media sites, like Twitter or LinkedIn or Facebook, if you’re trying to gather some of this data informally.

2. Gather information your competition cannot/would not get:

Once you have all these questions, you need to assemble the information that answers all of these types of questions, hopefully in a way that your competition cannot or would not get. So that means things like:

  • Proprietary data
  • Competitive landscape information, which many folks are only willing to talk about themselves and not how they relate to others.
  • It means industry and community opinions, which most folks are not willing to go out and get, especially if they’re bigger.
  • Aggregated or uniquely processed metrics, obviously one of the most salient recent examples from the election that’s just passed is sites like FiveThirtyEight or the Upshot or Huffington Post, who build these models based on other people’s data that they’ve aggregated and included.
  • It also could mean that you are putting together information in visual or audio or interactive mediums.

3. Assemble in formats others don’t/can’t/won’t use:

Now that you have this competitive advantage, in terms of the content, and you have all of the questions, you can assemble this stuff in formats that other people don’t or won’t create or use.

  • That could be things like guides that require extraordinary amounts of work. “The Beginners Guide to SEO” is a good example from Moz, but there are many, many others in all sorts of fields.
  • Highly customized formats that have these interactive or visual components that other people are generally unwilling to invest the effort in to create.
  • Free to download or access to reports and data that other people would charge for or they put behind pay walls.
  • Non-transactional or non-call-to-action-focused formats. For example, a lot of the times when you do stuff in this job search arena, you see folks who are trying to promote their service or their product, and therefore they want to have you input something before they give you anything back. If you do that for free, you can often overwhelm the comprehensiveness of what anyone else in the space is doing.

This process, like I said, not easy, but can be a true competitive advantage, especially if you’re willing to take on these individual key phrases and terms in a way that your competition just can’t or won’t.

I’d love to hear if you’ve got any examples of these, if you’ve tried it before. If you do use this process, please feel free to post the results in the comments. We’d love to check it out. We’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Moz Blog

Posted in IM NewsComments Off