Tag Archive | "Analysis"

A Comprehensive Analysis of the New Domain Authority

Posted by rjonesx.

Moz’s Domain Authority is requested over 1,000,000,000 times per year, it’s referenced millions of times on the web, and it has become a veritable household name among search engine optimizers for a variety of use cases, from determining the success of a link building campaign to qualifying domains for purchase. With the launch of Moz’s entirely new, improved, and much larger link index, we recognized the opportunity to revisit Domain Authority with the same rigor as we did keyword volume years ago (which ushered in the era of clickstream-modeled keyword data).

What follows is a rigorous treatment of the new Domain Authority metric. What I will not do in this piece is rehash the debate over whether Domain Authority matters or what its proper use cases are. I have and will address those at length in a later post. Rather, I intend to spend the following paragraphs addressing the new Domain Authority metric from multiple directions.

Correlations between DA and SERP rankings

The most important component of Domain Authority is how well it correlates with search results. But first, let’s get the correlation-versus-causation objection out of the way: Domain Authority does not cause search rankings. It is not a ranking factor. Domain Authority predicts the likelihood that one domain will outrank another. That being said, its usefulness as a metric is tied in large part to this value. The stronger the correlation, the more valuable Domain Authority is for predicting rankings.

Methodology

Determining the “correlation” between a metric and SERP rankings has been accomplished in many different ways over the years. Should we compare against the “true first page,” top 10, top 20, top 50 or top 100? How many SERPs do we need to collect in order for our results to be statistically significant? It’s important that I outline the methodology for reproducibility and for any comments or concerns on the techniques used. For the purposes of this study, I chose to use the “true first page.” This means that the SERPs were collected using only the keyword with no additional parameters. I chose to use this particular data set for a number of reasons:

  • The true first page is what most users experience, thus the predictive power of Domain Authority will be focused on what users see.
  • By not using any special parameters, we’re likely to get Google’s typical results. 
  • By not extending beyond the true first page, we’re likely to avoid manually penalized sites (which can impact the correlations with links.)
  • We did NOT use the same training set or training set size as we did for this correlation study. That is to say, we trained on the top 10 but are reporting correlations on the true first page. This prevents us from the potential of having a result overly biased towards our model. 

I randomly selected 16,000 keywords from the United States keyword corpus for Keyword Explorer. I then collected the true first page for all of these keywords (completely different from those used in the training set.) I extracted the URLs but I also chose to remove duplicate domains (ie: if the same domain occurred, one after another.) For a length of time, Google used to cluster domains together in the SERPs under certain circumstances. It was easy to spot these clusters, as the second and later listings were indented. No such indentations are present any longer, but we can’t be certain that Google never groups domains. If they do group domains, it would throw off the correlation because it’s the grouping and not the traditional link-based algorithm doing the work.

I collected the Domain Authority (Moz), Citation Flow and Trust Flow (Majestic), and Domain Rank (Ahrefs) for each domain and calculated the mean Spearman correlation coefficient for each SERP. I then averaged the coefficients for each metric.

Outcome

Moz’s new Domain Authority has the strongest correlations with SERPs of the competing strength-of-domain link-based metrics in the industry. The sign (-/+) has been inverted in the graph for readability, although the actual coefficients are negative (and should be).

Moz’s Domain Authority scored a ~.12, or roughly 6% stronger than the next best competitor (Domain Rank by Ahrefs.) Domain Authority performed 35% better than CitationFlow and 18% better than TrustFlow. This isn’t surprising, in that Domain Authority is trained to predict rankings while our competitor’s strength-of-domain metrics are not. It shouldn’t be taken as a negative that our competitors strength-of-domain metrics don’t correlate as strongly as Moz’s Domain Authority — rather, it’s simply exemplary of the intrinsic differences between the metrics. That being said, if you want a metric that best predicts rankings at the domain level, Domain Authority is that metric.

Note: At first blush, Domain Authority’s improvements over the competition are, frankly, underwhelming. The truth is that we could quite easily increase the correlation further, but doing so would risk over-fitting and compromising a secondary goal of Domain Authority…

Handling link manipulation

Historically, Domain Authority has focused on only one single feature: maximizing the predictive capacity of the metric. All we wanted were the highest correlations. However, Domain Authority has become, for better or worse, synonymous with “domain value” in many sectors, such as among link buyers and domainers. Subsequently, as bizarre as it may sound, Domain Authority has itself been targeted for spam in order to bolster the score and sell at a higher price. While these crude link manipulation techniques didn’t work so well in Google, they were sufficient to increase Domain Authority. We decided to rein that in. 

Data sets

The first thing we did was compile a series off data sets that corresponded with industries we wished to impact, knowing that Domain Authority was regularly manipulated in these circles.

  • Random domains
  • Moz customers
  • Blog comment spam
  • Low-quality auction domains
  • Mid-quality auction domains
  • High-quality auction domains
  • Known link sellers
  • Known link buyers
  • Domainer network
  • Link network

While it would be my preference to release all the data sets, I’ve chosen not to in order to not “out” any website in particular. Instead, I opted to provide these data sets to a number of search engine marketers for validation. The only data set not offered for outside validation was Moz customers, for obvious reasons.

Methodology

For each of the above data sets, I collected both the old and new Domain Authority scores. This was conducted all on February 28th in order to have parity for all tests. I then calculated the relative difference between the old DA and new DA within each group. Finally, I compared the various data set results against one another to confirm that the model addresses the various methods of inflating Domain Authority.

Results

In the above graph, blue represents the Old Average Domain Authority for that data set and orange represents the New Average Domain Authority for that same data set. One immediately noticeable feature is that every category drops. Even random domains drops. This is a re-centering of the Domain Authority score and should cause no alarm to webmasters. There is, on average, a 6% reduction in Domain Authority for randomly selected domains from the web. Thus, if your Domain Authority drops a few points, you are well within the range of normal. Now, let’s look at the various data sets individually.



Random domains: -6.1%

Using the same methodology of finding random domains which we use for collecting comparative link statistics, I selected 1,000 domains, we were able to determine that there is, on average, a 6.1% drop in Domain Authority. It’s important that webmasters recognize this, as the shift is likely to affect most sites and is nothing to worry about.  

Moz customers: -7.4%

Of immediate interest to Moz is how our own customers perform in relation to the random set of domains. On average, the Domain Authority of Moz customers lowered by 7.4%. This is very close to the random set of URLs and indicates that most Moz customers are likely not using techniques to manipulate DA to any large degree.  

Link buyers: -15.9%

Surprisingly, link buyers only lost 15.9% of their Domain Authority. In retrospect, this seems reasonable. First, we looked specifically at link buyers from blog networks, which aren’t as spammy as many other techniques. Second, most of the sites paying for links are also optimizing their site’s content, which means the sites do rank, sometimes quite well, in Google. Because Domain Authority trains against actual rankings, it’s reasonable to expect that the link buyers data set would not be impacted as highly as other techniques because the neural network learns that some link buying patterns actually work. 

Comment spammers: -34%

Here’s where the fun starts. The neural network behind Domain Authority was able to drop comment spammers’ average DA by 34%. I was particularly pleased with this one because of all the types of link manipulation addressed by Domain Authority, comment spam is, in my honest opinion, no better than vandalism. Hopefully this will have a positive impact on decreasing comment spam — every little bit counts. 

Link sellers: -56%

I was actually quite surprised, at first, that link sellers on average dropped 56% in Domain Authority. I knew that link sellers often participated in link schemes (normally interlinking their own blog networks to build up DA) so that they can charge higher prices. However, it didn’t occur to me that link sellers would be easier to pick out because they explicitly do not optimize their own sites beyond links. Subsequently, link sellers tend to have inflated, bogus link profiles and flimsy content, which means they tend to not rank in Google. If they don’t rank, then the neural network behind Domain Authority is likely to pick up on the trend. It will be interesting to see how the market responds to such a dramatic change in Domain Authority.

High-quality auction domains: -61%

One of the features that I’m most proud of in regards to Domain Authority is that it effectively addressed link manipulation in order of our intuition regarding quality. I created three different data sets out of one larger data set (auction domains), where I used certain qualifiers like price, TLD, and archive.org status to label each domain as high-quality, mid-quality, and low-quality. In theory, if the neural network does its job correctly, we should see the high-quality domains impacted the least and the low-quality domains impacted the most. This is the exact pattern which was rendered by the new model. High-quality auction domains dropped an average of 61% in Domain Authority. That seems really high for “high-quality” auction domains, but even a cursory glance at the backlink profiles of domains that are up for sale in the $ 10K+ range shows clear link manipulation. The domainer industry, especially the domainer-for-SEO industry, is rife with spam. 

Link network: -79%

There is one network on the web that troubles me more than any other. I won’t name it, but it’s particularly pernicious because the sites in this network all link to the top 1,000,000 sites on the web. If your site is in the top 1,000,000 on the web, you’ll likely see hundreds of root linking domains from this network no matter which link index you look at (Moz, Majestic, or Ahrefs). You can imagine my delight to see that it drops roughly 79% in Domain Authority, and rightfully so, as the vast majority of these sites have been banned by Google.

Mid-quality auction domains: -95%

Continuing with the pattern regarding the quality of auction domains, you can see that “mid-quality” auction domains dropped nearly 95% in Domain Authority. This is huge. Bear in mind that these drastic drops are not combined with losses in correlation with SERPs; rather, the neural network is learning to distinguish between backlink profiles far more effectively, separating the wheat from the chaff. 

Domainer networks: -97%

If you spend any time looking at dropped domains, you have probably come upon a domainer network where there are a series of sites enumerated and all linking to one another. For example, the first site might be sbt001.com, then sbt002.com, and so on and so forth for thousands of domains. While it’s obvious for humans to look at this and see a pattern, Domain Authority needed to learn that these techniques do not correlate with rankings. The new Domain Authority does just that, dropping the domainer networks we analyzed on average by 97%.

Low-quality auction domains: -98%

Finally, the worst offenders — low-quality auction domains — dropped 98% on average. Domain Authority just can’t be fooled in the way it has in the past. You have to acquire good links in the right proportions (in accordance with a natural model and sites that already rank) if you wish to have a strong Domain Authority score. 

What does this mean?

For most webmasters, this means very little. Your Domain Authority might drop a little bit, but so will your competitors’. For search engine optimizers, especially consultants and agencies, it means quite a bit. The inventories of known link sellers will probably diminish dramatically overnight. High DA links will become far more rare. The same is true of those trying to construct private blog networks (PBNs). Of course, Domain Authority doesn’t cause rankings so it won’t impact your current rank, but it should give consultants and agencies a much smarter metric for assessing quality.

What are the best use cases for DA?

  • Compare changes in your Domain Authority with your competitors. If you drop significantly more, or increase significantly more, it could indicate that there are important differences in your link profile.
  • Compare changes in your Domain Authority over time. The new Domain Authority will update historically as well, so you can track your DA. If your DA is decreasing over time, especially relative to your competitors, you probably need to get started on outreach.
  • Assess link quality when looking to acquire dropped or auction domains. Those looking to acquire dropped or auction domains now have a much more powerful tool in their hands for assessing quality. Of course, DA should not be the primary metric for assessing the quality of a link or a domain, but it certainly should be in every webmaster’s toolkit.

What should we expect going forward?

We aren’t going to rest. An important philosophical shift has taken place at Moz with regards to Domain Authority. In the past, we believed it was best to keep Domain Authority static, rarely updating the model, in order to give users an apples-to-apples comparison. Over time, though, this meant that Domain Authority would become less relevant. Given the rapidity with which Google updates its results and algorithms, the new Domain Authority will be far more agile as we give it new features, retrain it more frequently, and respond to algorithmic changes from Google. We hope you like it.


Be sure to join us on Thursday, March 14th at 10am PT at our upcoming webinar discussing strategies & use cases for the new Domain Authority:

Save my spot

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

SEO Channel Context: An Analysis of Growth Opportunities

Posted by Branko_Kral

Too often do you see SEO analyses and decisions being made without considering the context of the marketing channel mix. Equally as often do you see large budgets being poured into paid ads in ways that seem to forget there’s a whole lot to gain from catering to popular search demand.

Both instances can lead to leaky conversion funnels and missed opportunity for long term traffic flows. But this article will show you a case of an SEO context analysis we used to determine the importance and role of SEO.

This analysis was one of our deliverables for a marketing agency client who hired us to inform SEO decisions which we then turned into a report template for you to get inspired by and duplicate.

Case description

The included charts show real, live data. You can see the whole SEO channel context analysis in this Data Studio SEO report template.

The traffic analyzed is for of a monetizing blog, whose marketing team also happens to be one of most fun to work for. For the sake of this case study, we’re giving them a spectacular undercover name — “The Broze Fellaz.”

For context, this blog started off with content for the first two years before they launched their flagship product. Now, they sell a catalogue of products highly relevant to their content and, thanks to one of the most entertaining Shark Tank episodes ever aired, they have acquired investments and a highly engaged niche community.

As you’ll see below, organic search is their biggest channel in many ways. Facebook also runs both as organic and paid and the team spends many an hour inside the platform. Email has elaborate automated flows that strive to leverage subscribers that come from the stellar content on the website. We therefore chose the three — organic Search, Facebook, and email — as a combination that would yield a comprehensive analysis with insights we can easily act on.

Ingredients for the SEO analysis

This analysis is a result of a long-term retainer relationship with “The Broze Fellaz” as our ongoing analytics client. A great deal was required in order for data-driven action to happen, but we assure you, it’s all doable.

From the analysis best practice drawer, we used:

  • 2 cups of relevant channels for context and analysis via comparison.
  • 3 cups of different touch points to identify channel roles — bringing in traffic, generating opt-ins, closing sales, etc.
  • 5 heads of open-minded lettuce and readiness to change current status quo, for a team that can execute.
  • 457 oz of focus-on-finding what is going on with organic search, why it is going on, and what we can do about it (otherwise, we’d end up with another scorecard export).
  • Imperial units used in arbitrary numbers that are hard to imagine and thus feel very large.
  • 1 to 2 heads of your analyst brain, baked into the analysis. You’re not making an automated report — even a HubSpot intern can do that. You’re being a human and you’re analyzing. You’re making human analysis. This helps avoid having your job stolen by a robot.
  • Full tray of Data Studio visualizations that appeal to the eye.
  • Sprinkles of benchmarks, for highlighting significance of performance differences.

From the measurement setup and stack toolbox, we used:

  • Google Analytics with tailored channel definitions, enhanced e-commerce and Search Console integration.
  • Event tracking for opt-ins and adjusted bounce rate via MashMetrics GTM setup framework.
  • UTM routine for social and email traffic implemented via Google Sheets & UTM.io.
  • Google Data Studio. This is my favorite visualization tool. Despite its flaws and gaps (as it’s still in beta) I say it is better than its paid counterparts, and it keeps getting better. For data sources, we used the native connectors for Google Analytics and Google Sheets, then Facebook community connectors by Supermetrics.
  • Keyword Hero. Thanks to semantic algorithms and data aggregation, you are indeed able to see 95 percent of your organic search queries (check out Onpage Hero, too, you’ll be amazed).

Inspiration for my approach comes from Lea Pica, Avinash, the Google Data Studio newsletter, and Chris Penn, along with our dear clients and the questions they have us answer for them.

Ready? Let’s dive in.

Analysis of the client’s SEO on the context of their channel mix

1) Insight: Before the visit

What’s going on and why is it happening?

Organic search traffic volume blows the other channels out of the water. This is normal for sites with quality regular content; yet, the difference is stark considering the active effort that goes into Facebook and email campaigns.

The CTR of organic search is up to par with Facebook. That’s a lot to say when comparing an organic channel to a channel with high level of targeting control.

It looks like email flows are the clear winner in terms of CTR to the website, which has a highly engaged community of users who return fairly often and advocate passionately. It also has a product and content that’s incredibly relevant to their users, which few other companies appear to be good at.

There’s a high CTR on search engine results pages often indicates that organic search may support funnel stages beyond just the top.

As well, email flows are sent to a very warm audience — interested users who went through a double opt-in. It is to be expected for this CTR to be high.

What’s been done already?

There’s an active effort and budget allocation being put towards Facebook Ads and email automation. A content plan has been put in place and is being executed diligently.

What we recommend next

  1. Approach SEO in a way as systematic as what you do for Facebook and email flows.
  2. Optimize meta titles and descriptions via testing tools such as Sanity Check. The organic search CTR may become consistently higher than that of Facebook ads.
  3. Assuming you’ve worked on improving CTR for Facebook ads, have the same person work on the meta text and titles. Most likely, there’ll be patterns you can replicate from social to SEO.
  4. Run a technical audit and optimize accordingly. Knowing that you haven’t done that in a long time, and seeing how much traffic you get anyway, there’ll be quick, big wins to enjoy.

Results we expect

You can easily increase the organic CTR by at least 5 percent. You could also clean up the technical state of your site in the eyes of crawlers -— you’ll then see faster indexing by search engines when you publish new content, increased impressions for existing content. As a result, you may enjoy a major spike within a month.

2) Insight: Engagement and opt-ins during the visit

With over 70 percent of traffic coming to this website from organic search, the metrics in this analysis will be heavily skewed towards organic search. So, comparing the rate for organic search to site-wide is sometimes conclusive, other times not conclusive.

Adjusted bounce rate — via GTM events in the measurement framework used, we do not count a visit as a bounce if the visit lasts 45 seconds or longer. We prefer this approach because such an adjusted bounce rate is much more actionable for content sites. Users who find what they were searching for often read the page they land on for several minutes without clicking to another page. However, this is still a memorable visit for the user. Further, staying on the landing page for a while, or keeping the page open in a browser tab, are both good indicators for distinguishing quality, interested traffic, from all traffic.

We included all Facebook traffic here, not just paid. We know from the client’s data that the majority is from paid content, they have a solid UTM routine in place. But due to boosted posts, we’ve experienced big inaccuracies when splitting paid and organic Facebook for the purposes of channel attribution.

What’s going on and why is it happening?

It looks like organic search has a bounce rate worse than the email flows — that’s to be expected and not actionable, considering that the emails are only sent to recent visitors who have gone through a double opt-in. What is meaningful, however, is that organic has a better bounce rate than Facebook. It is safe to say that organic search visitors will be more likely to remember the website than the Facebook visitors.

Opt-in rates for Facebook are right above site average, and those for organic search are right below, while organic is bringing in a majority of email opt-ins despite its lower opt-in rate.

Google’s algorithms and the draw of the content on this website are doing better at winning users’ attention than the detailed targeting applied on Facebook. The organic traffic will have a higher likelihood of remembering the website and coming back. Across all of our clients, we find that organic search can be a great retargeting channel, particularly if you consider that the site will come up higher in search results for its recent visitors.

What’s been done already?

The Facebook ad campaigns of “The Broze Fellaz” have been built and optimized for driving content opt-ins. Site content that ranks in organic search is less intentional than that.

Opt-in placements have been tested on some of the biggest organic traffic magnets.

Thorough, creative and consistent content calendars have been in place as a foundation for all channels.

What we recommend next

  1. It’s great to keep using organic search as a way to introduce new users to the site. Now, you can try to be more intentional about using it for driving opt-ins. It’s already serving both of the stages of the funnel.
  2. Test and optimize opt-in placements on more traffic magnets.
  3. Test and optimize opt-in copy for top 10 traffic magnets.
  4. Once your opt-in rates have improved, focus on growing the channel. Add to the content work with a 3-month sprint of an extensive SEO project
  5. Assign Google Analytics goal values to non-e-commerce actions on your site. The current opt-ins have different roles and levels of importance and there’s also a handful of other actions people can take that lead to marketing results down the road. Analyzing goal values will help you create better flows toward pre-purchase actions.
  6. Facebook campaigns seem to be at a point where you can pour more budget into them and expect proportionate increase in opt-in count.

Results we expect

Growth in your opt-ins from Facebook should be proportionate to increase in budget, with a near-immediate effect. At the same time, it’s fairly realistic to bring the opt-in rate of organic search closer to site average.

3) Insight: Closing the deal

For channel attribution with money involved, you want to make sure that your Google Analytics channel definitions, view filters, and UTM’s are in top shape.

What’s going on and why is it happening?

Transaction rate, as well as per session value, is higher for organic search than it is for Facebook (paid and organic combined).

Organic search contributes to far more last-click revenue than Facebook and email combined. For its relatively low volume of traffic, email flows are outstanding in the volume of revenue they bring in.

Thanks to the integration of Keyword Hero with Google Analytics for this client, we can see that about 30 percent of organic search visits are from branded keywords, which tends to drive the transaction rate up.

So, why is this happening? Most of the product on the site is highly relevant to the information people search for on Google.

Multi-channel reports in Google Analytics also show that people often discover the site in organic search, then come back by typing in the URL or clicking a bookmark. That makes organic a source of conversions where, very often, no other channels are even needed.

We can conclude that Facebook posts and campaigns of this client are built to drive content opt-ins, not e-commerce transactions. Email flows are built specifically to close sales.

What’s been done already?

There is dedicated staff for Facebook campaigns and posts, as well a thorough system dedicated to automated email flows.

A consistent content routine is in place, with experienced staff at the helm. A piece has been published every week for the last few years, with the content calendar filled with ready-to-publish content for the next few months. The community is highly engaged, reading times are high, comment count soaring, and usefulness of content outstanding. This, along with partnerships with influencers, helps “The Broze Fellaz” take up a half of the first page on the SERP for several lucrative topics. They’ve been achieving this even without a comprehensive SEO project. Content seems to be king indeed.

Google Shopping has been tried. The campaign looked promising but didn’t yield incremental sales. There’s much more search demand for informational queries than there is for product.

What we recommend next

  1. Organic traffic is ready to grow. If there is no budget left, resource allocation should be considered. In paid search, you can often simply increase budgets. Here, with stellar content already performing well, a comprehensive SEO project is begging for your attention. Focus can be put into structure and technical aspects, as well as content that better caters to search demand. Think optimizing the site’s information architecture, interlinking content for cornerstone structure, log analysis, and technical cleanup, meta text testing for CTR gains that would also lead to ranking gains, strategic ranking of long tail topics, intentional growing of the backlink profile.
  2. Three- or six-month intensive sprint of comprehensive SEO work would be appropriate.

Results we expect

Increasing last click revenue from organic search and direct by 25 percent would lead to a gain as high as all of the current revenue from automated email flows. Considering how large the growth has been already, this gain is more than achievable in 3–6 months.

Wrapping it up

Organic search presence of “The Broze Fellaz” should continue to be the number-one role for bringing new people to the site and bringing people back to the site. Doing so supports sales that happen with the contribution of other channels, e.g. email flows. The analysis points out is that organic search is also effective at playing the role of the last-click channel for transactions, often times without the help of other channels.

We’ve worked with this client for a few years, and, based on our knowledge of their marketing focus, this analysis points us to a confident conclusion that a dedicated, comprehensive SEO project will lead to high incremental growth.

Your turn

In drawing analytical conclusions and acting on them, there’s always more than one way to shoe a horse. Let us know what conclusions you would’ve drawn instead. Copy the layout of our SEO Channel Context Comparison analysis template and show us what it helped you do for your SEO efforts — create a similar analysis for a paid or owned channel in your mix. Whether it’s comments below, tweeting our way, or sending a smoke signal, we’ll be all ears. And eyes.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Log File Analysis 101 – Whiteboard Friday

Posted by BritneyMuller

Log file analysis can provide some of the most detailed insights about what Googlebot is doing on your site, but it can be an intimidating subject. In this week’s Whiteboard Friday, Britney Muller breaks down log file analysis to make it a little more accessible to SEOs everywhere.

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Hey, Moz fans. Welcome to another edition of Whiteboard Friday. Today we’re going over all things log file analysis, which is so incredibly important because it really tells you the ins and outs of what Googlebot is doing on your sites.

So I’m going to walk you through the three primary areas, the first being the types of logs that you might see from a particular site, what that looks like, what that information means. The second being how to analyze that data and how to get insights, and then the third being how to use that to optimize your pages and your site.

For a primer on what log file analysis is and its application in SEO, check out our article: How to Use Server Log Analysis for Technical SEO

1. Types

So let’s get right into it. There are three primary types of logs, the primary one being Apache. But you’ll also see W3C, elastic load balancing, which you might see a lot with things like Kibana. But you also will likely come across some custom log files. So for those larger sites, that’s not uncommon. I know Moz has a custom log file system. Fastly is a custom type setup. So just be aware that those are out there.

Log data

So what are you going to see in these logs? The data that comes in is primarily in these colored ones here.

So you will hopefully for sure see:

  • the request server IP;
  • the timestamp, meaning the date and time that this request was made;
  • the URL requested, so what page are they visiting;
  • the HTTP status code, was it a 200, did it resolve, was it a 301 redirect;
  • the user agent, and so for us SEOs we’re just looking at those user agents’ Googlebot.

So log files traditionally house all data, all visits from individuals and traffic, but we want to analyze the Googlebot traffic. Method (Get/Post), and then time taken, client IP, and the referrer are sometimes included. So what this looks like, it’s kind of like glibbery gloop.

It’s a word I just made up, and it just looks like that. It’s just like bleh. What is that? It looks crazy. It’s a new language. But essentially you’ll likely see that IP, so that red IP address, that timestamp, which will commonly look like that, that method (get/post), which I don’t completely understand or necessarily need to use in some of the analysis, but it’s good to be aware of all these things, the URL requested, that status code, all of these things here.

2. Analyzing

So what are you going to do with that data? How do we use it? So there’s a number of tools that are really great for doing some of the heavy lifting for you. Screaming Frog Log File Analyzer is great. I’ve used it a lot. I really, really like it. But you have to have your log files in a specific type of format for them to use it.

Splunk is also a great resource. Sumo Logic and I know there’s a bunch of others. If you’re working with really large sites, like I have in the past, you’re going to run into problems here because it’s not going to be in a common log file. So what you can do is to manually do some of this yourself, which I know sounds a little bit crazy.

Manual Excel analysis

But hang in there. Trust me, it’s fun and super interesting. So what I’ve done in the past is I will import a CSV log file into Excel, and I will use the Text Import Wizard and you can basically delineate what the separators are for this craziness. So whether it be a space or a comma or a quote, you can sort of break those up so that each of those live within their own columns. I wouldn’t worry about having extra blank columns, but you can separate those. From there, what you would do is just create pivot tables. So I can link to a resource on how you can easily do that.

Top pages

But essentially what you can look at in Excel is: Okay, what are the top pages that Googlebot hits by frequency? What are those top pages by the number of times it’s requested?

Top folders

You can also look at the top folder requests, which is really interesting and really important. On top of that, you can also look into: What are the most common Googlebot types that are hitting your site? Is it Googlebot mobile? Is it Googlebot images? Are they hitting the correct resources? Super important. You can also do a pivot table with status codes and look at that. I like to apply some of these purple things to the top pages and top folders reports. So now you’re getting some insights into: Okay, how did some of these top pages resolve? What are the top folders looking like?

You can also do that for Googlebot IPs. This is the best hack I have found with log file analysis. I will create a pivot table just with Googlebot IPs, this right here. So I will usually get, sometimes it’s a bunch of them, but I’ll get all the unique ones, and I can go to terminal on your computer, on most standard computers.

I tried to draw it. It looks like that. But all you do is you type in “host” and then you put in that IP address. You can do it on your terminal with this IP address, and you will see it resolve as a Google.com. That verifies that it’s indeed a Googlebot and not some other crawler spoofing Google. So that’s something that these tools tend to automatically take care of, but there are ways to do it manually too, which is just good to be aware of.

3. Optimize pages and crawl budget

All right, so how do you optimize for this data and really start to enhance your crawl budget? When I say “crawl budget,” it primarily is just meaning the number of times that Googlebot is coming to your site and the number of pages that they typically crawl. So what is that with? What does that crawl budget look like, and how can you make it more efficient?

  • Server error awareness: So server error awareness is a really important one. It’s good to keep an eye on an increase in 500 errors on some of your pages.
  • 404s: Valid? Referrer?: Another thing to take a look at is all the 400s that Googlebot is finding. It’s so important to see: Okay, is that 400 request, is it a valid 400? Does that page not exist? Or is it a page that should exist and no longer does, but you could maybe fix? If there is an error there or if it shouldn’t be there, what is the referrer? How is Googlebot finding that, and how can you start to clean some of those things up?
  • Isolate 301s and fix frequently hit 301 chains: 301s, so a lot of questions about 301s in these log files. The best trick that I’ve sort of discovered, and I know other people have discovered, is to isolate and fix the most frequently hit 301 chains. So you can do that in a pivot table. It’s actually a lot easier to do this when you have kind of paired it up with crawl data, because now you have some more insights into that chain. What you can do is you can look at the most frequently hit 301s and see: Are there any easy, quick fixes for that chain? Is there something you can remove and quickly resolve to just be like a one hop or a two hop?
  • Mobile first: You can keep an eye on mobile first. If your site has gone mobile first, you can dig into that, into the logs and evaluate what that looks like. Interestingly, the Googlebot is still going to look like this compatible Googlebot 2.0. However, it’s going to have all of the mobile implications in the parentheses before it. So I’m sure these tools can automatically know that. But if you’re doing some of the stuff manually, it’s good to be aware of what that looks like.
  • Missed content: So what’s really important is to take a look at: What’s Googlebot finding and crawling, and what are they just completely missing? So the easiest way to do that is to cross-compare with your site map. It’s a really great way to take a look at what might be missed and why and how can you maybe reprioritize that data in the site map or integrate it into navigation if at all possible.
  • Compare frequency of hits to traffic: This was an awesome tip I got on Twitter, and I can’t remember who said it. They said compare frequency of Googlebot hits to traffic. I thought that was brilliant, because one, not only do you see a potential correlation, but you can also see where you might want to increase crawl traffic or crawls on a specific, high-traffic page. Really interesting to kind of take a look at that.
  • URL parameters: Take a look at if Googlebot is hitting any URLs with the parameter strings. You don’t want that. It’s typically just duplicate content or something that can be assigned in Google Search Console with the parameter section. So any e-commerce out there, definitely check that out and kind of get that all straightened out.
  • Evaluate days, weeks, months: You can evaluate days, weeks, and months that it’s hit. So is there a spike every Wednesday? Is there a spike every month? It’s kind of interesting to know, not totally critical.
  • Evaluate speed and external resources: You can evaluate the speed of the requests and if there’s any external resources that can potentially be cleaned up and speed up the crawling process a bit.
  • Optimize navigation and internal links: You also want to optimize that navigation, like I said earlier, and use that meta no index.
  • Meta noindex and robots.txt disallow: So if there are things that you don’t want in the index and if there are things that you don’t want to be crawled from your robots.txt, you can add all those things and start to help some of this stuff out as well.

Reevaluate

Lastly, it’s really helpful to connect the crawl data with some of this data. So if you’re using something like Screaming Frog or DeepCrawl, they allow these integrations with different server log files, and it gives you more insight. From there, you just want to reevaluate. So you want to kind of continue this cycle over and over again.

You want to look at what’s going on, have some of your efforts worked, is it being cleaned up, and go from there. So I hope this helps. I know it was a lot, but I want it to be sort of a broad overview of log file analysis. I look forward to all of your questions and comments below. I will see you again soon on another Whiteboard Friday. Thanks.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

NEW On-Demand Crawl: Quick Insights for Sales, Prospecting, & Competitive Analysis

Posted by Dr-Pete

In June of 2017, Moz launched our entirely rebuilt Site Crawl, helping you dive deep into crawl issues and technical SEO problems, fix those issues in your Moz Pro Campaigns (tracked websites), and monitor weekly for new issues. Many times, though, you need quick insights outside of a Campaign context, whether you’re analyzing a prospect site before a sales call or trying to assess the competition.

For years, Moz had a lab tool called Crawl Test. The bad news is that Crawl Test never made it to prime-time and suffered from some neglect. The good news is that I’m happy to announce the full launch (as of August 2018) of On-Demand Crawl, an entirely new crawl tool built on the engine that powers Site Crawl, but with a UI designed around quick insights for prospecting and competitive analysis.

While you don’t need a Campaign to run a crawl, you do need to be logged into your Moz Pro subscription. If you don’t have a subscription, you can sign-up for a free trial and give it a whirl.

How can you put On-Demand Crawl to work? Let’s walk through a short example together.


All you need is a domain

Getting started is easy. From the “Moz Pro” menu, find “On-Demand Crawl” under “Research Tools”:

Just enter a root domain or subdomain in the box at the top and click the blue button to kick off a crawl. While I don’t want to pick on anyone, I’ve decided to use a real site. Our recent analysis of the August 1st Google update identified some sites that were hit hard, and I’ve picked one (lilluna.com) from that list.

Please note that Moz is not affiliated with Lil’ Luna in any way. For the most part, it seems to be a decent site with reasonably good content. Let’s pretend, just for this post, that you’re looking to help this site out and determine if they’d be a good fit for your SEO services. You’ve got a call scheduled and need to spot-check for any major problems so that you can go into that call as informed as possible.

On-Demand Crawls aren’t instantaneous (crawling is a big job), but they’ll generally finish between a few minutes and an hour. We know these are time-sensitive situations. You’ll soon receive an email that looks like this:

The email includes the number of URLs crawled (On-Demand will currently crawl up to 3,000 URLs), the total issues found, and a summary table of crawl issues by category. Click on the [View Report] link to dive into the full crawl data.


Assess critical issues quickly

We’ve designed On-Demand Crawl to assist your own human intelligence. You’ll see some basic stats at the top, but then immediately move into a graph of your top issues by count. The graph only displays issues that occur at least once on your site – you can click “See More” to show all of the issues that On-Demand Crawl tracks (the top two bars have been truncated)…

Issues are also color-coded by category. Some items are warnings, and whether they matter depends a lot on context. Other issues, like “Critcal Errors” (in red) almost always demand attention. So, let’s check out those 404 errors. Scroll down and you’ll see a list of “Pages Crawled” with filters. You’re going to select “4xx” in the “Status Codes” dropdown…

You can then pretty easily spot-check these URLs and find out that they do, in fact, seem to be returning 404 errors. Some appear to be legitimate content that has either internal or external links (or both). So, within a few minutes, you’ve already found something useful.

Let’s look at those yellow “Meta Noindex” errors next. This is a tricky one, because you can’t easily determine intent. An intentional Meta Noindex may be fine. An unintentional one (or hundreds of unintentional ones) could be blocking crawlers and causing serious harm. Here, you’ll filter by issue type…

Like the top graph, issues appear in order of prevalence. You can also filter by all pages that have issues (any issues) or pages that have no issues. Here’s a sample of what you get back (the full table also includes status code, issue count, and an option to view all issues)…

Notice the “?s=” common to all of these URLs. Clicking on a few, you can see that these are internal search pages. These URLs have no particular SEO value, and the Meta Noindex is likely intentional. Good technical SEO is also about avoiding false alarms because you lack internal knowledge of a site. On-Demand Crawl helps you semi-automate and summarize insights to put your human intelligence to work quickly.


Dive deeper with exports

Let’s go back to those 404s. Ideally, you’d like to know where those URLs are showing up. We can’t fit everything into one screen, but if you scroll up to the “All Issues” graph you’ll see an “Export CSV” option…

The export will honor any filters set in the page list, so let’s re-apply that “4xx” filter and pull the data. Your export should download almost immediately. The full export contains a wealth of information, but I’ve zeroed in on just what’s critical for this particular case…

Now, you know not only what pages are missing, but exactly where they link from internally, and can easily pass along suggested fixes to the customer or prospect. Some of these turn out to be link-heavy pages that could probably benefit from some clean-up or updating (if newer recipes are a good fit).

Let’s try another one. You’ve got 8 duplicate content errors. Potentially thin content could fit theories about the August 1st update, so this is worth digging into. If you filter by “Duplicate Content” issues, you’ll see the following message…

The 8 duplicate issues actually represent 18 pages, and the table returns all 18 affected pages. In some cases, the duplicates will be obvious from the title and/or URL, but in this case there’s a bit of mystery, so let’s pull that export file. In this case, there’s a column called “Duplicate Content Group,” and sorting by it reveals something like the following (there’s a lot more data in the original export file)…

I’ve renamed “Duplicate Content Group” to just “Group” and included the word count (“Words”), which could be useful for verifying true duplicates. Look at group #7 – it turns out that these “Weekly Menu Plan” pages are very image heavy and have a common block of text before any unique text. While not 100% duplicated, these otherwise valuable pages could easily look like thin content to Google and represent a broader problem.


Real insights in real-time

Not counting the time spent writing the blog post, running this crawl and diving in took less than an hour, and even that small amount of time spent uncovered more potential issues than what I could cover in this post. In less than an hour, you can walk into a client meeting or sales call with in-depth knowledge of any domain.

Keep in mind that many of these features also exist in our Site Crawl tool. If you’re looking for long-term, campaign insights, use Site Crawl (if you just need to update your data, use our “Recrawl” feature). If you’re looking for quick, one-time insights, check out On-Demand Crawl. Standard Pro users currently get 5 On-Demand Crawls per month (with limits increasing at higher tiers).

Your On-Demand Crawls are currently stored for 90 days. When you re-enter the feature, you’ll see a table of all of your recent crawls (the image below has been truncated):

Click on any row to go back to see the crawl data for that domain. If you get the sale and decide to move forward, congratulations! You can port that domain directly into a Moz campaign.

We hope you’ll try On-Demand Crawl out and let us know what you think. We’d love to hear your case studies, whether it’s sales, competitive analysis, or just trying to solve the mysteries of a Google update.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Best of 2017: MarketingSherpa’s most popular content about email, customer-first marketing, and competitive analysis

To give you that little extra oomph before we cross the line into 2018, here’s a look at some of our readers’ favorite content from the MarketingSherpa Blog this past year.
MarketingSherpa Blog

Posted in IM NewsComments Off

Don’t Be Fooled by Data: 4 Data Analysis Pitfalls & How to Avoid Them

Posted by Tom.Capper

Digital marketing is a proudly data-driven field. Yet, as SEOs especially, we often have such incomplete or questionable data to work with, that we end up jumping to the wrong conclusions in our attempts to substantiate our arguments or quantify our issues and opportunities.

In this post, I’m going to outline 4 data analysis pitfalls that are endemic in our industry, and how to avoid them.

1. Jumping to conclusions

Earlier this year, I conducted a ranking factor study around brand awareness, and I posted this caveat:

“…the fact that Domain Authority (or branded search volume, or anything else) is positively correlated with rankings could indicate that any or all of the following is likely:

  • Links cause sites to rank well
  • Ranking well causes sites to get links
  • Some third factor (e.g. reputation or age of site) causes sites to get both links and rankings”
    ~ Me

However, I want to go into this in a bit more depth and give you a framework for analyzing these yourself, because it still comes up a lot. Take, for example, this recent study by Stone Temple, which you may have seen in the Moz Top 10 or Rand’s tweets, or this excellent article discussing SEMRush’s recent direct traffic findings. To be absolutely clear, I’m not criticizing either of the studies, but I do want to draw attention to how we might interpret them.

Firstly, we do tend to suffer a little confirmation bias — we’re all too eager to call out the cliché “correlation vs. causation” distinction when we see successful sites that are keyword-stuffed, but all too approving when we see studies doing the same with something we think is or was effective, like links.

Secondly, we fail to critically analyze the potential mechanisms. The options aren’t just causation or coincidence.

Before you jump to a conclusion based on a correlation, you’re obliged to consider various possibilities:

  • Complete coincidence
  • Reverse causation
  • Joint causation
  • Linearity
  • Broad applicability

If those don’t make any sense, then that’s fair enough — they’re jargon. Let’s go through an example:

Before I warn you not to eat cheese because you may die in your bedsheets, I’m obliged to check that it isn’t any of the following:

  • Complete coincidence - Is it possible that so many datasets were compared, that some were bound to be similar? Why, that’s exactly what Tyler Vigen did! Yes, this is possible.
  • Reverse causation - Is it possible that we have this the wrong way around? For example, perhaps your relatives, in mourning for your bedsheet-related death, eat cheese in large quantities to comfort themselves? This seems pretty unlikely, so let’s give it a pass. No, this is very unlikely.
  • Joint causation - Is it possible that some third factor is behind both of these? Maybe increasing affluence makes you healthier (so you don’t die of things like malnutrition), and also causes you to eat more cheese? This seems very plausible. Yes, this is possible.
  • Linearity - Are we comparing two linear trends? A linear trend is a steady rate of growth or decline. Any two statistics which are both roughly linear over time will be very well correlated. In the graph above, both our statistics are trending linearly upwards. If the graph was drawn with different scales, they might look completely unrelated, like this, but because they both have a steady rate, they’d still be very well correlated. Yes, this looks likely.
  • Broad applicability - Is it possible that this relationship only exists in certain niche scenarios, or, at least, not in my niche scenario? Perhaps, for example, cheese does this to some people, and that’s been enough to create this correlation, because there are so few bedsheet-tangling fatalities otherwise? Yes, this seems possible.

So we have 4 “Yes” answers and one “No” answer from those 5 checks.

If your example doesn’t get 5 “No” answers from those 5 checks, it’s a fail, and you don’t get to say that the study has established either a ranking factor or a fatal side effect of cheese consumption.

A similar process should apply to case studies, which are another form of correlation — the correlation between you making a change, and something good (or bad!) happening. For example, ask:

  • Have I ruled out other factors (e.g. external demand, seasonality, competitors making mistakes)?
  • Did I increase traffic by doing the thing I tried to do, or did I accidentally improve some other factor at the same time?
  • Did this work because of the unique circumstance of the particular client/project?

This is particularly challenging for SEOs, because we rarely have data of this quality, but I’d suggest an additional pair of questions to help you navigate this minefield:

  • If I were Google, would I do this?
  • If I were Google, could I do this?

Direct traffic as a ranking factor passes the “could” test, but only barely — Google could use data from Chrome, Android, or ISPs, but it’d be sketchy. It doesn’t really pass the “would” test, though — it’d be far easier for Google to use branded search traffic, which would answer the same questions you might try to answer by comparing direct traffic levels (e.g. how popular is this website?).

2. Missing the context

If I told you that my traffic was up 20% week on week today, what would you say? Congratulations?

What if it was up 20% this time last year?

What if I told you it had been up 20% year on year, up until recently?

It’s funny how a little context can completely change this. This is another problem with case studies and their evil inverted twin, traffic drop analyses.

If we really want to understand whether to be surprised at something, positively or negatively, we need to compare it to our expectations, and then figure out what deviation from our expectations is “normal.” If this is starting to sound like statistics, that’s because it is statistics — indeed, I wrote about a statistical approach to measuring change way back in 2015.

If you want to be lazy, though, a good rule of thumb is to zoom out, and add in those previous years. And if someone shows you data that is suspiciously zoomed in, you might want to take it with a pinch of salt.

3. Trusting our tools

Would you make a multi-million dollar business decision based on a number that your competitor could manipulate at will? Well, chances are you do, and the number can be found in Google Analytics. I’ve covered this extensively in other places, but there are some major problems with most analytics platforms around:

  • How easy they are to manipulate externally
  • How arbitrarily they group hits into sessions
  • How vulnerable they are to ad blockers
  • How they perform under sampling, and how obvious they make this

For example, did you know that the Google Analytics API v3 can heavily sample data whilst telling you that the data is unsampled, above a certain amount of traffic (~500,000 within date range)? Neither did I, until we ran into it whilst building Distilled ODN.

Similar problems exist with many “Search Analytics” tools. My colleague Sam Nemzer has written a bunch about this — did you know that most rank tracking platforms report completely different rankings? Or how about the fact that the keywords grouped by Google (and thus tools like SEMRush and STAT, too) are not equivalent, and don’t necessarily have the volumes quoted?

It’s important to understand the strengths and weaknesses of tools that we use, so that we can at least know when they’re directionally accurate (as in, their insights guide you in the right direction), even if not perfectly accurate. All I can really recommend here is that skilling up in SEO (or any other digital channel) necessarily means understanding the mechanics behind your measurement platforms — which is why all new starts at Distilled end up learning how to do analytics audits.

One of the most common solutions to the root problem is combining multiple data sources, but…

4. Combining data sources

There are numerous platforms out there that will “defeat (not provided)” by bringing together data from two or more of:

  • Analytics
  • Search Console
  • AdWords
  • Rank tracking

The problems here are that, firstly, these platforms do not have equivalent definitions, and secondly, ironically, (not provided) tends to break them.

Let’s deal with definitions first, with an example — let’s look at a landing page with a channel:

  • In Search Console, these are reported as clicks, and can be vulnerable to heavy, invisible sampling when multiple dimensions (e.g. keyword and page) or filters are combined.
  • In Google Analytics, these are reported using last non-direct click, meaning that your organic traffic includes a bunch of direct sessions, time-outs that resumed mid-session, etc. That’s without getting into dark traffic, ad blockers, etc.
  • In AdWords, most reporting uses last AdWords click, and conversions may be defined differently. In addition, keyword volumes are bundled, as referenced above.
  • Rank tracking is location specific, and inconsistent, as referenced above.

Fine, though — it may not be precise, but you can at least get to some directionally useful data given these limitations. However, about that “(not provided)”…

Most of your landing pages get traffic from more than one keyword. It’s very likely that some of these keywords convert better than others, particularly if they are branded, meaning that even the most thorough click-through rate model isn’t going to help you. So how do you know which keywords are valuable?

The best answer is to generalize from AdWords data for those keywords, but it’s very unlikely that you have analytics data for all those combinations of keyword and landing page. Essentially, the tools that report on this make the very bold assumption that a given page converts identically for all keywords. Some are more transparent about this than others.

Again, this isn’t to say that those tools aren’t valuable — they just need to be understood carefully. The only way you could reliably fill in these blanks created by “not provided” would be to spend a ton on paid search to get decent volume, conversion rate, and bounce rate estimates for all your keywords, and even then, you’ve not fixed the inconsistent definitions issues.

Bonus peeve: Average rank

I still see this way too often. Three questions:

  1. Do you care more about losing rankings for ten very low volume queries (10 searches a month or less) than for one high volume query (millions plus)? If the answer isn’t “yes, I absolutely care more about the ten low-volume queries”, then this metric isn’t for you, and you should consider a visibility metric based on click through rate estimates.
  2. When you start ranking at 100 for a keyword you didn’t rank for before, does this make you unhappy? If the answer isn’t “yes, I hate ranking for new keywords,” then this metric isn’t for you — because that will lower your average rank. You could of course treat all non-ranking keywords as position 100, as some tools allow, but is a drop of 2 average rank positions really the best way to express that 1/50 of your landing pages have been de-indexed? Again, use a visibility metric, please.
  3. Do you like comparing your performance with your competitors? If the answer isn’t “no, of course not,” then this metric isn’t for you — your competitors may have more or fewer branded keywords or long-tail rankings, and these will skew the comparison. Again, use a visibility metric.

Conclusion

Hopefully, you’ve found this useful. To summarize the main takeaways:

  • Critically analyse correlations & case studies by seeing if you can explain them as coincidences, as reverse causation, as joint causation, through reference to a third mutually relevant factor, or through niche applicability.
  • Don’t look at changes in traffic without looking at the context — what would you have forecasted for this period, and with what margin of error?
  • Remember that the tools we use have limitations, and do your research on how that impacts the numbers they show. “How has this number been produced?” is an important component in “What does this number mean?”
  • If you end up combining data from multiple tools, remember to work out the relationship between them — treat this information as directional rather than precise.

Let me know what data analysis fallacies bug you, in the comments below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

How to Do a Competitor Analysis for SEO

Posted by John.Reinesch

Competitive analysis is a key aspect when in the beginning stages of an SEO campaign. Far too often, I see organizations skip this important step and get right into keyword mapping, optimizing content, or link building. But understanding who our competitors are and seeing where they stand can lead to a far more comprehensive understanding of what our goals should be and reveal gaps or blind spots.

By the end of this analysis, you will understand who is winning organic visibility in the industry, what keywords are valuable, and which backlink strategies are working best, all of which can then be utilized to gain and grow your own site’s organic traffic.

Why competitive analysis is important

SEO competitive analysis is critical because it gives data about which tactics are working in the industry we are in and what we will need to do to start improving our keyword rankings. The insights gained from this analysis help us understand which tasks we should prioritize and it shapes the way we build out our campaigns. By seeing where our competitors are strongest and weakest, we can determine how difficult it will be to outperform them and the amount of resources that it will take to do so.

Identify your competitors

The first step in this process is determining who are the top four competitors that we want to use for this analysis. I like to use a mixture of direct business competitors (typically provided by my clients) and online search competitors, which can differ from whom a business identifies as their main competitors. Usually, this discrepancy is due to local business competitors versus those who are paying for online search ads. While your client may be concerned about the similar business down the street, their actual online competitor may be a business from a neighboring town or another state.

To find search competitors, I simply enter my own domain name into SEMrush, scroll down to the “Organic Competitors” section, and click “View Full Report.”

The main metrics I use to help me choose competitors are common keywords and total traffic. Once I’ve chosen my competitors for analysis, I open up the Google Sheets Competitor Analysis Template to the “Audit Data” tab and fill in the names and URLs of my competitors in rows 2 and 3.

Use the Google Sheets Competitor Analysis Template

A clear, defined process is critical not only for getting repeated results, but to scale efforts as you start doing this for multiple clients. We created our Competitor Analysis Template so that we can follow a strategic process and focus more on analyzing the results rather than figuring out what to look for anew each time.

In the Google Sheets Template, I’ve provided you with the data points that we’ll be collecting, the tools you’ll need to do so, and then bucketed the metrics based on similar themes. The data we’re trying to collect relates to SEO metrics like domain authority, how much traffic the competition is getting, which keywords are driving that traffic, and the depth of competitors’ backlink profiles. I have built in a few heatmaps for key metrics to help you visualize who’s the strongest at a glance.

This template is meant to serve as a base that you can alter depending on your client’s specific needs and which metrics you feel are the most actionable or relevant.

Backlink gap analysis

A backlink gap analysis aims to tell us which websites are linking to our competitors, but not to us. This is vital data because it allows us to close the gap between our competitors’ backlink profiles and start boosting our own ranking authority by getting links from websites that already link to competitors. Websites that link to multiple competitors (especially when it is more than three competitors) have a much higher success rate for us when we start reaching out to them and creating content for guest posts.

In order to generate this report, you need to head over to the Moz Open Site Explorer tool and input the first competitor’s domain name. Next, click “Linking Domains” on the left side navigation and then click “Request CSV” to get the needed data.

Next, head to the SEO Competitor Analysis Template, select the “Backlink Import – Competitor 1” tab, and paste in the content of the CSV file. It should look like this:

Repeat this process for competitors 2–4 and then for your own website in the corresponding tabs marked in red.

Once you have all your data in the correct import tabs, the “Backlink Gap Analysis” report tab will populate. The result is a highly actionable report that shows where your competitors are getting their backlinks from, which ones they share in common, and which ones you don’t currently have.

It’s also a good practice to hide all of the “Import” tabs marked in red after you paste the data into them, so the final report has a cleaner look. To do this, just right-click on the tabs and select “Hide Sheet,” so the report only shows the tabs marked in blue and green.

For our clients, we typically gain a few backlinks at the beginning of an SEO campaign just from this data alone. It also serves as a long-term guide for link building in the months to come as getting links from high-authority sites takes time and resources. The main benefit is that we have a starting point full of low-hanging fruit from which to base our initial outreach.

Keyword gap analysis

Keyword gap analysis is the process of determining which keywords your competitors rank well for that your own website does not. From there, we reverse-engineer why the competition is ranking well and then look at how we can also rank for those keywords. Often, it could be reworking metadata, adjusting site architecture, revamping an existing piece of content, creating a brand-new piece of content specific to a theme of keywords, or building links to your content containing these desirable keywords.

To create this report, a similar process as the backlink gap analysis one is followed; only the data source changes. Go to SEMrush again and input your first competitor’s domain name. Then, click on the “Organic Research” positions report in the left-side navigation menu and click on “Export” on the right.

Once you download the CSV file, paste the content into the “Keyword Import – Competitor 1” tab and then repeat the process for competitors 2–4 and your own website.

The final report will now populate on the “Keyword Gap Analysis” tab marked in green. It should look like the one below:

This data gives us a starting point to build out complex keyword mapping strategy documents that set the tone for our client campaigns. Rather than just starting keyword research by guessing what we think is relevant, we have hundreds of keywords to start with that we know are relevant to the industry. Our keyword research process then aims to dive deeper into these topics to determine the type of content needed to rank well.

This report also helps drive our editorial calendar, since we often find keywords and topics where we need to create new content to compete with our competitors. We take this a step further during our content planning process, analyzing the content the competitors have created that is already ranking well and using that as a base to figure out how we can do it better. We try to take some of the best ideas from all of the competitors ranking well to then make a more complete resource on the topic.

Using key insights from the audit to drive your SEO strategy

It is critically important to not just create this report, but also to start taking action based on the data that you have collected. On the first tab of the spreadsheet template, we write in insights from our analysis and then use those insights to drive our campaign strategy.

Some examples of typical insights from this document would be the average number of referring domains that our competitors have and how that relates to our own backlink profile. If we are ahead of our competitors regarding backlinks, content creation might be the focal point of the campaign. If we are behind our competitors in regards to backlinks, we know that we need to start a link building campaign as soon as possible.

Another insight we gain is which competitors are most aggressive in PPC and which keywords they are bidding on. Often, the keywords that they are bidding on have high commercial intent and would be great keywords to target organically and provide a lift to our conversions.

Start implementing competitive analyses into your workflow

Competitive analyses for SEO are not something that should be overlooked when planning a digital marketing strategy. This process can help you strategically build unique and complex SEO campaigns based on readily available data and the demand of your market. This analysis will instantly put you ahead of competitors who are following cookie-cutter SEO programs and not diving deep into their industry. Start implementing this process as soon as you can and adjust it based on what is important to your own business or client’s business.

Don’t forget to make a copy of the spreadsheet template here:

Get the Competitive Analysis Template

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

The SEO Competitive Analysis Checklist

Posted by zeehj

The SEO case for competitive analyses

“We need more links!” “I read that user experience (UX) matters more than everything else in SEO, so we should focus solely on UX split tests.” “We just need more keywords on these pages.”

If you dropped a quarter on the sidewalk, but had no light to look for it, would you walk to the next block with a street light to retrieve it? The obvious answer is no, yet many marketers get tunnel vision when it comes to where their efforts should be focused.

1942 June 3, Florence Morning News, Mutt and Jeff Comic Strip, Page 7, Florence, South Carolina. (NewspaperArchive)

Which is why I’m sharing a checklist with you today that will allow you to compare your website to your search competitors, and identify your site’s strengths, weaknesses, and potential opportunities based on ranking factors we know are important.

If you’re unconvinced that good SEO is really just digital marketing, I’ll let AJ Kohn persuade you otherwise. As any good SEO (or even keyword research newbie) knows, it’s crucial to understand the effort involved in ranking for a specific term before you begin optimizing for it.

It’s easy to get frustrated when stakeholders ask how to rank for a specific term, and solely focus on content to create, or on-page optimizations they can make. Why? Because we’ve known for a while that there are myriad factors that play into search engine rank. Depending on the competitive search landscape, there may not be any amount of “optimizing” that you can do in order to rank for a specific term.

The story that I’ve been able to tell my clients is one of hidden opportunity, but the only way to expose these undiscovered gems is to broaden your SEO perspective beyond search engine results page (SERP) position and best practices. And the place to begin is with a competitive analysis.

Competitive analyses help you evaluate your competition’s strategies to determine their strengths and weakness relative to your brand. When it comes to digital marketing and SEO, however, there are so many ranking factors and best practices to consider that can be hard to know where to begin. Which is why my colleague, Ben Estes, created a competitive analysis checklist (not dissimilar to his wildly popular technical audit checklist) that I’ve souped up for the Moz community.

This checklist is broken out into sections that reflect key elements from our Balanced Digital Scorecard. As previously mentioned, this checklist is to help you identify opportunities (and possibly areas not worth your time and budget). But this competitive analysis is not prescriptive in and of itself. It should be used as its name suggests: to analyze what your competition’s “edge” is.

Methodology

Choosing competitors

Before you begin, you’ll need to identify six brands to compare your website against. These should be your search competitors (who else is ranking for terms that you’re ranking for, or would like to rank for?) in addition to a business competitor (or two). Don’t know who your search competition is? You can use SEMRush and Searchmetrics to identify them, and if you want to be extra thorough you can use this Moz post as a guide.

Sample sets of pages

For each site, you’ll need to select five URLs to serve as your sample set. These are the pages you will review and evaluate against the competitive analysis items. When selecting a sample set, I always include:

  • The brand’s homepage,
  • Two “product” pages (or an equivalent),
  • One to two “browse” pages, and
  • A page that serves as a hub for news/informative content.

Make sure each site has equivalent pages to each other, for a fair comparison.

Scoring

The scoring options for each checklist item range from zero to four, and are determined relative to each competitor’s performance. This means that a score of two serves as the average performance in that category.

For example, if each sample set has one unique H1 tag per page, then each competitor would get a score of two for H1s appear technically optimized. However if a site breaks one (or more) of the below requirements, then it should receive a score of zero or one:

  1. One or more pages within sample set contains more than one H1 tag on it, and/or
  2. H1 tags are duplicated across a brand’s sample set of pages.

Checklist

Platform (technical optimization)

Title tags appear technically optimized. This measurement should be as quantitative as possible, and refer only to technical SEO rather than its written quality. Evaluate the sampled pages based on:

  • Only one title tag per page,
  • The title tag being correctly placed within the head tags of the page, and
  • Few to no extraneous tags within the title (e.g. ideally no inline CSS, and few to no span tags).

H1s appear technically optimized. Like with the title tags, this is another quantitative measure: make sure the H1 tags on your sample pages are sound by technical SEO standards (and not based on writing quality). You should look for:

  • Only one H1 tag per page, and
  • Few to no extraneous tags within the tag (e.g. ideally no inline CSS, and few to no span tags).

Internal linking allows indexation of content. Observe the internal outlinks on your sample pages, apart from the sites’ navigation and footer links. This line item serves to check that the domains are consolidating their crawl budgets by linking to discoverable, indexable content on their websites. Here is an easy-to-use Chrome plugin from fellow Distiller Dom Woodman to see whether the pages are indexable.

To get a score of “2” or more, your sample pages should link to pages that:

  • Produce 200 status codes (for all, or nearly all), and
  • Have no more than ~300 outlinks per page (including the navigation and footer links).

Schema markup present. This is an easy check. Using Google’s Structured Data Testing Tool, look to see whether these pages have any schema markup implemented, and if so, whether it is correct. In order to receive a score of “2” here, your sampled pages need:

  • To have schema markup present, and
  • Be error-free.

Quality of schema is definitely important, and can make the difference of a brand receiving a score of “3” or “4.” Elements to keep in mind are: Organization or Website markup on every sample page, customized markup like BlogPosting or Article on editorial content, and Product markup on product pages.

There is a “home” for newly published content. A hub for new content can be the site’s blog, or a news section. For instance, Distilled’s “home for newly published content” is the Resources section. While this line item may seem like a binary (score of “0” if you don’t have a dedicated section for new content, or score of “2” if you do), there are nuances that can bring each brand’s score up or down. For example:

  • Is the home for new content unclear, or difficult to find? Approach this exercise as though you are a new visitor to the site.
  • Does there appear to be more than one “home” of new content?
  • If there is a content hub, is it apparent that this is for newly published pieces?

We’re not obviously messing up technical SEO. This is partly comprised of each brand’s performance leading up to this line item (mainly Title tags appear technically optimized through Schema markup present).

It would be unreasonable to run a full technical audit of each competitor, but take into account your own site’s technical SEO performance if you know there are outstanding technical issues to be addressed. In addition to the previous checklist items, I also like to use these Chrome extensions from Ayima: Page Insights and Redirect Path. These can provide quick checks for common technical SEO errors.

Content

Title tags appear optimized (editorially). Here is where we can add more context to the overall quality of the sample pages’ titles. Even if they are technically optimized, the titles may not be optimized for distinctiveness or written quality. Note that we are not evaluating keyword targeting, but rather a holistic (and broad) evaluation of how each competitor’s site approaches SEO factors. You should evaluate each page’s titles based on the following:

H1s appear optimized (editorially). The same rules that apply to titles for editorial quality also apply to H1 tags. Review each sampled page’s H1 for:

  • A unique H1 tag per page (language in H1 tags does not repeat),
  • H1 tags that are discrete from their page’s title, and
  • H1s represent the content on the page.

Internal linking supports organic content. Here you must look for internal outlinks outside of each site’s header and footer links. This evaluation is not based on the number of unique internal links on each sampled page, but rather on the quality of the pages to which our brands are linking.

While “organic content” is a broad term (and invariably differs by business vertical), here are some guidelines:

  • Look for links to informative pages like tutorials, guides, research, or even think pieces.
    • The blog posts on Moz (including this very one) are good examples of organic content.
  • Internal links should naturally continue the user’s journey, so look for topical progression in each site’s internal links.
  • Links to service pages, products, RSVP, or email subscription forms are not examples of organic content.
  • Make sure the internal links vary. If sampled pages are repeatedly linking to the same resources, this will only benefit those few pages.
    • This doesn’t mean that you should penalize a brand for linking to the same resource two, three, or even four times over. Use your best judgment when observing the sampled pages’ linking strategies.

Appropriate informational content. You can use the found “organic content” from your sample sets (and the samples themselves) to review whether the site is producing appropriate informational content.

What does that mean, exactly?

  • The content produced obviously fits within the site’s business vertical, area of expertise, or cause.
    • Example: Moz’s SEO and Inbound Marketing Blog is an appropriate fit for an SEO company.
  • The content on the site isn’t overly self-promotional, resulting in an average user not trusting this domain to produce unbiased information.
    • Example: If Distilled produced a list of “Best Digital Marketing Agencies,” it’s highly unlikely that users would find it trustworthy given our inherent bias!

Quality of content. Highly subjective, yes, but remember: you’re comparing brands against each other. Here’s what you need to evaluate here:

  • Are “informative” pages discussing complex topics under 400 words?
  • Do you want to read the content?
  • Largely, do the pages seem well-written and full of valuable information?
    • Conversely, are the sites littered with “listicles,” or full of generic info you can find in millions of other places online?

Quality of images/video. Also highly subjective (but again, compare your site to your competitors, and be brutally honest). Judge each site’s media items based on:

  • Resolution (do the images or videos appear to be high quality? Grainy?),
  • Whether they are unique (do the images or videos appear to be from stock resources?),
  • Whether the photos or videos are repeated on multiple sample pages.

Audience (engagement and sharing of content)

Number of linking root domains. This factor is exclusively based on the total number of dofollow linking root domains (LRDs) to each domain (not total backlinks).

You can pull this number from Moz’s Open Site Explorer (OSE) or from Ahrefs. Since this measurement is only for the total number of LRDs to competitor, you don’t need to graph them. However, you will have an opportunity to display the sheer quantity of links by their domain authority in the next checklist item.

Quality of linking root domains. Here is where we get to the quality of each site’s LRDs. Using the same LRD data you exported from either Moz’s OSE or Ahrefs, you can bucket each brand’s LRDs by domain authority and count the total LRDs by DA. Log these into this third sheet, and you’ll have a graph that illustrates their overall LRD quality (and will help you grade each domain).

Other people talk about our content. I like to use BuzzSumo for this checklist item. BuzzSumo allows you to see what sites have written about a particular topic or company. You can even refine your search to include or exclude certain terms as necessary.

You’ll need to set a timeframe to collect this information. Set this to the past year to account for seasonality.

Actively promoting content. Using BuzzSumo again, you can alter your search to find how many of each domain’s URLs have been shared on social networks. While this isn’t an explicit ranking factor, strong social media marketing is correlated with good SEO. Keep the timeframe to one year, same as above.

Creating content explicitly for organic acquisition. This line item may seem similar to Appropriate informational content, but its purpose is to examine whether the competitors create pages to target keywords users are searching for.

Plug your the same URLs from your found “organic content” into SEMRush, and note whether they are ranking for non-branded keywords. You can grade the competitors on whether (and how many of) the sampled pages are ranking for any non-branded terms, and weight them based on their relative rank positions.

Conversion

You should treat this section as a UX exercise. Visit each competitor’s sampled URLs as though they are your landing page from search. Is it clear what the calls to action are? What is the next logical step in your user journey? Does it feel like you’re getting the right information, in the right order as you click through?

Clear CTAs on site. Of your sample pages, examine what the calls to action (CTAs) are. This is largely UX-based, so use your best judgment when evaluating whether they seem easy to understand. For inspiration, take a look at these examples of CTAs.

Conversions appropriate to several funnel steps. This checklist item asks you to determine whether the funnel steps towards conversion feel like the correct “next step” from the user’s standpoint.

Even if you are not a UX specialist, you can assess each site as though you are a first time user. Document areas on the pages where you feel frustrated, confused, or not. User behavior is a ranking signal, so while this is a qualitative measurement, it can help you understand the UX for each site.

CTAs match user intent inferred from content. Here is where you’ll evaluate whether the CTAs match the user intent from the content as well as the CTA language. For instance, if a CTA prompts a user to click “for more information,” and takes them to a subscription page, the visitor will most likely be confused or irritated (and, in reality, will probably leave the site).


This analysis should help you holistically identify areas of opportunity available in your search landscape, without having to guess which “best practice” you should test next. Once you’ve started this competitive analysis, trends among the competition will emerge, and expose niches where your site can improve and potentially outpace your competition.

Kick off your own SEO competitive analysis and comment below on how it goes! If this process is your jam, or you’d like to argue with it, come see me speak about these competitive analyses and the campaigns they’ve inspired at SearchLove London. Bonus? If you use that link, you’ll get £50 off your tickets.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Competitive analysis: Making your auction insights work for you

Columnist Amy Bishop shares tips for identifying actionable takeaways from your AdWords auction insights data.

The post Competitive analysis: Making your auction insights work for you appeared first on Search Engine Land.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Link building: Preliminary research and analysis

Columnist Andrew Dennis outlines the research necessary to provide a solid foundation for your link-building strategy.

The post Link building: Preliminary research and analysis appeared first on Search Engine Land.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Advert