Tag Archive | "Crawl"

Search Buzz Video Recap: Google Search Algorithm Update, Crawl Spikes, Mobile Indexing, Movie Listings, Maps & More

This week in search I covered a possible Google search ranking update that touched down on September 4th or 5th. Before that we saw a spike in crawling activity from GoogleBot. I also posted the monthly Google Webmaster Report…


Search Engine Roundtable

Posted in IM NewsComments Off

Internal Linking & Mobile First: Large Site Crawl Paths in 2018 & Beyond

Posted by Tom.Capper

By now, you’ve probably heard as much as you can bear about mobile first indexing. For me, there’s been one topic that’s been conspicuously missing from all this discussion, though, and that’s the impact on internal linking and previous internal linking best practices.

In the past, there have been a few popular methods for providing crawl paths for search engines — bulky main navigations, HTML sitemap-style pages that exist purely for internal linking, or blocks of links at the bottom of indexed pages. Larger sites have typically used at least two or often three of these methods. I’ll explain in this post why all of these are now looking pretty shaky, and what I suggest you do about it.

Quick refresher: WTF are “internal linking” & “mobile-first,” Tom?

Internal linking is and always has been a vital component of SEO — it’s easy to forget in all the noise about external link building that some of our most powerful tools to affect the link graph are right under our noses. If you’re looking to brush up on internal linking in general, it’s a topic that gets pretty complex pretty quickly, but there are a couple of resources I can recommend to get started:

I’ve also written in the past that links may be mattering less and less as a ranking factor for the most competitive terms, and though that may be true, they’re still the primary way you qualify for that competition.

A great example I’ve seen recently of what happens if you don’t have comprehensive internal linking is eflorist.co.uk. (Disclaimer: eFlorist is not a client or prospective client of Distilled, nor are any other sites mentioned in this post)

eFlorist has local landing pages for all sorts of locations, targeting queries like “Flower delivery in [town].” However, even though these pages are indexed, they’re not linked to internally. As a result, if you search for something like “flower delivery in London,” despite eFlorist having a page targeted at this specific query (which can be found pretty much only through use of advanced search operators), they end up ranking on page 2 with their “flowers under £30” category page:

¯\_(ツ)_/¯

If you’re looking for a reminder of what mobile-first indexing is and why it matters, these are a couple of good posts to bring you up to speed:

In short, though, Google is increasingly looking at pages as they appear on mobile for all the things it was previously using desktop pages for — namely, establishing ranking factors, the link graph, and SEO directives. You may well have already seen an alert from Google Search Console telling you your site has been moved over to primarily mobile indexing, but if not, it’s likely not far off.

Get to the point: What am I doing wrong?

If you have more than a handful of landing pages on your site, you’ve probably given some thought in the past to how Google can find them and how to make sure they get a good chunk of your site’s link equity. A rule of thumb often used by SEOs is how many clicks a landing page is from the homepage, also known as “crawl depth.”

Mobile-first indexing impacts this on two fronts:

  1. Some of your links aren’t present on mobile (as is common), so your internal linking simply won’t work in a world where Google is going primarily with the mobile-version of your page
  2. If your links are visible on mobile, they may be hideous or overwhelming to users, given the reduced on-screen real estate vs. desktop

If you don’t believe me on the first point, check out this Twitter conversation between Will Critchlow and John Mueller:

In particular, that section I’ve underlined in red should be of concern — it’s unclear how much time we have, but sooner or later, if your internal linking on the mobile version of your site doesn’t cut it from an SEO perspective, neither does your site.

And for the links that do remain visible, an internal linking structure that can be rationalized on desktop can quickly look overbearing on mobile. Check out this example from Expedia.co.uk’s “flights to London” landing page:

Many of these links are part of the site-wide footer, but they vary according to what page you’re on. For example, on the “flights to Australia” page, you get different links, allowing a tree-like structure of internal linking. This is a common tactic for larger sites.

In this example, there’s more unstructured linking both above and below the section screenshotted. For what it’s worth, although it isn’t pretty, I don’t think this is terrible, but it’s also not the sort of thing I can be particularly proud of when I go to explain to a client’s UX team why I’ve asked them to ruin their beautiful page design for SEO reasons.

I mentioned earlier that there are three main methods of establishing crawl paths on large sites: bulky main navigations, HTML-sitemap-style pages that exist purely for internal linking, or blocks of links at the bottom of indexed pages. I’ll now go through these in turn, and take a look at where they stand in 2018.

1. Bulky main navigations: Fail to scale

The most extreme example I was able to find of this is from Monoprice.com, with a huge 711 links in the sitewide top-nav:

Here’s how it looks on mobile:

This is actually fairly usable, but you have to consider the implications of having this many links on every page of your site — this isn’t going to concentrate equity where you need it most. In addition, you’re potentially asking customers to do a lot of work in terms of finding their way around such a comprehensive navigation.

I don’t think mobile-first indexing changes the picture here much; it’s more that this was never the answer in the first place for sites above a certain size. Many sites have tens of thousands (or more), not hundreds of landing pages to worry about. So simply using the main navigation is not a realistic option, let alone an optimal option, for creating crawl paths and distributing equity in a proportionate or targeted way.

2. HTML sitemaps: Ruined by the counterintuitive equivalence of noindex,follow & noindex,nofollow

This is a slightly less common technique these days, but still used reasonably widely. Take this example from Auto Trader UK:

The idea is that this page is linked to from Auto Trader’s footer, and allows link equity to flow through into deeper parts of the site.

However, there’s a complication: this page in an ideal world be “noindex,follow.” However, it turns out that over time, Google ends up treating “noindex,follow” like “noindex,nofollow.” It’s not 100% clear what John Mueller meant by this, but it does make sense that given the low crawl priority of “noindex” pages, Google could eventually stop crawling them altogether, causing them to behave in effect like “noindex,nofollow.” Anecdotally, this is also how third-party crawlers like Moz and Majestic behave, and it’s how I’ve seen Google behave with test pages on my personal site.

That means that at best, Google won’t discover new links you add to your HTML sitemaps, and at worst, it won’t pass equity through them either. The jury is still out on this worst case scenario, but it’s not an ideal situation in either case.

So, you have to index your HTML sitemaps. For a large site, this means you’re indexing potentially dozens or hundreds of pages that are just lists of links. It is a viable option, but if you care about the quality and quantity of pages you’re allowing into Google’s index, it might not be an option you’re so keen on.

3. Link blocks on landing pages: Good, bad, and ugly, all at the same time

I already mentioned that example from Expedia above, but here’s another extreme example from the Kayak.co.uk homepage:

Example 1

Example 2

It’s no coincidence that both these sites come from the travel search vertical, where having to sustain a massive number of indexed pages is a major challenge. Just like their competitor, Kayak have perhaps gone overboard in the sheer quantity here, but they’ve taken it an interesting step further — notice that the links are hidden behind dropdowns.

This is something that was mentioned in the post from Bridget Randolph I mentioned above, and I agree so much I’m just going to quote her verbatim:

Note that with mobile-first indexing, content which is collapsed or hidden in tabs, etc. due to space limitations will not be treated differently than visible content (as it may have been previously), since this type of screen real estate management is actually a mobile best practice.

Combined with a more sensible quantity of internal linking, and taking advantage of the significant height of many mobile landing pages (i.e., this needn’t be visible above the fold), this is probably the most broadly applicable method for deep internal linking at your disposal going forward. As always, though, we need to be careful as SEOs not to see a working tactic and rush to push it to its limits — usability and moderation are still important, just as with overburdened main navigations.

Summary: Bite the on-page linking bullet, but present it well

Overall, the most scalable method for getting large numbers of pages crawled, indexed, and ranking on your site is going to be on-page linking — simply because you already have a large number of pages to place the links on, and in all likelihood a natural “tree” structure, by very nature of the problem.

Top navigations and HTML sitemaps have their place, but lack the scalability or finesse to deal with this situation, especially given what we now know about Google’s treatment of “noindex,follow” tags.

However, the more we emphasize mobile experience, while simultaneously relying on this method, the more we need to be careful about how we present it. In the past, as SEOs, we might have been fairly nervous about placing on-page links behind tabs or dropdowns, just because it felt like deceiving Google. And on desktop, that might be true, but on mobile, this is increasingly going to become best practice, and we have to trust Google to understand that.

All that said, I’d love to hear your strategies for grappling with this — let me know in the comments below!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

NEW On-Demand Crawl: Quick Insights for Sales, Prospecting, & Competitive Analysis

Posted by Dr-Pete

In June of 2017, Moz launched our entirely rebuilt Site Crawl, helping you dive deep into crawl issues and technical SEO problems, fix those issues in your Moz Pro Campaigns (tracked websites), and monitor weekly for new issues. Many times, though, you need quick insights outside of a Campaign context, whether you’re analyzing a prospect site before a sales call or trying to assess the competition.

For years, Moz had a lab tool called Crawl Test. The bad news is that Crawl Test never made it to prime-time and suffered from some neglect. The good news is that I’m happy to announce the full launch (as of August 2018) of On-Demand Crawl, an entirely new crawl tool built on the engine that powers Site Crawl, but with a UI designed around quick insights for prospecting and competitive analysis.

While you don’t need a Campaign to run a crawl, you do need to be logged into your Moz Pro subscription. If you don’t have a subscription, you can sign-up for a free trial and give it a whirl.

How can you put On-Demand Crawl to work? Let’s walk through a short example together.


All you need is a domain

Getting started is easy. From the “Moz Pro” menu, find “On-Demand Crawl” under “Research Tools”:

Just enter a root domain or subdomain in the box at the top and click the blue button to kick off a crawl. While I don’t want to pick on anyone, I’ve decided to use a real site. Our recent analysis of the August 1st Google update identified some sites that were hit hard, and I’ve picked one (lilluna.com) from that list.

Please note that Moz is not affiliated with Lil’ Luna in any way. For the most part, it seems to be a decent site with reasonably good content. Let’s pretend, just for this post, that you’re looking to help this site out and determine if they’d be a good fit for your SEO services. You’ve got a call scheduled and need to spot-check for any major problems so that you can go into that call as informed as possible.

On-Demand Crawls aren’t instantaneous (crawling is a big job), but they’ll generally finish between a few minutes and an hour. We know these are time-sensitive situations. You’ll soon receive an email that looks like this:

The email includes the number of URLs crawled (On-Demand will currently crawl up to 3,000 URLs), the total issues found, and a summary table of crawl issues by category. Click on the [View Report] link to dive into the full crawl data.


Assess critical issues quickly

We’ve designed On-Demand Crawl to assist your own human intelligence. You’ll see some basic stats at the top, but then immediately move into a graph of your top issues by count. The graph only displays issues that occur at least once on your site – you can click “See More” to show all of the issues that On-Demand Crawl tracks (the top two bars have been truncated)…

Issues are also color-coded by category. Some items are warnings, and whether they matter depends a lot on context. Other issues, like “Critcal Errors” (in red) almost always demand attention. So, let’s check out those 404 errors. Scroll down and you’ll see a list of “Pages Crawled” with filters. You’re going to select “4xx” in the “Status Codes” dropdown…

You can then pretty easily spot-check these URLs and find out that they do, in fact, seem to be returning 404 errors. Some appear to be legitimate content that has either internal or external links (or both). So, within a few minutes, you’ve already found something useful.

Let’s look at those yellow “Meta Noindex” errors next. This is a tricky one, because you can’t easily determine intent. An intentional Meta Noindex may be fine. An unintentional one (or hundreds of unintentional ones) could be blocking crawlers and causing serious harm. Here, you’ll filter by issue type…

Like the top graph, issues appear in order of prevalence. You can also filter by all pages that have issues (any issues) or pages that have no issues. Here’s a sample of what you get back (the full table also includes status code, issue count, and an option to view all issues)…

Notice the “?s=” common to all of these URLs. Clicking on a few, you can see that these are internal search pages. These URLs have no particular SEO value, and the Meta Noindex is likely intentional. Good technical SEO is also about avoiding false alarms because you lack internal knowledge of a site. On-Demand Crawl helps you semi-automate and summarize insights to put your human intelligence to work quickly.


Dive deeper with exports

Let’s go back to those 404s. Ideally, you’d like to know where those URLs are showing up. We can’t fit everything into one screen, but if you scroll up to the “All Issues” graph you’ll see an “Export CSV” option…

The export will honor any filters set in the page list, so let’s re-apply that “4xx” filter and pull the data. Your export should download almost immediately. The full export contains a wealth of information, but I’ve zeroed in on just what’s critical for this particular case…

Now, you know not only what pages are missing, but exactly where they link from internally, and can easily pass along suggested fixes to the customer or prospect. Some of these turn out to be link-heavy pages that could probably benefit from some clean-up or updating (if newer recipes are a good fit).

Let’s try another one. You’ve got 8 duplicate content errors. Potentially thin content could fit theories about the August 1st update, so this is worth digging into. If you filter by “Duplicate Content” issues, you’ll see the following message…

The 8 duplicate issues actually represent 18 pages, and the table returns all 18 affected pages. In some cases, the duplicates will be obvious from the title and/or URL, but in this case there’s a bit of mystery, so let’s pull that export file. In this case, there’s a column called “Duplicate Content Group,” and sorting by it reveals something like the following (there’s a lot more data in the original export file)…

I’ve renamed “Duplicate Content Group” to just “Group” and included the word count (“Words”), which could be useful for verifying true duplicates. Look at group #7 – it turns out that these “Weekly Menu Plan” pages are very image heavy and have a common block of text before any unique text. While not 100% duplicated, these otherwise valuable pages could easily look like thin content to Google and represent a broader problem.


Real insights in real-time

Not counting the time spent writing the blog post, running this crawl and diving in took less than an hour, and even that small amount of time spent uncovered more potential issues than what I could cover in this post. In less than an hour, you can walk into a client meeting or sales call with in-depth knowledge of any domain.

Keep in mind that many of these features also exist in our Site Crawl tool. If you’re looking for long-term, campaign insights, use Site Crawl (if you just need to update your data, use our “Recrawl” feature). If you’re looking for quick, one-time insights, check out On-Demand Crawl. Standard Pro users currently get 5 On-Demand Crawls per month (with limits increasing at higher tiers).

Your On-Demand Crawls are currently stored for 90 days. When you re-enter the feature, you’ll see a table of all of your recent crawls (the image below has been truncated):

Click on any row to go back to see the crawl data for that domain. If you get the sale and decide to move forward, congratulations! You can port that domain directly into a Moz campaign.

We hope you’ll try On-Demand Crawl out and let us know what you think. We’d love to hear your case studies, whether it’s sales, competitive analysis, or just trying to solve the mysteries of a Google update.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Announcing 5 NEW Feature Upgrades to Moz Pro’s Site Crawl, Including Pixel-Length Title Data

Posted by Dr-Pete

While Moz is hard at work on some major new product features (we’re hoping for two more big launches in 2017), we’re also working hard to iterate on recent advances. I’m happy to announce that, based on your thoughtful feedback, and our own ever-growing wish lists, we’ve recently launched five upgrades to Site Crawl.

1. Mark Issues as Fixed

It’s fine to ignore issues that don’t matter to your site or business, but many of you asked for a way to audit fixes or just let us know that you’ve made a fix prior to our next data update. So, from any issues page, you can now select items and “Mark as fixed” (screens below edited for content).

Fixed items will immediately be highlighted and, like Ignored issues, can be easily restored…

Unlike the “Ignore” feature, we’ll also monitor these issues for you and warn you if they reappear. In a perfect world, you’d fix an issue once and be done, but we all know that real web development just doesn’t work out that way.

2. View/Ignore/Fix More Issues

When we launched the “Ignore” feature, many of you were very happy (it was, frankly, long overdue), until you realized you could only ignore issues in chunks of 25 at a time. We have heard you loud and clear (seriously, Carl, stop calling) and have taken two steps. First, you can now view, ignore, and fix issues 100 at a time. This is the default – no action or extra clicks required.

3. Ignore Issues by Type

Second, you can now ignore entire issue types. Let’s say, for example, that Moz.com intentionally has 33,000 Meta Noindex tags (for example). We really don’t need to be reminded of that every week. So, once we make sure none of those are unintentional, we can go to the top of the issue page and click “Ignore Issue Type”:

Look for this in the upper-right of any individual issue page. Just like individual issues, you can easily track all of your ignored issues and start paying attention to them again at any time. We just want to help you clear out the noise so that you can focus on what really matters to you.

4. Pixel-length Title Data

For years now, we’ve known that Google cut display titles by pixel length. We’ve provided research on this subject and have built our popular title tag checker around pixel length, but providing this data at product scale proved to be challenging. I’m happy to say that we’ve finally overcome those challenges, and “Pixel Length” has replaced Character Length in our title tag diagnostics.

Google currently uses a 600-pixel container, but you may notice that you receive warnings below that length. Due to making space to add the “…” and other considerations, our research has shown that the true cut-off point that Google uses is closer to 570 pixels. Site Crawl reflects our latest research on the subject.

As with other issues, you can export the full data to CSV, to sort and filter as desired:

Looks like we’ve got some work to do when it comes to brevity. Long title tags aren’t always a bad thing, but this data will help you much better understand how and when Google may be cutting off your display titles in SERPs and decide whether you want to address it in specific cases.

5. Full Issue List Export

When we rebuilt Site Crawl, we were thrilled to provide data and exports on all pages crawled. Unfortunately, we took away the export of all issues (choosing to divide those up into major issue types). Some of you had clearly come to rely on the all issues export, and so we’ve re-added that functionality. You can find it next to “All Issues” on the main “Site Crawl Overview” page:

We hope you’ll try out all of the new features and report back as we continue to improve on our Site Crawl engine and UI over the coming year. We’d love to hear what’s working for you and what kind of results you’re seeing as you fix your most pressing technical SEO issues.

Find and fix site issues now

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

New Site Crawl: Rebuilt to Find More Issues on More Pages, Faster Than Ever!

Posted by Dr-Pete

First, the good news — as of today, all Moz Pro customers have access to the new version of Site Crawl, our entirely rebuilt deep site crawler and technical SEO auditing platform. The bad news? There isn’t any. It’s bigger, better, faster, and you won’t pay an extra dime for it.

A moment of humility, though — if you’ve used our existing site crawl, you know it hasn’t always lived up to your expectations. Truth is, it hasn’t lived up to ours, either. Over a year ago, we set out to rebuild the back end crawler, but we realized quickly that what we wanted was an entirely re-imagined crawler, front and back, with the best features we could offer. Today, we launch the first version of that new crawler.

Code name: Aardwolf

The back end is entirely new. Our completely rebuilt “Aardwolf” engine crawls twice as fast, while digging much deeper. For larger accounts, it can support up to ten parallel crawlers, for actual speeds of up to 20X the old crawler. Aardwolf also fully supports SNI sites (including Cloudflare), correcting a major shortcoming of our old crawler.

View/search *all* URLs

One major limitation of our old crawler is that you could only see pages with known issues. Click on “All Crawled Pages” in the new crawler, and you’ll be brought to a list of every URL we crawled on your site during the last crawl cycle:

You can sort this list by status code, total issues, Page Authority (PA), or crawl depth. You can also filter by URL, status codes, or whether or not the page has known issues. For example, let’s say I just wanted to see all of the pages crawled for Moz.com in the “/blog” directory…

I just click the [+], select “URL,” enter “/blog,” and I’m on my way.

Do you prefer to slice and dice the data on your own? You can export your entire crawl to CSV, with additional data including per-page fetch times and redirect targets.

Recrawl your site immediately

Sometimes, you just can’t wait a week for a new crawl. Maybe you relaunched your site or made major changes, and you have to know quickly if those changes are working. No problem, just click “Recrawl my site” from the top of any page in the Site Crawl section, and you’ll be on your way…

Starting at our Medium tier, you’ll get 10 recrawls per month, in addition to your automatic weekly crawls. When the stakes are high or you’re under tight deadlines for client reviews, we understand that waiting just isn’t an option. Recrawl allows you to verify that your fixes were successful and refresh your crawl report.

Ignore individual issues

As many customers have reminded us over the years, technical SEO is not a one-sized-fits-all task, and what’s critical for one site is barely a nuisance for another. For example, let’s say I don’t care about a handful of overly dynamic URLs (for many sites, it’s a minor issue). With the new Site Crawl, I can just select those issues and then “Ignore” them (see the green arrow for location):

If you make a mistake, no worries — you can manage and restore ignored issues. We’ll also keep tracking any new issues that pop up over time. Just because you don’t care about something today doesn’t mean you won’t need to know about it a month from now.

Fix duplicate content

Under “Content Issues,” we’ve launched an entirely new duplicate content detection engine and a better, cleaner UI for navigating that content. Duplicate content is now automatically clustered, and we do our best to consistently detect the “parent” page. Here’s a sample from Moz.com:

You can view duplicates by the total number of affected pages, PA, and crawl depth, and you can filter by URL. Click on the arrow (far-right column) for all of the pages in the cluster (shown in the screenshot). Click anywhere in the current table row to get a full profile, including the source page we found that link on.

Prioritize quickly & tactically

Prioritizing technical SEO problems requires deep knowledge of a site. In the past, in the interest of simplicity, I fear that we’ve misled some of you. We attempted to give every issue a set priority (high, medium, or low), when the difficult reality is that what’s a major problem on one site may be deliberate and useful on another.

With the new Site Crawl, we decided to categorize crawl issues tactically, using five buckets:

  • Critical Crawler Issues
  • Crawler Warnings
  • Redirect Issues
  • Metadata Issues
  • Content Issues

Hopefully, you can already guess what some of these contain. Critical Crawler Issues still reflect issues that matter first to most sites, such as 5XX errors and redirects to 404s. Crawler Warnings represent issues that might be very important for some sites, but require more context, such as meta NOINDEX.

Prioritization often depends on scope, too. All else being equal, one 500 error may be more important than one duplicate page, but 10,000 duplicate pages is a different matter. Go to the bottom of the Site Crawl Overview Page, and we’ve attempted to balance priority and scope to target your top three issues to fix:

Moving forward, we’re going to be launching more intelligent prioritization, including grouping issues by folder and adding data visualization of your known issues. Prioritization is a difficult task and one we haven’t helped you do as well as we could. We’re going to do our best to change that.

Dive in & tell us what you think!

All existing customers should have access to the new Site Crawl as of earlier this morning. Even better, we’ve been crawling existing campaigns with the Aardwolf engine for a couple of weeks, so you’ll have history available from day one! Stay tuned for a blog post tomorrow on effectively prioritizing Site Crawl issues, and be sure to register for the upcoming webinar.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Site Crawl, Day 1: Where Do You Start?

Posted by Dr-Pete

When you’re faced with the many thousands of potential issues a large site can have, where do you start? This is the question we tried to tackle when we rebuilt Site Crawl. The answer depends almost entirely on your site and can require deep knowledge of its history and goals, but I’d like to outline a process that can help you cut through the noise and get started.

Simplistic can be dangerous

Previously, we at Moz tried to label every issue as either high, medium, or low priority. This simplistic approach can be appealing, even comforting, and you may be wondering why we moved away from it. This was a very conscious decision, and it boils down to a couple of problems.

First, prioritization depends a lot on your intent. Misinterpreting your intent can lead to bad advice that ranges from confusing to outright catastrophic. Let’s say, for example, that we hired a brand-new SEO at Moz and they saw the following issue count pop up:

Almost 35,000 NOINDEX tags?! WHAT ABOUT THE CHILDREN?!!

If that new SEO then rushed to remove those tags, they’d be doing a lot of damage, not realizing that the vast majority of those directives are intentional. We can make our systems smarter, but they can’t read your mind, so we want to be cautious about false alarms.

Second, bucketing issues by priority doesn’t do much to help you understand the nature of those problems or how to go about fixing them. We now categorize Site Crawl issues into one of five descriptive types:

  • Critical Crawler Issues
  • Crawler Warnings
  • Redirect Issues
  • Metadata Issues
  • Content Issues

Categorizing by type allows you to be more tactical. The issues in our new “Redirect” category, for example, are going to have much more in common, which means they potentially have common fixes. Ultimately, helping you find problems is just step one. We want to do a better job at helping you fix them.

1. Start with Critical Crawler Issues

That’s not to say everything is subjective. Some problems block crawlers (not just ours, but search engines) from getting to your pages at all. We’ve grouped these “Critical Crawler Issues” into our first category, and they currently include 5XX errors, 4XX errors, and redirects to 4XX. If you have a sudden uptick in 5XX errors, you need to know, and almost no one intentionally redirects to a 404.

You’ll see Critical Crawler Issues highlighted throughout the Site Crawl interface:

Look for the red alert icon to spot critical issues quickly. Address these problems first. If a page can’t be crawled, then every other crawler issue is moot.

2. Balance issues with prevalence

When it comes to solving your technical SEO issues, we also have to balance severity with quantity. Knowing nothing else about your site, I would say that a 404 error is probably worth addressing before duplicate content — but what if you have eleven 404s and 17,843 duplicate pages? Your priorities suddenly look very different.

At the bottom of the Site Crawl home, check out “Moz Recommends Fixing”:

We’ve already done some of the math for you, weighting urgency by how prevalent the issue is. This does require some assumptions about prioritization, but if your time is limited, we hope it at least gives you a quick starting point to solve a couple of critical issues.

3. Solve multi-page issues

There’s another advantage to tackling issues with high counts. In many cases, you might be able to solve issues on hundreds (or even thousands) of pages with a single fix. This is where a more tactical approach can save you a lot of time and money.

Let’s say, for example, that I want to dig into my 916 pages on Moz.com missing meta descriptions. I immediately notice that some of these pages are blog post categories. So, I filter by URL:

I can quickly see that these pages account for 392 of my missing descriptions — a whopping 43% of them. If I’m concerned about this problem, then it’s likely that I could solve it with a fairly simple CMS page, wiping out hundreds of issues with a few lines of code.

In the near future, we hope to do some of this analysis for you, but if filtering isn’t doing the job, you can also export any list of issues to CSV. Then, pivot and filter to your heart’s content.

4. Dive into pages by PA & crawl depth

If you can’t easily spot clear patterns, or if you’ve solved some of those big issues, what next? Fixing thousands of problems one URL at a time is only worthwhile if you know those URLs are important.

Fortunately, you can now sort by Page Authority (PA) and Crawl Depth in Site Crawl. PA is our own internal metric of ranking ability (primarily powered by link equity), and Crawl Depth is the distance of a page from the home-page:

Here, I can see that there’s a redirect chain in one of our MozBar URLs, which is a very high-authority page. That’s probably one worth fixing, even if it isn’t part of an obvious, larger group.

5. Watch for spikes in new issues

Finally, as time goes on, you’ll also want to be alert to new issues, especially if they appear in large numbers. This could indicate a sudden and potentially damaging change. Site Crawl now makes tracking new issues easy, including alert icons, graphs, and a quick summary of new issues by category:

Any crawl is going to uncover some new pages (the content machine never rests), but if you’re suddenly seeing hundreds of new issues of a single type, it’s important to dig in quickly and make sure nothing’s wrong. In a perfect world, the SEO team would always know what changes other people and teams made to the site, but we all know it’s not a perfect world.

I hope this gives you at least a few ideas for how to quickly dive into your site’s technical SEO issues. If you’re an existing customer, you already have access to Moz’s new Site Crawl and all of the features discussed in this post.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Screen Shot: Reduce Load Time & Increase Google Crawl Budget

We’ve been talking a lot about Google crawl budget recently but here is a tweet that Google’s John Mueller retweeted from Twitter Rasmus SÃ’rensen where he said he was able to reduce his load time on his server by 50% which resulted in his crawl budget “exploding…


Search Engine Roundtable

Posted in IM NewsComments Off

Google’s Gary Illyes On Crawl Budget, Scheduling & Host Load

During the Stone Temple Consulting Q&A with Gary Illyes on YouTube the other week, he spoke about crawl budget and how Google doesn’t necessarily crawl based on crawl budget. He said internally…


Search Engine Roundtable

Posted in IM NewsComments Off

Google Webmaster Tools Crawl Errors: How To Get Detailed Data From the API

Earlier this week, I wrote about my disappointment that granular data (the number of URLs reported, the specifics of the errors…) was removed from Google webmaster tools. However, as I’ve been talking with Google, I’ve discovered that much of this detail is still available via the…



Please visit Search Engine Land for the full article.




Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Help Google Crawl Your Site More Effectively, But Use Caution

Google has introduced some changes to Webmaster Tools – in particular, handling of URLs with parameters. “URL Parameters helps you control which URLs on your site should be crawled by Googlebot, depending on the parameters that appear in these URLs,” …


WebProNews

Posted in IM NewsComments Off


Advert