Tag Archive | "URLS"

Bing Ads recommends updating final ad URLs to HTTPS if supported

With the Chrome update this month, now is a good time to change your final ad URLs if your site supports the HTTPS protocol.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Google Doesn’t Decode MD5 Hash URLs For Keywords

I love seeing Google SEO related questions I’ve never seen before and this one, I never saw before. Does Google decode md5 hash values in URLs to extract hidden keywords within them for image search or web search. The answer is no but I am sure that is obvious to most of you…


Search Engine Roundtable

Posted in IM NewsComments Off

SEO Best Practices for Canonical URLs + the Rel=Canonical Tag – Whiteboard Friday

Posted by randfish

If you’ve ever had any questions about the canonical tag, well, have we got the Whiteboard Friday for you. In today’s episode, Rand defines what rel=canonical means and its intended purpose, when it’s recommended you use it, how to use it, and sticky situations to avoid.

SEO best practices for canonical URLs

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week, we’re going to chat about some SEO best practices for canonicalization and use of the rel=canonical tag.

Before we do that, I think it pays to talk about what a canonical URL is, because a canonical URL doesn’t just refer to a page upon which we are targeting or using the rel=canonical tag. Canonicalization has been around, in fact, much longer than the rel=canonical tag itself, which came out in 2009, and there are a bunch of different things that a canonical URL means.

What is a “canonical” URL?

So first off, what we’re trying to say is this URL is the one that we want Google and the other search engines to index and to rank. These other URLs that potentially have similar content or that are serving a similar purpose or perhaps are exact duplicates, but, for some reason, we have additional URLs of them, those ones should all tell the search engines, “No, no, this guy over here is the one you want.”

So, for example, I’ve got a canonical URL, ABC.com/a.

Then I have a duplicate of that for some reason. Maybe it’s a historical artifact or a problem in my site architecture. Maybe I intentionally did it. Maybe I’m doing it for some sort of tracking or testing purposes. But that URL is at ABC.com/b.

Then I have this other version, ABC.com/a?ref=twitter. What’s going on there? Well, that’s a URL parameter. The URL parameter doesn’t change the content. The content is exactly the same as A, but I really don’t want Google to get confused and rank this version, which can happen by the way. You’ll see URLs that are not the original version, that have some weird URL parameter ranking in Google sometimes. Sometimes this version gets more links than this version because they’re shared on Twitter, and so that’s the one everybody picked up and copied and pasted and linked to. That’s all fine and well, so long as we canonicalize it.

Or this one, it’s a print version. It’s ABC.com/aprint.html. So, in all of these cases, what I want to do is I want to tell Google, “Don’t index this one. Index this one. Don’t index this one. Index this one. Don’t index this one. Index this one.”

I can do that using this, the link rel=canonical, the href telling Google, “This is the page.” You put this in the header tag of any document and Google will know, “Aha, this is a copy or a clone or a duplicate of this other one. I should canonicalize all of my ranking signals, and I should make sure that this other version ranks.”

By the way, you can be self-referential. So it is perfectly fine for ABC.com/a to go ahead and use this as well, pointing to itself. That way, in the event that someone you’ve never even met decides to plug in question mark, some weird parameter and point that to you, you’re still telling Google, “Hey, guess what? This is the original version.”

Great. So since I don’t want Google to be confused, I can use this canonicalization process to do it. The rel=canonical tag is a great way to go. By the way, FYI, it can be used cross-domain. So, for example, if I republish the content on A at something like a Medium.com/@RandFish, which is, I think, my Medium account, /a, guess what? I can put in a cross-domain rel=canonical telling them, “This one over here.” Now, even if Google crawls this other website, they are going to know that this is the original version. Pretty darn cool.

Different ways to canonicalize multiple URLs

There are different ways to canonicalize multiple URLs.

1. Rel=canonical.

I mention that rel=canonical isn’t the only one. It’s one of the most strongly recommended, and that’s why I’m putting it at number one. But there are other ways to do it, and sometimes we want to apply some of these other ones. There are also not-recommended ways to do it, and I’m going to discuss those as well.

2. 301 redirect.

The 301 redirect, this is basically a status code telling Google, “Hey, you know what? I’m going to take /b, I’m going to point it to /a. It was a mistake to ever have /b. I don’t want anyone visiting it. I don’t want it clogging up my web analytics with visit data. You know what? Let’s just 301 redirect that old URL over to this new one, over to the right one.”

3. Passive parameters in Google search console.

Some parts of me like this, some parts of me don’t. I think for very complex websites with tons of URL parameters and a ton of URLs, it can be just an incredible pain sometimes to go to your web dev team and say like, “Hey, we got to clean up all these URL parameters. I need you to add the rel=canonical tag to all these different kinds of pages, and here’s what they should point to. Here’s the logic to do it.” They’re like, “Yeah, guess what? SEO is not a priority for us for the next six months, so you’re going to have to deal with it.”

Probably lots of SEOs out there have heard that from their web dev teams. Well, guess what? You can end around it, and this is a fine way to do that in the short term. Log in to your Google search console account that’s connected to your website. Make sure you’re verified. Then you can basically tell Google, through the Search Parameters section, to make certain kinds of parameters passive.

So, for example, you have sessionid=blah, blah, blah. You can set that to be passive. You can set it to be passive on certain kinds of URLs. You can set it to be passive on all types of URLs. That helps tell Google, “Hey, guess what? Whenever you see this URL parameter, just treat it like it doesn’t exist at all.” That can be a helpful way to canonicalize.

4. Use location hashes.

So let’s say that my goal with /b was basically to have exactly the same content as /a but with one slight difference, which was I was going to take a block of content about a subsection of the topic and place that at the top. So A has the section about whiteboard pens at the top, but B puts the section about whiteboard pens toward the bottom, and they put the section about whiteboards themselves up at the top. Well, it’s the same content, same search intent behind it. I’m doing the same thing.

Well, guess what? You can use the hash in the URL. So it’s a#b and that will jump someone — it’s also called a fragment URL — jump someone to that specific section on the page. You can see this, for example, Moz.com/about/jobs. I think if you plug in #listings, it will take you right to the job listings. Instead of reading about what it’s like to work here, you can just get directly to the list of jobs themselves. Now, Google considers that all one URL. So they’re not going to rank them differently. They don’t get indexed differently. They’re essentially canonicalized to the same URL.

NOT RECOMMENDED

I do not recommend…

5. Blocking Google from crawling one URL but not the other version.

Because guess what? Even if you use robots.txt and you block Googlebot’s spider and you send them away and they can’t reach it because you said robots.txt disallow /b, Google will not know that /b and /a have the same content on them. How could they?

They can’t crawl it. So they can’t see anything that’s here. It’s invisible to them. Therefore, they’ll have no idea that any ranking signals, any links that happen to point there, any engagement signals, any content signals, whatever ranking signals that might have helped A rank better, they can’t see them. If you canonicalize in one of these ways, now you’re telling Google, yes, B is the same as A, combine their forces, give me all the rankings ability.

6. I would also not recommend blocking indexation.

So you might say, “Ah, well Rand, I’ll use the meta robots no index tag, so that way Google can crawl it, they can see that the content is the same, but I won’t allow them to index it.” Guess what? Same problem. They can see that the content is the same, but unless Google is smart enough to automatically canonicalize, which I would not trust them on, I would always trust yourself first, you are essentially, again, preventing them from combining the ranking signals of B into A, and that’s something you really want.

7. I would not recommend using the 302, the 307, or any other 30x other than the 301.

This is the guy that you want. It is a permanent redirect. It is the most likely to be most successful in canonicalization, even though Google has said, “We often treat 301s and 302s similarly.” The exception to that rule is but a 301 is probably better for canonicalization. Guess what we’re trying to do? Canonicalize!

8. Don’t 40x the non-canonical version.

So don’t take /b and be like, “Oh, okay, that’s not the version we want anymore. We’ll 404 it.” Don’t 404 it when you could 301. If you send it over here with a 301 or you use the rel=canonical in your header, you take all the signals and you point them to A. You lose them if you 404 that in B. Now, all the signals from B are gone. That’s a sad and terrible thing. You don’t want to do that either.

The only time I might do this is if the page is very new or it was just an error. You don’t think it has any ranking signals, and you’ve got a bunch of other problems. You don’t want to deal with having to maintain the URL and the redirect long term. Fine. But if this was a real URL and real people visited it and real people linked to it, guess what? You need to redirect it because you want to save those signals.

When to canonicalize URLs

Last but not least, when should we canonicalize URLs versus not?

I. If the content is extremely similar or exactly duplicate.

Well, if it is the case that the content is either extremely similar or exactly duplicate on two different URLs, two or more URLs, you should always collapse and canonicalize those to a single one.

II. If the content is serving the same (or nearly the same) searcher intent (even if the KW targets vary somewhat).

If the content is not duplicate, maybe you have two pages that are completely unique about whiteboard pens and whiteboards, but even though the content is unique, meaning the phrasing and the sentence structures are the same, that does not mean that you shouldn’t canonicalize.

For example, this Whiteboard Friday about using the rel=canonical, about canonicalization is going to replace an old version from 2009. We are going to take that old version and we are going to use the rel=canonical. Why are we going to use the rel=canonical? So that you can still access the old one if for some reason you want to see the version that we originally came out with in 2009. But we definitely don’t want people visiting that one, and we want to tell Google, “Hey, the most up-to-date one, the new one, the best one is this new version that you’re watching right now.” I know this is slightly meta, but that is a perfectly reasonable use.

What I’m trying to aim at is searcher intent. So if the content is serving the same or nearly the same searcher intent, even if the keyword targeting is slightly different, you want to canonicalize those multiple versions. Google is going to do a much better job of ranking a single piece of content that has lots of good ranking signals for many, many keywords that are related to it, rather than splitting up your link equity and your other ranking signal equity across many, many pages that all target slightly different variations. Plus, it’s a pain in the butt to come up with all that different content. You would be best served by the very best content in one place.

III. If you’re republishing or refreshing or updating old content.

Like the Whiteboard Friday example I just used, you should use the rel=canonical in most cases. There are some exceptions. If you want to maintain that old version, but you’d like the old version’s ranking signals to come to the new version, you can take the content from the old version, republish that at /a-old. Then take /a and redirect that or publish the new version on there and have that version be the one that is canonical and the old version exist at some URL you’ve just created but that’s /old. So republishing, refreshing, updating old content, generally canonicalization is the way to go, and you can preserve the old version if you want.

IV. If content, a product, an event, etc. is no longer available and there’s a near best match on another URL.

If you have content that is expiring, a piece of content, a product, an event, something like that that’s going away, it’s no longer available and there’s a next best version, the version that you think is most likely to solve the searcher’s problems and that they’re probably looking for anyway, you can canonicalize in that case, usually with a 301 rather than with a rel=canonical, because you don’t want someone visiting the old page where nothing is available. You want both searchers and engines to get redirected to the new version, so good idea to essentially 301 at that point.

Okay, folks. Look forward to your questions about rel=canonicals, canonical URLs, and canonicalization in general in SEO. And we’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Structuring URLs for Easy Data Gathering and Maximum Efficiency

Posted by Dom-Woodman

Imagine you work for an e-commerce company.

Wouldn’t it be useful to know the total organic sessions and conversions to all of your products? Every week?

If you have access to some analytics for an e-commerce company, try and generate that report now. Give it 5 minutes.

Done?

Or did that quick question turn out to be deceptively complicated? Did you fall into a rabbit hole of scraping and estimations?

Not being able to easily answer that question — and others like it — is costing you thousands every year.

Let’s jump back a step

Every online business, whether it’s a property portal or an e-commerce store, will likely have spent hours and hours agonizing over decisions about how their website should look, feel, and be constructed.

The biggest decision is usually this: What will we build our website with? And from there, there are hundreds of decisions, all the way down to what categories should we have on our blog?

Each of these decisions will generate future costs and opportunities, shaping how the business operates.

Somewhere in this process, a URL structure will be decided on. Hopefully it will be logical, but the context in which it’s created is different from how it ends up being used.

As a business grows, the desire for more information and better analytics grows. We hire data analysts and pay agencies thousands of dollars to go out, gather this data, and wrangle it into a useful format so that smart business decisions can be made.

It’s too late. You’ve already wasted £1000s a year.

It’s already too late; by this point, you’ve already created hours and hours of extra work for the people who have to analyze your data and thousands will be wasted.

All because no one structured the URLs with data gathering in mind.

How about an example?

Let’s go back to the problem we talked about at the start, but go through the whole story. An e-commerce company goes to an agency and asks them to get total organic sessions to all of their product pages. They want to measure performance over time.

Now this company was very diligent when they made their site. They’d read Moz and hired an SEO agency when they designed their website and so they’d read this piece of advice: products need to sit at the root. (E.g. mysite.com/white-t-shirt.)

Apparently a lot of websites read this piece of advice, because with minimal searching you can find plenty of sites whose product pages that rank do sit at the root: Appleyard Flowers, Game, Tesco Direct.

At one level it makes sense: a product might be in multiple categories (LCD & 42” TVs, for example), so you want to avoid duplicate content. Plus, if you changed the categories, you wouldn’t want to have to redirect all the products.

But from a data gathering point of view, this is awful. Why? There is now no way in Google Analytics to select all the products unless we had the foresight to set up something earlier, like a custom dimension or content grouping. There is nothing that separates the product URLs from any other URL we might have at the root.

How could our hypothetical data analyst get the data at this point?

They might have to crawl all the pages on the site so they can pick them out with an HTML footprint (a particular piece of HTML on a page that identifies the template), or get an internal list from whoever owns the data in the organization. Once they’ve got all the product URLs, they’ll then have to match this data to the Google Analytics in Excel, probably with a VLOOKUP or, if the data set is too large, a database.

Shoot. This is starting to sound quite expensive.

And of course, if you want to do this analysis regularly, that list will constantly change. The range of products being sold will change. So it will need to be a scheduled scrape or automated report. If we go the scraping route, we could do this, but crawling regularly isn’t possible with Screaming Frog. Now we’re either spending regular time on Screaming Frog or paying for a cloud crawler that you can schedule. If we go the other route, we could have a dev build us an internal automated report we can go to once we can get the resource internally.

Wow, now this is really expensive: a couple days’ worth of dev time, or a recurring job for your SEO consultant or data analyst each week.

This could’ve been a couple of clicks on a default report.

If we have the foresight to put all the products in a folder called /products/, this entire lengthy process becomes one step:

Load the landing pages report in Google Analytics and filter for URLs beginning with /product/.

Congratulations — you’ve just cut a couple days off your agency fee, saved valuable dev time, or gained the ability to fire your second data analyst because your first is now so damn efficient (sorry, second analysts).

As a data analyst or SEO consultant, you continually bump into these kinds of issues, which suck up time and turn quick tasks into endless chores.

What is unique about a URL?

For most analytics services, it’s the main piece of information you can use to identify the page. Google Analytics, Google Search Console, log files, all of these only have access to the URL most of the time and in some cases that’s all you’ll get — you can never change this.

The vast majority of site analyses requires working with templates and generalizing across groups of similar pages. You need to work with templates and you need to be able to do this by URL.

It’s crucial.

There’s a Jeff Bezos saying that’s appropriate here:

“There are two types of decisions. Type 1 decisions are not reversible, and you have to be very careful making them. Type 2 decisions are like walking through a door — if you don’t like the decision, you can always go back.”

Setting URLs is very much a Type 1 decision. As anyone in SEO knows, you really don’t want to be constantly changing URLs; it causes a lot of problems, so when they’re being set up we need to take our time.

How should you set up your URLs?

How do you pick good URL patterns?

First, let’s define a good pattern. A good pattern is something which we can use to easily select a template of URLs, ideally using contains rather than any complicated regex.

This usually means we’re talking about adding folders because they’re easiest to find with just a contains filter, i.e. /products/, /blogs/, etc.

We also want to keep things human-readable when possible, so we need to bear that in mind when choosing our folders.

So where should we add folders to our URLs?

I always ask the following two questions:

  • Will I need to group the pages in this template together?
    • If a set of pages needs grouping I need to put them in the same folder, so we can identify this by URL.
  • Are there crucial sub-groupings for this set of pages? If there are, are they mutually exclusive and how often might they change?
    • If there are common groupings I may want to make, then I should consider putting this in the URL, unless those data groupings are liable to change.

Let’s look at a couple examples.

Firstly, back to our product example: let’s suppose we’re setting up product URLs for a fashion e-commerce store.

Will I need to group the products together? Yes, almost certainly. There clearly needs to be a way of grouping in the URL. We should put them in a /product/ folder.

Within in this template, how might I need to group these URLs together? The most plausible grouping for products is the product category. Let’s take a black midi dress.

What about putting “little black dress” or “midi” as a category? Well, are they mutually exclusive? Our dress could fit in the “little black dress” category and the “midi dress” category, so that’s probably not something we should add as a folder in the URL.

What about moving up a level and using “dress” as a category? Now that is far more suitable, if we could reasonably split all our products into:

  • Dresses
  • Tops
  • Skirts
  • Trousers
  • Jeans

And if we were happy with having jeans and trousers separate then this might indeed be an excellent fit that would allow us to easily measure the performance of each top-level category. These also seem relatively unlikely to change and, as long as we’re happy having this type of hierarchy at the top (as opposed to, say, “season,” for example), it makes a lot of sense.

What are some common URL patterns people should use?

Product pages

We’ve banged on about this enough and gone through the example above. Stick your products in a /products/ folder.

Articles

Applying the same rules we talked about to articles and two things jump out. The first is top-level categorization.

For example, adding in the following folders would allow you to easily measure the top-level performance of articles:

  • Travel
  • Sports
  • News

You should, of course, be keeping them all in a /blog/ or /guides/ etc. folder too, because you won’t want to group just by category.

Here’s an example of all 3:

  • A bad blog article URL: example.com/this-is-an-article-name/
  • A better blog article URL: example.com/blog/this-is-an-article-name/
  • An even better blog article URL: example.com/blog/sports/this-is-an-article-name

The second, which obeys all our rules, is author groupings, which may be well-suited for editorial sites with a large number of authors that they want performance stats on.

Location grouping

Many types of websites often have category pages per location. For example:

  • Cars for sale in Manchester – /for-sale/vehicles/manchester
  • Cars for sale in Birmingham. – /for-sale/vehicles/birmingham

However, there are many different levels of location granularity. For example, here are 4 different URLs, each a more specific location in the one above it (sorry to all our non-UK readers — just run with me here).

  • Cars for sale in Suffolk – /for-sale/vehicles/suffolk
  • Cars for sale in Ipswich – /for-sale/vehicles/ipswich
  • Cars for sale in Ipswich center – /for-sale/vehicles/ipswich-center
  • Cars for sale on Lancaster road – /for-sale/vehicles/lancaster-road

Obviously every site will have different levels of location granularity, but a grouping often missing here is providing the level of location granularity in the URL. For example:

  • Cars for sale in Suffolk – /for-sale/cars/county/suffolk
  • Cars for sale in Ipswich – /for-sale/vehicles/town/ipswich
  • Cars for sale in Ipswich center – /for-sale/vehicles/area/ipswich-center
  • Cars for sale on Lancaster road – /for-sale/vehicles/street/lancaster-road

This could even just be numbers (although this is less ideal because it breaks our second rule):

  • Cars for sale in Suffolk – /for-sale/vehicles/04/suffolk
  • Cars for sale in Ipswich – /for-sale/vehicles/03/ipswich
  • Cars for sale in Ipswich center – /for-sale/vehicles/02/ipswich-center
  • Cars for sale on Lancaster road – /for-sale/vehicles/01/lancaster-road

This makes it very easy to assess and measure the performance of each layer so you can understand if it’s necessary, or if perhaps you’ve aggregated too much.

What other good (or bad) examples of this has the community come across? Let’s here it!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

SearchCap: Google AMP URLs, Google Maps traffic & Super Bowl search ads

Below is what happened in search today, as reported on Search Engine Land and from other places across the web.

The post SearchCap: Google AMP URLs, Google Maps traffic & Super Bowl search ads appeared first on Search Engine Land.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

SearchCap: Google & Hillary Clinton, SEO URLs & AdWords explained

Below is what happened in search today, as reported on Search Engine Land and from other places across the web.

The post SearchCap: Google & Hillary Clinton, SEO URLs & AdWords explained appeared first on Search Engine Land.



Please visit Search Engine Land for the full article.


Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing

Posted in IM NewsComments Off

Google: We Will Shorten Your URLs For You

It is not uncommon to find Google indexing and ranking URLs that have extra tracking parameters added to the end of the URL. It is not uncommon to see the Google Analytics UTM tracking parameters as one of those cases…


Search Engine Roundtable

Posted in IM NewsComments Off

Google: Don’t Forget To Upgrade Your URLs By This Date

Back in February, Google announced the launch of new “Upgraded URLs” for AdWords Ads, which enable advertisers to spend less time managing tracking updates while reducing crawl and load times for their websites. They also come with new ValueTrack parameters, which provide new insights about ads. The deadline to upgrade your URLs is July 1, and Google is reminding advertisers …

The post Google: Don’t Forget To Upgrade Your URLs By This Date appeared first on WebProNews.


WebProNews – WebProNews

Posted in IM NewsComments Off

Should I Use Relative or Absolute URLs? – Whiteboard Friday

Posted by RuthBurrReedy

It was once commonplace for developers to code relative URLs into a site. There are a number of reasons why that might not be the best idea for SEO, and in today’s Whiteboard Friday, Ruth Burr Reedy is here to tell you all about why.

Relative vs Absolute URLs Whiteboard

For reference, here’s a still of this week’s whiteboard. Click on it to open a high resolution image in a new tab!

Let’s discuss some non-philosophical absolutes and relatives

Howdy, Moz fans. My name is Ruth Burr Reedy. You may recognize me from such projects as when I used to be the Head of SEO at Moz. I’m now the Senior SEO Manager at BigWing Interactive in Oklahoma City. Today we’re going to talk about relative versus absolute URLs and why they are important.

At any given time, your website can have several different configurations that might be causing duplicate content issues. You could have just a standard http://www.example.com. That’s a pretty standard format for a website.

But the main sources that we see of domain level duplicate content are when the non-www.example.com does not redirect to the www or vice-versa, and when the HTTPS versions of your URLs are not forced to resolve to HTTP versions or, again, vice-versa. What this can mean is if all of these scenarios are true, if all four of these URLs resolve without being forced to resolve to a canonical version, you can, in essence, have four versions of your website out on the Internet. This may or may not be a problem.

It’s not ideal for a couple of reasons. Number one, duplicate content is a problem because some people think that duplicate content is going to give you a penalty. Duplicate content is not going to get your website penalized in the same way that you might see a spammy link penalty from Penguin. There’s no actual penalty involved. You won’t be punished for having duplicate content.

The problem with duplicate content is that you’re basically relying on Google to figure out what the real version of your website is. Google is seeing the URL from all four versions of your website. They’re going to try to figure out which URL is the real URL and just rank that one. The problem with that is you’re basically leaving that decision up to Google when it’s something that you could take control of for yourself.

There are a couple of other reasons that we’ll go into a little bit later for why duplicate content can be a problem. But in short, duplicate content is no good.

However, just having these URLs not resolve to each other may or may not be a huge problem. When it really becomes a serious issue is when that problem is combined with injudicious use of relative URLs in internal links. So let’s talk a little bit about the difference between a relative URL and an absolute URL when it comes to internal linking.

With an absolute URL, you are putting the entire web address of the page that you are linking to in the link. You’re putting your full domain, everything in the link, including /page. That’s an absolute URL.

However, when coding a website, it’s a fairly common web development practice to instead code internal links with what’s called a relative URL. A relative URL is just /page. Basically what that does is it relies on your browser to understand, “Okay, this link is pointing to a page that’s on the same domain that we’re already on. I’m just going to assume that that is the case and go there.”

There are a couple of really good reasons to code relative URLs

1) It is much easier and faster to code.

When you are a web developer and you’re building a site and there thousands of pages, coding relative versus absolute URLs is a way to be more efficient. You’ll see it happen a lot.

2) Staging environments

Another reason why you might see relative versus absolute URLs is some content management systems — and SharePoint is a great example of this — have a staging environment that’s on its own domain. Instead of being example.com, it will be examplestaging.com. The entire website will basically be replicated on that staging domain. Having relative versus absolute URLs means that the same website can exist on staging and on production, or the live accessible version of your website, without having to go back in and recode all of those URLs. Again, it’s more efficient for your web development team. Those are really perfectly valid reasons to do those things. So don’t yell at your web dev team if they’ve coded relative URLS, because from their perspective it is a better solution.

Relative URLs will also cause your page to load slightly faster. However, in my experience, the SEO benefits of having absolute versus relative URLs in your website far outweigh the teeny-tiny bit longer that it will take the page to load. It’s very negligible. If you have a really, really long page load time, there’s going to be a whole boatload of things that you can change that will make a bigger difference than coding your URLs as relative versus absolute.

Page load time, in my opinion, not a concern here. However, it is something that your web dev team may bring up with you when you try to address with them the fact that, from an SEO perspective, coding your website with relative versus absolute URLs, especially in the nav, is not a good solution.

There are even better reasons to use absolute URLs

1) Scrapers

If you have all of your internal links as relative URLs, it would be very, very, very easy for a scraper to simply scrape your whole website and put it up on a new domain, and the whole website would just work. That sucks for you, and it’s great for that scraper. But unless you are out there doing public services for scrapers, for some reason, that’s probably not something that you want happening with your beautiful, hardworking, handcrafted website. That’s one reason. There is a scraper risk.

2) Preventing duplicate content issues

But the other reason why it’s very important to have absolute versus relative URLs is that it really mitigates the duplicate content risk that can be presented when you don’t have all of these versions of your website resolving to one version. Google could potentially enter your site on any one of these four pages, which they’re the same page to you. They’re four different pages to Google. They’re the same domain to you. They are four different domains to Google.

But they could enter your site, and if all of your URLs are relative, they can then crawl and index your entire domain using whatever format these are. Whereas if you have absolute links coded, even if Google enters your site on www. and that resolves, once they crawl to another page, that you’ve got coded without the www., all of that other internal link juice and all of the other pages on your website, Google is not going to assume that those live at the www. version. That really cuts down on different versions of each page of your website. If you have relative URLs throughout, you basically have four different websites if you haven’t fixed this problem.

Again, it’s not always a huge issue. Duplicate content, it’s not ideal. However, Google has gotten pretty good at figuring out what the real version of your website is.

You do want to think about internal linking, when you’re thinking about this. If you have basically four different versions of any URL that anybody could just copy and paste when they want to link to you or when they want to share something that you’ve built, you’re diluting your internal links by four, which is not great. You basically would have to build four times as many links in order to get the same authority. So that’s one reason.

3) Crawl Budget

The other reason why it’s pretty important not to do is because of crawl budget. I’m going to point it out like this instead.

When we talk about crawl budget, basically what that is, is every time Google crawls your website, there is a finite depth that they will. There’s a finite number of URLs that they will crawl and then they decide, “Okay, I’m done.” That’s based on a few different things. Your site authority is one of them. Your actual PageRank, not toolbar PageRank, but how good Google actually thinks your website is, is a big part of that. But also how complex your site is, how often it’s updated, things like that are also going to contribute to how often and how deep Google is going to crawl your site.

It’s important to remember when we think about crawl budget that, for Google, crawl budget cost actual dollars. One of Google’s biggest expenditures as a company is the money and the bandwidth it takes to crawl and index the Web. All of that energy that’s going into crawling and indexing the Web, that lives on servers. That bandwidth comes from servers, and that means that using bandwidth cost Google actual real dollars.

So Google is incentivized to crawl as efficiently as possible, because when they crawl inefficiently, it cost them money. If your site is not efficient to crawl, Google is going to save itself some money by crawling it less frequently and crawling to a fewer number of pages per crawl. That can mean that if you have a site that’s updated frequently, your site may not be updating in the index as frequently as you’re updating it. It may also mean that Google, while it’s crawling and indexing, may be crawling and indexing a version of your website that isn’t the version that you really want it to crawl and index.

So having four different versions of your website, all of which are completely crawlable to the last page, because you’ve got relative URLs and you haven’t fixed this duplicate content problem, means that Google has to spend four times as much money in order to really crawl and understand your website. Over time they’re going to do that less and less frequently, especially if you don’t have a really high authority website. If you’re a small website, if you’re just starting out, if you’ve only got a medium number of inbound links, over time you’re going to see your crawl rate and frequency impacted, and that’s bad. We don’t want that. We want Google to come back all the time, see all our pages. They’re beautiful. Put them up in the index. Rank them well. That’s what we want. So that’s what we should do.

There are couple of ways to fix your relative versus absolute URLs problem

1) Fix what is happening on the server side of your website

You have to make sure that you are forcing all of these different versions of your domain to resolve to one version of your domain. For me, I’m pretty agnostic as to which version you pick. You should probably already have a pretty good idea of which version of your website is the real version, whether that’s www, non-www, HTTPS, or HTTP. From my view, what’s most important is that all four of these versions resolve to one version.

From an SEO standpoint, there is evidence to suggest and Google has certainly said that HTTPS is a little bit better than HTTP. From a URL length perspective, I like to not have the www. in there because it doesn’t really do anything. It just makes your URLs four characters longer. If you don’t know which one to pick, I would pick one this one HTTPS, no W’s. But whichever one you pick, what’s really most important is that all of them resolve to one version. You can do that on the server side, and that’s usually pretty easy for your dev team to fix once you tell them that it needs to happen.

2) Fix your internal links

Great. So you fixed it on your server side. Now you need to fix your internal links, and you need to recode them for being relative to being absolute. This is something that your dev team is not going to want to do because it is time consuming and, from a web dev perspective, not that important. However, you should use resources like this Whiteboard Friday to explain to them, from an SEO perspective, both from the scraper risk and from a duplicate content standpoint, having those absolute URLs is a high priority and something that should get done.

You’ll need to fix those, especially in your navigational elements. But once you’ve got your nav fixed, also pull out your database or run a Screaming Frog crawl or however you want to discover internal links that aren’t part of your nav, and make sure you’re updating those to be absolute as well.

Then you’ll do some education with everybody who touches your website saying, “Hey, when you link internally, make sure you’re using the absolute URL and make sure it’s in our preferred format,” because that’s really going to give you the most bang for your buck per internal link. So do some education. Fix your internal links.

Sometimes your dev team going to say, “No, we can’t do that. We’re not going to recode the whole nav. It’s not a good use of our time,” and sometimes they are right. The dev team has more important things to do. That’s okay.

3) Canonicalize it!

If you can’t get your internal links fixed or if they’re not going to get fixed anytime in the near future, a stopgap or a Band-Aid that you can kind of put on this problem is to canonicalize all of your pages. As you’re changing your server to force all of these different versions of your domain to resolve to one, at the same time you should be implementing the canonical tag on all of the pages of your website to self-canonize. On every page, you have a canonical page tag saying, “This page right here that they were already on is the canonical version of this page. ” Or if there’s another page that’s the canonical version, then obviously you point to that instead.

But having each page self-canonicalize will mitigate both the risk of duplicate content internally and some of the risk posed by scrappers, because when they scrape, if they are scraping your website and slapping it up somewhere else, those canonical tags will often stay in place, and that lets Google know this is not the real version of the website.

In conclusion, relative links, not as good. Absolute links, those are the way to go. Make sure that you’re fixing these very common domain level duplicate content problems. If your dev team tries to tell you that they don’t want to do this, just tell them I sent you. Thanks guys.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


Moz Blog

Posted in IM NewsComments Off

Thanksgiving roundup: YouTube URLs, Vine gets pushy and more

inigo montoyaSince this Friday is a holiday (Black Friday is a legal holiday, isn’t it), I thought I’d do my weekly tidbit roundup a little early.

Putting the YOU back in YouTube

YouTube is making it possible to change your YouTube user name to something more fitting for your brand. This is huge because so many people didn’t give a lot of thought to the name they created when they signed up. My YouTube name, TVCyn, was created when I was working as a TV reporter. Now, it’s a little off the mark and not intuitive for those who know my by my full name or often used nickname.

Unfortunately, I don’t qualify for a change because only channels with more than 500 followers are eligible and of course, you can only pick a name that isn’t already in use.

Vine gets pushy

Vine notifcationsWorried about missing a terrific Vine video? Worry no more. Vine just added push notifications for favorites.

Click the star in the corner of any Vine vidder profile to add them to your favorites list. Then, when that vidder uploads a new Vine, you’ll get an in-app and mobile feed notification.

If you use Vine for your business, this new feature could help you get more views provided you’re creating works interesting enough to favorite in the first place.

Twitter peeks at your apps

Kurt Wagner of Re/Code was kind enough to point out an addition to the Twitter Security and Privacy page. As of the most recent app update, Twitter will begin collecting data about the other apps you’re using on your mobile device. The support page isn’t coy about the reason behind the move; they want more information about you so they can serve up properly targeted ads.

If you don’t want Twitter looking at your apps, you have to opt-out and you’ll find instructions for doing that on this support page. Figuring most people who know about it won’t bother to opt-out and figuring that even more people won’t even know what’s happening, Twitter should end up with a big pile of helpful data in a short period of time. More data means better targeting and that means better ROIs for marketers.

Good Read

I’ve been hanging on to this one all week, thinking I’m going to write a post about it but it didn’t happen. So, here’s my Thanksgiving gift to all of you: The Magic Behind Unboxing on YouTube from Think Google. Check it out.

That’s it for me. Have a wonderful Thanksgiving if you’re in the US. If you’re not in the US, have a great Thursday. I’ll see you back here next week to talk about the crazy Black Friday numbers.

Marketing Pilgrim – Internet News and Opinion

Posted in IM NewsComments Off

Advert