Every site owner’s dream (and goal) is to rank as high as possible for their target audience. Every search engine’s dream (and goal) is to make sure that the most relevant websites rank the highest. But how is that relevancy built and interpreted? Does it have anything to do with what you say? That much we know to be true. Does it have anything to do with how (or where) you say it? That’s what we set to find out.
Having this in mind we’ve conducted an in-depth research, with a set of 34k keywords which we analyzed back and forth to find out the importance of a keyword’s occurrence in a title, URL, domain, subdomain and URI in rankings.
TL;DR – This is quite a large study. If you don’t have time to read it all, you can browse through the main take-aways.
Do all these characteristic matter? And if yes, how much? Even if these choices won’t ultimately make a significant change in your rankings, they are bound to make you more relevant to your intended target audience. There are also other experimental studies, which we refer to below, showing that a generic, keyword-similar URL or domain can make a whole lot of a difference when it comes to the success of online marketing campaigns. Below you can find a list with the main conclusions of this research.
- The Methodological Approach – How We Did the Research
- Using a Keyword in the URL Will Make a Site Appear More Relevant for Google.
- Keyword Appearance in Title Makes a Clear Difference Between Ranking 1st or 2nd.
- Using a Keyword in the URL Can Get You Closer to Ranking 1st.
- You Have a Chance of Ranking Better If Your Keyword Is Part of the Domain Name.
- Similarities Between Keywords and Subdomains Can’t Predict Any Change in the Rankings.
- If Your Domain Is an Exact Match for a Keyword, You Can Rank 1st Easier.
- Keyword Appearing in Both Title-Domain Boasts Impressive Results for the 1st and the 2nd Positions.
- The More Concise the URL, The Greater the Chance to Rank Better.
- SEO Friendly URLs Can Lead to Higher Perceived Relevance.
- Other Relevant Studies
- Take Aways
So what should you do? If possible, try to get around including a relevant keyword in the title, domain or URL of your website. Our study shows that there is moderate effect suggesting a possible correlation between the presence of a keyword in one of these and a higher rank in the search engine results list for that keyword. Better still, you can try to use keywords in all three, or at least a combination of two of them. If not, then the domain should be your highest priority. Don’t bother with the subdomain, though, that really seems to make no difference. And to truly maximize your chances of ranking as high as possible, you’d do well to keep the URL concise. In fact, if you could ensure that URLs to your pages never go over 50-60 characters, that would be perfect.
Does Google Count a Keyword in the Title, URL or URI When Ranking a Page?
Google says no, but various experiments and observations say yes.
Using a keyword in the URL might simply make a site appear more relevant.
Sure, Google will ultimately find it even if the keyword is buried deep inside the content on the site, but judging by this test, it looks like thinks might be slightly better if said keyword is in a more visible place (such as a URL or URI, for instance). As a straight up observation, that makes sense, but it’s important to keep things in perspective: experimental conclusions need to be backed up by quantitative analysis.
The Methodological Approach – How We Did the Research
First, let’s clarify some methodological assumptions. Understanding the methodology is important, because this is a complex problem to unfold, one which is not quite conventional. So the statistical tools we used are not quite conventional either. Firstly, it’s important to know that all the research was conducted over the months of May and June 2016.
We used a similarity algorithm based on the concept of the Levenshtein distance – a string metric measuring the difference between two sequences. It is probably the best algorithm to use when dealing with short strings of text. In our case, one of these strings was the keyword, while the other was the domain / title / URL. The Levenshtein distance can be roughly understood as the minimum number of edits required to change one word into the other. When normalized, this distance can be measured on a 1 to 100 scale (the value represents the fraction of the smallest over longest possible distance: ($str1_length – $levenshtein_distance) / $str1_length).
We applied this metric to no less than 34368 keywords. For each keyword we identified the first 20 positions in Google and checked the corresponding URLs. We then applied a Levenshtein-based analysis for the various comparisons we wanted to make in order to test the probability that there might be a relation between keyword and certain elements (URL, URI, title, domain, subdomain etc.) For each rank, this resulted in a score on the normalized scale, which we then interpreted according to the following thresholds: <30 means a small effect, between 30 and 60 means a moderate effect, and >60 means a large effect.
Do Keywords in Titles Have Any Ranking Importance?
What you get in the above chart is the distribution of the 3 levels of similarity (low, medium and high) among ranking positions 1 through 20. It’s tricky to talk about correlations when the differences seem quite subtle. For instance, we can plainly see that the largest number of instances with high similarity scores is for the 1st ranked websites. We can also see that there is an overall decreasing trend from the 1st to the 20th position (even though the relation is not quite linear.
It becomes clear, however, on closer inspection, that the interpretation is a bit more complex. For instance, the range between the 2nd and the 20th rank is 508, which would theoretically average to a difference of about 28 between any two ranks. That average difference, when compared to the smallest number of cases with a high similarity score, represents less than 2.5% of it. While that in itself is not a deal-breaker, it paints a pretty ambiguous picture when coupled with the lack of a linear relationship between high similarity scores and ranking positions.
There is, however, one thing that seems quite clear. Regardless of whether keyword in title makes the difference between ranking 3rd or 4th, 6th or 7th, 10th or 11th, it certainly makes a difference between ranking 1st or 2nd.
Keyword presence in titles makes a clear difference between ranking 1st or 2nd.
The difference between them, in terms of websites with a high similarity score, is 666 – bigger than the whole range between rank 2 and 20. This difference also represents close to 29% of the total number of websites with a high similarity score for the 1st rank.
One thing to remember, though, is that while this relation is pretty solid for distinguishing between the number 1 rank and the ones that follow (and there’s a somewhat linear relationship among the first 7 rank positions), there are also some irregularities. For instance, the first break in linearity is between the 2nd and 3rd ranks, signaling a somewhat strange relationship. It’s also a bit weird that other than the first 4 positions, the only place to find more than 8 000 pages with keywords appearing in the title is in the last 5 positions. Of course, that doesn’t necessarily overlap with the effect of that presence, but it does suggest that ensuring a keyword appears in the title might not get you as far as you might hope.
Does Google Count a Keyword in the URL/URI When Ranking a Page?
The previous analysis holds for the similarity between keyword and URL as well. When using the keyword in the URL, it can make a big impact in getting you closer to ranking 1st (all other factors being equal). There are 3 times as many pages with high similarity scores between keyword and URL for ranking 1st (725) than there are for ranking 2nd (234). Even though the number of high scores is much lower than for the keyword-title correlation, the range is comparatively much higher (156). This translates into double the amount of the lowest number of pages with a similarity high score, and two thirds the second highest such number. It also corresponds to an average difference between ranks of 8.6 (for ranks 2nd to 20th) – 11% of the smallest number of pages with a high score.
There is also more of a linear relationship, at least partly. In fact, for the first nine ranking positions, the linearity is perfect. However, it gets entirely unpredictable after the 10th rank.
In terms of just presence, the keyword appears in the URL in most cases for the 1st position. The numbers go down in a pretty linear fashion afterwards, all the way up to the 9th position. However, the same weird relation exists towards the 20th rank: the numbers pick up for the last few positions. Still, taking everything else into account, it does seem there is stronger link between keyword in the URL and rank than keyword in the Title and rank. So, if you have to choose, go for the URL over the Title.
When it comes to the relation between keyword and URI, several things are different from the previous data sets. First of all, there’s not that much of a difference between the number of pages with a high similarity score for the 1st and 2nd rank. For Title, the difference was almost 29% of the number of sites for rank 1. For URL, it was 2/3 of the number of sites for rank 1. For URI, that difference between 1st (3863) and 2nd (3719) represents less than 4% of the number of sites for rank 1. This would suggest a much smaller predictive effect. What’s weird, then, is that the relationship between keyword in URL and rank is linear for ranks 1 through 13, which is the best streak so far.
When using the keyword in the URL, it can make a big impact in getting you closer to ranking 1st.
When we try to break down the relevance of URI by the number of keyword in URI per position, things get a bit strange. the first position has the lowest number of keyword in URI, despite the fact that the relation between presence and similarity effect seems to be quite strong, based on the above. So what’s the explanation? Well, most likely, a valid interpretation would be that even though there are fewer instances of keyword in the URI that ranked 1st, in those instances the similarity effect is quite strong, whereas for the pages that ranked lower, the similarity isn’t as high in most instances.
Should You Choose to Place Keywords in Domains? Will That Help With Rankings?
When it comes to domain, things are a bit more similar to title and URL. There is a large discrepancy between the number of high scores for rank 1 and rank 2 (43% of the number for the 1st rank). There is a linear relationship for results corresponding to ranks 1 through 7. The range of the numbers for ranks 2 though 20 is considerable (bigger than the smallest number, 57% of the largest number). The average difference between the numbers corresponding to any 2 ranks is 58 (slightly higher than 7% of the lowest number). All in all, a small to moderate effect that holds up best for the first few rank positions.
You do have a higher chance of ranking better if your keyword is part of the domain name.
The effect seems to be most consistently supported by numbers when it comes to domain. It’s not just that linearity is visible for the first few positions when looking at keyword presence in domain name, it’s also that the difference between the number of keywords appearing in the domain for the 1st position is markedly higher than for any other position. This seems to reinforce the idea that the effect exists and that statistically, you actually have a higher chance of ranking better if your keyword is part of the domain name.
Does Using a Keyword in Subdomain Will Affect the Rankings?
Here is where the cookie crumbles. For the keyword-subdomain relation, there is absolutely no semblance of a linear relationship. Numbers of pages with high scores seem to follow no intuitive rule. What’s more, the largest number of high score similarity scores isn’t correlated with the 1st rank. There are 2 other ranks with larger numbers (3rd and 8th). All high score numbers fall within the 70-99 range (there is no outlier, like the one corresponding to the number 1 rank was for most other charts). When averaged, the difference between any 2 ranks amounts to a meager 1,52 – just barely over 2% of the smallest number of high scores. Quite clearly then, there is likely nothing to gain from trying to cram a keyword in the subdomain (though it still remains quite a good tactic when it comes to domains).
Even though some similarities between keywords and subdomains are high, they can’t really predict any movement in the rankings.
The erratic nature of the relationship between keywords and subdomains can be illuminated by the fact that keywords are entirely missing from subdomains. This means that effect sizes for this comparison need to be considered in the context of zero exact matches. This explains why, even though some similarities are considered to be high, they can’t really predict any movement in the rankings. So does this mean that it might be useful to try and include a keyword in a subdomain? Absolutely. Is there any reason to believe it will actually help? Not really.
Is There an Exact Match Domain Ranking Benefit?
Intuitively, the advantages of having an exact match domain (EMD) in terms of ranking should be self-evident. If it really does make a difference (even a tiny one) to have a keyword in the domain, or title, or URL, compared to having it just in the content, then having an EMD should mean striking gold. But because of that, it also stands to reason that more people would try to abuse that, so it’s also more likely that an EMD is a potential flag for spam. As long as the content on such a site is clean and qualitative, however, it’s definitely worth going for an exact match.
Two experimental studies strongly support the hypothesis that, all other things being equal, generic domains outperform non-generic domains when it comes to online marketing. One pitted ElectricBicycles.co.uk against YourBikes.co.uk and InAHurry.co.uk , while another pitted DivorceLawyer.com against VladimirLaw.com. Of course, it’s important to clarify that both experiment settings revolved around ad campaigns and that the edge of EMD sites over non-EMD sites was in term of Click-Through-Rate. Although this is clearly a different metric than the one used to measure how Google’s algorithms might evaluate their relevance, it does speak towards the larger case for EMD over non-EMD domains.
If your domain is an exact match for a keyword, chances are that you’re a lock for ranking 1st.
What does our analysis say specifically about EMD? If your domain is an exact match for a keyword, chances are that you’re a lock for ranking 1st. Or there’s a really high chance. There is a strong linear relation at work here, but more importantly, there is a very large (and significant) difference between 1st and 2nd rank in terms of number of pages which manage an exact match domain. In fact, there’s a huge difference between the number of EMD that ranked 1st (79) and pretty much all of the following 19 ranks combined (which make up just 48). That’s more than 50% larger.
What Is the Winning Ranking Formula?
So now that we know at least several instances of keyword use make both intuitive and statistical sense, what would happen if we were to have intersections of these situations? Would their powers combine yield proportionately larger gains?
One thing to notice is that there’s no intersection of a keyword being used for the URI, URL, title and domain. “But wait, isn’t more better?” Well, no. Firstly, because, as we’ve already seen, not everything has the potential to lead to a good outcome. So combining those things doesn’t lead to “better.” Secondly, because some things simply might not be practical. You can absolutely avoid 99% of risks in life by building an atomic bunker and living there for the rest of your existence. But that would be unnecessary. This is where we go back to being 100% in agreement with Matt Cutts – think about your users and build for them (as opposed to building for search engines).
Keyword appearing in both title-domain boasts impressive results for the 1st and the 2nd position .
In terms of smaller scale intersections, title-domain trumps and title-URL are quite similar. For title-domain, the high scores for similarity for the 1st rank are more than 2.5 times more numerous than the ones for the 2nd rank. That’s a pretty encouraging finding. The downside is that linearity is pretty finicky and breaks down quickly (right after the 3rd position). Bottom line, however, is that using a keyword for title and domain combination has a pretty high chance of landing you on number 1 (though it might not do anything for you in terms of a steady evolution through the ranks, everything else being equal).
The title-domain combo boasts a similarly impressive difference between the number of high scores for the 1st position (201) and that same number for the 2nd position (62). That’s more than 3 times as many. There’s also a pretty abrupt linearity in place, breaking off shortly after the 5th position. Again, there’s a pretty high chance that this combination can put you into the pole position, but it can’t really be explained in terms of universality and sure-fire recipes.
What Title and URL Length Should You Choose in Order to Rank Higher?
There seems to be little difference between the lengths of URLs ranked in the top 20. Sure, the more concise the URL, the greater the chance to be higher up, but really, as long as the length falls between 50 and 60 characters, you’re probably in a good spot (all other things being equal).
The more concise the URL, the greater the chance to be higher up.
Of course, having a concise URL is helpful in all sorts of ways beyond the search results hierarchy. For instance, even if most casual users don’t bother remembering full URLs nowadays, it can still count as advantage if you make it easy on the brain. Think about EMDs, for instance: instinctively, you are more likely to go with the URL that most resembles your thought process (“I need to find an electrical bicycle…. oh, looks: a site named just that”). The extra effort you can put into it is make sure that not only the homepage, but pretty much any page on your website has its URL fit in a neat number of characters (that, if too large to be easily remembered, can at least be short enough to be easily recognized).
Are SEO Friendly URLs a Must?
Matt Cutts actually answered a very similar question a few years back: is it better to have a keyword in the path or the filename? The answer is straightforward (excessive optimization in the title probably doesn’t really matter) and common sense (you really have to think of your URL through the lens of your potential user). However, Cutts allows for some ambiguity (perhaps strategically), by suggesting that the answer to this question is something that is more suitable to doing “some experimenting and see what works for you.” So we took his advice and went out to do churn some data.
Obviously, there’s another advantage to having SEO Friendly URLs: a lot of times they are also user-friendly URLs, meaning they are easy to think of and share with others.
SEO Friendly URLs can ultimately lead to higher perceived relevance, leading back to an edge from a SEO perspective.
It is easy to see why that would be desirable, especially in a paradigm that extends the notion of net neutrality to the fairness with which all results of a search are covered. Perhaps Google really is dreaming of a world where search results are judged and indexed based not on the keywords they use, but on the quality of their content. But the reality of things – at least for now – might be somewhat different.
Other Relevant Studies
There are other articles based on observation and even some experimental designs trying to tackle this question. But what our current study does (in a way that other writings on this subject do not) is to use quantitative analysis to try and predict your chances of success in deploying various strategies related to keywords for the purpose of ranking as high as possible. Should you try to include your favorite keywords in every possible corner of your site architecture? Or is it enough to drop it in the content, because the search engines are smart enough to distinguish a bluff from an actual quality piece of content?
Google is – most likely – smart enough, nowadays, that it will find your website even if it’s not optimized to have important keywords in the title, domain, or even URL. In fact, excessive optimization might be taken as a strong spam indicator, if not backed up by quality content. But if you aren’t trying to fool search engines, strategically placing relevant keywords in the title, domain and/or URL is likely to give you a slight edge over your competition (all other things being equal).
Including the keyword in the title, domain or URL/URI of the website is not a surefire way to get to the top of search engine results. However, if all other things (in terms of SEO and content quality) are equal, it might be the thing that sets you apart.
If possible, try to be consistent and use keywords that are relevant to you in more than one place. Use them in the title, in the domain and in the URL as well. Of course, you need to be aware that this used to be a “tactic” in the old days of primitive SEO, so doing that might also make you more prone to suspicion from the search engines. If you can’t afford the luxury (or don’t want to take the risk) of placing the keyword everywhere, the domain is your best bet. It turned out to have the most predictably linear relationship between keyword presence and rank (as well as keyword presence similarity effect and rank).
Last but not least, remember that most search engines are trying to think like the users (or sometimes even for the users). You’re not creating content for search engines, but for people. If it’s right for them, it will be right for search engines as well. There are many reasons to have a short URL for instance, and SEO is just one of them. Besides, relevance in the user’s minds and relevance in a search engine’s algorithms are probably more interconnected than most of use realize.
Who did this research
- Razvan Gavrilas
Researched & Audited the Analysis
- Cornelia Cozmiuc
Researched & Wrote the Paper
- Ionut Astratiei
Performed the Crawlings
and Data Validations