You may have noticed that when you use Google to search for a certain answer to a question, a lot of times you actually get some sort of answer and not just a series of search results. For instance, if you’ve just recently made a terrible decision about changing your go-to barber’s shop and would like to find out how soon the damage will fade away, you might ask Google “how fast does hair grow”, to which the famed oracle of wisdom will give an actual answer, in what we will refer to as an answer box. Of course, it hasn’t come up with the answer on its own, but knowing what to look for is pretty cool in itself. Answer boxes can contain text, graphs, maps, images or even videos.
Being on the Google’s answer box will do you a big favor, making you appear on Google’s first page not once, but twice. So it’s not just showing up in the results, it’s showing up in the answer box, which has a different status in the mind of the searcher. Being there can get a lot of SERP “landscape”, especially if you own some important keywords in your niche, as is the case with Tesla cars, which are right now synonymous with electric cars. We will come back to this example a bit later on.
Let’s take a look at the example below and see how Google made the website thekitchn.com feel like it won the lottery. It is not the first site that Google recommends for the query “how to boil eggs” but it’s even better than this, it’s above all the results, placed on the answer box. While trying to prepare their breakfast, the “netizens” from all over the world can read a list with helpful instructions about it, right from the answer box, generously filled with information by the thekitchn.com. Now, imagine how great it would be if for a specific query, the information from the answer box would be cited from your site. Imagine the impact that would have on your rankings and authority. And even if it seems it’s a hard to accomplish goal, it’s not impossible. Understanding the way Google makes use of the answer box is primordial in order to reach this goal and reading this article might be the first step to achieve it.
Yet, how does Google know what to look for? The answer – or at least part of it – is that:
Google actually “gets” you. It knows “what you’re saying’”.
Thanks to the Google Hummingbird algorithm, it’s a lot savvier when it comes to interpreting things in context. And if it doesn’t have enough information to figure out what you actually want, like when we’re simply searching for “hummingbird”, it displays the most likely result, with an answer box containing text and images, but also a disambiguation box, right below the answer one, just in case you were thinking about something else. This is all pretty smart and very much related to the knowledge graph concept Google has been toying around with for a while.
Why are Google’s Answer Boxes so Important?
From a SEO perspective … because you can get a lot of SERP “landscape” if you own this on the important keywords in your niche.
Being number 1 in searches no longer means just one thing. Sure, being number 1 in the results for a search is still good, but so is being the site Google gets the answer for its answer box. Or being the source of the first image in an image block result. Or being number 1 for the paid shopping results box. Or having a rated product with a high rating showing up in the first searches. You’ve probably got it: the SERP landscape has changed and being a source for an answer box can get you serious traction.
How does Google Extract the Answers?
According to some patents from Google it assumes that one element of the search query will be an entity, and the other will be a certain attribute of said entity. For instance, if I were to search “Ukraine’s GDP”, “Ukraine” would be my entity and “GDP” would be the attribute. This is what the algorithm would identify as a “E’s A” type of query . It works similarly with a series of other phrasings, such as “what is the A of E”, “who is the A of E” (e.g.: “who is the husband of angelina jolie”) , “the A of E” or “who is the E’s A”. Google has a database of entities with their potential looked for attributes. This means that you won’t find an answer for every question, but you should find answers for the most popular questions.
Similar patents are in place to further searches about specific facts. Google has patents for extracting entities from queries (US80058524 B2) , returning factual answers in response to queries (US8655866 B1) or for retrieval of information items (US8719244 B1). And although they look hard to digest, we will try to simplify the process and explain them.
The key elements in these are selecting the representative phrase / search fragment to then match with information and determining whether a certain fragment of the search query is, in fact, a fact query. Google then identifies potential results and ranks them according to an unknown scoring system, presenting you with the highest scoring results in the answer box.
So when we asked the classical question “why is the sky blue”, we got an answer not from Wikipedia, as one might expect, but from a page on the Department of Mathematics, University of California, Riverside website. At least one of the things that this site did well to get the highest score was having the question in the title of an article and the answer in the very first paragraph. Belonging to a .edu domain probably helped as well.
Another thing that caught our attention and we thought was interesting enough to share with you is that for certain queries, attributes extraction is based on user location. In the screenshot below, we can see that for the search query “how much does a passport cost” in the results we get using a USA IP we’ll have a different answer box than for the same search query with a UK IP. We cannot extrapolate this situation and use it as a general rule regarding how Google extracts the answers but it’s important to mention that Google seems to take into consideration the user’s location.
Furthermore, we’ve found another interesting example that enforces this matter. If I search for a conversion from Colombian pesos into dollars (“transformar pesos colombianos a dolares”), Google takes into consideration my IP, so it displays the result adapted to that knowledge. Only now it’s adapted the result so much that it doesn’t really answer my question. It is indeed an evidence of the highly advanced technology that Google uses, but as long as it doesn’t provide an answer for my query it is not much of a help.
So the answer box algorithm understands almost all of the 5 Ws: “why”, “who”,“what”,“when” and “where”. It doesn’t always get the right interpretation in the context though. For instance, if you’ve become nostalgic and you’re wondering “when the iPhone was released”, you’ll get June 29, 2007 as the date the iPhone was originally introduced, which is correct. Same if your question is about the latest model. If, however, you’ve suddenly become a more practical, future-oriented person, and want to know“ when the iphone 6 is released”, you’ll only be reminded of the June 29, 2007 date for the introduction of the first iPhone. That’s not very helpful, but maybe Google doesn’t know the exact release date, so it’s understandable. Only that’s not the issue here. If you were to search “when is the new iphone to be released”, you’ll actually get useful results about the iPhone 6 (which is the new one) release date speculations. Does Google not know that the newest version of the iPhone will be the iPhone 6? It is entirely possible. Further testing the search engine by inquiring about iPhone 8 or iPhone 10 seems to greatly confuse the big G and causes it to return incorrect or misguided results.
There are a lot of questions that simply go unanswered. While the algorithm is quite impressive, it’s not fully reliable yet. Which is probably why we can assume that Google presents you with an answer if the answer is clear and specific – and therefore if the questions are specific enough. If you’re wondering “how much gold is worth”, you’ll get some search results but no preferred answer. If, however, you’re asking “how much gold is worth per ounce”, then you’ll at least get an estimate in the answer box. The same goes for the price of milk. If you’re simply asking “how much milk costs”, you’ll get a bunch of results, but no preferred answer in the answer box. This changes, however, if you ask “how much does a gallon of milk cost” as this time you get a preferred answer from a page on a news organization website.
Where do all these answers come from? A variety of places, really. Two factors seem important::
- that the page selected for the answer contains the question in a very similar (if not exact) form, along with the answer, at a short distance from the question (repeating at least some of the words from the question) and
- that the page selected for the answer belongs to a trustworthy website. So most of the times, if it’s not Wikipedia, it will be a site that it can consider a non biased third party, such as is the case with a lot of “.edu” sites, or news organization websites.
That, of course, changes if you are purposefully looking for specific brand information, in which case the most trustworthy information might come straight from that brand’s site.
Topical Authority Sites vs Generic Authority Sites
It’s reasonable to assume that when it can’t find its answers on an already trusted-and-verified site, Google will try to take some extra precautions by looking for authority sites or sites that it can trust. And here is where Panda 4.0, the Topical Authority Content Update of 2014 joins the game. Let’s look at the examples below and try to understand the way Google is making use of Topical Authority sites and the Generic ones.
If we try to figure out what is a quantum computer or back pain, Google will quote from a generic site (wikipedia), giving us an overall idea about the computation device or the unpleasant but yet so common back pain. But what happens when we try to find some info about a headache or a sciatica pain (a condition akin to back pain)? Apparently, we are led to more specialized sites, to some pages that Google considers to be high authority pages. It is well known that Google’s goal is to return the best possible results that match not just based on the exact match query but on the intent of the user doing the query, boosting sites that offer a more in-depth topic covered than a simple article about the problem.
There’s a special note to make here, namely that medical terms seem to be filtered. For example, for sensitive terms searches, you are very likely to be instructed by Google to “Consult a doctor if you have a medical concern”. It is possible that the big G may have a self-medication monitoring system, giving different results in the answer box according to the severity of the health condition. Furthermore, the size of a business card or the date of the next solar eclipse are considered to be specific questions that need to be answered by topical sites with high authority in the field. With the risk of redundancy, I bring Panda 4.0 back in discussion as it seems again that sites which bring “additional value” to the web are considered to be topical authority and promoted as such, whether it comes to SERP or answer boxes.
Thereby, it seems like that for more specific queries, content based Topical Authority Sites are quoted more in answer boxes compared to sites that only cover the topic briefly (even if the site covering the topic briefly has a lot of generic authority). More articles written on the same topic increase the chances for the site to be treated as a “Topical Authority Content Site” on that specific topic and be listed in the answer box. But what Google considers to be high-quality content might remain a legit question.
Commercial Queries and the Google Answer Box
If you’re interested in finding out “how much a passport costs”, be advised that the information in the answer box comes from the website of the UK government. We’re not sure whether this is because of the language settings, the IP, the fact that this was the most popular query, or some other factors as well, but even though the source may vary depending on the language settings, it will very likely remain governmental.
You can also get an answer for “how much a gallon of milk costs”, though since that price is much more likely to fluctuate, what you actually get in the answer box is information about how the price has changed recently. It seems at least somewhat reasonable to suspect that you will only get an answer box when a significant increase or decrease has taken place. A similar result emerges when you search for the price of a stamp, and this is probably the same for a series of other generic products.
Not all unbranded products cost the same though. Even though a medical procedure may be performed roughly the same everywhere, it doesn’t mean it costs the same. If your brand or organization name is entirely associated with an issue, you’re also likely to get an extra advantage from the answer boxes. Looking for “how much an abortion costs” returns an answer from the Compass Care website, which is specifically dedicated to such issues. For them, showing up in the answer box might be more valuable than showing up first in the search results.
So, how come did Compass Care manage to climb above the results list, way until the answer box? As we take a look at their website, we begin to understand a bit on the Google’s algorithm on this matter. First of all, our question of interest (how much does an abortion cost?) is clearly written on their website. This way, they really make Google’s job way easier.
Secondly, the given definition is not just a random sentence but one that is citing from an authority in the field, Planned Parenthood, a non-profit organization providing reproductive health and maternal health services. Also, the page links back to the same authority, Planned Parenthood. We can conclude that all these factors are good enough to convince the search engine that they are worthy to be in the answer box. If you’re wondering why Google didn’t put the definition directly from the Planned Parenthood site, don’t forget that different answer boxes may be listed for the same search query but for different IPs. Thus, from our research, the answer boxes for the query ” how much an abortion costs” might differ but they all cite from the same high authority Planned Parenthood.
But did Google understood that this is a non-profit organization we are talking about? Or the references to high authority sites are those that make them appear in the answer box? Two other aspects come to mind. One is that it really is very difficult to completely be on a “blank page” with Google and that it’s hard to tell what background information may remain available at a certain time, even beyond private browsing. The other aspect is that the answer box function comes with a “Feedback” option at the bottom right of the box, meaning that the battle to stay relevant in the answer box is potentially just as fierce as the battle to stay relevant in the traditional search results.
The landscape is fairly clear for generic products. What about products of really big brands? As discussed earlier, it actually makes sense to go to the source when someone is looking for specific information. At least in terms of pricing, the brand owner’s page probably contains the most unbiased information. If you’re giving into the electric car craze and want to find out just “how much a tesla costs”, you will get your answer right at the top, in a special box. You will actually get a price estimate in the answer box, alongside with prices for electric car models from other manufacturers, most likely because Google gathered that you’re not looking just into a Tesla, but into electric cars, more generally. The pricing information is provided for the searched entity and not for the brand itself. Furthermore, all the sites that it takes the information from are third party sites, magazines, news websites and not the official selling car lot.
Let’s take a look at the examples below and try to figure out how Google treats the “pricing” issue in the answer box. Whether we are looking for an iphone 5′s price, a google ad or garcinia cambogia weight loss pills, we can get an idea of this item’s prices but not in a direct marketing-selling approach but from some third parties authorities. What is Google actually doing here? It provides the user with a good experience by trying to offer him an answer to his question but, at the same time, it keeps him away from the official advertising sites of these products. None of the sites we are sent to are selling the products in question nor have a direct connection to the marketing strategy of these items.
Not even when we ask Google how much a Google ad costs, we are not directed to the AdWords page as expected, but to a unbiased blog post that talks about this matter. This “politically correct” Google attitude makes us have a better understanding of the saying “more catholic than the Pope himself”. Yet, Google doesn’t have the price list for all the products as it gets confused when it is asked “how much a nexus 5 costs” (to be mentioned that Nexus is a Google product) or “how much a macbook costs” (a very well-known product).
There is, however, a lot of individual commercial pricing data for various products which will not show up in an answer box. Take an obvious example: “how much is an Audi”. Even though it’s a big brand and one that’s so big, you won’t find your answer in any box at the top of the results page. Things change a bit when you’re inquiring about specific models, but in weird and inconsistent ways. You get answers for “how much is an Audi R8”, but not for “how much is an Audi A1”. Continuing with car-related queries, things seem to be equally unequal. An answer box for Mercedes C63 AMG, none for Mercedes S55, or A180. In both cases, the information comes from automobile magazines websites and refers to models that have only recently been rolled out. So unless the product is fairly new and has only a handful of authorized sellers, it’s more likely that you won’t get your answer in a box.
Failures of the Google Answer Box
Of course, you could make the case that not getting an answer box might be preferable to getting one with wrong or misguiding information. And you might be right. Let’s look at a couple of examples from our very own field. If you’d like to know, for instance, “what is a SEO company”, the information you’ll get in the definition is about SEO in general. Close enough in this case, but not what I asked.
Things get worse down the road if we ask “what is seo company india”. That, you’ll quibble, doesn’t make much sense now, does it? How could we possibly expect to get an answer box for that? Well, we didn’t. But we did get the box anyway. A nonsensical definition from a shady website. In fairness, the definition was nonsensical because it kind of looked like keyword stuffing. And the site looked shady because… well, it was.
That wasn’t very fair, we’ll admit, but let’s try a query that does make sense: “what is a seo keyword”. Not only does this make sense, but there should be plenty of reasonable, unbiased definitions lying around. And yet Google picked another slew of stuffed keywords from a shady website. There’s probably a bit of irony in there.
One last try: “what is a seo tool”. Again we get an incorrect definition (this one at least is an actual sentence, but not the answer to our query) from a somewhat shady site – or, at the very least, one whose presence in the answer box is shady.
There are other examples as well, from a variety of fields, which highlight the fact that the Feedback button at the bottom right of the search box is not merely a courtesy, but a potentially crucial tool, in a community-effort-buildup kind of way, for the development and improvement of the search box.
Some Really Interesting Google Answer Boxes
If you haven’t done it yet, you will probably do it after you read this: ask Google “who ….is”(to be completed with your name). As I am trying to face Google with some existential problems, trying to figure out who I am, I receive all kinds of references related to my name, even pictures and some YouTube Videos. Yet, no answer box for me. And this is understandable; not even famous pop artists like Michael Jackson or Madonna don’t have an answer box inserted among the page results. The knowledge graph provides us indeed with valuable information about these artists on the right of the result list but not in an answer box.
Yet, there are some “who ….is” questions for which Google offers the answer right on the top of the SERP. And one of this answers is for the query “Who is Matt Cutts”. To be mentioned that the head of Google’s spam department is not only listed in an answer box, but also, the generated answer is taken directly from Cutt’s blog, offering us a very friendly and personal presentation of the well known engineer. If the same pattern would have applied, I am curious how the answer box for ” who is Vladimir Putin” or ” who is Barack Obama” would look like.
There are several situations when the Google answer box doesn’t seem to have an “inspirational” source at all. It just looks like the search engine already “knows” and it is so sure about the truthfulness of a certain piece of information that doesn’t feel the need of providing the link of where that data are coming from. This is what happens when we try to find out the date of the marriage between Barack Obama and Michelle Obama. Google “knows” this information and generates it in an answer box like an undeniable truth, without any other explanations.
Anytime the information is not this specific, but rather general, Google will most often use Wikipedia to extract an answer. That’s what you’ll get, for instance, for queries such as “when did the Titanic sink” or “what is the average IQ”.
But as soon as the information gets a bit more specific, the “rule” seems to be that Google extracts data from topical authority trusted sites if there are any for the niche to which the question belongs. Which is why some queries are much more likely to land you one specialized site’s page to get you an answer. Other times it’s rather the phrasing of the query that is beyond the scope of Wikipedia. Searching “how to drive stick shift”, for example, prompts an answer box from a website dedicated to instructions lists for DYI projects. This answer box looks pretty complex, offering not just a general overview of the situation but practical steps of how to manage a car with manual transmission. Yet, we can only hope that people are not searching for this query while they are actual driving…
If, however, you were to look for something even more specialized, such as “how to reset an iphone to factory settings”, then the answer box would pick up information from a specialized topical authority site, which in this case is the support page from Apple. Once again, a very handy list of the exact steps that need to be done is generated in the answer box.
Although “why we yawn” looks like a pretty generic question, in answering this, Google goes to a children’s health website. Most likely because it’s topical and trustworthy and, if we go to the cited website we will see that the page contains almost the exact same question and the answer right after.
How You Could Optimize for the Google’s Answer Boxes
So let’s get to the most important question, which unfortunately will not yield an answer box: how can you try to get in on the race to the answer box (without being one of the shady/failure examples)? There’s three steps you need to focus on, mainly:
1.Identify the Keywords that could return answer boxes
You’ve seen from the patents but also from the practical examples that there are a few words and phrases that Google associates with “answer-able” queries: “when did”, “where is”, “how much”, “what/who is”, etc. Identify definitions that don’t have answer boxes yet and try to see if you can fill any void. You can even try to find out which specific wording is more likely to be considered by Google. A great tool to generate suggestions is keywordtool.io , a free online keyword research instrument that uses Google Autocomplete to generate suggestions for several keywords. For example, should your page provide an answer for “how much X is” or for “how much does X cost”? Though similar in meaning, they might be treated differently by the search engine.
2.Analyze the keywords
Test your conclusions to see if you’ve made the right choice. See what other users are looking for, see what autocomplete you’re likely to trigger by using one phrase or another. The devil is in the details, but that’s where the road to the answer box might be as well. Look at sites that are already in the answer boxes for various terms definitions and see what they did right and what they did wrong.
3.Create Dedicated Landing Pages
Build dedicated landing pages where you include clear definitions and you cite the best references for the terms and concepts. Remember the lessons you’ve seen so far: the landing page has to have the question in the title and the answer in the very first paragraph, or at any rate, the question and answer have to be in very close proximity. Quoting trustworthy sources if you don’t feel you’re one yet can also help: get you information first-hand from educational, authority-topical or non profit sources. Just because you’re not the originator of a certain piece of data doesn’t mean you can’t be the provider of the best or most complete version of that data.
In addition, make use of the tools and principles you generally rely on: build awareness to each page, bring the social spotlight in and work on creating organic links to expand its popularity. Building on the old adage, “you have to spend money to make money”, just keep in mind that with the answer box, you have to have traffic to make more traffic. And most importantly? Don’t be spammy and provide value to the world wide web. You want to be in our correct examples category, not in the misleading one, otherwise the cutting blade of the Feedback button may take you down sooner than you’ve risen to the top.
A pretty famous computer scientist said that “the question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” SERP landscape has changed a lot, that is for sure, and mostly because of the highly technological progress, Artificial Intelligence included, of course. The artificial intelligence process suggests that the line between intelligent machines and people blurs most when a puree is made of that identity. No matter how much we’d like everything to be standardized, in the search industry results are not based directly on exact rules.
Answer box is indeed, as Google stated: much of the work on language, speech, translation and visual processing relies on Machine Learning and AI that raises deep scientific and engineering challenges. Contrary to much of current theory and practice, the statistics of the data shifts very rapidly, the features of interest change as well, and the volume of data often precludes the use of standard single-machine training algorithms. It is a learning process we are facing now and in the end the human that programmed the machine does not know the exact output of the program because of the diverse training phases of the algorithms. Everyday, Google search finds more “humane” ways to interact with its users and provide more direct answers. With all the strings attached, answer boxes are a way new level in the search industry and understanding them shouldn’t be just a recreational hobby but a must if you want your site to be known and ranked the best possible.