Google’s internal documentation on how their search engine works, ave been leaked online. The documents are causing quite a stir, as they contradict a few statements Google has made in the past about the way they “rank” sites in the search results.
For example: subdomains are treated as separate websites by the crawler and do impact your site’s ranking. Also, newly registered domains are penalised and ranked lower than more established ones. Domains that mimic search queries (like mens-luxury-watches.com or big-purple-car.com) are also penalised.
One particularly troubling reveal is that Google ranks results by user centric click signals. In other words: heavily advertised websites and websites that generate a lot of traffic are ranked higher, contrary to what Google has stated numerous times in the past. This means that sites from big brands always show up higher in the list, no matter how hard a smaller brand’s tries to optimise their website or has authority. As today’s episode of Techlinked puts it: “It’s like Mr. Beast showing up higher in the search results than your own doctor, just because Mr. Beast has more Twitter followers.”
Google has in the past also stated that data from Chrome (their browser) is in no way used to determin a page’s ranking. Guess what? They do use that data - including from people who have opted out from allowing Google to use browsing data.
The documents were accidentally leaked by a Google employee, who included them in his Github repro. He tried to delete the docs soon after, but they’d already been backed up by third part document management services, who still have them available.
The leak is quite big news - SEO specialists and website owners need to read them, as you can read exactly how to get traffic to your site by reading between the lines of the documents.
I’m not linking to the docs themselves, as I’m not certin of the legal implications of doing so for both me and Realmac (as the owner of this platform), but you can (oh I love irony) find them using Google Search just fine…
Google has confirmed the authenticity of the documents to The Verge, but followed that with a statement that one should not jump to conclusions as “the pages are out of context”. Just how 2,500 pages of text can lack context was not explained though. Google also states that the documents are for an older version of their systems.
Cheers,
Erwin
Sources: