Jun 4, 2018

Website Pages Titles Included in Google Search Queries – Studying Such Google Searches and Hoping That My Website Will Not Be Unjustifiably De-indexed


Search engines, such as Google, present the search results in a list of internet page addresses arranged according to their relevance. The relevance is calculated according to an algorithm which is usually confidential but  depends on known ranking factors. The factors include the character of the domain (whether it is reliable, public, famous, whether its name comprises the search terms = key terms), the character  of the  page  content  (whether it is long, recent, lacking forbidden features, comprising
the search terms, comprising links), the number of links to the site from other websites, the number and duration of visits on the page, and others. Some factors may reduce the ranking, or even lead to de-indexing (exclusion from the regular reviewing - crawling), including gibberish content, forbidden search engine optimization (SEO) activities, etc. (see, for example, this).

I believe that one of the most important factors in Google search rating is the presence of the search terms in the page title, preferably in the expected order. Non-forbidden SEO activities, thus, should include putting the search terms in the title. One should compress one’s message to as few words as possible and put it in the title. The question is how many internet pages will comprise your search terms – these pages may relate to the same problem you try to relate to; another question is, how many of these pages will manage to be rated better than you. Google now limit the search query to 32 words, but people often employ 2 to 7 words in their searches (such as “Justin Bieber” or “Justin Bieber concerting in London next year”). The question thus is how many words your message should have to get in a usual search query as a whole.  My advice is 6 to 8 words, preferably embedded in a broader title; my present title comprises an 8-terms message (WEBSITE PAGES TITLES INCLUDED IN GOOGLE SEARCH QUERIES) built in a title of 22 terms.

I have checked a search model, taking a page having a 19-words title on a website (one of pages in my Czech website) and checking a search query made of the first 1 to 19 words of said title. See below.


It is interesting to see how the number of search results changes with decreasing number of words in the query, and to see when the watched website ceases to be seen among relevant results. Similar situation is usually obtained for any search query, the subject and language notwithstanding. In our case, the literal translation of the title is “About [the] dead [speak only] well – about [the] live [speak] badly – and about whom [to speak the] truth? Bare facts from [the] life [of] Vaclav Havel and his family.”

When putting the exact title in the search query, we expect that the website employing such title will have a high position in the list of found websites – if the title is long enough and if the website is not blacklisted or de-indexed.  On the other hand, when the search query consists of only a few words, there surely are many internet pages comprising the same words, and the website in question can hardly keep its high position. All this is expected and obvious, the only question is what is “long enough”. In our example, “long enough” is between 6 and 10 words, because 10 or more words ensure the website to appear on the 1st position in the list, and 6 words ensures to be on the first page of the Google search list comprising ten results on each page (a non-explained fluctuation occurred for 9 terms). I have noticed that in many searches “long enough” is 6 words. It may be sometimes less, when unusual terms are employed.

The number of results does not change smoothly or monotonously with reducing the title length, and the jumps are affected by the Google algorithm, by the subject of the title, and by the character of word (its frequency in the vocabulary, etc.) being deleted in each cycle. When going from 1 search term in the query up, it can be seen that the search having one term has 16 billion items (“o”, which means “about” in Czech, has meaning in other languages as well); the total number of websites is nowadays about 2 billion, and each has many pages, which might explain the 16 billion result, even though the Google probably merely gives an extrapolation, and the real number of sites having in their text a word “o” is lower. Including a second term in the query reduces the number of results 1000 times, because the second term exists only in Czech. Adding further 1 or 2 or 3 terms, which are rather banal and expected in the Czech context, will cause some insubstantial fluctuations of the number of found items, but only the addition of a term which is not quite expected (“badly” or “evil-like”) reduces the number of results 10 times (from 5 to 6 terms). Further adding between 1 to 4 words, having relatively banal and expected meaning in the context, will again cause some fluctuations, but only the addition of an unexpected word (“bare”) leads to a 10 fold reduction of the found websites (from 10 to 11 terms). Further broadening of the search query by 1 to 5 terms, which are quite relevant, results in quite wide fluctuations but within the same order of magnitude (from 12 to 16 terms), and when lastly finishing by adding from 1 to 3 not too relevant terms, the number of found items is reduced 100 fold and remains nearly constant (from 17 to 19 terms); the changes of the found items when going from 11 to 19 terms cannot be simply explained and depends on the way of how the importance of the terms in the title is determined by the Google algorithm. One would expect that the number of found items will decrease monotonously with the increasing length of the query, but the Google algorithm is more complex, and the number sometimes jumps up with adding another term. As observed above, the watched website starts to appear among the first 100 websites when 5 first words of its title are included in the search query, and appears on the first page when 6 terms are included, and becomes the 1st one when 7 terms are included. However, hardly explainable fluctuations can be seen even in the placing of the website, as can be seen when used 9 terms. It must be emphasized that neither the numbers of items, nor the website placings are quire reproducible when repeating the searches in different times.

Surprisingly, my website you are now reading was penalized during the period I performed the searches, similar to this one shown in the above scheme, by substantially reducing its Google ranking and even by de-indexing. The activity associated with placing quickly-changing queries in the Google searches was probably perceived by Google algorithms as a spam activity, even if not including typical activities defined as spam activity, as these usually comprise modifying one’s website (see Google patent US 8,924,380), which I have never done. I am hopeful that my website will be freed of unjustified penalties, and that the present post will be ranked as high as possible.  

No comments:

Post a Comment