Only a small part of the information on the web is within reach of the search engines. The rest is located inside the invisible web, the deep web, whose contents are not indexed by search engines. The deep web includes services that require registration, for example licensed academic databases.
Original photo: Uwe Kils
The search results are personalized according to, for example, language, location, and search history. Two people can get very different results for the same search. If you want to search freely, without personalization, clear cache and cookies or browse in private with incognito mode.
Search engine results change quickly, as the web content is being updated continuously, and the criteria for search robots change. If you want to ensure that you find a specific website also in future, archive it on your own computer.
The search engines modify your search slightly without you noticing; similar words are included in the search and spell checkers use the most common spelling of the word, whether or not you spell it correctly. This will complicate your search, especially when you want to search by using a specific spelling of a word. Google offers verbatim search (Make a search -> Search tools -> All results -> Verbatim), which removes autocorrect suggestions.
Automatic search algorithms determine in which order the search results are presented. The complex ranking logic is based on the popularity of the sites and the number of links they contain, for example. Wikipedia is usually ranking highly, since its articles contain a lot of links and many web pages are linked back to them. The best-known search algorithm is Google PageRank, whose exact principles is a big trade secret. Search algorithms are tested on the users and they are continually revised, which partly explains why two similar searches can generate different results.
No search engine - not even Google - will find all information. As a fact, search engines index only about 16 % of the web content, the rest (the so-called invisible web or deep web) is beyond the reach of search engines. Some academic resources are also beyond the reach of search engines, since the access to licensed resources is restricted due to different user rights.
The purpose of the search engines is not solely to serve its users. Most of them are commercial ventures whose income depends on the ads appearing above the search results and the clicks, as well as on the user data collected.