The gentle art of the web For the past several thousand years, the soul of adventure was searching for an adventure and looking for an exotic taste. In the past 10 years, this smaller group's more afraid member, led to the Internet. In the 1990s, opportunities to buy this special media reached the highest point in history, each Tekki had space for himself. However, not all websites on the Internet are self-respecting and not shameless.
What is decentralization? Internet: Anyone can set up web pages and link to other web pages. It is scattered. Anyone can let the search engine see these pages. This is concentrated. Search engines can add blogs. This is Google + Blogger. Currently it is both a publisher and a search engine. It has more power. Distributed things are more difficult to manage and use. The centralized one can finally make money by using it relatively easily. (If you are curious, I am writing about "platform" in detail.)
A search engine is a program for searching information on the Internet. The results of the search query given to the user are presented in the list on the web page. Each result is a link to a web page that contains specific information about a particular query. The information may be a web page, an audio or video file, or a multimedia document. The Web search engine functions by storing information in the database. To gather this information, we crawl each link on a specific website. Google is considered the most powerful and used search engine at the moment. It is a large and generic search engine that crawls and indexes millions of web pages each day. It is suitable for information retrieval, but it is not enough to manage complicated information queries that require some additional knowledge.
The web search engine works by storing information on many web pages searched from the HTML itself. These pages are retrieved by a web crawler (also called a spider), which is an automatic web browser that follows all links on the site. You can exclude it using robots.txt. Next, analyze the contents of each page and decide how to index (eg extract a word from a special field called title, title, or meta tag). Data about web pages is stored in the index database for use in future queries. The query can be a single word. The purpose of the index is to make it possible to find the information as soon as possible. While other search engines, there are several search engines (such as Google) that save all or part of the page's information as well as pages of the source (called cache) (eg Altavista), each of the pages they find I will save the word