Though the working of different search engines vary from each
other, they all perform some basic functions, Essentially, all search engines
work in an orderly fashion, performing three main operations-Web crawling,
indexing and searching.
Web Crawling: Before it can give you the
information you want, a search engine has to find the information you are
looking for. To find that information, search engines, which are basically
WebPages that respond to search queries, “crawl” the web. Web crawling or
spidering is one of the basic functions of a search engine. Spidering is
achieved by retrieving stored information from WebPages by means of a bot, referred
to as a Web crawler or spider. A web crawler is a computer program that
navigates the internet in an automated, organized manner and retrieves the
information directly from the page itself. It makes a copy of every link on
each website visited, and analyses the content for indexing. Websites and URLs
are not the only entries that spiders indentify; they also take note of words
within pages and where they were found, through means and techniques could vary
from search engine to search engine.
Indexing: One the spiders have collected all
the information, it needs to be stored, compiled and organized so that it is
accessible for users and available for searching. To be more than a website
that lists links, search engines often store additional information including
the frequency of the occurrence of words on a page, and the importance given to
them (for ranking). Some search engines also assign “weight” to each
entry-measuring the differing value of words as they appear on a certain page.
The entire information is then compacted and stored-ready for indexing. The
most efficient means of building an index is by creating a hash table, which
assigns a formula to the numeric value of each word, and consists of a hash
number and a pointed to the actual data. This arrangement makes the indexing
and storage system effective for quick search results-even in cases of
complicated search.
Searching: This process is the first instance
of the user’s interaction with the search engine. Step one, build a query; step
two, submit it. Once the query is submitted, it is processed by the search
engine, which then extracts the information from the index.
No comments:
Post a Comment