|
Common search engine principles
Everyone in the
computer field has heard of a search engine. A search engine is basically a
program that is designed to search all the documents in the internet for
some specific keywords. The search engine then produces the list of
documents wherein these keywords are found.
There are some
common search engine principles that are used to activate the search engine.
When you use a search engine, it first sends out a spider whose job is to
collect many documents with the related keyword as possible. The spider can
be compared to a web browser; however the difference lies in the fact that
the web browser shows all the information of a page while the spider has no
visual components. It works with the HTML code of the page.
The next
program you find is the crawler that helps to find links on each page. It is
with the help of the crawler that the spider knows where to go. Following
these links, the crawler can find documents that were till then not found on
the search engine. Then it is the indexer that analyzes each page and each
part of the page like headers, text, special HTML tags, etc. These three
programs collectively constitute the common search engine principle.
There are some
crawlers that are better than the ordinary crawlers. These crawlers do a
deep crawl on the website to access as many pages as possible which contain
the keywords that are mentioned in the search engine. These deep crawlers
can also gather pages that have not been submitted too! However, it is
always better to search on larger search engines; as the larger is the
search engine, the higher the number of pages that are listed on the site.
There are some search engines that can follow frame links, and some that
can’t. So it is better to use search engines that follow frame links as it
provides for complete search of your web page.
Crawlers are
found in crawler-based search engines where the listings are created
automatically. However, there are also human powered directories where it
depends on humans for its listings. There are also hybrid search engines
that work on a combination of both of these types of search engines. Some of
their searches are created automatically while some depend on humans.
Search engines
have a large database that is used for the storage of downloaded and
processed pages. It is usually referred to as the index of the search
engine. Next in line in common search engine principles is the results
engine. This is the program that extracts the search results from the
database to rank pages. It is through this program that the order of the
pages that best match the user’s query is arranged. There are ranking
algorithms of the search engine that have to be followed to arrange the web
pages orderly.
The next
important part of the search engine is the web server. It is the web server
that is responsible for all the interactions between the user and other
search engine components. The search engine web server comes with a HTML
page which has an input field. It is through this input field that the user
can actually specify the exact query or information he/she is searching for.
The web server has another search engine need to fulfill, and that is to
display all search results related to the needs of the user in the form of a
HTML page.
By following
the common search engine principles and organizing your website according to
these principles, you can surely place your website on the top listing of
search engines.
|