A: On the Internet, a search engine has three parts:
1. A spider (also called a "crawler" or a "bot") which travels to every page or representative page on every searchable web site, reads it, then using hypertext links on those pages, travels throughout the other pages linked by
that web site.
2. A catalog or Index which is created by programs compiling the pages read
from those web sites, and...
3. A program which receives your search request, compares it to the entries in the index, and returns the results to you. An alternative to using a search engine is to explore a structured directory of topics. Yahoo, which also lets you use its search engine, is the most widely-used directory on the Web. A number of Web portal sites offer both the search engine and directory approaches to finding information Not all search engines are created equal, but all of them have a few basic components that are essential to their use. Some components are more visible than others to the average user, but all of them must be working in tandem to create a high performance search tool. The three basic actions that have to be performed for a search engine to be useful are: Gather information, analyze information, and display information. The only major difference between major search engines is how these tasks are performed and how often they are performed. Gathering information Spiders are the programs that search engines use to collect information about web sites on the Internet. These programs traverse the world wide web gathering the content of web sites and store that information for later processing.
There are two basic ways that spiders can find your web site. You can tell the search engine about your web site, or let it find your site on its own. Typically search engines will have a place on their web site which allows you to suggest a site to them. After a site has been suggested, the search engines spider will visit that web site to collect information about it. Spiders also follow the links on each web site to find linked sites to visit. This is how a spider will find your site by itself. The more web sites that link to your site, the more likely a spider will find your site without you telling it your sites URL.
Usually search engine spiders will revisit your site when you submit your URL again. When the spider finds a link to your site, or after a specified amount of time has passed since its last visit. Depending on the number of web sites that the spider needs to visit and the resources that the spider has at its disposal, it can take days or months for a spider to visit or revisit your web site.
Displaying information
Search engines take a search request from a user and display a list of web pages that relate to that topic. These returned sites give clues to the algorithm used to analyze the web pages in the search engines index. When a search engine displays the file size of the web page or a percentage next to the web site, it can be used to help figure out how to optimize your web pages better for that search engine. Some search engines return results in the order of relevance, others mix up the results to make sure the web sites returned are from different sites. No matter how a search engine displays the information requested by a user, this result is typically the first impression of your web site. It is important to follow any guidelines that search engines give and do research on how each search engine analyzes web pages so that you not only get a good ranking for your search, but the description of your site is accurate as well.
Thursday, February 11, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment