Top 3 Products & Services
Dated: May. 14, 2010
Related CategoriesSearch Engine Optimization
By Mans Gibler
Each search engine has something called an algorithm which is the formula that each search engine uses to evaluate web pages and determine their relevance and value when crawling them for possible inclusion in their search engine. A crawler is the robot that browses all of these pages for the search engine. Let's take look at how Algorithm, Page Rank, Backlinks, Hyperlinks, and various other variables can benefit your website.
GOOGLE Algorithm Is Key
Google has a comprehensive and highly developed technology, a straightforward interface and a wide-ranging array of search tools which enable the users to easily access a variety of information online.
Google users can browse the web and find information in various languages, retrieve maps, stock quotes and read news, search for a long lost friend using the phonebook listings available on Google for all of US cities and basically surf the 3 billion odd web pages on the internet!
Google boasts of having world’s largest archive of Usenet messages, dating all the way back to 1981. Google’s technology can be accessed from any conventional desktop PC as well as from various wireless platforms such as WAP and i-mode phones, handheld devices and other such Internet equipped gadgets.
Page Rank Based On Popularity
The web search technology offered by Google is often the technology of choice of the world’s leading portals and websites. It has also benefited the advertisers with its unique advertising program that does not hamper the web surfing experience of its users but still brings revenues to the advertisers.
When you search for a particular keyword or a phrase, most of the search engines return a list of page in order of the number of times the keyword or phrase appears on the website. Google web search technology involves the use of its indigenously designed Page Rank Technology and hypertext-matching analysis which makes several instantaneous calculations undertaken without any human intervention. Google’s structural design also expands simultaneously as the internet expands.
Page Rank technology involves the use of an equation which comprises of millions of variables and terms and determines a factual measurement of the significance of web pages and is calculated by solving an equation of 500 million variables and more than 3 billion terms. Unlike some other search engines, Google does not calculate links, but utilizes the extensive link structure of the web as an organizational tool. When the link to a Page, let’s say Page B is clicked from a Page A, then that click is attributed as a vote towards Page B on behalf of Page A.
Back Links Are Considered Popularity Votes
Quintessentially, Google calculates the importance of a page by the number of such ‘votes’ it receives. Not only that, Google also assesses the importance of the pages that are involved in the voting process. Consequently, pages that are themselves ahead in ranking and are important in that way also help to make other pages important. One thing to note here is that Google’s technology does not involve human intervention in anyway and uses the inherent intelligence of the internet and its resources to determine the ranking and importance of any page.
Unlike its conventional counterparts, Google is a search engine which is hypertext-based. This means that it analyzes all the content on each web page and factors in fonts, subdivisions, and the exact positions of all terms on the page. Not only that, Google also evaluates the content of its nearest web pages. This policy of not disregarding any subject matter pays off in the end and enables Google to return results that are closest to user queries.
Google has a very simple 3-step procedure in handling a query submitted in its search box:
- When the query is submitted and the enter key is pressed, the web server sends the query to the index servers. Index server is exactly what its name suggests. It consists of an index much like the index of a book which displays where is the particular page containing the queried term is located in the entire book.
- After this, the query proceeds to the doc servers, and these servers actually retrieve the stored documents. Page descriptions or “snippets” are then generated to suitably describe each search result.
- These results are then returned to the user in less than a one second! (Normally.)
Approximately once a month, Google updates their index by recalculating the Page Ranks of each of the web pages that they have crawled. The period during the update is known as the Google dance.
Do You Know The GOOGLE Dance?
The Algorithm Shuffle
Because of the nature of Page Rank, the calculations need to be performed about 40 times and, because the index is so large, the calculations take several days to complete. During this period, the search results fluctuate; sometimes minute-by minute. It is because of these fluctuations that the term, Google Dance, was coined. The dance usually takes place sometime during the last third of each month.
Google has two other servers that can be used for searching. The search results on them also change during the monthly update and they are part of the Google dance.
For the rest of the month, fluctuations sometimes occur in the search results, but they should not be confused with the actual dance. They are due to Google's fresh crawl and to what is known "Everflux".
Google has two other searchable servers apart from www.google.com. They are www2.google.com and www3.google.com. Most of the time, the results on all 3 servers are the same, but during the dance, they are different.
For most of the dance, the rankings that can be seen on www2 and www3 are the new rankings that will transfer to www when the dance is over. Even though the calculations are done about 40 times, the final rankings can be seen from very early on. This is because, during the first few iterations, the calculated figures merge to being close to their final figures.
You can see this with the Page Rank Calculator by checking the Data box and performing some calculations. After the first few iterations, the search results on www2 and www3 may still change, but only slightly.
During the dance, the results from www2 and www3 will sometimes show on the www server, but only briefly. Also, new results on www2 and www3 can disappear for short periods. At the end of the dance, the results on www will match those on www2 and www3.
GOOGLE Dance Tool
This Google Dance Tool allows you to check your rankings on all three tools www, www2 and www3 and on all 9 datacenters simultaneously.
The Google Web Directory works in combination of the Google Search Technology and the Netscape Open Directory Project which makes it possible to search the Internet organized by topic. Google displays the pages in order of the rank given to it using the Page Rank Technology. It not only searches the titles and descriptions of the websites, but searches the entire content of sites within a related category, which ultimately delivers a comprehensive search to the users. Google also has a fully functional web directory which categorizes all the searches in order.
Submitting your URL to Google
Google is primarily a fully-automatic search engine with no human-intervention involved in the search process. It utilizes robots known as ‘spiders’ to crawl the web on a regular basis for new updates and new websites to be included in the Google Index. This robot software follows hyperlinks from site to site. Google does not require that you should submit your URL to its database for inclusion in the index, as it is done anyway automatically by the ‘spiders’. However, manual submission of URL can be done by going to the Google website and clicking the related link. One important thing here is that Google does not accept payment of any sort for site submission or improving page rank of your website. Also, submitting your site through the Google website does not guarantee listing in the index.
Sometimes, a webmaster might program the server in such a way that it returns different content to Google than it returns to regular users, which is often done to misrepresent search engine rankings. This process is referred to as cloaking as it conceals the actual website and returns distorted web pages to search engines crawling the site. This can mislead users about what they'll find when they click on a search result. Google highly disapproves of any such practice and might place a ban on the website which is found guilty of cloaking.
Here are some of the important tips and tricks that can be employed while dealing with Google.
- A website should have crystal clear hierarchy and links and should preferably be easy to navigate.
- A site map is required to help the users go around your site and in case the site map has more than 100 links, then it is advisable to break it into several pages to avoid clutter.
- Come up with essential and precise keywords and make sure that your website features relevant and informative content.
- The Google crawler will not recognize text hidden in the images, so when describing important names, keywords or links; stick with plain text.
- The TITLE and ALT tags should be descriptive and accurate and the website should have no broken links or incorrect HTML.
- Dynamic pages (the URL consisting of a ‘?’ character) should be kept to a minimum as not every search engine spider is able to crawl them.
- The robots.txt file on your web server should be current and should not block the Googlebot crawler. This file tells crawlers which directories can or cannot be crawled.
- When making a site, do not cheat your users, i.e. those people who will surf your website. Do not provide them with irrelevant content or present them with any fraudulent schemes.
- Avoid tricks or link schemes designed to increase your site's ranking.
- Do not employ hidden texts or hidden links.
- Google frowns upon websites using cloaking technique. Hence, it is advisable to avoid that.
- Automated queries should not be sent to Google.
- Avoid stuffing pages with irrelevant words and content. Also don't create multiple pages, sub-domains, or domains with significantly duplicate content.
- Avoid "doorway" pages created just for search engines or other "cookie cutter" approaches such as affiliate programs with hardly any original content.
Also, consider technical factors. If a site has a slow connection, it might time-out for the crawler. Very complex pages, too, may time out before the crawler can harvest the text.
If you have a hierarchy of directories at your site, put the most important information high, not deep. Some search engines will presume that the higher you placed the information, the more important it is. And crawlers may not venture deeper than three or four or five directory levels.
Above all remember the obvious - full-text search engines such index text. You may well be tempted to use fancy and expensive design techniques that either block search engine crawlers or leave your pages with very little plain text that can be indexed. Don’t fall prey to that temptation.
Ranking Rules Of Thumb
The simple rule of thumb is that content counts, and that content near the top of a page counts for more than content at the end. In particular, the HTML title and the first couple lines of text are the most important part of your pages. If the words and phrases that match a query happen to appear in the HTML title or first couple lines of text of one of your pages, chances are very good that that page will appear high in the list of search results.
A crawler/spider search engine can base its ranking on both static factors (a computation of the value of page independent of any particular query) and query-dependent factors.
- Long pages, which are rich in meaningful text (not randomly generated letters and words).
- Pages that serve as good hubs, with lots of links to pages that that have related content (topic similarity, rather than random meaningless links, such as those generated by link exchange programs or intended to generate a false impression of "popularity").
- The connectivity of pages, including not just how many links there are to a page but where the links come from: the number of distinct domains and the "quality" ranking of those particular sites. This is calculated for the site and also for individual pages. A site or a page is "good" if many pages at many different sites point to it, and especially if many "good" sites point to it.
- The level of the directory in which the page is found. Higher is considered more important. If a page is buried too deep, the crawler simply won't go that far and will never find it.
These static factors are recomputed about once a week, and new good pages slowly percolate upward in the rankings. Note that there are advantages to having a simple address and sticking to it, so others can build links to it, and so you know that it's in the index
- The HTML title.
- The first lines of text.
- Query words and phrases appearing early in a page rather than late.
- Meta tags, which are treated as ordinary words in the text, but like words that appear early in the text (unless the meta tags are patently unrelated to the content on the page itself, in which case the page will be penalized)
- Words mentioned in the "anchor" text associated with hyperlinks to your pages. (E.g., if lots of good sites link to your site with anchor text "mesothelioma cancer" and the query is "mesothelioma cancer," chances are good that you will appear high in the list of matches.)
Blanket Policy On Doorway Pages And Cloaking
Many search engines are opposed to doorway pages and cloaking. They consider doorway and cloaked pages to be spam and encourage people to use other avenues to increase the relevancy of their pages. We’ll talk about doorway pages and cloaking a bit later.
Meta Tags (Ask.Com As An Example)
Though Meta tags are indexed and considered to be regular text, Ask.com claims it doesn't give them priority over HTML titles and other text. Though you should use meta tags in all your pages, some webmasters claim their doorway pages for Ask.com rank better when they don't use them. If you do use Meta tags, make your description tag no more than 150 characters and your keywords tag no more than 1,024 characters long.
Keywords In The URL And File Names
It's generally believed that Ask.com gives some weight to keywords in filenames and URL names. If you're creating a file, try to name it with keywords.
Keywords In The ALT Tags
Ask.com indexes ALT tags, so if you use images on your site, make sure to add them. ALT tags should contain more than the image's description. They should include keywords, especially if the image is at the top of the page. ALT tags are explained later.
There's been some debate about how long doorway pages for AltaVista should be. Some webmasters say short pages rank higher, while others argue that long pages are the way to go. According to AltaVista's help section, it prefers long and informative pages. We've found that pages with 600-900 words are most likely to rank well.
AltaVista has the ability to index frames, but it sometimes indexes and links to pages intended only as navigation. To keep this from happening to you, submit a frame-free site map containing the pages that you want indexed. You may also want to include a "robots.txt" file to prohibit AltaVista from indexing certain pages.
Now that you've gotten free know-how on this topic, try to grow your skills even faster with online video training. Then finally, put these skills to the test and make a name for yourself by offering these skills to others by becoming a freelancer. There are literally 2000+ new projects that are posted every single freakin' day, no lie!