Monday 9 August 2010

Wseas Transactions

New Subscription to Wseas Transactions

The following information was submitted:

Transactions: WSEAS TRANSACTIONS ON COMPUTERS
Transactions ID Number: 52-302
Full Name: Mukesh Kumar
Position: Assistant Professor
Age: ON
Sex: Male
Address: UIET, Panjab University, Chandigarh
Country: INDIA
Tel:
Tel prefix:
Fax:
E-mail address: mukesh_rai9@yahoo.com
Other E-mails: mukesh_rai9@pu.ac.in
Title of the Paper: A HYBRID REVISIT POLICY FOR WEB SEARCH
Authors as they appear in the Paper: Vipul Sharma , Mukesh Kumar, Renu Vig
Email addresses of all the authors: vipul_85cse@yahoo.co.in,mukesh_rai9@yahoo.com,renuvig@hotmail.com
Number of paper pages: 10
Abstract: A crawler is a program that retrieves and stores pages from the Web, commonly for a Web search engine. A crawler often has to download hundreds of millions of pages in a short period of time and has to constantly monitor and refresh the downloaded pages. Once the crawler has downloaded a significant number of pages, it has to start revisiting the downloaded pages in order to refresh the downloaded collection. Due to resource constraints, search engines usually have difficulties keeping the entire local repository synchronized with the web. Given the size of web today and inherent resource constraints: re-crawling too frequently leads to wasted bandwidth, re-crawling too infrequently brings down the quality of the search engine. In this paper a hybrid approach is build on the basis of which a web crawler maintains the retrieved pages "fresh" in the local collection. Towards this goal the concept of Page rank and Age of a web page is used. As higher page rank means t!
hat more number of users are visiting that very web page and that page has higher link popularity. Age of web page is a measure that indicates how outdated the local copy is. Using these two parameters a hybrid approach is proposed that can identify important pages at the early stage of a crawl, and the crawler re-visit these important pages with higher priority.
Keywords: Revisit Policy, Search Engines, Web Crawler,WWW,Page rank
EXTENSION of the file: .doc
Special (Invited) Session:
Organizer of the Session:
How Did you learn about congress:
IP ADDRESS: 124.124.224.105