Web Spider Performance and Data Structure Analysis


Semantic Web and Web Services, 2012, United States Of America, 01 July 2012, pp.73-77

  • Publication Type: Conference Paper / Full Text
  • Country: United States Of America
  • Page Numbers: pp.73-77


The aim of this study is performance evaluation of a web spider which almost all search engines utilize during the web crawling. A data structure is required to keep record of pages visited and the keywords extracted from the web site during the web crawling. The paper first goes into the detail of possible data structures for a web spider and critics all possibilities depending on their time and memory efficiencies. Furthermore the possibilities are narrowed into tree variations only and a tree is selected from each tree data structure family. Finally, a search engine is implemented and all the tree alternatives from each of the tree data structure family are also implemented and the performance of each alternative is benchmarked.