Distributed Component-Based Crawler for AJAX Applications

Raj, Suryansh and Krishna, Rajashree and Nayak, Ashalatha (2018) Distributed Component-Based Crawler for AJAX Applications. In: International Conference on Advances in Electronics, Computer and Communications (ICAECC-2018), 09/02/2018, Reva University,Bangalore,Karnataka.

[img] PDF
1248.pdf - Published Version
Restricted to Registered users only

Download (365kB) | Request a copy


Crawling web applications is important for indexing websites as well as for testing vulnerabilities present in the website. The research area of crawling traditional websites has made significant progress and many software suites are available which can carry deep crawls of large traditional websites in limited time. The modern AJAX (asynchronous JavaScript and XML) based websites, however, cannot be crawled by traditional crawlers. The area is open to research and many open-source software suites are being developed. However, the software suites developed so far still face the issues of state space explosion, poor time efficiency and incomplete content coverage. This research work aims to develop a distributed component-based crawler for deterministic AJAX applications to reduce state space explosion, improve time efficiency and provide complete content coverage. It uses a combination of multiple approaches to develop the solution. Firstly, it takes into account a Component-Based approach to reduce state space explosion. It then takes a Distributed-Crawling approach to process the events concurrently in order to improve efficiency. It employs a Breadth First Search (BFS) strategy to provide complete content coverage

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: AJAX crawler; distributed component-based crawler; concurrent crawler; state space explosion;
Subjects: Engineering > MIT Manipal > Computer Science and Engineering
Depositing User: MIT Library
Date Deposited: 09 Jan 2019 05:34
Last Modified: 09 Jan 2019 05:34
URI: http://eprints.manipal.edu/id/eprint/152766

Actions (login required)

View Item View Item