Easyspider is a Free Open-source Self-hosted Distributed Web Crawler
Easy Spider is a fascinating project that was created in 2006 to facilitate distributed web crawling. The project was developed using Perl and it is designed to crawl web pages, distribute the crawled data to a server, and generate XML files from it. What makes Easy Spider a great tool is that it is compatible with any computer, whether it is running Windows or Linux.
The project uses a unique architecture that allows the client site to collect all the data and store it on the server. This architecture is particularly useful for large data sets since it prevents the client site from becoming overloaded. Additionally, the server can be accessed from anywhere in the world, allowing users to access their data from any location.
Easy Spider has revolutionized the way we gather and store data, making it easier and more efficient than ever before.
Features
- Client/Server Distributed Crawling
- Config File Support
- PDF, DOC, XLS, PPT Extraction Support
License
- GNU Library or Lesser General Public License version 2.0 (LGPLv2)