All About Programming: Crawler4j: Open-source Web Crawler for Java

September 17, 2013 Weeks ago I was given a task to read values from an e-commerce website. The idea was simple: a link was given, the application should parse the content of the HTML, download the specific value and store it. I decided to use a crawler instead, and started looking for open-source solutions for Java with fast implementation. I finally came across crawler4j , which proved to be simple but very efficient right away! So, below I show the implementation that fits my needs: simply store all available links within a given domain,

Read full article from Crawler4j: Open-source Web Crawler for Java

Crawler4j: Open-source Web Crawler for Java

No comments:

Post a Comment

Labels

Popular Posts