I'm closing this issue because with the new handle status code, broken links can be handled properly. See example here: https://code.google.com/p/crawler4j/source/browse/src/test/java/edu/uci/ics/crawler4j/examples/statushandler/ -Yasser
Read full article from Issue 107 - crawler4j - Recrawl Not Fetched Links - Open Source Web Crawler for Java - Google Project Hosting
No comments:
Post a Comment