Problem
You have a HTML document that contains relative URLs, which you need to resolve to absolute URLs.
Solution
- Make sure you specify a
base URI
when parsing the document (which is implicit when loading from a URL), and - Use the
abs:
attribute prefix to resolve an absolute URL from an attribute:
Document doc = Jsoup.connect("http://jsoup.org").get();
Element link = doc.select("a").first();
String relHref = link.attr("href"); // == "/"
String absHref = link.attr("abs:href"); // "http://jsoup.org/"
Read full article from Working with URLs: jsoup Java HTML parser
No comments:
Post a Comment