You can change the OutputSettings
for this:
Example:
final String html = ...; OutputSettings settings = new OutputSettings(); settings.escapeMode(Entities.EscapeMode.xhtml); String cleanHtml = Jsoup.clean(html, "", Whitelist.relaxed(), settings);
This is possible with a Document
parsed by Jsoup too:
Document doc = Jsoup.parse(...); doc.outputSettings().escapeMode(Entities.EscapeMode.xhtml); // ...
Edit:
Removing tags:
doc.select("p:matchesOwn((?is) )").remove();
Please note: after (?is)
there's not a blank, but char #160 (= nbsp). This will remove all p-Tags who's own text is only a
. If you want do so with all other tags, you can replace the p:
with *:
.
Read full article from java - alternative of JSoup or how to clean whitespaces - Stack Overflow
No comments:
Post a Comment