The newly added RegEx constructs to support the named capturing group are:
(1) (?<NAME>X) to define a named group NAME"
(2) \\k<Name> to backref a named group "NAME"
(3) <$<NAME> to reference to captured group in matcher's replacement str
(4) group(String NAME) to return the captured input subsequence by the given "named group"
With these new constructs, now you can write something like
String pStr = "0x(?<bytes>\\\\p{XDigit}{1,4})\\\\s++u\\\\+(?<char>\\\\p{XDigit}{4})(?:\\\\s++)?";
Matcher m = Pattern.compile(pStr).matcher(INPUTTEXT);
if (m.matches()) {
int bs = Integer.valueOf(m.group("bytes"), 16);
int c = Integer.valueOf(m.group("char"), 16);
System.out.printf("[%x] -> [%04x]%n", bs, c);
}
or
System.out.println("0x1234 u+5678".replaceFirst(pStr, "u+$<char> 0x$<bytes>"));
OK, examples above just show how to use these new constructs, not necessary mean they are "better":-) more "easy" to understand for me though.
The method group(String name) is NOT added into MatchResult interface for the compatibility concern ( personally I don't think it's a big deal, my guess is the majority of RegEx users can just live with the Matcher class, the compatibility weighs more here. Let me know otherwise).
Read full article from Named Capturing Group in JDK7 RegEx (Xueming Shen's Oracle Blog)
No comments:
Post a Comment