Matcher (java.util.regex.Matcher)








The java.util.regex.Matcher class is used to search through a text for multiple occurrences of a regular expression. You can also use a Matcher to search for the same regular expression in different texts.
The Matcher class has a lot of useful methods. For a full list, see the official JavaDoc for the Matcher class. I will cover the core methods here.

Java Matcher Example

Here is a quick Java Matcher example so you can get an idea of how the Matcher works:
String text    =          "This is the text to be searched " +          "for occurrences of the http:// pattern.";    String patternString = ".*http://.*";    Pattern pattern = Pattern.compile(patternString);    Matcher matcher = pattern.matcher(text);  boolean matches = matcher.matches();  
First a Pattern is created, and from that a Matcher. Then the matches() method is called, which returns true if the pattern matches the text, and false if not.
You can do a whole lot more with the Matcher class. The rest is covered throughout the rest of this text.

Creating a Matcher

Creating a Matcher is done via the matcher() method in the Pattern class. Here is an example:
String text    =          "This is the text to be searched " +          "for occurrences of the http:// pattern.";    String patternString = ".*http://.*";    Pattern pattern = Pattern.compile(patternString);    Matcher matcher = pattern.matcher(text);    

matches()

The matches() method in the Matcher class matches the regular expression against the whole text passed to the Pattern.matcher() method, when the Matcher was created. Here is an example:
boolean matches = matcher.matches();  
If the regular expression matches the whole text, then the matches() method returns true. If not, the matches() method returns false.
You cannot use the matches() method to search for multiple occurrences of a regular expression in a text. For that, you need to use the find(), start() and end() methods.

lookingAt()

The lookingAt() method works like the matches() method with one major difference. The lookingAt() method only matches the regular expression against the beginning of the text, whereas matches() matches the regular expression against the whole text. In other words, if the regular expression matches the beginning of a text but not the whole text, lookingAt() will return true, whereas matches() will return false.
Here is an example:
String text    =          "This is the text to be searched " +          "for occurrences of the http:// pattern.";    String patternString = "This is the";    Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);  Matcher matcher = pattern.matcher(text);    System.out.println("lookingAt = " + matcher.lookingAt());  System.out.println("matches   = " + matcher.matches());  
This example matches the regular expression "this is the" against both the beginning of the text, and against the whole text. Matching the regular expression against the beginning of the text (lookingAt()) will return true.
Matching the regular expression against the whole text (matches()) will return false, because the text has more characters than the regular expression. The regular expression says that the text must match the text "This is the" exactly, with no extra characters before or after the expression.

find() + start() + end()

The find() method searches for occurrences of the regular expressions in the text passed to the Pattern.matcher(text) method, when the Matcher was created. If multiple matches can be found in the text, the find() method will find the first, and then for each subsequent call to find() it will move to the next match.
The methods start() and end() will give the indexes into the text where the found match starts and ends. Actually end() returns the index of the character just after the end of the matching section. Thus, you can use the return values of start() and end() inside a String.substring() call.
Here is an example:
String text    =          "This is the text which is to be searched " +          "for occurrences of the word 'is'.";    String patternString = "is";    Pattern pattern = Pattern.compile(patternString);  Matcher matcher = pattern.matcher(text);    int count = 0;  while(matcher.find()) {      count++;      System.out.println("found: " + count + " : "              + matcher.start() + " - " + matcher.end());  }

Read full article from Matcher (java.util.regex.Matcher)

No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts