java - How can I split a text into sentences using the Stanford parser? - Stack Overflow



You can check the DocumentPreprocessor class. Below is a short snippet. I think there may be other ways to do what you want.

String paragraph = "My first sentence. My second sentence.";  Reader reader = new StringReader(paragraph);  DocumentPreprocessor dp = new DocumentPreprocessor(reader);    List<String> sentenceList = new LinkedList<String>();  Iterator<List<HasWord>> it = dp.iterator();  while (it.hasNext()) {     StringBuilder sentenceSb = new StringBuilder();     List<HasWord> sentence = it.next();     for (HasWord token : sentence) {        if(sentenceSb.length()>1) {           sentenceSb.append(" ");        }        sentenceSb.append(token);     }     sentenceList.add(sentenceSb.toString());  }    for(String sentence:sentenceList) {     System.out.println(sentence);  }

Read full article from java - How can I split a text into sentences using the Stanford parser? - Stack Overflow


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts