efficiency - Why does Java's String class not implement a more efficient indexOf()? - Programmers Stack Exchange



Efficiency" is all about tradeoffs, and the "best" algorithm will depend on many factors. In the case of indexOf(), one of those factors is the expected size of strings.

The JDK's algorithm is based on simple indexed reference into existing character arrays. The Knuth-Morris-Pratt that you reference needs to create a new int[] that's the same size as the input string. For Boyer-Moore, you need several external tables, at least one of which is two-dimensional (I think; I've never implemented B-M).

So the question becomes: are allocating the additional objects and building lookup tables offset by the increased performance of the algorithm? Remember, we're not talking about a change from O(N2) to O(N), but simply a reduction in the number of steps taken for each N.

And I would expect that the JDK designers said something like "for strings less than X characters, the simple approach is faster, we don't expect regular use of strings longer than that, and people who do use longer strings will know how to optimize their searches."


Read full article from efficiency - Why does Java's String class not implement a more efficient indexOf()? - Programmers Stack Exchange


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts