Find anagrams of a string in another string - Algorithms and Problem SolvingAlgorithms and Problem Solving



Find anagrams of a string in another string - Algorithms and Problem SolvingAlgorithms and Problem Solving

Given a string (haystack) and a pattern (needle). Find all the anagrams of the pattern that are contained in the string.

For example: str = "abateatas" and pattern = "tea". Then output should be the anagrams of "tea" contained as a substring in str. So, the desired output = {"ate", "eat", "tea"}.

A quick bruteforce solution would be to find all the anagram of the pattern and then check which are contained as a substring of str. This is a costly string matching operation including exponential cost for generating all the anagrams.

Sliding Histogram Approach – O(n) time O(1) space
Notice that when we find an anagram starting from i we might potentially find another starting from i+1 if character count distribution remains same. For example, text = "abaab" and pattern = "aba". We can see both of the strings have same character count distribution i.e. histogram ([a=2,b=1,c=0,..]) for first m characters, where m=length of pattern. If we remove first character 'a' and add 4th character 'a' then histogram remains same ([a=2,b=1,c=0,..]).

So, we can create a fixed length (=size of the pattern) sliding window histogram for characters in text string and check if the window equals to the pattern histogram at each position of the text. We keep sliding the window to right one position and check if the window equals to patten histogram. This is linear time algorithm if the alphabet size of the strings is finite (for example 256 character set). Below is the implementation of the idea in O(n) time and O(1) space.


Read full article from Find anagrams of a string in another string - Algorithms and Problem SolvingAlgorithms and Problem Solving


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts