Given a string (haystack) and a pattern (needle). Find all the anagrams of the pattern that are contained in the string.
For example: str = "abateatas" and pattern = "tea". Then output should be the anagrams of "tea" contained as a substring in str. So, the desired output = {"ate", "eat", "tea"}.
A quick bruteforce solution would be to find all the anagram of the pattern and then check which are contained as a substring of str. This is a costly string matching operation including exponential cost for generating all the anagrams.
Sliding Histogram Approach – O(n) time O(1) space
Notice that when we find an anagram starting from i we might potentially find another starting from i+1 if character count distribution remains same. For example, text = "abaab" and pattern = "aba". We can see both of the strings have same character count distribution i.e. histogram ([a=2,b=1,c=0,..]) for first m characters, where m=length of pattern. If we remove first character 'a' and add 4th character 'a' then histogram remains same ([a=2,b=1,c=0,..]).
So, we can create a fixed length (=size of the pattern) sliding window histogram for characters in text string and check if the window equals to the pattern histogram at each position of the text. We keep sliding the window to right one position and check if the window equals to patten histogram. This is linear time algorithm if the alphabet size of the strings is finite (for example 256 character set). Below is the implementation of the idea in O(n) time and O(1) space.
Read full article from Find anagrams of a string in another string - Algorithms and Problem SolvingAlgorithms and Problem Solving
No comments:
Post a Comment