Buttercola: Twitter: Anagram



Buttercola: Twitter: Anagram

Given two string a and b, determine if we can use characters in a to compose the string b, each character in string a can be used once and only once.

Understand the problem:
有点像Anagram的题目。不过问题可能是如果String a 和 b 都很大的话,无法Fit 到内存里面怎么办?

如果假设String a 和 b 都只是含有a - z 这样26 个字符的话,可以先建立一 long[26] 的 数组,然后从硬盘中把String a load进来,一次load 一个character, 然后在相应的bucket里面统计词频, i.e, counter 加1。然后在把 String b 从硬盘里面load 进来,一次一个character, 然后去相应的bucket 里面counter - 1。最后再扫一遍buckets, 每个bucket 里面的counter 应该为0。

这样下来,space complexity 就是 26 * sizeof(long) = 26 * 8 byte = 208 byte = 0.2 kb, 远远能装下内存。

Read full article from Buttercola: Twitter: Anagram


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts