崔添翼 § 翼若垂天之云 › 构造Suffix Array的DC3算法



崔添翼 § 翼若垂天之云 › 构造Suffix Array的DC3算法

DC3算法的基本过程是这样的:首先构造出来一个superstring,这个串的每个字母都是原串的字母拼接上后面的两个字母,但这个串仅包含原串中下标模3不为0的位置。(递归地)求出这个串的SA,设为SA12。显然SA12中串的排列顺序与最终结果SA的排列顺序是相同的,只是SA12包含的串只有原串的2/3。

然后利用这个SA12和原串在线性时间内求出没有包含在SA12中的1/3的串的SA(也就是下标模3为0的那些串),设为SA0。求SA0的方法是:串的初始顺序是它们后面那个串在SA12中的顺序,然后再依据第一个字母进行一次RadixSort就可以了。也就是说SA0中的顺序可以仅用两个关键字来确定:第一关键字是第一个字母,第二关键字是后面的那个串在SA12中的位置。


Read full article from 崔添翼 § 翼若垂天之云 › 构造Suffix Array的DC3算法


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts