Word Cloud Data (Practice Interview Question) | Interview Cake



Word Cloud Data (Practice Interview Question) | Interview Cake

To do this, you'll need data. Write code that takes a long string and builds its word cloud data in a dictionary

A hash table (also called a hash, hash map, map, unordered map or dictionary) is a data structure that pairs keys to values.

  light_bulb_to_hours_of_light = {      'incandescent': 1200,      'compact fluorescent': 10000,      'LED': 50000  }  

Hash tables:

  • take on average O(1)O(1) time for insertions and lookups
  • are unordered (the keys are not guaranteed to stay in the same order)
  • can use many types of objects as keys (commonly strings)

Hash tables can be thought of as arrays, if you think of array indices as keys!

In fact, hash tables are built on arrays. So if you ever want to use a hash table but know your keys will be sequential integers (like 1..1001..100), you can probably save time and space by just using an array instead.

Note: hash tables have an average case insertion and lookup cost of O(1)O(1). In industry, we often confuse the average-case cost with worst case cost, but they're not really the same. Because of hash collisions and rebalancing, a hash table insertion or lookup can cost as much as O(n)O(n) time in the worst case. But usually in industry we assume hashing and resizing algorithms are clever enough that collisions are rare and cheap.

, where the keys are words and the values are the number of times the words occurred.

Think about capitalized words. For example, look at these sentences:

  'After beating the eggs, Dana read the next step:'  'Add milk and eggs, then add flour and sugar.'  

What do we want to do with "After", "Dana", and "add"? In this example, your final dictionary should include one "Add" or "add" with a value of 22. Make reasonable (not necessarily perfect) decisions about cases like "After" and "Dana".


Read full article from Word Cloud Data (Practice Interview Question) | Interview Cake


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts