1. Given a multi-set of numbers (integers), determine if there is a majority element. E.g. S = {4, 1, 3, 4, 4, 3, 3, 4, 4, 1, 4} has 4 in majority S = {4, 4, 4, 3, 1, 1, 1, 3, 2, 2, 2} has no majority In a sequence x1, x2, ..., xn, define the multiplicity of x to be its frequency. A number is in majority if its multiplicity is greater than n/2. (For example, imagine each xi represents a vote; ints are ids of candidates.) One simple solution is to sort; all equal number are consecutive, so can count. Takes O(n log n) time. A more complicated solution uses median---if there is a majority, it must be the median. It can be found in O(n) time, but complex. A much better, simpler linear time solution exists. Observation: Pick any two elements xi and xj. If xi \neq xj, then eliminate BOTH xi and xj from the list. Then, the majority in the original list is still a majority in the reduced list. ALGORITHM: Use two variables C (candidate) and M (multiplicity). Scan elements in their given order. When considering xi, inductively, C is the only candidate for majority among {x1, x2, ..., x_i-1} and M is its count excluding the times C was eliminated. When xi is considered, if C is zero (before comparison), put xi = C and M = 1. else if xi = C, increment M; if xi \neq C, decrement M; At the end, C is the only possible candidate for majority. We make a second pass to "confirm" that. O(n) time. 2. Extension to K elements, each appearing at least n/(k+1) times. Maintain k candidates and counters. When considering xi, if xi equals one of the cands, increment it; if xi different and some counter available, set xi as that cand, and M = 1; if xi different from all, and all cands full; decrement them all. 3. A hot application problem in Data Stream context---heavy hitters in routers, top k popular items, etc... One important attribute of this scheme is that it uses O(1) memory---independent of the stream size. ------------------------------------------------------------------ 4. Celebrity Problem. Among n people, a celebrity is defined as someone who is known to everyone, but who knows no one. Graph model: Person = node. If A knows B, draw a directed edge from A to B. Celebrity is a node with out-degree = 0; in-deg = n-1 (sink). A graph can have at most 1 sink/celebrity. The problem is to identify the celebrity, if one exists, by asking only questions of the following form: "Excuse me, do you know person x?" There are n(n-1)/2 pairs of persons, so potentially we need to ask n(n-1) questions. Turns out one can solve the problem without even looking at all the input! 5. Induction: First attempt. With 2 people, the game is simple. Two questions. For the general case, take out person n, and inductively solve the problem for the remaining (n-1). 3 possible outcomes: the celebrity is among first n-1; the celebrity is the nth guy; no celebrity at all. Case 1 is easiest---just need to check that the nth person knows the celebrity, and the celebrity does not know him. The other two cases are tricky. To determine if nth person is celebrity, we need to ask 2(n-1) questions. But if we ask so many questions in the nth step, the total complexity of the algorithm will be n(n-1) questions. 5. The correct solution is based on the following observation: For any two players A and B, if A knows B, then A cannot be a celebrity; if A does not know B, then B cannot be a celebrity. Thus, with 1 question, we can eliminate one of two candidates. Algorithm: Pick any two players, A and B. Based on the outcome of the question, eliminate the non-celeb, and inductively solve the remaining n-1 people. If the remainder has no celebrity, then done. Otherwise, ask 2 questions to make sure that the celebrity is known to the person eliminated in first round; and that celebrity does not know that person. Thus, the total number of questions is at most 3(n-1). Stack Implementation: Take top two people; with one question, eliminate one of them; repeat, until one person remains. Now go back, and ask 2(n-1) questions, comparing this celeb with (n-1) losers. ------------------------------------------------------------------ 6. Missing Numbers You are processing a stream of n numbers (presented in arbitrary order); each number is presented exactly once. You have limited constant amount of memory---say, a few words. Suppose you are told that EXACTLY one of the numbers from the set {1, 2, ..., n} is missing. Can you identify it? Solution: Keep track of the running sum. Subtract is from n(n+1)/2. How about if 2 numbers are missing? You could try storing s = n(n+1)/2 - \sum xi, and p = n! - \pi xi, giving two equations in two unknowns. But one can solve the same problem with far fewer bits: s = n(n+1)/2 - \sum xi and ss = n(n+1)(2n+1)/6 - \sum xi^2 In general, we could store kth power sums.
Subscribe to:
Post Comments (Atom)
Labels
Algorithm
(219)
Lucene
(130)
LeetCode
(97)
Database
(36)
Data Structure
(33)
text mining
(28)
Solr
(27)
java
(27)
Mathematical Algorithm
(26)
Difficult Algorithm
(25)
Logic Thinking
(23)
Puzzles
(23)
Bit Algorithms
(22)
Math
(21)
List
(20)
Dynamic Programming
(19)
Linux
(19)
Tree
(18)
Machine Learning
(15)
EPI
(11)
Queue
(11)
Smart Algorithm
(11)
Operating System
(9)
Java Basic
(8)
Recursive Algorithm
(8)
Stack
(8)
Eclipse
(7)
Scala
(7)
Tika
(7)
J2EE
(6)
Monitoring
(6)
Trie
(6)
Concurrency
(5)
Geometry Algorithm
(5)
Greedy Algorithm
(5)
Mahout
(5)
MySQL
(5)
xpost
(5)
C
(4)
Interview
(4)
Vi
(4)
regular expression
(4)
to-do
(4)
C++
(3)
Chrome
(3)
Divide and Conquer
(3)
Graph Algorithm
(3)
Permutation
(3)
Powershell
(3)
Random
(3)
Segment Tree
(3)
UIMA
(3)
Union-Find
(3)
Video
(3)
Virtualization
(3)
Windows
(3)
XML
(3)
Advanced Data Structure
(2)
Android
(2)
Bash
(2)
Classic Algorithm
(2)
Debugging
(2)
Design Pattern
(2)
Google
(2)
Hadoop
(2)
Java Collections
(2)
Markov Chains
(2)
Probabilities
(2)
Shell
(2)
Site
(2)
Web Development
(2)
Workplace
(2)
angularjs
(2)
.Net
(1)
Amazon Interview
(1)
Android Studio
(1)
Array
(1)
Boilerpipe
(1)
Book Notes
(1)
ChromeOS
(1)
Chromebook
(1)
Codility
(1)
Desgin
(1)
Design
(1)
Divide and Conqure
(1)
GAE
(1)
Google Interview
(1)
Great Stuff
(1)
Hash
(1)
High Tech Companies
(1)
Improving
(1)
LifeTips
(1)
Maven
(1)
Network
(1)
Performance
(1)
Programming
(1)
Resources
(1)
Sampling
(1)
Sed
(1)
Smart Thinking
(1)
Sort
(1)
Spark
(1)
Stanford NLP
(1)
System Design
(1)
Trove
(1)
VIP
(1)
tools
(1)
Popular Posts
-
13 Caves and a Thief Puzzle - Brain Teasers - Google Interview Puzzle There are 13 caves arranged in a circle. There is a thief in one of t...
-
【新提醒】我与Google的虐恋timeline //这是一篇流水账【一亩三分地论坛找工求职版】 - Powered by Discuz! 楼主15年12月毕业,抱着不知道从哪里来的谜一般的自信,对 找工 作毫不着急。结果大家也知道的,去年年底到现在的找工形势都不太好。 毕业后...
-
【新提醒】Google内部换组经历 + AMA【一亩三分地求职版】 - 可能要升职,升职后的第一个rating一般都会默认拿meet。换组之后的第一个rating也一般都会拿meet。所以这个时候留下或走对rating没有区别。很多人会在这个时候换个自己更喜欢的组。LZ准备从SE...
-
Puzzles, Maths and Algorithms: Math Magician A magician has one hundred cards numbered 1 to 100. He puts them into three boxes, a red one, a...
-
【新提醒】Offer中的猫腻――RSU和Option怎么交税【一亩三分地论坛抖包袱版】 - Powered by Discuz! 很多刚毕业的朋友在 找工 作的时候特别关注薪水,往往因为基本工资的一点点差距而做很后悔的决定。事实上,硅谷、湾区的主流观点早就认为收入中股票期权为王,...
-
【新提醒】典型有计划有目标找工经历,最后如愿去了Google【一亩三分地论坛找工求职版】 - Powered by Discuz! master 一年半项目毕业, gpa 3.6 ,修了一些 distributedsy stem 方向的课。 lc 刷了 3 遍, cc150 两遍...
-
How many degrees (if any) are there in the angle between the hour and minute hands of a clock when the time is a quarter past three? each ...
-
Bloomberg 电面+Onsite - 一亩三分地论坛 - Powered by Discuz! 国人+白人: . more info on 1point3acres.com 聊简历。。 URL 背后机制: DNS IP lookup, Load b...
-
【新提醒】Data scientist @ Google,Apple【一亩三分地论坛面经版】 - Powered by Discuz! Apple则是同学内推。面了两轮电话面试,发现工作内容实在跟预想差太多(貌似就是query, generate summary statisti...
-
Amazon OA2 Work Simulation | Keep Walking 目前两大原则: requirement排在第一, deadline第二 有manager出现的选项无脑选manager,manager就是一个组的代言人+保护伞,大哥自己人。 注意事项:...
No comments:
Post a Comment