1. Given a multi-set of numbers (integers), determine if there is a majority element. E.g. S = {4, 1, 3, 4, 4, 3, 3, 4, 4, 1, 4} has 4 in majority S = {4, 4, 4, 3, 1, 1, 1, 3, 2, 2, 2} has no majority In a sequence x1, x2, ..., xn, define the multiplicity of x to be its frequency. A number is in majority if its multiplicity is greater than n/2. (For example, imagine each xi represents a vote; ints are ids of candidates.) One simple solution is to sort; all equal number are consecutive, so can count. Takes O(n log n) time. A more complicated solution uses median---if there is a majority, it must be the median. It can be found in O(n) time, but complex. A much better, simpler linear time solution exists. Observation: Pick any two elements xi and xj. If xi \neq xj, then eliminate BOTH xi and xj from the list. Then, the majority in the original list is still a majority in the reduced list. ALGORITHM: Use two variables C (candidate) and M (multiplicity). Scan elements in their given order. When considering xi, inductively, C is the only candidate for majority among {x1, x2, ..., x_i-1} and M is its count excluding the times C was eliminated. When xi is considered, if C is zero (before comparison), put xi = C and M = 1. else if xi = C, increment M; if xi \neq C, decrement M; At the end, C is the only possible candidate for majority. We make a second pass to "confirm" that. O(n) time. 2. Extension to K elements, each appearing at least n/(k+1) times. Maintain k candidates and counters. When considering xi, if xi equals one of the cands, increment it; if xi different and some counter available, set xi as that cand, and M = 1; if xi different from all, and all cands full; decrement them all. 3. A hot application problem in Data Stream context---heavy hitters in routers, top k popular items, etc... One important attribute of this scheme is that it uses O(1) memory---independent of the stream size. ------------------------------------------------------------------ 4. Celebrity Problem. Among n people, a celebrity is defined as someone who is known to everyone, but who knows no one. Graph model: Person = node. If A knows B, draw a directed edge from A to B. Celebrity is a node with out-degree = 0; in-deg = n-1 (sink). A graph can have at most 1 sink/celebrity. The problem is to identify the celebrity, if one exists, by asking only questions of the following form: "Excuse me, do you know person x?" There are n(n-1)/2 pairs of persons, so potentially we need to ask n(n-1) questions. Turns out one can solve the problem without even looking at all the input! 5. Induction: First attempt. With 2 people, the game is simple. Two questions. For the general case, take out person n, and inductively solve the problem for the remaining (n-1). 3 possible outcomes: the celebrity is among first n-1; the celebrity is the nth guy; no celebrity at all. Case 1 is easiest---just need to check that the nth person knows the celebrity, and the celebrity does not know him. The other two cases are tricky. To determine if nth person is celebrity, we need to ask 2(n-1) questions. But if we ask so many questions in the nth step, the total complexity of the algorithm will be n(n-1) questions. 5. The correct solution is based on the following observation: For any two players A and B, if A knows B, then A cannot be a celebrity; if A does not know B, then B cannot be a celebrity. Thus, with 1 question, we can eliminate one of two candidates. Algorithm: Pick any two players, A and B. Based on the outcome of the question, eliminate the non-celeb, and inductively solve the remaining n-1 people. If the remainder has no celebrity, then done. Otherwise, ask 2 questions to make sure that the celebrity is known to the person eliminated in first round; and that celebrity does not know that person. Thus, the total number of questions is at most 3(n-1). Stack Implementation: Take top two people; with one question, eliminate one of them; repeat, until one person remains. Now go back, and ask 2(n-1) questions, comparing this celeb with (n-1) losers. ------------------------------------------------------------------ 6. Missing Numbers You are processing a stream of n numbers (presented in arbitrary order); each number is presented exactly once. You have limited constant amount of memory---say, a few words. Suppose you are told that EXACTLY one of the numbers from the set {1, 2, ..., n} is missing. Can you identify it? Solution: Keep track of the running sum. Subtract is from n(n+1)/2. How about if 2 numbers are missing? You could try storing s = n(n+1)/2 - \sum xi, and p = n! - \pi xi, giving two equations in two unknowns. But one can solve the same problem with far fewer bits: s = n(n+1)/2 - \sum xi and ss = n(n+1)(2n+1)/6 - \sum xi^2 In general, we could store kth power sums.
Subscribe to:
Post Comments (Atom)
Labels
Algorithm
(219)
Lucene
(130)
LeetCode
(97)
Database
(36)
Data Structure
(33)
text mining
(28)
Solr
(27)
java
(27)
Mathematical Algorithm
(26)
Difficult Algorithm
(25)
Logic Thinking
(23)
Puzzles
(23)
Bit Algorithms
(22)
Math
(21)
List
(20)
Dynamic Programming
(19)
Linux
(19)
Tree
(18)
Machine Learning
(15)
EPI
(11)
Queue
(11)
Smart Algorithm
(11)
Operating System
(9)
Java Basic
(8)
Recursive Algorithm
(8)
Stack
(8)
Eclipse
(7)
Scala
(7)
Tika
(7)
J2EE
(6)
Monitoring
(6)
Trie
(6)
Concurrency
(5)
Geometry Algorithm
(5)
Greedy Algorithm
(5)
Mahout
(5)
MySQL
(5)
xpost
(5)
C
(4)
Interview
(4)
Vi
(4)
regular expression
(4)
to-do
(4)
C++
(3)
Chrome
(3)
Divide and Conquer
(3)
Graph Algorithm
(3)
Permutation
(3)
Powershell
(3)
Random
(3)
Segment Tree
(3)
UIMA
(3)
Union-Find
(3)
Video
(3)
Virtualization
(3)
Windows
(3)
XML
(3)
Advanced Data Structure
(2)
Android
(2)
Bash
(2)
Classic Algorithm
(2)
Debugging
(2)
Design Pattern
(2)
Google
(2)
Hadoop
(2)
Java Collections
(2)
Markov Chains
(2)
Probabilities
(2)
Shell
(2)
Site
(2)
Web Development
(2)
Workplace
(2)
angularjs
(2)
.Net
(1)
Amazon Interview
(1)
Android Studio
(1)
Array
(1)
Boilerpipe
(1)
Book Notes
(1)
ChromeOS
(1)
Chromebook
(1)
Codility
(1)
Desgin
(1)
Design
(1)
Divide and Conqure
(1)
GAE
(1)
Google Interview
(1)
Great Stuff
(1)
Hash
(1)
High Tech Companies
(1)
Improving
(1)
LifeTips
(1)
Maven
(1)
Network
(1)
Performance
(1)
Programming
(1)
Resources
(1)
Sampling
(1)
Sed
(1)
Smart Thinking
(1)
Sort
(1)
Spark
(1)
Stanford NLP
(1)
System Design
(1)
Trove
(1)
VIP
(1)
tools
(1)
Popular Posts
-
A.L. Christensen / Getty Images / Flickr Open Want to avoid giving the impression you lack confidence and authority? Avoid these words Thi...
-
【新提醒】找人内推以前先看这个帖子!内推人建议你如何写简历!【一亩三分地论坛内推版】 - Powered by Discuz! 1. 简历要靠谱。 的确很多人的简历typo一堆,或者言辞过于浮夸,或者跟职位完全不match,或者你18个月后才能上班但是你非要现在申请。 2....
-
【新提醒】我与Google的虐恋timeline //这是一篇流水账【一亩三分地论坛找工求职版】 - Powered by Discuz! 楼主15年12月毕业,抱着不知道从哪里来的谜一般的自信,对 找工 作毫不着急。结果大家也知道的,去年年底到现在的找工形势都不太好。 毕业后...
-
Count number of ways to divide a number in 4 parts - GeeksforGeeks Count number of ways to divide a number in 4 parts ...
-
[BetterExplained]遇到问题为什么应该自己动手 – 刘未鹏 | Mind Hacks 我们在生活中总是在不停地试图做最优经济决策,只不过很多时候我们为适应远古社会而进化的大脑未必适用于现代工业社会(《Mean Genes》,《进化心理学》,《How We Decid...
-
经过Step1-500题训练,接下来可以开始Step2-500题,包括POJ训练计划的298题和SGU前两章200题。需要1-1年半时间继续提高解决问题和编码实现能力,加油ACMer!任重道远 =700) window.open('http://cnc.qzs.qq.co...
-
byoungd/English-level-up-tips-for-Chinese: 可能是让你受益匪浅的英语进阶指南 不久前,备考托福的女神问了我一个问题:如何高效学习英语? 在我思考如何回答这个问题时,想到了在大四上一学期我考过 26 门课的经验(其中重修 19 门,当前学...
-
Build windows service from java application with procrun | Jörg Lenhard's Blog Recently, I needed to adjust a Java application to be abl...
-
【新提醒】关于Google SETI和SWE的碎碎念【一亩三分地论坛找工求职版】 - Powered by Discuz! 最近看到地里有好几个帖子问关于G家SETI的事情,楼主当初面之前也云里雾里的,感觉这方面信息比较缺,现在专门开一帖来回馈地里。 如果你拿到SETI,基本上...
-
【新提醒】雅虎/Verizon 电面【一亩三分地论坛面经版】 - Powered by Discuz! 昨天面了一通领英今天转战雅虎。中午十二点面完,傍晚就收到了现场面试的邀请,看来是真缺人啊。贴面经之前实在忍不住唠嗑一下,虽然很多人已经不再考虑雅虎了,我也只是抱着试一试的态度让...
No comments:
Post a Comment