The Mediator Pattern



The Mediator Pattern

The American Heritage dictionary defines the word mediator as "one that mediates, especially one that reconciles differences between disputants." In this manner, the mediator pattern usually implements a single object that becomes a shared resource through all of the different pieces of an application. It's a higher-level version of pub/sub in that it's commonly used to communicate across the different features of an application in contrast to being used within one feature to communicate with all of the individual pieces of that same feature.

"Understanding the similarities and differences between an event aggregator and mediator is important for semantic reasons."
- Derick Bailey

This article is part of a series called JavaScript Design Patterns.

Advantages

  • Reduces the communication relationship from "many-to-many" to "many-to-one"
  • Helps us pinpoint dependencies
  • Excellent at decoupling objects which often promotes smaller, reusable components

Read full article from The Mediator Pattern


The Facade Pattern



The Facade Pattern

The Facade Pattern

The purpose of the facade pattern is to conceal the underlying complexity of the code by using an anonymous function as an extra layer. Internal subroutines are never exposed but rather invoked through a facade which makes this pattern secure in that it never exposes anything to the developers working with it. The facade pattern is both extremely interesting and very useful for adding an extra layer of security to your already minified code. This pattern is extremely useful when coupled with the revealing module pattern.

This article is part of a series called JavaScript Design Patterns.

Advantages

  • Enhances security for your web application
  • Works well in combination with other patterns
  • Easy to implement
  • Makes it easy to patch internals
  • Provides a simpler public interface
  • Proven useful for other major libraries such as jQuery

Disadvantages

There aren't any real drawbacks as it provides a unified interface to a set of interfaces in a subsystem. As a result, you aren't forced to make any unwanted compromises, thus a win-win. One possible note worth mentioning is that a developer must decide whether the implicit cost of implementation is really worth the abstraction (though this is generally a small footprint).


Read full article from The Facade Pattern


DATA ANALYSIS: 10 popular Linux commands for Hadoop



DATA ANALYSIS: 10 popular Linux commands for Hadoop

1. sort
A good conduct of running Hadoop is to always test the map/reduce programs at the local machine before releasing the time-consuming map/reduce codes to the cluster environment. The sort command simulates the sort and shuffle step necessary for the map/redcue process. For example, I can run the piped commands below to verify whether the Python codes have any bugs.
./mapper.py | sort | ./reducer.py
2. tail
Interestingly, the FS shell at Hadoop only supports the tail command instead of the head command. Then I can only grab the bottom lines of the data stored at Hadoop.
hadoop fs -tail 5 data/web.log.9
3. sed
Sine the FS shell doesn't provide the head command, the alternative solution is to use the sed command that actually has more flexible options.
hadoop fs -cat data/web.log.9 | sed '1,+5!d'
4. stat
The stat command allows me to know the time when the file has been touched.
hadoop fs -stat data/web.log.9
5. awk
The commands that the FS shell supports usually have very few options. For example the du command under the FS shell does not support -sh option to aggregate the disk usage of the sub-directories. In this case, I have to look for help from the awk command to satisfy my need.
hadoop fs -du data | awk '{sum+=$1} END {print sum}'
6. wc
One of the most important things to understand a file located at the Hadoop is to find the number of its total lines.
hadoop fs -cat data/web.log.9 | wc -l
7. cut
The cut command is convenient to select the specified columns at the file. For example, I am able to count the lines for each of the unique groups from the column between the position of #5 and #14.
hadoop fs -cat data/web.log.9 | cut -c 5-14 | uniq -c
8. getmerge
The great thing for the getmerge command is that I am able to fetch all the result after map/reduce to the local file system as a single file.
hadoop fs -getmerge result result_merged.txt
9. grep
I can start a mapper-only job only with the grep command form the Bash shell to search the lines which contain the key words I am interested in. And this is a map-only task.
hadoop jar $STREAMING -D mapred.reduce.tasks=0 -input data -output result -mapper "bash -c 'grep -e Texas'"  
10. at and crontab
The at and crontab commnands are my favorite to schedule a job at Hadoop. For example, I would like to use the order below to clean the map/reduce results at midnight.
at 0212  at > hadoop fs -rmr result

Read full article from DATA ANALYSIS: 10 popular Linux commands for Hadoop


So it took me almost the full 45 minute Google interview to write this. - John Corser



So it took me almost the full 45 minute Google interview to write this. - John Corser

I'm a bit frustrated with myself. I had the opportunity to do my 45 minute google hangout interview for a job as a software developer at Google (2nd round interview), and the interviewer opened with an EASY problem. The problem was this:

Write a method that takes an array of integers between 0 and 100, and returns a string showing the missing numbers. For example: [0,1,3,50] => "2,4-49,51-99″

As any careful programmer would, I asked questions first. Can I assume this array is already in sorted order? Yes. Can I assume there will be no duplicates? Yes. And can I assume all inputs will be between 0-99, or should I handle outside cases? Don't worry about that stuff, just create the string :).

Awesome, so it sounded like an easy problem. I spent the next 45 minutes stumbling over myself, stopping to "think" and generally feeling like I was messing this up big time, before FINALLY hitting my stride at the end of the interview and coming up with this.


Read full article from So it took me almost the full 45 minute Google interview to write this. - John Corser


通过“分布式系统的8大谬误”反思APP的设计 第七篇 谬误7:网络传输无需任何开销 - 简书



通过"分布式系统的8大谬误"反思APP的设计 第七篇 谬误7:网络传输无需任何开销 - 简书

Arnon Rotem-Gal-Oz's在解释这条谬误的时候具体指出了,需要从一下两方面来看:

第一,你需要考虑应用和网络接口之间的数据传输开销。除了带宽和时延会带来开销,数据的序列化和反序列化也会影响到性能。苹果在2010 WWDC session 117"基于服务器的用户体验"的演讲中,对比了xml,json,plist这几种数据传输格式的大小以及加载时间。对比结果表明,各种数据格式的大小都差不多,但是解析的时间相差很大;解析xml需要812毫秒,json需要416毫秒,ascii格式的plist需要140ms,而二进制数据流只需要19ms。

其次,你需要考虑维护网络服务以及基础设施带来的开销。一个用户会给服务器带来多大的负载?这些负载的特性是什么样的?服务器可以同时处理多少请求,扩展服务器需要花费多少钱?使用你的app需要花费你用户多少钱(这些用户可能是按流量给运营商支付流量费,也有可能是包月)。


Read full article from 通过"分布式系统的8大谬误"反思APP的设计 第七篇 谬误7:网络传输无需任何开销 - 简书


《通过“分布式系统的8大谬误”反思APP的设计 第八篇 谬误8:网络配置都是类似的》 - 简书



《通过"分布式系统的8大谬误"反思APP的设计 第八篇 谬误8:网络配置都是类似的》 - 简书

相对于web开放来讲,移动设备总是让人出乎预料。对一个应用来说,可能大多数用户所处的网络配置都类似。不幸的是,这个假设的会在某些情况下导致一些问题。

类似谬误6,不是所有的网络都有相同的配置。例如,某些wifi网络允许设备之间建立点对点的通信,有些却不支持。让移动app与其他设备通信(比方,与桌面软件)可能因此非常困难,即使它们身处同个网络内。TN2152"传输文件的一些策略"简要总结了一些设备之间,以及远程服务之间通信的技术。

一个web服务最开始可能开发出来只不过是为了给iOS APP使用,即使以后为其他设备开放APP也不会遇到多大的困难。服务可能因此需要提供多种数据格式,客户端也能按需要选择接收数据的类型(xml,json),反正怎么方便怎么来。二进制和属性数据格式很困难在多个设备进行解析。对围绕一个核心服务器展开的服务,这不是什么大不了的问题,因为它可以向客户端提供所需的数据格式,并可在解释不同类型的数据。但是对于需要提供点对点通信的应用来讲(比方GameKit),一个iOS APP与一个Android App进行通信,或者同个应用不同版本的手机APP,这可就是大麻烦了。


Read full article from 《通过"分布式系统的8大谬误"反思APP的设计 第八篇 谬误8:网络配置都是类似的》 - 简书


通过“分布式系统的8大谬误”反思APP的设计 第六篇 谬误6:只有一个管理者 - 简书



通过"分布式系统的8大谬误"反思APP的设计 第六篇 谬误6:只有一个管理者 - 简书

作为一个开发者,你可以控制在什么时候发布新的APP或新的服务器版本,但任何人都控制不了到底有多少类型的设备在运行你的APP。用户们可以在按自己意愿更新应用,也许更本不会再更新。导致你不得不同时处理各种版本。对于基于网络的应用,这是个大麻烦,你得不得同时处理不同版本的不同API接口请求。我认为,非常有必要在一开始就引入API版本管理。也许不会马上就需要支持多个api版本,但是这样一个策略存在,比方当你需要修改现有api调用是,可将接口从api.example.com/1/迁移到api.example.com/2,这使得新app的发布更加简单。

此外,为了支持不同版本的api,APP需要能进行本地版本的升级。如果应用升级的原因,修改了数据的格式(数据可能是保存在硬盘,coredata,keychain),你需要有个迁移之前用户数据的计划(至少升级后的应用不能在读取老数据时抛出异常)。你需要考虑从版本1.0直接升级到3.0的情况,而从中跳过了版本2.0。

用户就是他们设备的管理员,他们可以任意设置设备的限制,而你的APP就运行在这些设备上。比方,只有部分人会容许你获取地理位置信息,只有部分人会容许推送消息。有些会使用家长控制功能禁止孩子使用Safari,应用市场,或其他应用。

原文链接:http://blog.carbonfive.com/2010/12/03/iphone-distributed-computing-fallacy-6-there-is-one-administrator/

译者的总结:

1,作者主要说的是,你虽然是应用的开发者,但你其实无法掌握什么时候更新应用,包括应用运行的环境;这就是作者说,不止一个管理者。实际上,每个用户都是你的应用的管理者。

2,作者说的本地更新,我是深有体会。之前做个一个即使通信工具,会将数据保存在本地;后来不但需要支持文本还需要支持语音,图片。。。。之前的数据结构需要扩展,所以新版运行是需要先把之前的数据格式进行迁移。教训啊,教训。。


Read full article from 通过"分布式系统的8大谬误"反思APP的设计 第六篇 谬误6:只有一个管理者 - 简书


通过“分布式系统的8大谬误”反思APP的设计 第四篇 谬误4:网络是安全的 - 简书



通过"分布式系统的8大谬误"反思APP的设计 第四篇 谬误4:网络是安全的 - 简书

谬误4:网络是安全的;

只要与网络服务相关,开发人员都要从开发设计以及业务需求方面考虑网络的安全性,iOS也不例外。所有最基本的攻击类型,网络服务都需要考虑:session劫持,盗取证书,代码注入等等。网络安全是个负责学科,现在先让我们考虑一些和iOS APP相关的内容。

我们只能像相信用户一样,相信用户的设备(译者:这里的意思是用户就是小白,他们不懂得如何保护自己的信息。)。任何一个安装应用的人都可以深入了解到应用的内容(译者:像图片资源,私钥,任何保存在手机上的文件),也可以轻而易举获取手机发送的网络信息。千万不要在你的手机上保存隐私信息(网络访问凭证,认证证书,加密用密钥,等等),而且也不能让在用户面前公布这些信息。

UDID(译者:设备的唯一标示,苹果上称作为UDID;iOS5后这个API被禁止使用了,有人就开发了生成伪UDID的方法,也有人直接使用MAC地址作为设备的唯一序列号。)实际上是一个非常不牢靠的用户认证机制。UDID的确是个非常方便获取唯一标示的方法,无需用户创建账户并登录。我曾经在一款APP的开发中使用UDID来临时标示用户,但这不是个长久的方案。UDID是用户端提供的,是可以被串改的。比方,黑客可以有意伪装成另一个用户;也有可能是无意的,比方用户升级了他们的设备的操作系统或者换了台新的设备。总而言之,一个UDID只是识别了设备,识别不了用户。

移动设备经常会进入到公共网络服务中,而这些公共网络接入点很可能埋伏了不怀好意的黑客。你的APP发送的任何网络数据都可能被同个网络中的人截取。如果你觉得数据非常敏感,必须从加密的链接上发送,苹果提供了Advanced URLConnection样例,延时了如何在建立网络是如何使用基本的http授权,认证(包含了自己发布的证书,如何你想在自己的网络上发送)。一个最简单的例子,NSURLConnectionk完全可以支持https请求,你所要做的尽尽是实现2个代理函数。

原文中提供了一端建立https请求的代码,可以直接去看;

原文地址:http://blog.carbonfive.com/2010/11/29/iphone-distributed-computing-fallacy-4-the-network-is-secure/

译者的一些经验:

1,不要将敏感信息保存在手机上,android手机的所有文件目录都是公开,任何拿到这个手机的都可以从手机上读取文件信息,所以不要将密码不经过加密就保存在文件中。iOS看似安全,实际上越狱后的iPhone的所有目录也是可读的。

2,密码不能以明码保存;前两年许多网站的数据库数据泄露,暴露出密码都是明码保存在数据库中,就连CSDN也是如此。记住,密码必须是MD5后再保存于数据库中。

3,之前有新人非常想不通,为什么网络数据会被人截取;后来,我演示通过wireshark抓包,解包内容。即使是HTTPS也不是牢不可破,Fiddler就可以解密https的数据,当然需要在手机上安装文件。这个的可怕之处在于,黑客可以通过解密https的数据获取你的接口服务的内容,抓住规律。

4,并非只有我们的应用才会调用服务接口,本质上来说,我们的接口压根就是暴露在公众面前,所有人都可以来调用;

关于支付宝交易请求的加密,在服务器端进行数据签名;这样的好处是,不需要在客户端保存签名密钥。


Read full article from 通过"分布式系统的8大谬误"反思APP的设计 第四篇 谬误4:网络是安全的 - 简书


松果果 - 简书



松果果 - 简书


  • Read full article from 松果果 - 简书


    请关闭你的等死模式! - 简书



    请关闭你的等死模式! - 简书

                  等待成本:身心俱疲;拿不到单;影响自己其他业务。

                  穿越成本:身心愉快,早死早超生;还有成功的可能;实在不成功,集中精力应付新的单子。

                  与其在等死模式中消耗自己的心力与体力,还不如去试一试!她走到洗手间,心跳加速,打通电话,惊喜地听到对方爽快地答应自己,对方还开玩笑责怪她说:为什么现在才说,还以为你找别人了呢。

                     一旦你陷入了等死模式,最好的选择就是行动起来,进入穿越模式!穿越也许会有短期痛苦,但是等死往往会带来更大的永久损失。

                    《战胜拖拉》的作者尼尔・菲奥里在书中写道:

                      "我们真正的痛苦,来自于因耽误而产生的持续的焦虑,来自于因最后时刻所完成项目质量之低劣而产生的负罪感,还来自于因为失去人生中许多机会而产生的深深的悔恨。"


    Read full article from 请关闭你的等死模式! - 简书


    What is Bridge Method and Type Erasure in Java Generics? | So Many Word



    What is Bridge Method and Type Erasure in Java Generics? | So Many Word

    Bridge method and type erasure in Java generics are advance concepts. These concepts are important for experienced Java professionals. These are basically a low level typics, I mean Java Bytecode compilation level topics. Therefore, It is necessary that you must have good understanding of Java generics and it's implementation before going more deep in these concepts. Bridge method and type erasure are really interesting topics in Java. After reading this article, you would be amazed that how generics classes work internally. Apart from that, these topics are also important for all experienced Java professionals, who wants to have good understanding behind Java Generics. Because, most of interviewers would expect from you that you would have understanding of JVM internal execution and Java Bytecode compilation. If you are really interested to know, let us talk about bridge method and type erasure.

    Read full article from What is Bridge Method and Type Erasure in Java Generics? | So Many Word


    How to decide pool size for Thread Pools? | So Many Word



    How to decide pool size for Thread Pools? | So Many Word

    Thread pool is really an important topic to understand, if you are developing an application, which is designed with multithreading approaches. Thread pool is a pool of live worker threads. Worker thread is thread which accept a task, complete it and come back to thread pool to accept another task. Such way application can use already exist threads multiple time instead of creating new thread every time. By using thread pool in your application, you can boost its performance and utilize resource efficiently.

    But in this article, we are not talking how to create thread pool and how thread pool works. For this you can read my other blog about thread pool. Right now we are just discussing about, how to decide pool size for thread pool? I mean, on which factors we could decide that how many threads, we should create in our thread pool for better time and resource utilization.

    Most of the time people don't think about pool size for their thread pool and they create any number of threads in pool with their choice without thinking much on it. But believe me, if you don't know how to decide number of threads in pool that could be more dangerous and could spoil your performance and memory management badly. Therefore, we will discuss in this article on which factors thread pool size depends and how an appropriate pool size affects performance of your application?

    There are few factors mentioned below on which thread pool size depends:

    Available Processors: This is first and most important factor on which thread pool size matters a lot. Because the purpose behind creating thread pool or making any program multithreaded is to utilize all available resources like processors and unused CPU cycles. If your pool size would be less than available processors in your system means you are not using all available processors and not utilizing resources fully. On other side, if your pool size is greater than your available processer that means you are creating more threads which a thread pool can handle. It is not possible to provide CPU cycle to each thread that means you are wasting memory by creating such threads in pool which is not going to utilize by thread pool. There is one more overhead for thread pool to maintain scheduling for such extra threads. Therefore, ideal pool size is available processors (AP) in your system or AP+1. Here is an example, how to get number of processors available in your system using Java.

    int poolSize  = Runtime.getRuntime().availableProcessors();

    OR

    int poolSize  = Runtime.getRuntime().availableProcessors() + 1;

    This is ideal pool size, if your multithreaded task is kind of computation, where threads are not getting block, wait on I/O or some combination.

    But if your task also includes some kind of blocking or waiting time then there is one more way to determine pool size for such tasks. That is explained in next section.

    Behavior of Tasks: It is important to understand behavior of your task what you want to perform inside thread. If you have different category of tasks with different behaviors then consider you thread pool size according to that behavior. If your task is simple computation and do not obtain any blocking or waiting then you can consider number of available processors (AP)  or AP+1, as mentioned in above section.

    But for tasks that also include I/O or other blocking operations, you need a larger pool size for that, since not all of the threads will be schedulable at all times, some will be in wait condition. In order to size the pool properly, you must estimate the ratio of waiting time to compute time for your tasks, this estimate need not be precise and can be obtained through profiling or instrumentation. Alternatively, the size of the thread pool can be tuned by running the application using several different pool sizes under a benchmark load and observing the level of CPU utilization.

    The optimal pool size for keeping the processors at the desired utilization of CUP is:

    N = Number of processor available in system.

    U = Target utilization of CUP; 0 >= U <= 1.

    W/C = Ration of wait and computation time.

    Number of threads for thread pool can evaluate by this formula:

    Number of Threads = N * U * (1 + W/C)

    This is way you can estimate your suitable thread pool size.

    Amdahl's Law: In application development there are lots of tasks which cannot be perform totally concurrently, there are few tasks which need to be perform sequentially. Therefore it is important to understand, how much proportion of tasks can be executed concurrent and how much speed you would get after making that portion of task concurrent. Therefore, Amdahl's law is very useful to determine how much speed up you would get if you are breaking up your task into parallelism and sequential.

    According to Amdahl's Law, if P is the proportion of task can be executed parallel then maximum speed up can get with N number of processors (threads) is:

    Amdahl's Law

    By applying Amdahl's law you can determine how many threads can give you better result in case of concurrency.


    Read full article from How to decide pool size for Thread Pools? | So Many Word


    yuanhsh's blog: All about URL shorteners



    yuanhsh's blog: All about URL shorteners

    All about URL shorteners

    Internet applications don't always need to be full of features or cover all aspects of your Internet life to be successful. Sometimes it's ok to be simple and just focus on providing a single feature. It doesn't even need to be earth-shatteringly important—it should be just useful enough for its target users. The archetypical and probably most extreme example of this is the URL shortening application or URL shortener.
    This service offers a very simple but surprisingly useful feature. It provides a shorter URL that represents a normally longer URL. When a user goes to the short URL, he will be redirected to the original URL. For this simple feature, top three most popular URL shortening services (TinyURL, bit.ly, and is.gd) collectively had about 11 million unique visitors, 110 million page views and a reach of about one percent of the Internet in June 2009. In 2008, the most popular URL shortener at that time, TinyURL, was made one of Time Magazine's Top 50 Best Websites.
    The idea to shorten long and unwieldy URLs into shorter, more manageable ones has been around for some time. One of the earlier attempts to make it a public service is Make A Shorter Link (MASL), which appeared around July 2001. MASL did just that, though the usefulness was debatable as the domain name was long and the shortened URL could potentially be longer than the original.

    Read full article from yuanhsh's blog: All about URL shorteners


    努橙刷题编: "Layers" of Cache



    努橙刷题编: "Layers" of Cache

    "Layers" of Cache

    There are in fact many layers of cache in a modern PC. This does not even include looking at caches included on some peripherals, such as hard disks. Each layer is closer to the processor and faster than the layer below it. Each layer also caches the layers below it, due to its increased speed relative to the lower levels:

    Level
    Devices Cached
    Level 1 Cache
    Level 2 Cache, System RAM, Hard Disk / CD-ROM
    Level 2 Cache
    System RAM, Hard Disk / CD-ROM
    System RAM
    Hard Disk / CD-ROM
    Hard Disk / CD-ROM
    --



    What happens in general terms is this. The processor requests a piece of information. The first place it looks is in the level 1 cache, since it is the fastest. If it finds it there (called a hit on the cache), great; it uses it with no performance delay. If not, it's a miss and the level 2 cache is searched. If it finds it there (level 2 "hit"), it is able to carry on with relatively little delay. Otherwise, it must issue a request to read it from the system RAM. The system RAM may in turn either have the information available or have to get it from the still slower hard disk or CD-ROM. The mechanics of how the processor (really the chipset controlling the cache and memory) "look" for the information in these various places is discussed here.

    It is important to realize just how slow some of these devices are compared to the processor. Even the fastest hard disks have an access time measuring around 10 milliseconds. If it has to wait 10 milliseconds, a 200 MHz processor will waste 2 million clock cycles! And CD-ROMs are generally at least 10 times slower. This is why using caches to avoid accesses to these slow devices is so crucial.

    Caching actually goes even beyond the level of the hardware. For example, your web browser uses caching itself, in fact, two levels of caching! Since loading a web page over the Internet is very slow for most people, the browser will hold recently-accessed pages to save it having to re-access them. It checks first in its memory cache and then in its disk cache to see if it already has a copy of the page you want. Only if it does not find the page will it actually go to the Internet to retrieve it.

    reference: http://www.pcguide.com/ref/mbsys/cache/layers.htm

    Read full article from 努橙刷题编: "Layers" of Cache


    Apple no longer dominates the tablet market it created - Business Insider



    Apple no longer dominates the tablet market it created - Business Insider

    Apple no longer dominates the tablet market it created   When the iPad first came out in spring 2010, tablets were a rarity. Of course, Microsoft and some of its PC partners had experimented with PCs in a touch-screen slate format, but those tablets never took off. A year later, Apple was selling about 10 million iPads per quarter, and other manufacturers raced to capitalize on the trend. For a couple years, Apple was the tablet market. As this chart from Statista  based on IDC data shows, things have changed. Samsung has come close to matching the iPad's market share for the last year, and nearly 60% of tablets sold now come from a group of smaller players, with nobody consistently dominating that group. (Lenovo is currently the biggest of that group with about a 6% share.) Meanwhile, the overall tablet market has been in decline for the last couple years — earlier this week, IDC reported that Q2 tablet sales were down 7% from a year ago.

    Read full article from Apple no longer dominates the tablet market it created - Business Insider


    Code Zone: Design a meeting scheduler



    Code Zone: Design a meeting scheduler

    Design a meeting scheduler
    Design a meeting scheduler:
    How you will identify conflicting meetings?
    How you will design data structure and supporting algorithm so that System will suggest Best Available Meeting time considering the fact that many folks on the invite might be in different geography.

    Useful Discussion:
    ------------------
    http://www.careercup.com/question?id=5720778041458688

    Read full article from Code Zone: Design a meeting scheduler


    算法导论 Exercises 9.3-8 - GoAgent - 博客园



    算法导论 Exercises 9.3-8 - GoAgent - 博客园

    Problem Description:

    Let X[1...n] and Y[1...n] be two arrays, each containing n numbers already in sorted order.

    Give an O(lgn)-time algorithm to find the median of all 2n elements in array X and Y.

    问题描述:

    X和Y是两个长度均为n的已排序数组,现要求以O(lgn)的时间复杂度找到两个数组中所有2n个数的中值。

    (注:这里中值被定义为:对于奇数个数,排序后中间那个,或者对于偶数个数,排序后中间两个数下标较小的那个)

    问题升级:

    题目中假设了两个数组长度相等,并且找中值。

    这里修改一下题目,假设两个数组长度不一定相等,分别为m和n,要求以O(lgm + lgn)的时间复杂度找到m+n个数中的第i个数。

    先解决一般情况,再将原题作为一种特殊情况来处理。

     

    解决方案:

    如上图所示的两个已排序数组(假设为单调非减),其中aMid = (aBeg + aEnd) / 2是数组中值的下标。

    {a1}是数组a中a[aBeg]至a[aMid]段的元素的集合,{a2}是数组a中a[aMid + 1]至a[aEnd]段的元素的集合。同理{b1}、{b2}如图所示。

     

    假设a[aMid] <= b[bMid](a[aMid] > b[bMid]的情况与之类似),则有:

    {a1}<={a2} {a1}<={b2}

    即对于{a1}中的任意元素,在a、b中不小于它的数至少有(aEnd - aMid) + (bEnd - bMid) + 1个

    所以对于{a1}中的任意元素,在a、b中小于它的数至多有

    [(aEnd - aBeg + 1) + (bEnd - bBeg + 1)] - [(aEnd - aMid) + (bEnd - bMid) + 1]

    =(aMid - aBeg) + (bMid - bBeg) + 1个......①

    同理,{b2}>={b1} {b2}>={a1}

    对于{b2}中的任意元素,在a、b中不大于它的数至少有(bMid - bBeg) + (aMid - aBeg) + 1

    =(aMid - aBeg) + (bMid - bBeg) + 1个......②

    由①、②可知:

    如果i <= (aEnd - aMid) + (bEnd - bMid) + 1,那第i个数一定不在{b2}中。

    此时只需在{a1}、{a2}、{b1}中继续找第i个数就可以了。

    (当i == (aEnd - aMid) + (bEnd - bMid) + 1时,只有在a[aMid] == b[bMid]时,第i个数等于a[aMid]或者b[bMid],

    此时虽然b[bMid]在{b2}中,但由于a[aMid]不在{b2}中,所以不影响我们"第i个数一定不在{b2}中"的判断)

    如果i > (aEnd - aMid) + (bEnd - bMid) + 1,那第i个数一定不在{a1}中。

    此时只需在{a2}、{b1}、{b2}中继续找第i - (aMid - aBeg + 1)个数就可以了。

    同理在a[aMid] > b[bMid]时,可以得出类似的结论,只是a、b两个数组的角色互换。

     

    由上面的分析,每次递归可以丢弃掉其中一个数组一半的元素直至递归结束。

    递归结束的条件是其中一个数组已经为空,此时在另外一个数组里面直接找第i个数就可以了。

    因此本算法的时间复杂度是O(lgm + lgn)的。


    Read full article from 算法导论 Exercises 9.3-8 - GoAgent - 博客园


    算法导论 Exercises 9.3-8 - GoAgent - 博客园



    算法导论 Exercises 9.3-8 - GoAgent - 博客园

    Problem Description:

    Let X[1...n] and Y[1...n] be two arrays, each containing n numbers already in sorted order.

    Give an O(lgn)-time algorithm to find the median of all 2n elements in array X and Y.

    问题描述:

    X和Y是两个长度均为n的已排序数组,现要求以O(lgn)的时间复杂度找到两个数组中所有2n个数的中值。

    (注:这里中值被定义为:对于奇数个数,排序后中间那个,或者对于偶数个数,排序后中间两个数下标较小的那个)

    问题升级:

    题目中假设了两个数组长度相等,并且找中值。

    这里修改一下题目,假设两个数组长度不一定相等,分别为m和n,要求以O(lgm + lgn)的时间复杂度找到m+n个数中的第i个数。

    先解决一般情况,再将原题作为一种特殊情况来处理。

     

    解决方案:

    如上图所示的两个已排序数组(假设为单调非减),其中aMid = (aBeg + aEnd) / 2是数组中值的下标。

    {a1}是数组a中a[aBeg]至a[aMid]段的元素的集合,{a2}是数组a中a[aMid + 1]至a[aEnd]段的元素的集合。同理{b1}、{b2}如图所示。

     

    假设a[aMid] <= b[bMid](a[aMid] > b[bMid]的情况与之类似),则有:

    {a1}<={a2} {a1}<={b2}

    即对于{a1}中的任意元素,在a、b中不小于它的数至少有(aEnd - aMid) + (bEnd - bMid) + 1个

    所以对于{a1}中的任意元素,在a、b中小于它的数至多有

    [(aEnd - aBeg + 1) + (bEnd - bBeg + 1)] - [(aEnd - aMid) + (bEnd - bMid) + 1]

    =(aMid - aBeg) + (bMid - bBeg) + 1个......①

    同理,{b2}>={b1} {b2}>={a1}

    对于{b2}中的任意元素,在a、b中不大于它的数至少有(bMid - bBeg) + (aMid - aBeg) + 1

    =(aMid - aBeg) + (bMid - bBeg) + 1个......②

    由①、②可知:

    如果i <= (aEnd - aMid) + (bEnd - bMid) + 1,那第i个数一定不在{b2}中。

    此时只需在{a1}、{a2}、{b1}中继续找第i个数就可以了。

    (当i == (aEnd - aMid) + (bEnd - bMid) + 1时,只有在a[aMid] == b[bMid]时,第i个数等于a[aMid]或者b[bMid],

    此时虽然b[bMid]在{b2}中,但由于a[aMid]不在{b2}中,所以不影响我们"第i个数一定不在{b2}中"的判断)

    如果i > (aEnd - aMid) + (bEnd - bMid) + 1,那第i个数一定不在{a1}中。

    此时只需在{a2}、{b1}、{b2}中继续找第i - (aMid - aBeg + 1)个数就可以了。

    同理在a[aMid] > b[bMid]时,可以得出类似的结论,只是a、b两个数组的角色互换。

     

    由上面的分析,每次递归可以丢弃掉其中一个数组一半的元素直至递归结束。

    递归结束的条件是其中一个数组已经为空,此时在另外一个数组里面直接找第i个数就可以了。

    因此本算法的时间复杂度是O(lgm + lgn)的。


    Read full article from 算法导论 Exercises 9.3-8 - GoAgent - 博客园


    rightwayman's blog: [Algo] Exercise 9.3



    rightwayman's blog: [Algo] Exercise 9.3

    Exercise 9.3-1
    In the algorithm SELECT, the input elements are divided into groups of 5. Will the algorithm work in linear time if they are divided into groups of 7? Argue that SELECT does not run in linear time if groups of 3 are used.
    solution:
    (1) groups of 7 are used
    # of elements greater than the median of median is at least
    .
    The recurrence can be rewritten as
    .
    The process of proving is similar to the textbook, and we will prove that finally.

    (2) groups of 3 are used
    # of elements greater than the median of median is at least
    .
    The recurrence can be rewritten as
    .
    Before we derive the following steps, we can glance at the equation and find out the right hand side . Suppose that for some positive constant c.  By substitution, we have

    Since , we cannot derive that . Hence, SELECT does not run in linear time if groups of 3 are used.

    Read full article from rightwayman's blog: [Algo] Exercise 9.3


    !hack & redesign: Fun with Guava's ListeningExecutorService



    !hack & redesign: Fun with Guava's ListeningExecutorService

    The Guava API has the advantage to allow one to register a callback to a ListenableFuture and apply transformations to futures, resulting in a chain of non-blocking operations, very much like Scala's Futures. Non-blocking concurrency will greatly limit resource usage, if used properly since there will be no idle threads blocking while waiting on the results of other threads' computations.

    Just for illustration purposes, an example of using Guava's future transformation capabilities to implement fully non-blocking asynchronous computations follows. Enjoy!

    Read full article from !hack & redesign: Fun with Guava's ListeningExecutorService


    Longest Consecutive 1s in Binary - 我的博客 - ITeye技术网站



    Longest Consecutive 1s in Binary - 我的博客 - ITeye技术网站


    Read full article from Longest Consecutive 1s in Binary - 我的博客 - ITeye技术网站


    Google Code Jam 2009资格赛 题目B Watersheds - xbl1986的专栏 - 博客频道 - CSDN.NET



    Google Code Jam 2009资格赛 题目B Watersheds - xbl1986的专栏 - 博客频道 - CSDN.NET

    分水岭

    Problem 问题

    Geologists sometimes divide an area of land into different regions based on where rainfall flows down to. These regions are called drainage basins.

    地址学家有时候根据降雨量把一个陆地地区分成不同的区域。这些区域被称为流域。

    Given an elevation map (a 2-dimensional array of altitudes), label the map such that locations in the same drainage basin have the same label, subject to the following rules.

    给出了一个海拔地图(一个海拔的二维数组),标记这个地图,使得相同流域的位置具有相同的标签,服从于下面的规则。

    • From each cell, water flows down to at most one of its 4 neighboring cells.
    • 对于每一个单元,水流至少流向四个邻居之一的地方。
    • For each cell, if none of its 4 neighboring cells has a lower altitude than the current cell's, then the water does not flow, and the current cell is called a sink.
    • 对于每一个单元,如果它周围的四个邻居海拔都比它高,那么水就不流动了,并且当前的单元被称为水池。
    • Otherwise, water flows from the current cell to the neighbor with the lowest altitude.
    • 否则,水流从当前的单元流向海拔更低的单元。
    • In case of a tie, water will choose the first direction with the lowest altitude from this list: North, West, East, South.
    • 至于关系,水将会从北西东南中选择最低的海拔作为第一个流动的方向。(我加的理解,比如9旁边有2和3,则水流流向2而不是3)

    Every cell that drains directly or indirectly to the same sink is part of the same drainage basin. Each basin is labeled by a unique lower-case letter, in such a way that, when the rows of the map are concatenated from top to bottom, the resulting string is lexicographically smallest. (In particular, the basin of the most North-Western cell is always labeled 'a'.)

    每一个单元,直接或者间接的成到同样的水池,是相同流域的一部分。每一个流域被一个特殊的小写字母标记,这一点,地图上的行从上到下,得到的字符串字典最小的连接在一起。(特殊的,大多数西北单元的水池总是被标记为a)。(我加的理解,这个长句子……感觉还是谷歌翻译翻译的有点味道。我理解的就是从海拔高的地方流到水池,取路径最短的,如果一个流经了4个单元,一个流经了9个单元,则取4个单元的那个路径。)

    Input 输入

    The first line of the input file will contain the number of maps, T. T maps will follow, each starting with two integers on a line -- H and W -- the height and width of the map, in cells. The next H lines will each contain a row of the map, from north to south, each containing W integers, from west to east, specifying the altitudes of the cells.

    输入文件的第一行将包含地图的数量,T。T个地图将紧随其后,每一个以一行有两个整数开始--H和W--地图的高度和宽度。接下来的H行,每一行都包含了地图的一行,从北到南,每行包含了W个整数,从西到东,指定了单元的海拔。

    Output 输出

    For each test case, output 1+H lines. The first line must be of the form

    每一个测试案例,输出 1+H行。第一行必需从Case #X:开始

    Case #X:

    where X is the test case number, starting from 1. The next H lines must list the basin labels for each of the cells, in the same order as they appear in the input.

    X是测试案例号,从1开始。接下来的H行必须列出每个单元的流域标签,以其在输入文件中出现的顺序。

    Limits 限制

    T ≤ 100; T,地图的个数小于100.

    Small dataset 小数据

    1 ≤ H, W ≤ 10; 高度和宽度大于等于一小于等于十。
    0 ≤ altitudes < 10. 海拔大于等于零,小于十。
    There will be at most two basins. 最多两个流域。

    Large dataset 大数据

    1 ≤ H, W ≤ 100; 高度和宽度大于等于一,小于等于一百。
    0 ≤ altitudes < 10,000.海拔大于等于零,小于一万。
    There will be at most 26 basins.最多26个流域。

    Sample 例子

    Input 输入

    5
    3 3
    9 6 3
    5 9 6
    3 5 9
    1 10
    0 1 2 3 4 5 6 7 8 7
    2 3
    7 6 7
    7 6 7
    5 5
    1 2 3 4 5
    2 9 3 9 6
    3 3 0 8 7
    4 9 8 9 8
    5 6 7 8 9
    2 13
    8 8 8 8 8 8 8 8 8 8 8 8 8
    8 8 8 8 8 8 8 8 8 8 8 8 8

    Output 输出

    Case #1:
    a b b        9是分水岭,从9沿着左上-右下的对角线水流动,西北优先标记a,然后是b,两个流域。
    a a b
    a a a
    Case #2:
    a a a a a a a a a b  8是分水岭,流向左右。两个流域。
    Case #3:
    a a a 这个难明白些。 767  7只能->6,而6没法流动成为水池,6<-7,而第一行的都无法流向第二行。
    b b b              767  第二行相同,无法流向第一行,因此上下分ab。有点不好理解。
    Case #4:
    a a a a a 这个的理解,我写在上面的翻译里面了,感觉长句子真是难理解啊!!!
    a a b b a
    a b b b a
    a b b b a
    a a a a a
    Case #5:
    a b c d e f g h i j k l m  都是一样的,每个单元都是一个水池,只好编号了,字母表走了一遍。
    n o p q r s t u v w x y z

    Notes 注意

    In Case #1, the upper-right and lower-left corners are sinks. Water from the diagonal flows towards the lower-left because of the lower altitude (5 versus 6).

    在第一个地图中,右上角和左下角是流域。水从对角线流向更低的左边,因为海拔低(5对6).

    /*-----------------------------------------------------------------------------*/

    从A的经验来看,弄明白怎么回事是很重要的……,灰常,灰常重要!

    我又看了一边例子,把给出的例子解释了一遍,然后修正了一些翻译的内容。

    例子是基本看明白了,思路是一点没有啊……。好吧,慢慢思考。


    Read full article from Google Code Jam 2009资格赛 题目B Watersheds - xbl1986的专栏 - 博客频道 - CSDN.NET


    How to do well at Google Code Jam: Solutions to Code Jam Problems



    How to do well at Google Code Jam: Solutions to Code Jam Problems

    Here is a list of solutions to Code Jam problems in python, using the wrapper code that I published in my previous post

    2013 Round 2

    1. Ticket Swapping
    2. Many Prizes
    3. Erdős–Szekeres
    4. Multiplayer Pong

    2012 Round 2

    1. Swinging Wild
    2. Aerobics
    3. Mountain View
    4. Descending in the Dark
    2012 Round 1C
    1. Diamond Inheritance
    2. Out of Gas
    3. Box Factory
    2012 Round 1B
    1. Safety in Numbers
    2. Tide Goes In, Tide Goes Out
    3. Equal Sums
    2012 Round 1A
    1. Password Problem
    2. Kingdom Rush
    3. Cruise Control
    2012 Qualification Round

    1. Speaking in Tongues
    2. Dancing with the Googlers
    3. Recycled Numbers
    4. Hall of Mirrors

    2011 Round 2
    1. Airport Walkways
    2. Spinning Blade
    3. Expensive Dinner
    4. A.I. War
    2011 Qualification Round
    1. Bot Trust
    2. Magicka
    3. Candy Splitting
    4. GoroSort
    2010 Qualification Round
    2009 Qualification Round
    1. Alien Language (input file does not contain a first line of data containing solely the number of problem cases)
    2. Watersheds
    3. Welcome to Code Jam

    Read full article from How to do well at Google Code Jam: Solutions to Code Jam Problems


    Wilan: Google Code Jam 2009 Qualification Round



    Wilan: Google Code Jam 2009 Qualification Round

    Overview:
    The GCJ qualification round started smoothly until the input downloading system stopped working. This isn't crucial to the qualification round as participants can just move on to the next problem and submit all the problems at the same time when the system started working again. The number of competitors came close to almost 10,000 international participants - many of which were able to solve at least 1 problem. The problems itself required no specialised algorithmic or mathematical knowledge so they proved to be accessible to most participants. A score of 33 was required to progress to Round 1, so any combination of a correct small input and a large input solution was sufficient enough to qualify.

    Links:

    Read full article from Wilan: Google Code Jam 2009 Qualification Round


    Alien Language, Watersheds and Welcome to Code Jam - Qualification Round 2009 solutions | Code Jam Daemon



    Alien Language, Watersheds and Welcome to Code Jam - Qualification Round 2009 solutions | Code Jam Daemon

    Next in the choice problem list in Practice and Learn is Code Jam 2008, Round 1A: Minimum Scalar Product, that we skip since the problem itself is a bit too easy and the whole round is a bit too hard for now. So let's go to Google Code Jam 2009, Qualification Round.

    First off, in this qualification round you only need to "score at least 33 points to advance to round 1" as you can read above the scoreboard. This means that all you need is to solve small and large inputs of just one problem. Now start the timer and go practice. Hopefully it'll take less than the 24 hour allowed for the round!

    As this is a qualification round the problems are relatively simple, but they are a good opportunity to learn a few useful think and code patterns.

    Alien Language

    Alien Language is a pretty straightforward pattern matching problem. The brute force is clearly out of questions as it scales with 26 ^ D and the limit for D is 5000. There are a number of clever data structures and algorithms that good programmers could employ to reduce the complexity, but one of the advantage of using a high level language in a coding competition is that you have tons of high level libraries and you can happily cheat in cases like this one. Let's harness the power of the python re module.

    Read full article from Alien Language, Watersheds and Welcome to Code Jam - Qualification Round 2009 solutions | Code Jam Daemon


    每日一贴: http://www.mitbbs.com/article_t/JobHunting/33015301.html



    每日一贴: http://www.mitbbs.com/article_t/JobHunting/33015301.html

    总共5小时 面了7个人(包含一开始HR 和中午吃饭1小时).. 
    第一轮HR帮你热身30分钟 先下马威
       1.AMAZON的BASE上限是160K..全公司没有人BASE超过这个上限..连CEO JEFF 也一样.
       2.一般不发SDE III offer.  Amazon SDE II 包含MSFT SDE II 到MSFT Senior SDE
    一半的range(60,61,62)

    #1
    设计Drones Schedule/Deliver system

    #2
    Median of Integer Stream
    http://www.ardendertat.com/2011/11/03/programming-interview-que

    #3
    设计 Elevator schedule system to minimize everyone's wait time

    #4
    Given DLL Package dependency, print the installation sequence.
    先用DFS 把Graph 的End NODES 找出..再由End node 向上找没有children 的Node… 
    etc

    50%的时间在问behavior questions

    Read full article from 每日一贴: http://www.mitbbs.com/article_t/JobHunting/33015301.html


    面经网



    面经网

    刚结束Google电面,报下面经。

    面试官是白人妹子,啥都没问就开始做题了。

    1.1. 给一个数n,如果能被3整除就print Func3(n),如果能被5整除就print Func5(n),如果能被3和5整除就print Func3(Func5(n)). 我用最简单的if-else写了下。
    1.2. follow-up, 把上面的代码改成general的,如这时有{3,5,7},能被3和7整除的话就print Func3(Func7(n)),能被3,5,7都整除print Func3(Func5(Func7(n)))……

    2. Big Integer + 1, big integer用list of characters表示。这个leetcode原题,写的时候差点出bug,在给面试官解释的时候发现赶紧改过来。

    3. 设计一个电话本系统,实现三个功能:查找号码 boolean isTaken(),添加号码 void takeNumber(),返回一个不在电话本里的没人用的号码 number giveMeANumber()。我一开始说用HashMap,这样前两个函数的复杂度都是O(1),第三个是O(n)。面试官说能不能优化第三个函数,我说用BST,每个节点多存一个value记录这个节点下还有几个available的号码,giveMeANumber()的实现只要沿着value>0的node往下找就行了。这样三个函数的复杂度都是O(lgn).


    Read full article from 面经网


    google 面经 - proudmore的专栏 - 博客频道 - CSDN.NET



    google 面经 - proudmore的专栏 - 博客频道 - CSDN.NET

    input string:"+++–++-+"
    游戏规则:每人每次可以 flip "++" to "–"(必须是相邻的)
    第一问:generate all possible moves
    第二问:determine if this player can will. 下一步没有连续的++

    2.
    1.先写反转链表,然后第二题把链表变成a1->an->a2->an-1
    2.第一题是path sum难度一样的dp水果。 第二题给画布上的几个线段,问怎么画最快(移动画笔会浪费时间)
    solution: 线段作为vertex,端点之间的连线是edge,转化成tsp问题
    3.给一个字符串,问可否组成一个新串使得每两个相邻字符不重复
    **solution:**O(n!)
    4.先问简历,然后又是一个path sum难度一样的dp水果
    5.扫雷
    solution:
    set mines,其他counter=0, 然后对每个mine周围的counter++.
    click: if mine, lose. if empty, show counters and run dfs 打开连通分支
    mark a flag: nothing happens
    2 layers: UI + matrix

    3.设计一个电话本系统,实现三个功能:查找号码 boolean isTaken(),添加号码 void takeNumber(),返回一个不在电话本里的没人用的号码 number giveMeANumber()。我一开始说用HashMap,这样前两个函数的复杂度都是O(1),第三个是O(n)。面试官说能不能优化第三个函数,我说用BST,每个节点多存一个value记录这个节点下还有几个available的号码,giveMeANumber()的实现只要沿着value>0的node往下找就行了。这样三个函数的复杂度都是O(lgn).

    1. 要从server A 到 server B 备份很大的文件,网速很慢,文件改动很小, 用rsync: delta更新

    5.
    1.Word abbreviation,
    e.g. Between=>b5n, friend=>f4d
    Follow-up: implement
    Bool checkduplicate(string [] dict, string word).1point3acres缃�
    E.g. {feed }, feed => false; {door }, deer =>true; {dare}, deer => false
    如果dict里有word 和input word的abbreviation 一样,则return true
    follow up:如果调用很多次。 用hashmap<abbreviation, list<string>>
    2.Given a list of words, find two strings S & T such that:
    a. S & T have no common character. From 1point 3acres bbs
    b. S.length() * T.length() is maximized
    Follow up: how to optimize and speed up your algorithm

    6.
    onsite面了四轮,
    设计random queue(面经题!!),follow-up,设计支持权重的random queue, 这一轮对复杂度的计算很多
    第二轮,面经题,返回一个整数的平方和表示,使得个数最小。
    第三轮,三道leetcode变种。返回二叉树的最小深度;抢银行的新题,一个数组不能连续访问两个相邻的数字,求和的最大值;求rotated sorted数组的最小值。这一轮比较开心。。
    第四轮,我靠题目就是设计一个RMQ(http://en.wikipedia.org/wiki/Range_minimum_query) (我也是后来才知道这是RMQ的)。
    solutionhttp://dongxicheng.org/structure/lca-rmq/
    预处理空间+时间 O(nlgn), 查询O(1)

    第一轮,给一个字符串数组,找出这样的字符串对(str1,str2),使得1,两个字符串不包含一样的字符,2. 长度乘积最大。我一开始对面试官说O(n^2)比较intuitive,然后写完了之后让我改进,后来想了一个按长度先排序,然后设置两个指针慢慢移动,估计会好很多。.
    第二轮,偏向c++功底跟concurrency。实现memcopy,还有就是实现一个银行的类里面的几个算法,都很简单,但是对多线程调用的加锁需要有了解。最后又问了一个实现每次调用,运行5秒,期间不停循环自增的简单算法,follow-up是如何应对系统管理员尴尬地恰巧在这段时间内改了系统时间。

    7.
    1. 2个string排序,加special case 比如 'ch' 在'k'之前'j'之后.
    2. 电话本。 用trie+count, 新号码沿着count>0的路走就行
    3. string[]. 找2个没有common letter的string ,length乘积最大。
    solution: new BitSet(26). 然后set(digit-'0'). 比较用 bitset1.and(bitset2)==0.
    维护hashmap<bitset, list<string>>, 和top1, top2.每次更新hashmap的时候,如果有list size超过了top2, 更新top1,top2.
    或者直接用a2b3c4d1…


    Read full article from google 面经 - proudmore的专栏 - 博客频道 - CSDN.NET


    Zenifits 面经 | Hello World



    Zenifits 面经 | Hello World

    1. longest chain: 类似word ladder,对于一个单词删掉任何一个字母,如果新单词出现在给的词典里 那么就形成一个 chain: old word -> new word -> newer word, 求最长长度(return int) 比如给vector w = {a,ba,bca,bda,bdca} 最长是4: bdca->bda->ba->a; 思路和word ladder一样,我们可以先sort这个set, 然后从最长的开始。(或者我们建立一个hashMap, 单词长度作为key,单词存进一个list作为value, 这样也可以从最长的开始),然后遍历删除每个字母检查是否在字典中,如果在的话可以把max++, 然后把单词从字典中删除(如果从长度最大的单词一次往下遍历,删除是合理的,因为这个单词肯定已经在最长的list里算过了), 最后return max即可。 2 N queen 变种 题目要求是求出 对于每个Queen, 最大的威胁次数,威胁指只要一个queen所能移动的范围内有别的queen就算威胁 P.S.同一方向上有2个queen威胁你 只算最近的那个。 由于同一行 同一列不会出现2个queen。(由于输入限制)所以只要考虑对角线 和逆对角线。 举个例子: 棋盘是: Onsite题: 1.Design a service that returns a unique integer for each request. 可以用timestamp + machineid + thread_id + cnt 这道题是leetcode题,注意边界条件,先讨论General情况,然后考虑边界条件。注意start, end和k的关系,start和end应该是坐标,而k是从start开始算的数。 brain teaser Share this: Enter your comment here... Fill in your details below or click an icon to log in: Email (required) (Address never made public) Name (required) You are commenting using your WordPress.com account.

    Read full article from Zenifits 面经 | Hello World


    Google Interview - Flip Game - 我的博客 - ITeye技术网站



    Google Interview - Flip Game - 我的博客 - ITeye技术网站


    Read full article from Google Interview - Flip Game - 我的博客 - ITeye技术网站


    Google Interview - 判断任意两个人是否有血缘关系 - 我的博客 - ITeye技术网站



    Google Interview - 判断任意两个人是否有血缘关系 - 我的博客 - ITeye技术网站


    Read full article from Google Interview - 判断任意两个人是否有血缘关系 - 我的博客 - ITeye技术网站


    Problem A: Snapper Chain Solution (Google Code Jam Qualification Round 2010) | LeetCode



    Problem A: Snapper Chain Solution (Google Code Jam Qualification Round 2010) | LeetCode

    Read the question here from GCJ Qualification Round 2010:
    » Problem A: Snapper Chain

    Once you understand how the snapper works, it is easy. This problem can be solved in various ways, but the main observations are:

    Let k be the number of times you snap your finger, n be the light's position, and assume 0 = OFF and 1 = ON.

    The configuration of the first five snappers (with the first snapper on the far left side) as k increases are:
    00000 k = 0
    10000 k = 1
    01000 k = 2
    11000 k = 3
    00100 k = 4
    10100 k = 5
    01100 k = 6
    11100 k = 7
    00010 k = 8


    Read full article from Problem A: Snapper Chain Solution (Google Code Jam Qualification Round 2010) | LeetCode


    Google Codejam 2010 Solutions - Qualification - Zero Wind :: Jamie Wong



    Google Codejam 2010 Solutions - Qualification - Zero Wind :: Jamie Wong

    Problem 1: Snapper Chain

    This is the one I screwed up, because I somehow overlooked the fact that each snap was simply equivalent to an binary increment of the snappers states. Instead, I did a foolish simulation, which did actually pass me the small test case. But anyway, the real solution is O(1).

    For interest, here's what the snapper chain looks like in the first 30 iterations. The left side, labeled "P:" shows which of the snappers are powered, and the right side, labeled "S:" shows the current state of the snappers, with "#" being ON and "." being OFF in both cases. I did this so it would be easier to see the pattern. The first row is the initial set up, before any snaps.


    Read full article from Google Codejam 2010 Solutions - Qualification - Zero Wind :: Jamie Wong


    Compsci 149s, Fall 2011, Syllabus



    Compsci 149s, Fall 2011, Syllabus

    Compsci 149s, Fall 2011, Syllabus

    Syllabus

    Problems are roughly sorted by difficulty. Problems are usually related to material covered during the lecture; however, there are usually one or two unrelated problems thrown in as well. Read this page to figure out how to test your solutions. Submit your solutions via email to hewner at cs dot duke dot edu. Problems turned in late will receive reduced credit.

    Contents:





    Week 0 (August 29th): Intro/Greedy

    Topics:

    Assignment:

    Due 5PM Monday, September 5th: Do at least two problems from the current problem set, and one problem from next week's set.



    Week 1 (September 5th): Graphs 1

    Topics:

    Assignment:

    Due 5PM Monday, September 12th: Do at least two problems from the Week 1, and one problem from the Week 2 Set. For those new to graph theory, we recommend KingdomXCitiesandVillagesAnother.



    Week 2 (September 12th): Graphs 2

    Topics:

    Assignment:





    Week 3 (September 19th): In-Class Practice / TopCoder SRM 519

    Topics:

    • During this class period we will not focus on a specific topic, but we will break into groups to work on the problem set.
    • The class will be followed by SRM 519. Problems from the SRM will count towards problems solved for this problem set.

    Assignment:





    Week 4 (September 26th): Graphs 3

    Topics:

    Competitions:

    Assignment:

    Due 5PM Monday, October 3rd: Do at least two problems from this week's set, and one problem from next week's set. We suggest looking at the Fibonacci and combination problems.



    Week 5 (October 3rd): Dynamic Programming 1

    Topics:

    • Intro to dynamic programming
    • (optional) TopCoder dynamic programming introduction: Part 3
    • (Notes to be posted.)

    Competitions:

    Assignment:





    Week 6 (October 10th): Fall Break / Max Flow (Optional)

    Topics:

    Competitions:

    Assignment:

    Enjoy fall break! If you want to do a few more problems, we suggest either:
    1. Working on more of the dynamic programming problems above, if you are not so familiar with dynamic programming.
    2. For more advanced students, solving some of the max flow problems below would be helpful.




    Week 7 (October 17th): Dynamic Programming 2 / Math 1

    Topics:

    • More dynamic programming
    • Bitmasking
    • Probability
    • Notes

    Competitions:

    • ACC Scrimmage - October 23rd, 2011, 1PM to 6PM (Problem set here.)

    Assignment:





    Week 8 (October 24th): Math 2

    Topics:

    • BigInteger
    • Useful mathematical algorithms from the library
    • Dot/cross products
    • Polygon area
    • Convex hull
    • (Notes to be posted.)

    Competitions:

    • TopCoder SRM 522 - October 26th, 2011, from 9PM to 11:30PM
    • ACC Scrimmage 2 - October 30th, 2011, 1PM to 6PM

    Assignment:





    Week 9 (October 31st):

    Competitions:





    Week 10 (November 7th): Intro to Ants

    Notes:

    Assignment:

    • Final Ants project requirements (due last day of class) on blackboard
    • Choose your Ants team
    • Get the ants environment set up (you can start here)
    • Write ants that can at least search for food (if you're writing in Java or Python, go through the whole tutorial)
    • Submit your code/team members via blackboard

    Week 11 (November 14th):

    Notes:





    Week 12 (November 21st):





    Week 13 (November 28th):

    Course Evaluation for non-registered students

    Notes:

    Assignment:

    • Sumbit final Ants project by SUNDAY NIGHT (before 8am on Monday)




    Read full article from Compsci 149s, Fall 2011, Syllabus


    Labels

    Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

    Popular Posts