Search a 2D Matrix

go-plus

Install go-plus: apm install go-plus or open Atom and go to Preferences > Install, search for go-plus, and install it

Overview

This package adds extra Atom functionality for the go language:

Autocomplete using gocode (you must have the autocomplete-plus package activated for this to work)
Formatting source using gofmt, goimports, or goreturns
Code quality inspection using go vet
Linting using golint
Syntax checking using go build and go test
Display of test coverage using go test -coverprofile
Go to definition using godef

Read full article from go-plus

Beyond the Web: 10 surprising Node.js projects | InfoWorld

In only a few years since its original release, Node.js -- the server-side JavaScript engine -- has gone from being a curiosity derived from Google Chromium's V8 to a full-blown platform in its own right. Engineers for high-traffic websites like PayPal have written exuberantly about how Node.js makes it easy to create fast-moving Web frameworks.

But Node.js isn't just a Web stack -- it's a technology coming into its own in many different respects. Here's a collection of projects that use Node.js for server monitoring, for media streaming, for remote control, for desktop and mobile apps, and in other ways that don't confine it to the usual role of Web server -- or even to the server at all.

Read full article from Beyond the Web: 10 surprising Node.js projects | InfoWorld

五年过去了・还记得当年的老罗吗？

好久没在知乎上写任何文字了，但经常能学到些东西，感谢知乎。今天是知乎五周年，祝贺。

从培训界转型，做了一个科技公司

2012 年，只是"听从内心的呼唤"，做了自己一直想做的事，开了一家科技公司。接下来的三年半，是这辈子身心都最累的三年半，遭遇了大规模的批评、讽刺和诋毁，头发掉了一半，胆结石大了一倍，体重增加了百分之二十，但这些跟我获得的无穷无尽的快乐、满足、成就感和难以置信的温暖支持和鼓励相比，屁都不是...或只是一个屁罢（注意在知乎发帖时考究的用字＃墨镜＃）。如果没有意外，我后半生的全部，除了家庭，也就是这个公司了。能每天从事自己热爱的、有益于人民的伟大事业，随手还能把家人照顾得很好，没有比这更幸福的了。

做第一个公司时惹了一个大流氓

准确地说，不是"惹"了，而是"搭理"了一个大流氓。虽然"搭理"得很欢乐，但实在是太浪费时间了。后来不长记性，做第二个企业时还搭理过一个小流氓。做企业最宝贵的就是时间，以后再有流氓惹我，而且我那时还没退休，我会都记到一个小账本上，留到退休后再说。原则上，真正严重的流氓是不能放过的，原谅不该原谅的，是所谓"道德上的深刻堕落"。

维权时让很多企业界人士认为我是一个大流氓

跟某厨房设备企业维权时，虽然大多数了解前因后果的媒体和网民对我的行为是高度认可和赞许的态度，但对很多无知、无趣、无调查有发言权、价值观可疑的企业界人士来说，某厨房设备企业"还是挺好的"，而这个叫"老罗"的人应该是一个"惹不起的大流氓"。这对我后来做企业或多或少也带来了一些尴尬，但整体上说来，还是利大于弊的：1. 他们毕竟是生意人，有生意做还是会和我做；2. 我无权无势，那些没有原则的生意人怕我，对我是好事，可以省去很多麻烦。

莫名其妙地跟一个游手好闲型的小兄弟维系了松散无比的友情并因此阴差阳错地受益

我们是在网上认识的，其实很少见面，我帮过他一些忙，但微不足道，偶尔见面时大家也就是贫两句无聊的话，忙的时候，只贫一句。但我两次创业（我只做过两家公司）的时候，他作为一个我们共同朋友圈里都公认贪玩，爱�N瑟，严重不靠谱的小伙子，两次都随随便便就帮了我极其重要的忙。如果没有更好的例子，这种就是传说中"命里的贵人"了吧？我的社交经验里，还有几个类似的例子，后来我就渐渐理解为什么做企业的人大都迷信了……虽然我永远也不会弄一个金属癞蛤蟆放到木雕关羽的边上。上次回家，还有朋友建议我烧一个真手机给我去世的父亲，我说，就算京东把坚果手机的价格下调 200 元至 699（实际上他们最近刚刚这样做了！），使得烧手机的成本大幅度降低，我也不会做这么不环保的事儿。

以貌似毫无希望的开局，骗来了一大批善良、正直、聪明、有才华的理想主义中青年一起创业

锤子科技最初的几个软件工程师，几乎都不是学安卓的，大都是入职后才开始看生平第一行安卓代码。他们从很好的互联网公司或芯片公司来到锤子科技，并且几乎没有人看好锤子科技的发展，但当时我像傻 x 一样得意洋洋，对此一无所知。后来公司最困难的时候，他们当中有些心理状态不稳定的经常在群里交流负面情绪，我说你们他妈的这么不看好公司的前途，当初为什么要来？他们说上学的时候听了你的录音，心里当你是启蒙老师，现在看你犯傻要做一件找死的事，就觉得，那就浪费个一年左右的时间帮你了结这个心愿吧，反正吃技术饭的，到哪儿都不愁工作，谁知道这破公司搞这么久还他妈迟迟不倒闭……咦？公司怎么牛人越来越多了？不会真的能成事儿吧？

查看知乎原文（2734 条讨论）

Read full article from 五年过去了・还记得当年的老罗吗？

极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

年度开源项目新秀

每年都有上千新的开源项目问世，但只有少数能够真正的吸引我们的关注。一些项目因为利用了当前比较流行的技术而发展壮大，有一些则真正地开启了一个新的领域。很多开源项目建立的初衷是为了解决一些生产上的问题，还有一些项目则是世界各地志同道合的开发者们共同发起的一个宏伟项目。

从2009年起，开源软件管理公司黑鸭便发起了年度开源项目新秀这一活动，它的评选根据 Open Hub 网站（即以前的Ohloh）上的活跃度。今年，我们很荣幸能够报道2015年10大开源项目新秀的得主和2名荣誉奖得主，它们是从上千个开源项目中脱颖而出的。评选采用了加权评分系统，得分标准基于项目的活跃度，交付速度和几个其它因数。

开源俨然成为了产业创新的引擎，就拿今年来说，和Docker容器相关的开源项目在全球各地兴起，这也不恰巧反映了企业最感兴趣的技术领域吗？最后，我们接下来介绍的项目，将会让你了解到全球开源项目的开发者们的在思考什么，这很快将会成为一个指引我们发展的领头羊。

Read full article from 极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

领跑全球安全行业，为什么是以色列？ - FreeBuf.COM | 关注黑客与极客

2008年，时任PayPal总裁的斯科特・汤普森会见了一个声称能解决在线支付欺诈问题的以色列年轻人，听起来就像是天方夜谭，年轻人平静地说："我们的想法很简单。我相信这个世界上只有两种人，好人和坏人。打击欺诈的办法就是在网络上把这两种人区分开来。"

这位年轻人来自一家以色列安全公司Fraud Sciences，公司致力于提供反欺诈检测服务。

Fraud Sciences分析了汤普森提供的10万个PayPal交易记录作为"测试"，几个月后他们交出了令人震惊的答卷：这家仅有五十人的以色列公司得到的分析结果竟比PayPal的精确17%！他们用了相对少的数据，却更加准确地预测分辨出谁是好顾客或坏顾客。

后来广为人知的事情是PayPal以1.69亿美元收购了这家成立仅五年的以色列安全公司。

这仅仅是个开始，以色列安全行业的风帆刚刚扬起，硅谷会是下一个港湾吗？

Read full article from 领跑全球安全行业，为什么是以色列？ - FreeBuf.COM | 关注黑客与极客

ssh登录之忽略known_hosts文件 | Reeoo's Blog

最后我的代码也没能提交成功，很是郁闷。
然后回到家，百度之，找到了问题之所在：
ssh会把你每个你访问过计算机的公钥(public key)都记录在~/.ssh/known_hosts。当下次访问相同计算机时，OpenSSH会核对公钥。如果公钥不同，OpenSSH会发出警告，避免你受到DNS Hijack之类的攻击。
原因：一台主机上有多个Linux系统，会经常切换，那么这些系统使用同一ip，登录过一次后就会把ssh信息记录在本地的~/.ssh/known_hsots文件中，切换该系统后再用ssh访问这台主机就会出现冲突警告，需要手动删除修改known_hsots里面的内容。

可是我也没切换系统啊，好吧，不管了，网页上给出了两个解决方法：

手动删除修改known_hsots里面的内容；
修改配置文件"~/.ssh/config"，加上这两行，重启服务器。
1
2
StrictHostKeyChecking no
UserKnownHostsFile /dev/null

试了一下，第一个方法不太好使，于是只能转向第二个方法，加上之后，好了~

优缺点：

需要每次手动删除文件内容，一些自动化脚本的无法运行（在ssh登陆时失败），但是安全性高；
ssh登陆时会忽略known_hsots的访问，但是安全性低；
以上的方法可能不太安全，但是只能先这样了，要不然我代码都push不了了，-_-…

本站部署于阿里云 ECS。如果你也要购买阿里云服务，可以使用我的九折推荐码 f4slbk（限新用户），多谢支持！

Read full article from ssh登录之忽略known_hosts文件 | Reeoo's Blog

If null is bad, why do modern languages implement it? - Programmers Stack Exchange

I'm sure designers of languages like Java or C# knew issues related to existence of null references

Of course.

Also implementing an option type isn't really much more complex than null references.

I beg to differ! The design considerations that went into nullable value types in C# 2 were complex, controversial and difficult. They took the design teams of both the languages and the runtime many months of debate, implementation of prototypes, and so on, and in fact the semantics of nullable boxing were changed very very close to shipping C# 2.0, which was very controversial.

Why did they decide to include it anyway?

All design is a process of choosing amongst many subtly and grossly incompatible goals; I can only give a brief sketch of just a few of the factors that would be considered:

Orthogonality of language features is generally considered a good thing. C# has nullable value types, non-nullable value types, and nullable reference types. Non-nullable reference types don't exist, which makes the type system non-orthogonal.
Familiarity to existing users of C, C++ and Java is important.
Easy interoperability with COM is important.
Easy interoperability with all other .NET languages is important.
Easy interoperability with databases is important.
Consistency of semantics is important; if we have reference TheKingOfFrance equal to null does that always mean "there is no King of France right now", or can it also mean "There definitely is a King of France; I just don't know who it is right now"? or can it mean "the very notion of having a King in France is nonsensical, so don't even ask the question!"? Null can mean all of these things and more in C#, and all these concepts are useful.
Performance cost is important.
Being amenable to static analysis is important.
Consistency of the type system is important; can we always know that a non-nullable reference is never under any circumstances observed to be invalid? What about in the constructor of an object with a non-nullable field of reference type? What about in the finalizer of such an object, where the object is finalized because the code that was supposed to fill in the reference threw an exception? A type system that lies to you about its guarantees is dangerous.
And what about consistency of semantics? Null values propagate when used, but null references throw exceptions when used. That's inconsistent; is that inconsistency justified by some benefit?
Can we implement the feature without breaking other features? What other possible future features does the feature preclude?
You go to war with the army you have, not the one you'd like. Remember, C# 1.0 did not have generics, so talking about Maybe<T> as an alternative is a complete non-starter. Should .NET have slipped for two years while the runtime team added generics, solely to eliminate null references?
What about consistency of the type system? You can say Nullable<T> for any value type -- no, wait, that's a lie. You can't say Nullable<Nullable<T>>. Should you be able to? If so, what are its desired semantics? Is it worthwhile making the entire type system have a special case in it just for this feature?

And so on. These decisions are complex.

Read full article from If null is bad, why do modern languages implement it? - Programmers Stack Exchange

极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

就像所有好的项目一样，这个项目也是为了解决实际问题而产生的。

作为网站可靠性工程师，我们负责管理 Google 公司的基础设施。我们平时需要处理大量的内部使用的服务，而这些服务需要负载均衡来保障其可伸缩性和可靠性。在 2012 年，我们有两个不同的平台来提供负载均衡，它们都有不同程度的管理和稳定性的挑战。为了缓解这方面的问题，我们团队开始着手寻找一个替代的负载均衡平台。

在评估了一些包括现有的开源项目的平台之后，我们没能找出一个能够满足我们所有需求的平台，所以我们决定自己着手开发一个可靠和可伸缩的负载均衡平台。需求并不太复杂，我们需要能够处理单播（unicast）和任播（anycast）虚拟 IP (VIPs) 流量，使用 NAT 和 DSR (也被称为 DR) 执行负载均衡，执行针对后端的健康检查。特别是，我们需要一个容易管理的平台，可以自动部署配置的变化。

原有的两个平台之一是基于 Linux LVS 构建的，它在网络层提供了必要的负载均衡。这方面已被证明是成功的，所以我们选择在新的平台中保留它。在项目初期我们就确定了几个设计决定，首先是使用 Go 语言，因为它提供了实现并发的强大方法(goroutines 和 channels) 以及方便的进程间通信(net/rpc)机制。其次是要实现一个模块化的多进程架构。第三，如果遇到了未知状态，能够简单地退出(abort)和终止(terminate)进程，这种情况理想上是做故障转移和/或自我恢复。

Read full article from 极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

什么样的程序员适合去创业公司 - 程序视界，漫谈程序人生，原创，有趣，有料，有能量 - 博客频道 - CSDN.NET

2. 创业公司带给程序员的风险

有一部电视剧叫做《北京人在纽约》，是由郑晓龙、冯小刚执导，姜文和王姬主演，豪华阵容啊，男的又帅又有内涵，女的又漂亮又有气质，当时火得一塌糊涂。

在这部电视剧里，有一句话特别流行，是这么说的：如果你爱他，就送他去纽约，因为那里是天堂；如果你恨他，就送他去纽约，因为那里是地狱。

关于创业公司和程序员，我们可以套用一下："如果你爱他，就送他去创业公司，因为那里是天堂；如果你恨他，就送他去创业公司，因为那里是地狱。"

OK，是好是坏，全在个人感觉。所以呢，下面这部分风险罗列与提示，仅供参考。

创业公司成功概率小，1%或者更低
创业公司变现周期长，比如大家喜闻乐见的股票和期权，这种变现方式，只能等到公司上市或再融资，以上市为例，第一视频05年成立06年上市，那是火箭一般的速度啊；空中网也比较快，02年成立04年纳斯达克上市，用了2年2个月；聚美优品10年成立，14年上市，用了4年2个月……这都是快的，阿里巴巴十几年才上市，还有很多公司根本就没希望上市，唱的是"出师未捷身先死，长使英雄泪满襟"的戏，而大部分程序员假如的公司，都是最后面唱戏的这种
创业公司也不是人人都有股份和期权……你懂的，即便你选中了1%的那些公司熬过了变现前的进行曲，也可能到时什么事儿都没你的……
创业公司工作不规律，OK，这是灰常常见的，比如各种加班，结婚的加班到妻离子散，有女朋友的加班到单身，单身的加班到没有朋友……
个人定位不清晰，你懂的，成长快么，全栈么，一个人顶10个人用么，哪里缺人顶哪里么，如果你缺乏适应性，可能会风中凌乱或精神分裂

Read full article from 什么样的程序员适合去创业公司 - 程序视界，漫谈程序人生，原创，有趣，有料，有能量 - 博客频道 - CSDN.NET

极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

在互联网行业的估值方法里，常用的有P/E（市盈率）、P/S（市销率）、P/GMV(市值/交易流水)、P/订单量、P/用户数，等等。到底在什么时候应该用什么估值方法，一直是业界争论不休的问题。本文希望找到各种估值方法的内部关系，并提出一些建设性的看法。

让我们先看一个虚拟的社交类企业的融资历程：

天使轮：公司由一个连续创业者创办，创办之初获得了天使投资。
A轮：1年后公司获得A轮，此时公司MAU（月活）达到50万人，ARPU（单用户贡献）为0元，收入为0。
A+轮：A轮后公司用户数发展迅猛，半年后公司获得A+轮，此时公司MAU达到500万人，ARPU为1元。公司开始有一定的收入（500万元），是因为开始通过广告手段获得少量的流量变现。
B轮：1年后公司再次获得B轮，此时公司MAU已经达到1500万人，ARPU为5元，公司收入已经达到7500万元。ARPU不断提高，是因为公司已经在广告、游戏等方式找到了有效的变现方法。
C轮：1年后公司获得C轮融资，此时公司MAU为3000万人，ARPU为10元，公司在广告、游戏、电商、会员等各种变现方式多点开花。公司此时收入达到3亿元，另外公司已经开始盈利，假设有20%的净利率，为6000万。
IPO：以后公司每年保持收入和利润30-50%的稳定增长，并在C轮1年后上市。

这是一个典型的优秀互联网企业的融资历程，由连续创业者创办，每一轮都获得著名VC投资，成立五年左右上市。我们从这个公司身上，可以看到陌陌等互联网公司的影子。公司每一轮的估值是怎么计算的呢？
我们再做一些假设，按时间顺序倒着来讲：
IPO：上市后，公众资本市场给了公司50倍市盈率。细心而专业的读者会立即反应过来，这个公司的股票投资价值不大了，PEG>1（市盈率/增长），看来最好的投资时点还是在私募阶段，钱都被VC和PE们挣了。^_^

C轮的时候，不同的投资机构给了公司不同的估值，有的是50倍P/E，有的是10倍P/S，有的是单个月活估100元人民币，但最终估值都是30亿。不信大家可以算算。每种估值方法都很有逻辑的：一个拟上创业板的公司给50倍市盈率，没问题吧；一个典型的互联网公司给10倍市销率，在美国很流行吧？或者一个用户给15-20美元的估值，看看facebook、twitter等几个公司的估值，再打点折扣。

B轮的时候，不同的投资机构给了不同的估值方法，分歧开始出来了：某个机构只会按P/E估值，他给了公司50倍市盈率，但公司没有利润，所以公司估值为0；某个机构按P/S估值，他给了公司10倍市销率，所以公司估值10*0.75亿=7.5亿；某个机构按P/MAU估值，他给每个MAU 100元人民币，所以公司估值达 100元*1500万人=15亿。不同的估值方法，差异居然这么大！看来，此时P/E估值方法已经失效了，但P/S、P/MAU继续适用，但估出来的价格整整差了一倍！假设公司最终是在7.5-15亿之间选了一个中间值10亿，接受了VC的投资。

A轮的时候，P/E、P/S都失效了，但如果继续按每个用户100元估值，公司还能有100元/人*500万人=5亿估值。此时能看懂公司的VC比较少，大多数VC顾虑都很多，但公司选择了一个水平很高的、敢按P/MAU估值、也坚信公司未来会产生收入的VC，按5亿估值接受了投资。

在天使轮的时候，公司用户、收入、利润啥都没有， P/E、P/S、P/MAU都失效了，是怎么估值的呢？公司需要几百万元启动，由于创始人是著名创业者，所以VC都多投了一点，那就给2000万吧，再谈个不能太少不能太多的比例，20%，最后按1亿估值成交。

我们总结一下，这个互联网公司天使轮的估值方法是拍脑袋；A轮的估值方法是P/MAU；B轮的估值方法是P/MAU、P/S；C轮的估值方法是P/MAU、P/S、P/E；也许上市若干年后，互联网公司变成传统公司，大家还会按P/B（市净率）估值！大家回想一下，是不是大多数的融资都是类似的情况？

对互联网公司来说，P/MAU估值体系的覆盖范围是最广的，P/E估值体系的覆盖范围是最窄的。在此，我姑且把这种覆盖体系叫做估值体系的阶数。P/MAU是低阶估值体系，容忍度最高；P/E是高阶估值体系，对公司的要求最高。

不同的估值方法殊路同归：我们来看一个公式：净利润(E,earning)=收入(S，sales)-成本费用=用户数（MAU）*单用户贡献（ARPU）-成本费用。一般来说，如果企业没有E，还可以投S；如果没有S，还可以投MAU，但最终还是期待流量能转换为收入，收入能转换成利润。不同的创业企业，都处于不同的阶段，有的属于拼命扩大用户量的阶段，有的属于绞尽脑汁让流量变现的阶段，有的属于每天琢磨怎么实现盈利的阶段。然而，最终大家是要按盈利来考察一个公司的，那时候不同阶数的估值方法是殊路同归的。

为什么发展好好的公司会"B轮死"、"C轮死"：有的公司用户基数很大，但总是转换不成收入，如果在融下一轮的时候（假设是B轮），投资人坚决要按高阶估值体系P/S估值，那么公司的估值算下来是0，融不到资，所以会出现B轮死；有的公司收入规模也不错，但老是看不到盈利的希望，如果在融下一轮的时候（假设是C轮），面对的是只按净利润估值的PE机构，他们觉得公司P/E估值为0，公司融不到资，也会出现C轮死。

不同的经济周期，估值体系的使用范围会平移：在牛市，估值体系会往后移，这能解释为什么过去两年很多一直没有净利润的公司都获得了C轮、D轮，甚至E轮，而且是传统的PE机构投资的，因为他们降阶了，开始使用P/S这个低阶工具了。在熊市，估值体系会往前移，这能解释为什么今年下半年以来，一些收入和用户数发展良好的公司都融不到资，甚至只能合并来抱团取暖，因为连很多VC也要求利润了，大家把低阶的估值体系雪藏了。

二级市场的政策有明显的引导作用：中国为什么一直缺少人民币VC？部分原因是，中国的公众资本市场只认P/E这个高阶估值体系。我们看看创业板发行规则：" (1)连续两年连续盈利，累计净利润不少于1,000万元……或(2)最近一年净利润不少于500万元，营业收入不少于5000万元……"。必须要有这么多的利润，才能上市，才能在二级市场具有价值，这个估值体系要求实在太高了。当企业只有用户数、只有收入规模，哪怕你用户数是10亿人，你的收入规模有100亿，只要没有利润，估值统统为0！所以人民币VC很少，PE很多，因为他们响应了政府的号召只用市盈率这个工具，不然没有退出渠道！但美股、港股都有P/S的测试指标，只要达到一定规模就可以成为公众公司上市。如果公司能在上市后相当一段时间内都可以只按P/S估值（最终可能还是要按P/E），将打通大多数公司的发展阶段，让每一轮的估值都变得顺畅起来。

到此，各种估值体系的内在联系以及使用方法就探究完毕了，希望各位创业者和投资者读者能应用这些原理行走在牛熊之间、各轮融资之间，希望立法者读者能重视各阶估值体系的威力，积极改进规则发挥其对创新的引导作用。

Read full article from 极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

Tiger's leetcode solution: 330. Patching Array ！！！！！！！！

Read full article from Tiger's leetcode solution: 330. Patching Array ！！！！！！！！

Patching Array - Algorithms and Problem SolvingAlgorithms and Problem Solving

This is a Leetcode problem. Example 1: Return 1. Combinations of nums are [1], [3], [1,3], which form possible sums of: 1, 3, 4. Now if we add/patch 2 to nums, the combinations are: [1], [2], [3], [1,3], [2,3], [1,2,3]. Possible sums are 1, 2, 3, 4, 5, 6, which now covers the range [1, 6]. So we only need 1 patch. Example 2: Return 2. Example 3: Return 0. Note that, we should construct all numbers from 1 to n. A dirty take on this problem would be to to go from 1 to n and add all the numbers missing in the array. But certainly this is not the optimal solution. Because we don't actually need all consecutive numbers to get another by adding them up. For example, A=[1,2,4]. In this case we don't need 3 to make 3 because we can make 3 by adding 1 and 2. Also we don't need 5 either. But if we want to make 8 then we have to add 8 in the list. Once we added 8 to the list i.e. A'=[1,2,4,8] then we don't need to add any more element until 15 (why?). Basically,

Read full article from Patching Array - Algorithms and Problem SolvingAlgorithms and Problem Solving

Jsoup代码解读之一-概述 - ImportNew

今天看到一个用python写的抽取正文的东东，美滋滋的用Java实现了一番，放到了webmagic里，然后发现Jsoup里已经有了…觉得自己各种不靠谱啊！算了，静下心来学学好东西吧！

Jsoup是Java世界用作html解析和过滤的不二之选。支持将html解析为DOM树、支持CSS Selector形式选择、支持html过滤，本身还附带了一个Http下载器。从今天开始会写一个Jsoup源码解读系列，比起之前的博客，尽量会写的详尽一些。

概述

Jsoup的代码相当简洁，Jsoup总共53个类，且没有任何第三方包的依赖，对比最终发行包9.8M的SAXON，实在算得上是短小精悍了。

jsoup
├── examples #样例，包括一个将html转为纯文本和一个抽取所有链接地址的例子。
├── helper #一些工具类，包括读取数据、处理连接以及字符串转换的工具
├── nodes #DOM节点定义
├── parser #解析html并转换为DOM树
├── safety #安全相关，包括白名单及html过滤
└── select #选择器，支持CSS Selector以及NodeVisitor格式的遍历
使用

Jsoup的入口是Jsoup类。examples包里提供了两个例子，解析html后，分别用CSS Selector以及NodeVisitor来操作Dom元素。

这里用ListLinks里的例子来说明如何调用Jsoup：

public static void main(String[] args) throws IOException {
 Validate.isTrue(args.length == 1, "usage: supply url to fetch");
 String url = args[0];
 print("Fetching %s...", url);
// 下载url并解析成html DOM结构
 Document doc = Jsoup.connect(url).get();
 // 使用select方法选择元素，参数是CSS Selector表达式
 Elements links = doc.select("a[href]");
print("\nLinks: (%d)", links.size());
 for (Element link : links) {
 //使用abs:前缀取绝对url地址
 print(" * a: <%s> (%s)", link.attr("abs:href"), trim(link.text(), 35));
 }
}

Jsoup使用了自己的一套DOM代码体系，这里的Elements、Element等虽然名字和概念都与Java XML APIorg.w3c.dom类似，但并没有代码层面的关系。就是说你想用XML的一套API来操作Jsoup的结果是办不到的，但是正因为如此，才使得Jsoup可以抛弃xml里一些繁琐的API，使得代码更加简单。

还有一种方式是通过NodeVisitor来遍历DOM树，这个在对整个html做分析和替换时比较有用：

public interface NodeVisitor {
//遍历到节点开始时，调用此方法
 public void head(Node node, int depth);
//遍历到节点结束时(所有子节点都已遍历完)，调用此方法
 public void tail(Node node, int depth);
}
HtmlToPlainText的例子说明了如何使用NodeVisitor来遍历DOM树，将html转化为纯文本，并将需要换行的标签替换为换行\n：
 
public static void main(String... args) throws IOException {
 Validate.isTrue(args.length == 1, "usage: supply url to fetch");
 String url = args[0];
// fetch the specified URL and parse to a HTML DOM
 Document doc = Jsoup.connect(url).get();
HtmlToPlainText formatter = new HtmlToPlainText();
 String plainText = formatter.getPlainText(doc);
 System.out.println(plainText);
}
public String getPlainText(Element element) {
 //自定义一个NodeVisitor - FormattingVisitor
 FormattingVisitor formatter = new FormattingVisitor();
 //使用NodeTraversor来装载FormattingVisitor
 NodeTraversor traversor = new NodeTraversor(formatter);
 //进行遍历
 traversor.traverse(element);
 return formatter.toString();
}

下一节将从DOM结构开始对Jsoup代码进行分析。

Read full article from Jsoup代码解读之一-概述 - ImportNew

Redis 设计与实现 — Redis 设计与实现

欢迎来到《Redis 设计与实现》的支持网站！

《Redis 设计与实现》一书全面而完整地讲解了 Redis 的内部运行机制，对 Redis 的大多数单机功能以及所有多机功能的实现原理进行了介绍，展示了这些功能的核心数据结构以及关键的算法思想。通过阅读本书，读者可以快速、有效地了解 Redis 的内部构造以及运作机制，从而学会如何更高效地使用 Redis 。

你可以通过访问本站，或者关注本书作者的微博、twitter和豆瓣来获知本书的最新消息。

购买本书请访问：京东商城、互动出版网（china-pub）、亚马逊、当当网，另外本书的 Kindle 版本、多看阅读版本和豆瓣阅读版本也已有售。

内容与特色介绍¶

本书介绍了以下内容：

字符串（string）、散列（hash）、列表（list）、集合（set）和有序集合（sorted set）这五种类型的键的底层实现数据结构。
Redis 的对象处理机制以及数据库的实现原理。
事务实现原理。
订阅与发布实现原理。
Lua 脚本功能的实现原理。
SORT 命令的实现原理。
BITOP 、 BITCOUNT 等二进制位处理命令的实现原理。
慢查询日志的实现原理。
RDB 持久化和 AOF 持久化的实现原理。
Redis 事件处理器的实现原理。
Redis 服务器和客户端的实现原理。
复制（replication）、Sentinel 和集群（cluster）这三个多机功能的实现原理。

本书的特色是：

带有丰富的图示和表格，帮助读者更好地理解书中的知识点。
关注功能的高层设计思路而不是底层的实现代码，让读者无须花时间研读代码就可以了解到 Redis 的内部实现。
提供带有中文注释的 Redis 源码，帮助有需要的读者做进一步的学习。

查看目录并试读¶

《Redis 设计与实现》全书共有 388 页，分为 4 个部分，共 24 章。

以下目录中可点击的为试读内容。

简介

版本说明

章节编排

推荐的阅读方法

行文规则

配套网站

第一部分：数据结构与对象

简单动态字符串

SDS 的定义

SDS 与 C 字符串的区别

SDS API

重点回顾

参考资料

链表

链表和链表节点的实现

链表和链表节点的 API

重点回顾

字典

字典的实现

哈希算法

解决键冲突

rehash

渐进式 rehash

字典 API

重点回顾

跳跃表

跳跃表的实现

跳跃表 API

重点回顾

整数集合

整数集合的实现

升级

升级的好处

降级

整数集合 API

重点回顾

压缩列表

压缩列表的构成

压缩列表节点的构成

连锁更新

压缩列表 API

重点回顾

对象

对象的类型与编码

字符串对象

列表对象

哈希对象

集合对象

有序集合对象

类型检查与命令多态

内存回收

对象共享

对象的空转时长

重点回顾

第二部分：单机数据库的实现

数据库

服务器中的数据库

切换数据库

数据库键空间

设置键的生存时间或过期时间

过期键删除策略

Redis 的过期键删除策略

AOF 、RDB 和复制功能对过期键的处理

数据库通知

重点回顾

RDB 持久化

RDB 文件的创建与载入

自动间隔性保存

RDB 文件结构

分析 RDB 文件

重点回顾

AOF 持久化

AOF 持久化的实现

AOF 文件的载入与数据还原

AOF 重写

重点回顾

事件

文件事件

时间事件

事件的调度与执行

重点回顾

参考资料

客户端

客户端属性

客户端的创建与关闭

重点回顾

服务器

命令请求的执行过程

serverCron 函数

初始化服务器

重点回顾

第三部分：多机数据库的实现

复制

旧版复制功能的实现

旧版复制功能的缺陷

新版复制功能的实现

部分重同步的实现

PSYNC 命令的实现

复制的实现

心跳检测

重点回顾

Read full article from Redis 设计与实现 — Redis 设计与实现

Check if leaf traversal of two Binary Trees is same? - GeeksforGeeks

Leaf traversal is sequence of leaves traversed from left to right. The problem is to check if leaf traversals of two given Binary Trees are same or not.

Expected time complexity O(n). Expected auxiliary space O(h1 + h2) where h1 and h2 are heights of two Binary Trees.

Examples:

Input: Roots of below Binary Trees           1              	/ \         2   3              /   / \		         4   6   7    	 0  	/  \         5    8	            \  / \		          4  6  7  Output: same  Leaf order traversal of both trees is 4 6 7	     Input: Roots of below Binary Trees           0              	/ \         1   2               / \   	       8   9       	 1  	/ \         4   3	           \ / \		          8 2  9    Output: Not Same  Leaf traversals of two trees are different.  For first, it is 8 9 2 and for second it is  8 2 9

We strongly recommend you to minimize your browser and try this yourself first.
A Simple Solution is traverse first tree and store leaves from left and right in an array. Then traverse other tree and store leaves in another array. Finally compare two arrays. If both arrays are same, then return true.

The above solution requires O(m+n) extra space where m and n are nodes in first and second tree respectively.

How to check with O(h1 + h2) space?
The idea is use iterative traversal. Traverse both trees simultaneously, look for a leaf node in both trees and compare the found leaves. All leaves must match.

Algorithm:

1. Create empty stacks stack1 and stack2      for iterative traversals of tree1 and tree2    2. insert (root of tree1) in stack1     insert (root of tree2) in stack2    3. Stores current leaf nodes of tree1 and tree2  temp1 = (root of tree1)   temp2 = (root of tree2)      4. Traverse both trees using stacks  while (stack1 and stack2 parent empty)   {      // Means excess leaves in one tree      if (if one of the stacks are empty)     	return false       // get next leaf node in tree1      temp1 = stack1.pop()     while (temp1 is not leaf node)      {          push right child to stack1       	push left child to stack1     }       // get next leaf node in tree2          temp2 = stack2.pop()     while (temp2 is not leaf node)      {          push right child to stack2        	push left child to stack2     }       // If leaves do not match return false     if (temp1 != temp2)                           return false  }    5. If all leaves matched, return true

Read full article from Check if leaf traversal of two Binary Trees is same? - GeeksforGeeks

Search a 2D Matrix | LeetCode题解

Search a 2D Matrix

Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:

Integers in each row are sorted from left to right. The first integer of each row is greater than the last integer of the previous row. For example,

Consider the following matrix:

[   [1,   3,  5,  7],   [10, 11, 16, 20],   [23, 30, 34, 50] ]

题目翻译: 给定一个矩阵和一个特定值，要求写出一个高效的算法在这个矩阵中快速的找出是否这个给定的值存在. 但是这个矩阵有以下特征.

对于每一行，数值是从左到右从小到大排列的.
对于每一列，数值是从上到下从小到大排列的.

题目解析: 对于这个给定的矩阵，我们如果用brute force解法，用两个嵌套循环，O(n2)便可以得到答案.但是我们需要注意的是这道题已经给定了这个矩阵的两个特性，这两个特性对于提高我们算法的时间复杂度有很大帮助，首先我们给出一个O(n)的解法，也就是说我们可以固定住右上角的元素，根据递增或者递减的规律，我们可以判断这个给定的数值是否存在于这个矩阵当中.

class Solution {

Read full article from Search a 2D Matrix | LeetCode题解

喜刷刷: [LeetCode] Search a 2D Matrix

[LeetCode] Search a 2D Matrix

Write an efficient algorithm that searches for a value in an m x n matrix. This matrix has the following properties:

Integers in each row are sorted from left to right.
The first integer of each row is greater than the last integer of the previous row.

For example,

Consider the following matrix:

[    [1,   3,  5,  7],    [10, 11, 16, 20],    [23, 30, 34, 50]  ]

Given target = 3, return true.

思路：分别搜索判断行坐标和列坐标。

找出上下左右四个边界上元素的规律特点：

1. 左边界：A[i,0]为row i ~ row m-1的最小值。
当target < A[i, 0]时，在row 0 ~ row i-1中继续搜索。反之则无法判断，因为row 0 ~ row i-1中也可能存在比A[i, 0]大的数。

2. 右边界：A[i, n-1]为row 0 ~ row i的最大值。

当target > A[i, n-1]时，在row i+1 ~ row m-1中继续搜索。同理反之则无法判断。

3. 上边界：A[0, j]为col j ~ col n-1的最小值。

当target < A[0,j]时，在col 0 ~ col j-1中继续搜索。反之则无法判断。

4. 下边界：A[m-1, j]为col 0 ~ col j的最大值。

当target > A[m-1, j]时，在col j+1 ~ col n-1中继续搜索。反之则无法判断。

发现对于每个边界点而言，总有一种大小关系是无法排除区域的，而搜索时我们希望通过一次和target的比较就能减小搜索区域。这里的窍门是，矩阵的四个顶点中的任意一个(i, j)都同时在两个边界上。我们希望这两个边界的条件组合能提供：无论target < A[i,j]还是target > A[i,j]都能缩小搜索范围。观察以上四个边界，发现1,4的组合或2,3的组合都能符合。

1，4组合对应左下角：target < A[i, j]时，向上搜索；反之向右搜索

2，3组合对应右上角：target < A[i, j]时，向左搜索；反之向下搜索

以1，4组合的左下角出发为例，解法如下：

Read full article from 喜刷刷: [LeetCode] Search a 2D Matrix

2 remaining jail escapees arrested in San Francisco, showed no major resistance, police say - LA Times

Two Orange County jail escapees arrested Hossein Nayeri, left, and Jonathan Tieu Hossein Nayeri, left, and Jonathan Tieu The two Orange County jail escapees who remained at large after a daring escape eight days ago were arrested in San Francisco after a citizen noticed a van matching the description of the one they had allegedly stolen parked in a lot near a Whole Foods Market, officials said Saturday. Authorities said police were attending to an unrelated a medical emergency when a man a flagged down officers. The man told them he suspected that a white van parked near the market at Haight and Stanyan streets was "the one wanted in the Orange County jail escape," said San Francisco police Officer Grace Gatpandan. Officers began looking in the immediate area for the man -- later identified as Hossein Nayeri -- and found him near Waller and Stanyan streets, Gatpandan said. Police chased and arrested him after a short pursuit. When the officers went back to the van,

Read full article from 2 remaining jail escapees arrested in San Francisco, showed no major resistance, police say - LA Times

Wal-Mart Closures Bring Out the Amazon Sellers - Digits - WSJ

前端自动化之神器 ― Gulp--花满楼

Nodejs不仅把Javascript带到了服务端，也在前端掀起了自动化的浪潮，推动了前端工作的历史性巨变，今天和大家一起学习前端自动化的神器---Gulp;

说起自动化，自然少不了Grunt，这位前辈目前社区完善，拥有几千个现成插件，install下来参考文档即可配置使用（参见：前端自动化之利剑――Grunt）；而Gulp的出现，希望取其精华并取代Grunt，成为最流行的Javascript构建工具，Gulp采用代码优于配置的策略，让简单的事继续简单，让复杂的事变得可管理；

与Grunt的不同：

流：Gulp是一个基于流的构建系统，使用代码优于配置的策略。
插件：Gulp的插件更纯粹，单一的功能，并坚持一个插件只做一件事。
代码优于配置：维护Gulp更像是写代码，而且Gulp遵循CommonJS规范，因此跟写Node程序没有差别。
没有产生中间文件

使用Gulp的优势就是利用流的方式进行文件的处理，通过管道将多个任务和操作连接起来，因此只有一次I/O的过程，流程更清晰，更纯粹。Gulp去除了中间文件，只将最后的输出写入磁盘，整个过程因此变得更快。

下面在根目录下新建一个Gulpfile.js，我们将完成以下任务：

图片的无损压缩
Sass文件的编译压缩
JS文件的压缩合并
对以上任务的实时监听

cmd进入项目根目录，安装所需要的Gulp及其插件：

Read full article from 前端自动化之神器 ― Gulp--花满楼

Git 2.7: a Major New Release with Many New Features and Improvements

Two months after the release of version 2.6, Git 2.7 has been announced, bringing many new features as well as performance improvements.

Here is a selection of the major changes that Git 2.7 includes:

git remote supports a get-url subcommand that will show the URL for a given remote.
git rebase added a new command line option, --no-autostash, that will override the rebase.autostash configuration variable.
git worktree supports a list subcommand to display a repository's worktree and their associated branches. Worktrees are a feature firstly included in Git 2.5 to make it easier to work on multiple branches of the same repository.
git bisect has been made to work nicely when used concurrently on multiple worktrees. Additionally, the command now supports old and new subcommands to make its use less confusing than with the previous bad and good subcommands. bisect is useful when hunting for a hard to identify state change that produced some undesired effect. It allows developers to mark a good/old commit and a bad/new commit so a binary search can be carried through those commits looking for the one that broke things.
git submodule supports a new configuration option, push.recurseSubmodules to help developers pushing changes to the main module without previously pushing their changed submodules. The same effect could be obtained by using the --recurse-submodules=on-demand option on the command line, but the new push.recurseSubmodules can make that behaviour be the default.
git stash supports a new configuration option, stash.showPatch, to make it always show the actual patch instead of a list of paths affected files. This behaviour could be obtained in Git 2.6 by using the -p flag on the command line.
On the performance front, progress has been made to rewrite git submodule in C.

Read full article from Git 2.7: a Major New Release with Many New Features and Improvements

Finding the process that is using a certain port in Linux: lsof and netstat - Sigmainfy - Technical blog to learn, share and to inspire

Finding the process that is using a certain port in Linux: lsof and netstat – Sigmainfy – Technical blog to learn, share and to inspire

use command lsof

lsof -i tcp:9527

will give you the list of processes using tcp port 9527.

Alternatively,

sudo netstat -nlp

will give you all open network connections.

Read full article from Finding the process that is using a certain port in Linux: lsof and netstat – Sigmainfy – Technical blog to learn, share and to inspire

大数据技术大合集：Hadoop家族、Cloudera系列、spark、storm.. (转) �C Sigmainfy �C Technical blog to learn, share and to inspire

大数据我们都知道hadoop，可是还会各种各样的技术进入我们的视野：Spark，Storm，impala，让我们都反映不过来。为了能够更好的架构大数据项目，这里整理一下，供技术人员，项目经理，架构师选择合适的技术，了解大数据各种技术之间的关系，选择合适的语言。

我们可以带着下面问题来阅读本文章：
1.hadoop都包含什么技术？
2.Cloudera公司与hadoop的关系是什么，都有什么产品，产品有什么特性？
3.Spark与hadoop的关联是什么？
4.Storm与hadoop的关联是什么？

hadoop家族

创始人：Doug Cutting

整个Hadoop家族由以下几个子项目组成：

Hadoop Common：

Hadoop体系最底层的一个模块，为Hadoop各子项目提供各种工具，如：配置文件和日志操作等。

HDFS：

是Hadoop应用程序中主要的分布式储存系统， HDFS集群包含了一个NameNode（主节点），这个节点负责管理所有文件系统的元数据及存储了真实数据的DataNode（数据节点，可以有很多）。HDFS针对海量数据所设计，所以相比传统文件系统在大批量小文件上的优化，HDFS优化的则是对小批量大型文件的访问和存储。

MapReduce：

是一个软件框架，用以轻松编写处理海量（TB级）数据的并行应用程序，以可靠和容错的方式连接大型集群中上万个节点（商用硬件）。

Hive：

Apache Hive是Hadoop的一个数据仓库系统，促进了数据的综述（将结构化的数据文件映射为一张数据库表）、即席查询以及存储在Hadoop兼容系统中的大型数据集分析。Hive提供完整的SQL查询功能――HiveQL语言，同时当使用这个语言表达一个逻辑变得低效和繁琐时，HiveQL还允许传统的Map/Reduce程序员使用自己定制的Mapper和Reducer。hive类似CloudBase，基于hadoop分布式计算平台上的提供data warehouse的sql功能的一套软件。使得存储在hadoop里面的海量数据的汇总，即席查询简单化。

Pig：

Apache Pig是一个用于大型数据集分析的平台，它包含了一个用于数据分析应用的高级语言以及评估这些应用的基础设施。Pig应用的闪光特性在于它们的结构经得起大量的并行，也就是说让它们支撑起非常大的数据集。Pig的基础设施层包含了产生Map-Reduce任务的编译器。Pig的语言层当前包含了一个原生语言――Pig Latin，开发的初衷是易于编程和保证可扩展性。

Pig是SQL-like语言，是在MapReduce上构建的一种高级查询语言，把一些运算编译进MapReduce模型的Map和Reduce中，并且用户可以定义自己的功能。Yahoo网格运算部门开发的又一个克隆Google的项目Sawzall。

HBase：

Apache HBase是Hadoop数据库，一个分布式、可扩展的大数据存储。它提供了大数据集上随机和实时的读/写访问，并针对了商用服务器集群上的大型表格做出优化――上百亿行，上千万列。其核心是Google Bigtable论文的开源实现，分布式列式存储。就像Bigtable利用GFS（Google File System）提供的分布式数据存储一样，它是Apache Hadoop在HDFS基础上提供的一个类Bigatable。

ZooKeeper：

Zookeeper是Google的Chubby一个开源的实现。它是一个针对大型分布式系统的可靠协调系统，提供的功能包括：配置维护、名字服务、分布式同步、组服务等。ZooKeeper的目标就是封装好复杂易出错的关键服务，将简单易用的接口和性能高效、功能稳定的系统提供给用户。

Avro：

Avro是doug cutting主持的RPC项目，有点类似Google的protobuf和Facebook的thrift。avro用来做以后hadoop的RPC，使hadoop的RPC模块通信速度更快、数据结构更紧凑。

Sqoop:

Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具，可以将一个关系型数据库中数据导入Hadoop的HDFS中，也可以将HDFS中数据导入关系型数据库中。

Mahout:

Apache Mahout是个可扩展的机器学习和数据挖掘库，当前Mahout支持主要的4个用例：

推荐挖掘：搜集用户动作并以此给用户推荐可能喜欢的事物。
聚集：收集文件并进行相关文件分组。
分类：从现有的分类文档中学习，寻找文档中的相似特征，并为无标签的文档进行正确的归类。
频繁项集挖掘：将一组项分组，并识别哪些个别项会经常一起出现。

Cassandra：

Apache Cassandra是一个高性能、可线性扩展、高有效性数据库，可以运行在商用硬件或云基础设施上打造完美的任务关键性数据平台。在横跨数据中心的复制中，Cassandra同类最佳，为用户提供更低的延时以及更可靠的灾难备份。通过log-structured update、反规范化和物化视图的强支持以及强大的内置缓存，Cassandra的数据模型提供了方便的二级索引（column indexe）。

Chukwa：

Apache Chukwa是个开源的数据收集系统，用以监视大型分布系统。建立于HDFS和Map/Reduce框架之上，继承了Hadoop的可扩展性和稳定性。Chukwa同样包含了一个灵活和强大的工具包，用以显示、监视和分析结果，以保证数据的使用达到最佳效果。

Ambari：

Apache Ambari是一个基于web的工具，用于配置、管理和监视Apache Hadoop集群，支持Hadoop HDFS,、Hadoop MapReduce、Hive、HCatalog,、HBase、ZooKeeper、Oozie、Pig和Sqoop。Ambari同样还提供了集群状况仪表盘，比如heatmaps和查看MapReduce、Pig、Hive应用程序的能力，以友好的用户界面对它们的性能特性进行诊断。

HCatalog

Apache HCatalog是Hadoop建立数据的映射表和存储管理服务，它包括：

提供一个共享模式和数据类型机制。
提供一个抽象表，这样用户就不需要关注数据存储的方式和地址。

为类似Pig、MapReduce及Hive这些数据处理工具提供互操作性。

Chukwa：

Chukwa是基于Hadoop的大集群监控系统，由yahoo贡献。

Cloudera系列产品：

Read full article from 大数据技术大合集：Hadoop家族、Cloudera系列、spark、storm.. (转) �C Sigmainfy �C Technical blog to learn, share and to inspire

Absolute Value for MIN_VALUE in Java: Overflow Issue - Sigmainfy - Technical blog to learn, share and to inspire

Absolute Value for MIN_VALUE in Java: Overflow Issue – Sigmainfy – Technical blog to learn, share and to inspire

Integer.MIN_VALUE is -2147483648, but the highest value a 32 bit integer can contain is +2147483647. Attempting to represent +2147483648 in a 32 bit int will effectively “roll over” to -2147483648. This is because, when using signed integers, the two’s complement binary representations of +2147483648 and -2147483648 are identical. This is not a problem, however, as +2147483648 is considered out of range.

Read full article from Absolute Value for MIN_VALUE in Java: Overflow Issue – Sigmainfy – Technical blog to learn, share and to inspire

朋友圈模糊了？咱有 mitmproxy >> Topics >> 中国软件匠艺小组

朋友圈模糊了？咱有 mitmproxy » Topics » 中国软件匠艺小组

因为它是基于Python实现的，因此安装也比较简单，对于安装了pip的小伙伴可以直接pip install mitmproxy。用robotframework的小伙伴大多数应该都安装了pip了，不过在我的mac上用pip安装会在一个叫pyparsing的库安装卡住了，印象里记得是2.0.1的版本找不到，在pypi.python.org上搜这个库最新是2.0.7，最后只好手工下载源码安装了pyparsing，然后再重新pip安装mitmproxy。下面的演示是在mac上进行的，其他平台请参考官方网站（要科学上网）。Mac版本还有打好包的预编译版本可以使用。

Read full article from 朋友圈模糊了？咱有 mitmproxy » Topics » 中国软件匠艺小组

Find Missing Number - Algorithms and Problem SolvingAlgorithms and Problem Solving

1. Given a set of positive numbers less than equal to N, where one number is missing. Find the missing number efficiently.
2. Given a set of positive numbers less than equal to N, where two numbers are missing. Find the missing numbers efficiently.
3. Given a sequence of positive numbers less than equal to N, where one number is repeated and another is missing. Find the repeated and the missing numbers efficiently.
4. Given a sequence of integers (positive and negative). Find the first missing positive number in the sequence.

Solutions should not use no more than O(n) time and constant space.

For example,

1. A=[2,1,5,8,6,7,3,10,9] and N=10 then, 4 is missing.
2. A=[2,1,5,8,6,7,3,9] and N=10 then, 4 and 10 are missing.
3. A=[2,1,5,8,3,6,7,3,10] and N=10 then, 3 is repeating and 9 is missing.
2. A=[1,2,0] then first missing positive is 3, A=[3,4,-1,1], the first missing positive is 2.

Single Number Missing
A trivial approach would be to sort the array and loop through zero to N-1 to check whether index i contains number i+1. This will take constant space but takes O(nlgn) time. We can do a counting sort to sort the array but still it'll take in O(n+k) time and O(k) space. But we need to do it O(n) time and constant space, how?

Its rather simple to do it if we apply some elementary mathematics. We know that the input set contains positive numbers less than equal to N. If there were no missing in the sequence then summation of all the numbers would yield a sum of N*(N+1)/2 , which is the value of summation of numbers 1 to N. But if one number is missing then the summation of the given numbers, S will be less than expected sum N*(N+1)/2 and the difference (N*(N+1)/2 – S) is the missing number. This is O(n) time and O(1) space algorithm.

Read full article from Find Missing Number - Algorithms and Problem SolvingAlgorithms and Problem Solving

谷歌招聘软件工程师

职责：

我们寻找最富有创新精神的工程师设计和开发Google在搜索、广告等方面的国际产品与架构，构建复杂的大规模应用，影响全球数十亿用户。

基本要求：

计算机科学或类似技术领域理学学士学位（或具备同等水平的实践经验）。
具有一种或多种通用编程语言方面的软件开发经验。
在以下至少两个领域具有工作经验：网络应用开发、Unix/Linux 环境、移动应用开发、分布式并行系统、机器学习、信息检索、自然语言处理、网络、大型软件系统开发、安全防护软件开发。
工作能力强，具有良好的英文沟通（书面和口头）能力。

优先条件：

拥有工程、计算机科学或其他相关技术领域的硕士或博士学位，或在这些领域接受过继续教育，或具有相关工作经验。
具有一种或多种通用编程语言方面的经验，包括但不限于：Java, C/C++, C#, Objective C, Python, JavaScript 或 Go。
有大规模移动应用开发经验（iOS或Android)。
有建立面向开发者的工具和服务相关经验，或从事过构建开发者生态系统方面的工作。
对探索未知领域充满热情，充满创新精神解决最棘手的技术难题。

工作领域：

Google 是一家富有技术创新精神的公司。目前是，将来也不会改变。我们要求员工拥有多方面的技术技能，随时可以应对最棘手的技术难题，能够影响数十亿用户。在Google，工程师不仅要负责对搜索技术进行革新，还要在日常工作中负责开发/维护供全球开发者使用的大型可扩展和存储解决方案、大型应用及全新的平台。Google工程师一直在凭借一项又一项技术成就改变着世界，希望您能加入我们的团队，让世界更美好。

申请方式：

Read full article from 谷歌招聘软件工程师

本来连学计算机的都不是，怎么却读了计算机研究生 - mindwind - 博客园

有了这想法决定了考研就放弃了找工，但时间已不多。考研目标也很明确就是报考的计算机系，那时还有公费研究生，想着万一考上公费还能省点。但结果总是不如人意啊，分数高不成低不就，够不上计算机系却足够上软件工程的了，所以学校把所有离计算机系差点的都调配去软工了，真是天意弄人。时不我待我也只好去读了，学费更贵了，一年2万，咬咬牙忍了。而且软工研究生（当时也是该专业第一届）只用读两年，计算机读三年。两年的研究生当时毕业时也被称为：小硕。

如今回头来看算是幸运的，只读两年实际真正为我省了一年的时间。由于软工属于新专业，当时本科两年囫囵吞枣的学了别人计算机四年的课程，有种消化不良的感觉。研究生阶段又花了一年的时间来补足基础和梳理体系，而研究生的第二年基本没课让去企业实习再找课题（软工的定位偏应用，所以课题也更实际而不是纯理论的）。然后就去学校旁边立信集团下的从兴电子实习，当时那里在做一个 OA 系统，我当然是搞不了什么核心的东东。在那里帮着整理些内部文档，了解了一些实际企业使用的技术栈，领一月一千的实习工资，用小道文章里的话说：纯属补贴社会。几个月下来倍觉无趣，又考虑将来也不会在这个公司呆就回到学校多读些书，找了导师选了个偏理论的方向作些研究。其实很多研究生最后的论文大部分是东拼西凑。如果说这阶段真正学到什么？应该不是知识，而是学习本身，就算研究生没白读了，受益终生。

一晃已研究生毕业十年了，参加了最近五年公司的校招。很遗憾的是，我面试的同学中十个里有八个都是研究生毕业，如今本科毕业生出来找工作的都像成了稀有动物。我就在想为什么现在读研的比例这么高了，要知道当时我考研属于十个里面有两个，而且当时考研确实不是我的首选。实际上在我招学生时我更偏向本科的同学，一个本科的同学出来工作三年如果稍微进取点都会比读研刚从学校出来的同学强些。而且如果抱着读研是为了找到更高薪的工作，我不太清楚现在研究生和本科的起步薪资有多大差距，当年华为来学校招时是研究生多 1k，腾讯全年高 2 万，平均一个月可能也就高 1k 多些。以起薪高低作为出发点去读研往往得不尝失，三年不算短啊。

也不能一棒子打死计算机读研的，我举个读计算机科学研究生的正面例子，他不仅读了硕还继续读博。他 2006 年在莱斯大学读计算机科学的本科，2009 到斯坦福读计算机科学的硕士，然后接着读博士。期间在 Facebook 之类的公司短暂实习过，从简历看 2014 已博士毕业，现在 salesforce 从事计算基础架构工作。他叫 Diego Ongaro 是 Raft 分布式一致性协议论文《In Search of an Understandable Consensus Algorithm》的第一作者（正巧最近在读相关的论文），他的简历可在 Linkedin 查到，我觉得他这个研就读的很有价值啊。

Read full article from 本来连学计算机的都不是，怎么却读了计算机研究生 - mindwind - 博客园

Safe, non-toxic way to get rid of roaches? | Go Ask Alice!

Using roach baits. Roach baits, those little plastic domes or discs filled with gels, pastes, and dusts, contain poisoned food that a roach will carry back to its nest, where it will die. If other roaches eat the dead roach (which is likely), they will die of the poison too. Although not much poison is needed in this method, it's a good idea to put baits in a place where people won't touch them.

Read full article from Safe, non-toxic way to get rid of roaches? | Go Ask Alice!

Pest control and newborn babies

ANSWER: Depending on the application, it would be best to remain out of the house for a few hours after spraying, especially if you are concerned.

Pesticides designed for control of household insects are formulated at low concentration levels. The various formulations are intended for particular areas of the home. Dusts and aerosols are intended for application in cracks and crevices. They are effective against pests but without any exposed residue.

Baits intended to eliminate ants and/or roaches are either injected into cracks and crevices or put into plastic bait stations. The stations are placed out of the reach of children.

Insecticides that are intended for application onto exposed surfaces have very specific directions on their labels. The directions have instructions for mixing and application. The directions also require that people, including their children, stay off of the treated surface until the treatment has dried.

Read full article from Pest control and newborn babies

Tiger's leetcode solution: 326. Power of Three

Read full article from Tiger's leetcode solution: 326. Power of Three

How to swap or exchange objects in Java? - GeeksforGeeks

How to swap objects in Java?
Let's say we have a class called "Car" with some attributes. And we create two objects of Car, say car1 and car2, how to exchange the data of car1 and car2?

A Simple Solution is to swap members. For example, if the class Car has only one integer attribute say "no" (car number), we can swap cars by simply swapping the members of two cars.

Read full article from How to swap or exchange objects in Java? - GeeksforGeeks

Min Sum Path in a Triangle - Algorithms and Problem SolvingAlgorithms and Problem Solving

Given a triangle, find the minimum path sum from top to bottom. Each step you may move to adjacent numbers on the row below.

For example, given the following triangle

     [2],      [3,4],     [6,5,7],    [4,1,8,3]

The minimum path sum from top to bottom is 11 (i.e., 2 + 3 + 5 + 1 = 11).

But for the following triangle –

     [2],      [5,4],     [5,5,7],    [1,4,8,3]

The minimum path sum from top to bottom is 11 (i.e., 2 + 5 + 5 + 1 = 13).

The problem somewhat resemble a tree structure and hence finding minimum sum path from root to a leaf. But if we look carefully then we will notice that this is a simple dynamic programming problem as the problem is well defined. At each level we need to choose the node that yields a min total sum with the following relation –

dp[level][i] = triangle[level][i] + min{dp[next_level][i], dp[next_level][i+1]}

Notice that if we start from top and do a topdown dp then we might get wrong result as in example 2 it will return 15 = [2, 4, 5, 4]. But the actual minimum path is 13 = [2, 5, 5, 1]. That is we need to do a bottom-up computation for the dp. That is,

dp[level][i] = triangle[level][i] + min{dp[level+1][i], dp[level+1][i+1]}

Below is the implementation of this approach that runs in O(n^2) time and takes O(n^2) space.

//O(n^2) time and O(n^2) space for dp table

Read full article from Min Sum Path in a Triangle - Algorithms and Problem SolvingAlgorithms and Problem Solving

Find Next Greater Number Using Given Set Of Digits - Algorithms and Problem SolvingAlgorithms and Problem Solving

Given a set S of digits [0-9] and a number n. Find the smallest integer larger than n (ceiling) using only digits from the given set S. You can use a value as many times you want.

For example, d=[1, 2, 4, 8] and n=8753 then return 8811. For, d=[0, 1, 8, 3] and n=8821 then return 8830. For d=[0, 1, 8, 3] and n=8310 then return 8311.

As we know two numbers are equal when all their digits are same in respective positions. The smallest number greater than a number n must have at least one digit greater than its respective position and all the positions right to this position must contain the smallest digits possible.

For example, d=[0, 1, 8, 3] and n=8821 then as long as the digits match from MSB to LSB position we have the numbers matching. For example, [88]. If at a position we don't find the digit in the given digit, for example 3 at position 2 of n, then we want to replace this with the next higher number from the digits d. The next higher digit in this case is 3. Once we find a higher digit we need to have the rest of the digits in the LSBs from this position as smallest as possible i.e. the smallest one. For example, in this case the rest of the position will be filled with smallest digit in d which is 0. So, final answer is 8830.

Question is how to get the next higher digit? We can simply sort the digit array and use binary search to get the floor (smallest number greater than or equal to key) from the array. However there is a special case when all the digits in the given number is contained in the digit. In this case we will end up in generating the input number itself as the next greater number (why?). How to solve this issue? If we do not find a higher digit in the whole scan then we need to replace the LSB of the number by the smallest digit that is strictly higher than the current digit at that position (why?).

Read full article from Find Next Greater Number Using Given Set Of Digits - Algorithms and Problem SolvingAlgorithms and Problem Solving

Min subarray (or sublist) to sort to make an unsorted array (or list) sorted - Algorithms and Problem SolvingAlgorithms and Problem Solving

Give a list of unsorted number, find the min window or min sublist or min subarray of the input, such as if sublist is sorted, the whole list is sorted too.

For example, given array a={1,2,3,5,4,4,3,3,7,8,9} then min subarray to sort the complete array sorted is {5,4,3,3}. More example : for a={1,2,3,5,6,4,2,3,3,7,8,9} then min subarray is {2,3,5,6,4,2,3,3}, for a={1,2,3,5,6,4,3,3,7,8,9,2} then min subarray is {2,3,5,6,4,2,3,3,7,8,9,2} etc.

Apparently the problems looks very complicated. But if we look into an example then we will see that it is rather one of simpler problems. Just believe in your intuition!

For example, a={1,2,3,5,4,4,3,3,7,8,9}. Note that 4 comes after 5 breaking the ascending sorting order. We understand the 5 and 4 must be contained in the resultant min subarray. Now believe in your intuition. How did you figure out that 5,4,3,3 is the answer? Because our brain affixed our attention to the two numbers : 5 and the right most 3. this is because these two numbers are related. How? Because 5 tells us the first number to be included in the min list. This also tells us that the left of 5 , i.e. 3 is the minimum number that may appear in the min list(why?).

We identified the start of the min list. But where does it end? As we know that 3 may be the minimum that ma appear in the list. This means the min list should contain all number greater than equal to 3 starting from 5. So, if we can identify somehow the max number that may appear in this min list then the minList ends at the position which has a higher value than the max number (why?).

What is the maximum number that can appear in the list? As we know 3 is the min then we can identify the right most 3 and get the maximum among elements between the left boundary i.e. 5 and rightmost min i.e. rightmost 3. The maximum tells us where is the rightmost boundary of the min array. The right boundary is the first number greater than the max on the right of the right most min. For example, in the current example it is 7. Then all numbers from 5 until 7 (excluding 7) constitute the resulting min subarray.

Below is the implementation of the above idea. The algorithm runs in O(n) time and O(1) space.

Read full article from Min subarray (or sublist) to sort to make an unsorted array (or list) sorted - Algorithms and Problem SolvingAlgorithms and Problem Solving

ArrayIndexOutOfBoundsException from Java Path.normalize() for Empty Path - Sigmainfy - Technical blog to learn, share and to inspire

ArrayIndexOutOfBoundsException from Java Path.normalize() for Empty Path – Sigmainfy – Technical blog to learn, share and to inspire

This post explains why and how ArrayIndexOutOfBoundsException is thrown from Java Path.normalize() for Empty Path directly from the angle of UnixPath source code.

ArrayIndexOutOfBoundsException from Java Path.normalize() for Empty Path

Well, what happens is that the following simple one line statement throws and ArrayIndexOutOfBoundsException:

public class App {      public static void main( String[] args ) {          Paths.get("").normalize();          System.out.println( "Hello World!" );      }  }  /*  The following exception is thrown:    Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0      at sun.nio.fs.UnixPath.normalize(UnixPath.java:508)      at net.tech-wonderland.app.App.main(App.java:11)  */

Read full article from ArrayIndexOutOfBoundsException from Java Path.normalize() for Empty Path – Sigmainfy – Technical blog to learn, share and to inspire

Facebook bans private gun sales

In March 2014, Facebook took steps to limit sales of firearms including blocking minors from seeing posts about gun sales and trades and by requiring Facebook pages primarily used to promote the private sale of regulated goods and services to include language that reminds users to comply with laws and regulation.

At the time, an investigation by the technology blog VentureBeat found that adults and children were connecting on Facebook pages devoted to guns to buy, sell and trade firearms, sometimes in violation of federal and state gun laws.

Federal law enforcement sources told VentureBeat that Facebook, Instagram and other social media services were "emerging threats for unlawful gun transactions in the United States."

Moms Demand Action for Gun Sense in America and Mayors Against Illegal Guns, the group backed by former New York Mayor Michael Bloomberg, had stepped up their demands for changes to Facebook's policies after gun arrests were tied to sales of guns facilitated through Facebook.

"Over the last two years, more and more people have been using Facebook to discover products and to buy and sell things to one another. We are continuing to develop, test, and launch new products to make this experience even better for people and are updating our regulated goods policies to reflect this evolution," Monika Bickert, Facebook's head of product policy, said in an emailed statement.

Follow USA TODAY senior technology writer Jessica Guynn@jguynn

Read full article from Facebook bans private gun sales

程序员的日常：每日站会| 编程派 | Coding Python

每日站会，是日常敏捷开发中最重要的团队活动，必须团队全员参与，鼓励团队每日同步更新。但是在一些经验不够丰富的团队中，每日站会可能由于多种原因，逐渐流于形式，没有起到应有的作用。本期《程序员的日常》漫画，描绘的就是这样一种情况。

Read full article from 程序员的日常：每日站会| 编程派 | Coding Python

Overview

2. 创业公司带给程序员的风险

最新喜欢

内容与特色介绍¶

查看目录并试读¶

Search a 2D Matrix

[LeetCode] Search a 2D Matrix

职责：

基本要求：

优先条件：

工作领域：

申请方式：

ArrayIndexOutOfBoundsException from Java Path.normalize() for Empty Path

Labels

Popular Posts