Hystrix指标窗口实现原理 - 简书

Hystrix是一个熔断中间件，能够实现fast-fail并走备用方案。Hystrix基于滑动窗口判定服务失败占比选择性熔断。滑动窗口的实现方案有很多种，指标计数也有很多种实现常见的就是AtomicInteger进行原子增减维护计数，具体的方案就不探讨了。

Hystrix是基于Rxjava去实现的，那么如何利用RxJava实现指标的汇聚和滑动窗口实现呢？当然本篇不是作为教程去介绍RxJava的使用姿势，本篇文章主要解说Hystrix是什么一个思路完成这项功能。

Read full article from Hystrix指标窗口实现原理 - 简书

wxyyxc1992/Awesome-Coder: Interactive MindMap, RoadMap(Learning Path/Interview Questions), xCompass, Weekly for Developer, to Learn Everything in ITCS 程序员的技术视野、知识管理与职业规划，提高个人与团队的研发效能

wxyyxc1992/Awesome-Coder: Interactive MindMap, RoadMap(Learning Path/Interview Questions), xCompass, Weekly for Developer, to Learn Everything in ITCS 程序员的技术视野、知识管理与职业规划，提高个人与团队的研发效能

在这个知识爆炸与终身学习/碎片化学习为主的时代，我们面临的问题之一就是如何进行有效学习，不仅能有效平衡广度与深度，并且能真正的积淀下来，提升自己的研发效能。笔者个人浅论，技术能力的培养主要分为三个方面：知识广度，编程能力与知识深度。Awesome-Coder 系列即是致力于提升开发者的高效学习能力与实际研发效能，该系列由 Knowledge MindMap,RoadMap, Awesome Links , Awesome CheatSheet, Awesome CS Books Warehouse 以及 coding-snippets 这几个模块组成。

所谓知识广度，即是为实际问题选择合适的解决方案的能力，广义来说也是眼界与格局的表现。它并不拘泥于某个技术方向或者行业领域，而需要对传统/流行的各类语言、工具、框架、库、服务等有一定的认识；能够明晰各个方案的优劣，并在较高的层次(High Level)描述相关原理。知识广度的拓展与保持需要建立在庞大的阅读量与知识沉淀能力上，笔者习惯利用碎片时间浏览 HN, Reddit, Medium, Twitter 来了解资讯文章，在维护 Frontend Weekly, 每周阅读清单与前端开发周报的过程中也不断强迫自己去阅读与探究。另一方面，笔者坚定地即认为，唯有建立符合自己认知方式的知识图谱，才能有效地沉淀知识，明晰知识边界并进行不断地探索。上车伊始，笔者即致力于构建自己的 MindMap, IT 技术图谱与知识架构，并在数年来不断维护与刷新；同时，笔者将日常阅读、学习与实践中发掘的优秀的资料，按照知识图谱中定义的各个领域的知识体系分门别类地存放在 Awesome Links : Guide to Galaxy 中，以方便快速地检索与查找。

Read full article from wxyyxc1992/Awesome-Coder: Interactive MindMap, RoadMap(Learning Path/Interview Questions), xCompass, Weekly for Developer, to Learn Everything in ITCS 程序员的技术视野、知识管理与职业规划，提高个人与团队的研发效能

What's the difference between HEAD^ and HEAD~ in Git? - Stack Overflow

HEAD^ means the first parent of the tip of the current branch.

Remember that git commits can have more than one parent. HEAD^ is short for HEAD^1, and you can also address HEAD^2 and so on as appropriate.

You can get to parents of any commit, not just HEAD. You can also move back through generations: for example, master~2 means the grandparent of the tip of the master branch, favoring the first parent in cases of ambiguity. These specifiers can be chained arbitrarily , e.g., topic~3^2.

Read full article from What's the difference between HEAD^ and HEAD~ in Git? - Stack Overflow

万万没想到，分布式存储系统的一致性是......

有界旧一致性(bounded staleness)

保证读到的数据最多和最新版本差K个版本
通过维护一个滑动窗口，在窗口之外，有界旧一致性保证了操作的全局序。此外，在一个地域内，保证了单调读。

会话一致性

在一个会话内保证单调读，单调写，和读自己所写，会话之间不保证
会话一致性把读写操作的版本信息维护在客户端会话中，在多个副本之间传递
会话一致性的读写延迟都很低

前缀一致性

前缀一致保证，在没有更多写操作的情况下，所有的副本最终会一致
前缀一致保证，读操作不会看到乱序的写操作。例如，写操作执行的顺序是`A, B, C`，那么一个客户端只能看到`A`, `A, B`, 或者`A, B, C`，不会读到`A, C`，或者`B, A, C`等。
在每个会话内保证了单调读

最终一致性.

最终一致性保证，在没有更多写操作的情况下，所有的副本最终会一致
最终一致性是很弱的一致性保证，客户端可以读到比之前发生的读更旧的数据

Read full article from 万万没想到，分布式存储系统的一致性是......

分布式系统学习思路 | Charles的技术博客

博主近段时间准备学习分布式系统相关的东西，本文整理了学习分布式系统的思路，此文还未经过实践，可能还需要不断调整，仅供参考。

分布式系统一般分为分布式K/V系统、分布式文件系统和分布式数据库等几个大类，在学习这几类系统的时候，需要掌握的知识或技能应该包括计算机基础知识、分布式算法和协议相关论文、分布式系统设计范型相关论文、开源的分布式系统案例以及造相关的轮子。

Read full article from 分布式系统学习思路 | Charles的技术博客

求第K个数的问题 | 四火的唠叨

我想这也是个经典问题，这个问题都问烂了。数据量如果放在一台机器上不合适，那么很多人都会想到，可以map-reduce啊，每台机器上进行map运算都求出最大的k个，然后汇总到一台机器上去reduce求出最终的第k个（如果机器很多，这个汇总过程可以是多级汇总）。

可是，这个回答无意之中假定了一个条件，让问题变得好处理很多。这个条件就是——k不大。

假如这堆数很多，因此放在若干台机器上，但是如果这个k也非常大呢？即便要想把这k个数放到一台机器上去找也不可行。

这时候问题就有点复杂了，也有不同的处理方法。一种办法是，通过某种排序方法（比如基于不断归并的外排序），给每台机器上的数据都排好序，然后从中找一个（猜一个可能为所求的数）数作为pivot，并且在每台机器上这些有序数里面都明确这个pivot的位置。假设machine[i]表示第i台机器上，pivot这个数所处的序列，那么把这些machine[i]累加起来，得到的数sum去和k比较：

如果sum==k，那这个数就找到了；
如果sum<k，说明这个数在每台机器上machine[i]往后，直到结尾的这一段数中；
如果sum>k，说明这个数在每台机器上machine[i]往前，直到开头的这一段数中。

如是递归求解。

当然这个方法依然有许多地方可以改进，比如这个预先进行的外排序，未必要完全进行，是可以通过稳重介绍的理论优化掉或者部分优化掉的。

Read full article from 求第K个数的问题 | 四火的唠叨

每天数百亿用户行为数据，美团点评怎么实现秒级转化分析？

用户行为分析是数据分析中非常重要的一项内容，在统计活跃用户，分析留存和转化率，改进产品体验、推动用户增长等领域有重要作用。美团点评每天收集的用户行为日志达到数百亿条，如何在海量数据集上实现对用户行为的快速灵活分析，成为一个巨大的挑战。为此，我们提出并实现了一套面向海量数据的用户行为分析解决方案，将单次分析的耗时从小时级降低到秒级，极大的改善了分析体验，提升了分析人员的工作效率。

Read full article from 每天数百亿用户行为数据，美团点评怎么实现秒级转化分析？

Testing Your Shell Scripts, with Bats – Tim Perry – Medium

Shell scripting isn't easy though. Many of the tools and techniques you might be used to aren't nearly as effective or well-used on the command line. Testing is a good example: in most languages, there's a clearly agreed basic approach to testing, and most projects have at least a few tests sprinkled around (though often not as many as they'd like).

Read full article from Testing Your Shell Scripts, with Bats – Tim Perry – Medium

redis_replication_readme

redis启动流程:

1 加载配置；
2 初始化redis master、slave以及sentinel的sri；
3 注册事件事件serverCron。

Read full article from redis_replication_readme

Itweet & Boot

企业级大数据平台部署实施参考指南，截止目前我设计实现的集群已经几十个了，集群规模从几台到上千台的规模，主要是一些规范和经验吧，记录一下避免遗忘，今天我们聊集群硬件配置。

Read full article from Itweet & Boot

一个可伸缩的Java线程池的遗嘱执行人 - 个人文章 - SegmentFault 思否

在理想的情况下,从任何线程池的执行人,期望将如下:

一组初始创建的线程(核心线程池大小),来处理负载。
如果负载的增加,那么应该创建更多的线程来处理负载最大线程(最大池大小)。
如果线程的数量增加最大池大小,然后排队的任务。
如果有界的队列使用,队列满了,然后把一些拒绝政策。

Read full article from 一个可伸缩的Java线程池的遗嘱执行人 - 个人文章 - SegmentFault 思否

Redis RedLock 完美的分布式锁么？ | 犀利豆的博客

突然觉得事情好像没有那么简单，就点进去看了看。仔细读了读文章，发现了一个不得了的世界。于是静下心来研究了 Martin 对 RedLock 的批评，还有 RedLock 作者 antirez 的反击。

Martin 的批评

Martin上来就问，我们要锁来干啥呢？两个原因：

提升效率，用锁来保证一个任务没有必要被执行两次。比如（很昂贵的计算）
保证正确，使用锁来保证任务按照正常的步骤执行，防止两个节点同时操作一份数据，造成文件冲突，数据丢失。

对于第一种原因，我们对锁是有一定宽容度的，就算发生了两个节点同时工作，对系统的影响也仅仅是多付出了一些计算的成本，没什么额外的影响。这个时候使用单点的 Redis 就能很好的解决问题，没有必要使用RedLock，维护那么多的Redis实例，提升系统的维护成本。

Read full article from Redis RedLock 完美的分布式锁么？ | 犀利豆的博客

那些教科书上看不到的排序算法

德国不伦瑞克工业大学的几位计算机教授为了让自己开设的计算机算法讲座课更加生动，推出了一系列算法图解，以宜家家居指南的风格来图解常见的算法，真的是别具一格。网站地址是https://idea-instructions.com。

其中提到了一个算法，叫作Bogo排序算法。Bogo是个人名，不知道是不是一个叫作Bogo的人"发明"了这个算法，所以以他的名字来命名。但如果换了是我，我宁愿一辈子都不知道原来排序算法还能这么写！因为它实在是太"愚蠢"了，所以也有人把它叫作"愚蠢排序"、"猴子排序"等等。相比之下，把它叫作"Bogo"排序还算是仁慈的。

Read full article from 那些教科书上看不到的排序算法

kevinmost/junit-retry-rule: A simple @Rule for JUnit 4 to retry tests

In your test's class, add the RetryRule as a test rule, by creating an instance of it as a public field, annotated with @Rule.

@Rule public final RetryRule retry = new RetryRule();

Then, for any test that needs to implement retrying logic, annotate the method with @Retry. Optional parameters on this annotation let you change defaults such as the number of retries, the timeout length, etc.

@Test  @Retry(times = 5) // runs test up to 5 times, instead of the default 3 times  public void myFlakyTest throws Exception {      obj.doUnreliableThing();  }

Read full article from kevinmost/junit-retry-rule: A simple @Rule for JUnit 4 to retry tests

Retry JUnit failed tests immediately | Automation Rhapsody

There are mainly three approaches to make JUnit retry failed tests.

Maven Surefire or Failsafe plugins – follow plugin name links for more details how to use and configure plugins
JUnit rules – code listed in current post can be used as a rule. See more for rules in Use JUnit rules to debug failed API tests post. Problem is @Rule annotation works for test methods only. In order to have retry logic in @BeforeClass then @ClassRule object should be instantiated.
JUnit custom runner – this post is dedicated on creating own JUnit retry runner and run tests with it.

Read full article from Retry JUnit failed tests immediately | Automation Rhapsody

阅读开源框架，遍览Java嵌套类的用法 | 斑斓

Java的类对外而言只有一种面貌，但封装在类内部的形态却可以丰富多彩。嵌套类在这其中，扮演了极为重要的角色。它既丰富了类的层次，又可以灵活控制内部结构的访问限制与粒度，使得我们在开放性与封闭性之间、公开接口与内部实现之间取得适度的平衡。

嵌套类之所以能扮演这样的设计平衡角色，是因为嵌套类与其主类的关系不同，主类的所有成员对嵌套类而言都是完全开放的，主类的私有成员可以被嵌套类访问，而嵌套类则可以被看做是类边界中自成一体的高内聚类而被主类调用。因此，嵌套类的定义其实就是对类内部成员的进一步封装。虽然在一个类中定义嵌套类并不能减小类定义的规模，但由于嵌套类体现了不同层次的封装，使得一个相对较大的主类可以显得更有层次感，不至于因为成员过多而显得过于混乱。

Read full article from 阅读开源框架，遍览Java嵌套类的用法 | 斑斓

SSM(十七) MQ应用 | crossoverJie's Blog

写这篇文章的起因是由于之前的一篇关于Kafka异常消费，当时为了解决问题不得不使用临时的方案。

总结起来归根结底还是对Kafka不熟悉导致的，加上平时工作的需要，之后就花些时间看了Kafka相关的资料。

何时使用MQ

谈到Kafka就不得不提到MQ，是属于消息队列的一种。作为一种基础中间件在互联网项目中有着大量的使用。

一种技术的产生自然是为了解决某种需求，通常来说是以下场景：

需要跨进程通信：B系统需要A系统的输出作为输入参数。

当A系统的输出能力远远大于B系统的处理能力。

Read full article from SSM(十七) MQ应用 | crossoverJie's Blog

GitHub - crossoverJie/Java-Interview: 👨‍🎓 Java related: basic, concurrent, algorithm

Java 日常开发知识点，继续完善中。

多数是一些 Java 基础知识、底层原理、算法详解。也有上层应用设计，其中不乏一些大厂面试真题。

如果对你有帮助请点下 Star，有疑问欢迎提 Issues，有好的想法请提 PR。

Read full article from GitHub - crossoverJie/Java-Interview: 👨‍🎓 Java related: basic, concurrent, algorithm

Kubernetes中的Pause容器究竟是什么？ - jimmysong.io|宋净超的博客|Cloud Native|Big Data

pause容器将内部的80端口映射到宿主机的8880端口，pause容器在宿主机上设置好了网络namespace后，nginx容器加入到该网络namespace中，我们看到nginx容器启动的时候指定了--net=container:pause，ghost容器同样加入到了该网络namespace中，这样三个容器就共享了网络，互相之间就可以使用localhost直接通信，--ipc=contianer:pause --pid=container:pause就是三个容器处于同一个namespace中，init进程为pause，这时我们进入到ghost容器中查看进程情况。

Read full article from Kubernetes中的Pause容器究竟是什么？ - jimmysong.io|宋净超的博客|Cloud Native|Big Data

10个用Console来Debug的高级技巧 | Fundebug博客

译者按： 我们往往会局限在自己熟悉的知识圈，但也应担偶尔拓展一下，使用一些不常见而又有用的技巧，扩大自己的舒适圈。

Read full article from 10个用Console来Debug的高级技巧 | Fundebug博客

Failing to see how ambassador pattern enhances modularity / simplicty of container architecture in Docker - Stack Overflow

I fail to see how implementing the ambassador pattern would help us simplify / modularize the design of our container architecture.

Let's say that I have a database container db on host A and is used by a program db-client which sits on host B, which are connected via ambassador containers db-ambassador and db-foreign-ambassador over a network:

[host A (db) --> (db-ambassador)] <- ... -> [host B (db-forgn-ambsdr) --> (db-client)]

Connections between containers in the same machine, e.g. db to db-ambassador, and db-foreign-ambassador to db-client are done via Docker's --link parameter while db-ambassador and db-foreign-ambassador talks over the network.

But , --link is just a fancy way of inserting ip addresses, ports and other info from one container to another. When a container fails, the other container which is linked to it does not get notified, nor will it know the new IP address of the crashing container when it restarts. In short, if a container which is linked to another went dead, the link is also dead.

To consider my example, lets say that db crashed and restarts, thus get assigned to a different IP. db-ambassador would have to be restarted too, in order to update the link between them... Except you shouldn't. If db-ambassador is restarted, the IP would have changed too, and foreign-db-ambassador won't know where to reach it at the new IP address.

Read full article from Failing to see how ambassador pattern enhances modularity / simplicty of container architecture in Docker - Stack Overflow

The mythical 10x programmer -

Read full article from The mythical 10x programmer -

阿里云Redis开发规范

1. key名设计

(1)【建议】: 可读性和可管理性

以业务名(或数据库名)为前缀(防止key冲突)，用冒号分隔，比如业务名:表名:id

Read full article from 阿里云Redis开发规范

Terminal State: Bad Data Handbook Review

Bad Data Handbook from O'Reilly is a collection of essays and articles by different authors having as common theme data, or "bad" data to be precise. The "badness" of the data in this case is more of a perceived quality, rather than an inherent one. Arguably, data can be surprising, unpredictable, defective or deficient but rarely thoroughly bad.

The different chapters are generally well written and they can be read in any order. The book contains a wide range of interesting situations, from machine learning war stories, to data quality issues, to modelling and processing concerns. To be clear, this book is not a programming guide but it is full of practical advice and recommendations.

Read full article from Terminal State: Bad Data Handbook Review

我为什么拒绝谷歌的招聘人员 | BitTiger

这些公司官僚化的机制注定了他们每个月都要面试数百个应聘者。为了让这些应聘者上钩，一组经过专业训练的猴子（招聘人员）会给每一个和我相似的人发送热情洋溢的邮件。他们需要筛选应聘者，但又懒又缺乏创意，所以随机叫一些码农来面试我们，而这些码农只会尽可能的问一些复杂的问题。

我当然不是说那些通过测试的人不是好的程序员，我也不是想强调我是个多好的程序员——事实就是，我没有通过面试。我也承认，这样的面试筛选过程是一套很好的程序。我想强调的是，面试过程和我最初收到的邮件大相径庭。

如果邮件里写道："我们正在寻找一位算法专家。"那么我们不会有更多联系，也不会浪费彼此的时间。因为显而易见，我不是一位算法专家。给我提出任何的二叉树遍历问题都是毫无意义的——我根本不知道答案，我将来也不会有兴趣学习。我的专业热情在别的领域，诸如对象导向设计。

Read full article from 我为什么拒绝谷歌的招聘人员 | BitTiger

macos - How do I restart redis that I installed with brew? - Super User

As of Dec-7-2015 You can use brew services.

You need to brew tap homebrew/services and then thw following will work as expected:

install brew install redis

start brew services start redis

stop brew services stop redis

restart brew services restart redis

Read full article from macos - How do I restart redis that I installed with brew? - Super User

京东MySQL数据库Docker化最佳实践 | 岭南六少 - 一朵在LAMP架构下挣扎的云

京东MySQL数据库Docker化的推进之路，从开始如履薄冰的使用，到目前占比超过70%以上的大规模部署，下面给大家一一讲解这期间的发展历程。

Read full article from 京东MySQL数据库Docker化最佳实践 | 岭南六少 - 一朵在LAMP架构下挣扎的云

林仕鼎：火车票系统后面的架构设计 | 岭南六少 - 一朵在LAMP架构下挣扎的云

有点标题党，这回还是没想谈具体的火车票系统方案。我的观点是在没有详细数据、业务流程还有内部系统模型的情况下，直接设计方案容易水土不服。当然，有几个朋友比较有经验，已经给出了方案。方案都不错，挺实用的，想直接解决类似问题的朋友不妨直接参考那些设计。

我希望从普遍意义上说明一个在线系统设计时需要考虑的问题，可能这对有过一定经验的朋友才有用。如果你只是想看看如何写代码解决具体问题，则必然会觉得言之无物，不妨以后再读。如果在技术一途发展，慢慢地你终会感觉到，写些代码或解决某个具体问题并不是最难的。另外，列出一些专业词汇，不是为了卖弄，目的是更清晰定义和准确说明。

开发一个能支持一定用户规模的在线服务并不难，但要做好，必须在业务逻辑和系统架构两方面下功夫。业务逻辑方面需增强的主要是快速开发与功能迭代的支持。根据需求（来自客户或产品经理）实现功能只完成了基础部分，在线服务必然会有大量的功能升级，如何以最小代价支持千奇百怪的升级要求将成为最主要的难题。解决方法则是通过合理的功能抽象，提取出公共组件和通用流程，进行最大化的复用。我们也可以称其为软件架构的可维护性问题，其实这也是传统的企业级开发最重要的问题。

Read full article from 林仕鼎：火车票系统后面的架构设计 | 岭南六少 - 一朵在LAMP架构下挣扎的云

The mythical 10x programmer -

A 10x programmer is, in the mythology of programming, a programmer that can do ten times the work of another normal programmer, where for normal programmer we can imagine one good at doing its work, but without the magical abilities of the 10x programmer. Actually to better characterize the "normal programmer" it is better to say that it represents the one having the average programming output, among the programmers that are professionals in this discipline. The programming community is extremely polarized about the existence or not of such a beast: who says there is no such a thing as the 10x programmer, who says it actually does not just exist, but there are even 100x programmers if you know where to look for.

Read full article from The mythical 10x programmer -

极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

当一款社交App发布之初，用户访问量比较小，使用一台服务器就能够支撑全部的访问压力和数据存储需求，但是互联网应用具有病毒式的传播特点。一款App很可能会面临一夜爆红的现象，访问量和数据量在短时间内呈现爆发式增长，这时候会面临的局面是每天上亿PV、数百万新增用户和活跃用户、流量飙升至每秒数百兆。这些对于一个只部署了简单后端架构的应用来讲是无法支撑的，会直接导致服务器响应缓慢甚至超时，以及在高峰期时服务呈现瘫痪状态，使得后端的服务完全无法使用，用户体验急剧下降。本文将会通过一个真实的案例来分享一个社交应用如何构建一个具备高伸缩性的后端系统。

Read full article from 极分享：高质分享+专业互助=没有难做的软件+没有不得已的加班

Alan Cooper on Working Backwards for Better Product Design

At the Agile India conference, design expert Alan Cooper gave a keynote talk on Working Backwards in which he described the approach to design and innovation which has been the basis of his process and practice over the last 26+ years. The approach has three key elements which he explained:

Know your user and their goals
See possible solutions
See the Big Picture

He explored each of these themes and looked at the implications of the way we currently build and release products, often without regard to the broader impact they have on society as a whole.

He stated that it takes as much work to make a failing product as it does to create a successful one, and the difference if not what work is done but where that work starts from. Working backwards starts with the goals and outcomes rather than requirements and constraints. It identifies the compelling vision which aligns with customer outcomes and results in a successful product.

Read full article from Alan Cooper on Working Backwards for Better Product Design

API Evangelist

I am profiling APIs as part of my partnership with Streamdata.io, and my continued API Stack work. As part of my work, I am creating OpenAPI, Postman Collections, and APIs.json indexes for APIs in a variety of business sectors, and as I'm finishing up the profile for ParallelDots machine learning APIs, I am struck (again) by the importance of tags within OpenAPI definitions when it comes to defining what any API does, and something that will have significant effects on the growing machine learning, and artificial intelligence space.

Read full article from API Evangelist

java Disruptor工作原理，谁能用一个比喻形容下? - 知乎

首先disruptor是特别适用于对时间高度敏感的多线程应用。如果app对时间不敏感完全可以不用disruptor 而只用array blocking queue. 再如果废了好大劲挣回来30毫秒，结果被一个数据库连接耗掉1秒，也没必要用。所以搞清楚适用的环境很重要。

其次这个技术其实很酷的。最酷的地方不是ring buffer 而是想到直接用CPU指令做CAS. Ring buffer是做了工程级别的优化，对于CPU branch prediction更友好。也就是我们说的cache friendly。其他的别人没提到的好处是可以replay ,这样对daily regression test很方便。

实际适用的例子随便举两个。一个是实时的Reuters 市场数据接收，redistribute 到其他进程或者线程。另一个是如果algo model 决定place order or pull order from the market，就需要以最快的方式给市场发指令。在一个是比较复杂一点：多个ring buffer穿起来形成一个小型producer consumer 工作流，这个用的人应该不算多我就不白虎了。

Read full article from java Disruptor工作原理，谁能用一个比喻形容下? - 知乎

Netflix OSS: Batch Requests with Ruby on Rails and Ember.js

Batching allows you to pass several operations in a single HTTP request. How do we make a Batch request from Ember UI and process it on a Rails backend? Ember Batch Request and Batch Request API to the rescue. A JSON array of HTTP requests are created on the UI using an Ember add-on and then processed sequentially or in parallel on the backend API through the Rails Middleware.

Read full article from Netflix OSS: Batch Requests with Ruby on Rails and Ember.js

Innovating Faster on Personalization Algorithms at Netflix Using Interleaving

The Netflix experience is powered by a family of ranking algorithms, each optimized for a different purpose. For instance, the Top Picks row on the homepage makes recommendations based on a personalized ranking of videos, and the Trending Now row also incorporates recent popularity trends. These algorithms, along with many others, are used together to construct personalized homepages for over 100 million members.

Read full article from Innovating Faster on Personalization Algorithms at Netflix Using Interleaving

A comprehensive guide to design systems - InVision Blog

ompanies like Airbnb, Uber, and IBM have changed the ways they design digital products by incorporating their own unique design systems. By utilizing a collection of repeatable components and a set of standards guiding the use of those components, each of these companies has been able to change the pace of creation and innovation within their teams.

Many organizations have what they consider to be a design system, but these collections typically amount to no more than a group of elements and code snippets. While a style guide or pattern library can be a starting point for a design system, they are not the only components. Let's dig into the fundamentals of design systems, plan how you can build and implement one in your organization, and explore several examples of organizations that are using design systems to drive success.

Read full article from A comprehensive guide to design systems - InVision Blog

1分钟了解Leader-Follower线程模型_架构师之路_传送门

上图就是L/F多线程模型的状态变迁图，共6个关键点：

（1）线程有3种状态：领导leading，处理processing，追随following

（2）假设共N个线程，其中只有1个leading线程（等待任务），x个processing线程（处理），余下有N-1-x个following线程（空闲）

（3）有一把锁，谁抢到就是leading

（4）事件/任务来到时，leading线程会对其进行处理，从而转化为processing状态，处理完成之后，又转变为following

（5）丢失leading后，following会尝试抢锁，抢到则变为leading，否则保持following

（6）following不干事，就是抢锁，力图成为leading

优点：不需要消息队列

适用场景：线程能够很快的完成工作任务

有人说"并发量大时，L/F的锁容易成为系统瓶颈，需要引入一个消息队列解决。"

此观点不对，一个消息队列，其仍是临界资源，仍需要一把锁来保证互斥，只是锁竞争从leading移到了消息队列上，此时消息队列仅仅只能起到消息缓冲的作用。

根本解决方案是降低锁粒度（例如多个队列）。

Read full article from 1分钟了解Leader-Follower线程模型_架构师之路_传送门

Phoenix系列：二级索引（1） - nick hao - 博客园

Phoenix使用HBase作为后端存储，对于HBase来说，我们通常使用字典序的RowKey来快速访问数据，除此之外，也可以使用自定义的Filter来搜索数据，但是它是基于全表扫描的。而Phoenix提供的二级索引是可以避开全表扫描，是在HBase中快速查找或批量检索数据的另一个选择。下面的例子使用如下表进行测试：

Read full article from Phoenix系列：二级索引（1） - nick hao - 博客园

Hystrix入门与分析（二）：依赖隔离之线程池隔离 - nick hao - 博客园

依赖隔离是Hystrix的核心目的。依赖隔离其实就是资源隔离，把对依赖使用的资源隔离起来，统一控制和调度。那为什么需要把资源隔离起来呢？主要有以下几点：

1.合理分配资源，把给资源分配的控制权交给用户，某一个依赖的故障不会影响到其他的依赖调用，访问资源也不受影响。

2.可以方便的指定调用策略，比如超时异常，熔断处理。

3.对依赖限制资源也是对下游依赖起到一个保护作用，避免大量的并发请求在依赖服务有问题的时候造成依赖服务瘫痪或者更糟的雪崩效应。

4.对依赖调用进行封装有利于对调用的监控和分析，类似于hystrix-dashboard的使用。

Hystrix提供了两种依赖隔离方式：线程池隔离和信号量隔离。如下图，线程池隔离，Hystrix可以为每一个依赖建立一个线程池，使之和其他依赖的使用资源隔离，同时限制他们的并发访问和阻塞扩张。每个依赖可以根据权重分配资源（这里主要是线程），每一部分的依赖出现了问题，也不会影响其他依赖的使用资源。

Read full article from Hystrix入门与分析（二）：依赖隔离之线程池隔离 - nick hao - 博客园

线程安全的HashMap · 系统设计(System Design)

开地址法(Open addressing). 即所有元素在一个一维数组里，遇到冲突则按照一定规则向后跳，假设元素x原本在位置hash(x)%m（m表示数组长度），那么第i次冲突时位置变为Hi = [hash(x) + di] % m，其中di表示第i次冲突时人为加上去的偏移量。偏移量di一般有如下3种计算方法，
+
1. 线性探测法(Linear Probing)。非常简单，发现位子已经被占了，则向后移动1位，即 $d_i = i$ , i=1,2,3,...
  +
  
  该算法最大的优点在于计算速度快，对CPU高速缓存友好；其缺点也非常明显，假设3个元素x1，x2，x3的哈希值都相同，记为p, x1先来，查看位置p，是空，则x1被映射到位置p，x2后到达，查看位置p，发生第一次冲突，向后探测一下，即p+1，该位置为空，于是x2映射到p+1, 同理，计算x3的位置需要探测位置p, p+1, p+2，也就是说对于发生冲突的所有元素，在探测过程中会扎堆，导致效率低下，这种现象被称为clustering。
  +
2. 二次探测法(Quadratic Probing)。 $d_i=ai^2+bi+c$ , 其中a,b,c为系数， $d_i$ 是i的二次函数，所以称为二次探测法。该算法的性能介于线性探测和双哈希之间。
  +
3. 双哈希法(Double Hashing)。偏移量di由另一个哈希函数计算而来，设为hash2(x)，则di=(hash2(x) % (m-1) + 1) * i
拉链法(Chaining)。开一个定长数组，每个格子指向一个桶(Bucket, 可以用单链表或双向链表表示)，对每个元素计算出哈希并取模，找到对应的桶，并插入该桶。发生冲突的元素会处于同一个桶中。
+

JDK7和JDK8里java.util.HashMap采取了拉链法。

如何将基于拉链法的HashMap改造为线程安全的呢？有以下几个思路，

将所有public方法都加上synchronized. 相当于设置了一把全局锁，所有操作都需要先获取锁，java.util.HashTable就是这么做的，性能很低
由于每个桶在逻辑上是相互独立的，将每个桶都加一把锁，如果两个线程各自访问不同的桶，就不需要争抢同一把锁了。这个方案的并发性比单个全局锁的性能要好，不过锁的个数太多，也有很大的开销。
锁分离(Lock Stripping)技术。第2个方法把锁的压力分散到了多个桶，理论上是可行的的，但是假设有1万个桶，就要新建1万个ReentrantLock实例，开销很大。可以将所有的桶均匀的划分为16个部分，每一部分成为一个段(Segment)，每个段上有一把锁，这样锁的数量就降低到了16个。JDK 7里的java.util.concurrent.ConcurrentHashMap就是这个思路。
在JDK8里，ConcurrentHashMap的实现又有了很大变化，它在锁分离的基础上，大量利用了了CAS指令。并且底层存储有一个小优化，当链表长度太长（默认超过8）时，链表就转换为红黑树。链表太长时，增删查改的效率比较低，改为红黑树可以提高性能。JDK8里的ConcurrentHashMap有6000多行代码，JDK7才1500多行。

Read full article from 线程安全的HashMap · 系统设计(System Design)

躁动的季节里非典型跳槽指南，03篇 | 容易被忽视的简历问题

先确定的是，真的很重要，重要到你是否能够收获到一份面试邀约，重要到你还没有开始面试就已经被扣了10分的印象分，不要不在意这10分，因为整个面试就是这种不同的考核点10分10分的积攒起来的。

也不要以为，那个傻X面试官不懂欣赏你，或者那个傻X公司不欣赏你，自有留爷处。面试是一个相对严肃的事情，以简历的维度说，你自己都对自己不尊重，你怎么要求别人尊重你欣赏你。

所以，真的很重要。

02 简历的门面，排版的问题

拿到一份简历，入目而来的就是整个简历的感官，到底是冗杂不堪，还是眼花缭乱，还是逻辑混乱，还是赏心悦目，就从整个简历的排版开始。

我旁边的产品负责人X君曾说，拿到了上百份简历，简历一看、排版一筛，也就十来份了，简历都做不好，做什么产品。

虽然略显粗暴，但是并没有啥毛病，诸如产品、设计，做一份赏心悦目的简历本来就属于他们的功底之一，他们只是把考核流程提前了一些而已，对于技术人员来说，虽然不至于到这种程度，但是一份排版清晰、逻辑清楚的简历总是能给面试官带来好的印象的。

Read full article from 躁动的季节里非典型跳槽指南，03篇 | 容易被忽视的简历问题

Facebook 为什么不用 Cassandra 了？ - 知乎

Cassandra和MongoDB等这些Nosql产品，无论从可扩展性、性能等方面来说是有很大的优势，我觉得目前最大的一个问题还在于稳定性和运维能力。对于类似Cassandra这种相对比较"重型"的产品来说，越是智能化的，要想更好的驾驭它就必须花费越多的精力和需要更多的经验。

Facebook开发Cassandra初衷是用于Inbox Search，但是后来的Message System则使用了HBase，Facebook对此给出的解释是Cassandra的最终一致性模型不适合Message System，HBase具有更简单的一致性模型，当然还有其他的原因。HBase更加的成熟，成功的案例也比较多等等。Twitter和Digg都曾经很高调的选用Cassandra，但是最后也都放弃了，当然Twitter还有部分项目也还在使用Cassandra，但是主要的Tweet已经不是了。

Read full article from Facebook 为什么不用 Cassandra 了？ - 知乎

How do leases work with Memcached and McRouter? · Issue #175 · facebook/mcrouter

What we really need is data consistency in our cache layer. Since CAS ops are point-to-point, they won't work for us since we use replicated pools. As the white paper states, the lease feature addresses staleness and thundering herd. That's exactly what we need.

Would it be possible to publish Facebook's memcached fork? The lease feature is basically incomplete without that. This would be the fastest path to resolution for us.
Would it be possible to send a PR to @dormando with your custom memcached fork? That way it can make it into open-source memcached and the community can benefit. Is there any reason why Facebook didn't do a PR for this?
If you are unable to publish the memcached custom fork, our fallback option would be to implement our own version of acquiring a token for optimistic locking. In essence, that defeats the purpose of leveraging leases through mcrouter. It would potentially incur an additional cache roundtrip.

Read full article from How do leases work with Memcached and McRouter? · Issue #175 · facebook/mcrouter

工先利其器：流行的代码与敏捷开发工具选择指南 - 21CTO | 十条

这是目前最好的项目和团队管理工具！它为我们提供了一站式的敏捷项目管理产品，可以用来在团队中共享文件，笔记，待办事项列表等等。

你还可以为自己和团队安排时间表。因此，它消除了以前要混合使用各种应用程序的问题，BaseCamp已经成为所有项目管理需求的首选平台。

Read full article from 工先利其器：流行的代码与敏捷开发工具选择指南 - 21CTO | 十条

躁动的季节里非典型跳槽指南，02篇 | 技术面试三要素

这里所说的技能，并不局限于候选者会什么框架、精通什么语言，而是抛开了项目经历，各种相对"硬"的一些能力，语言、框架，工具都算在里头，相对是一些通用的技能体系。

一般考核这个点有两个方式，其一就是在候选者讲述项目经历时，在适当的时机里，面试官深入的追问一些技术细节，从而延伸到技术层面，进而结合项目考核候选者的技术实力。

第二就是针对候选者简历上所描述的技术范围、能力范围进行详细的考核了，当然了，笔试等环节也可以看做是这个维度的重要考核方式。虽然很多时候笔试也有开放式的题目，但毕竟还是以考核硬技能为主。

项目有可能作假、相对容易一点自圆其说，但是技能点位就不一样了，掌握了自然能够应对面试官的"为难"，并且对于招聘方来说，这是确保候选者技术实力的重要指标，因为项目是可能会变的，这会导致你的一些项目经验可能无用武之地，但是硬技能是通用货。

针对于技能这项，特别是第一篇中第一阶段的候选者，很多时候项目所涉及的范围并没有这么大，那这个时候就很考验候选者在工作之余的学习能力了，所以，在职业的早期，通过额外的精力补充的知识、技能，在后续的面试中必然会给你带来加分效应。

Read full article from 躁动的季节里非典型跳槽指南，02篇 | 技术面试三要素

以太坊合约分析之远程购买

支付宝在交易中起到信用中介的作用，避免用户钱货两空。虽然支付宝主要是保护了用户方的利益，但是支付宝会从商家店铺收取租金，羊毛出在羊身上，商家再将这笔费用隐形的转给用户。

现在有了以太坊，支付宝作为信用中介的功用就可以被区块链合约所取代。我们看下面这个简单的买卖合约代码。

Read full article from 以太坊合约分析之远程购买

坑人无数的Java面试题之ArrayList

ArrayList可能是Java数据结构中最简单的一种了，即使一个非Java程序员可能也知道这个数据结构，因为所有的语言中都有这样的类似的数据结构。可是在经历过的无数的面试中我发现并不是所有的小伙伴都精通这种简单的数据结构的，比如下面的几道题，能回答成功一般以上的都不多。不信你来挑战一下！

ArrayList插入删除一定慢么？
取决于你删除的元素离数组末端有多远，ArrayList拿来作为堆栈来用还是挺合适的，push和pop操作完全不涉及数据移动操作。

ArrayList的遍历和LinkedList遍历性能比较如何？

论遍历ArrayList要比LinkedList快得多，ArrayList遍历最大的优势在于内存的连续性，CPU的内部缓存结构会缓存连续的内存片段，可以大幅降低读取内存的性能开销。

ArrayList是如何扩容的？

ArrayList扩容后的大小等于扩容前大小的1.5倍，当ArrayList很大的时候，这样扩容还是挺浪费空间的，甚至会导致内存不足抛出OutOfMemoryError。扩容的时候还需要对数组进行拷贝，这个也挺费时的。所以我们使用的时候要竭力避免扩容，提供一个初始估计容量参数，以免扩容对性能带来较大影响。

ArrayList的默认数组大小为什么是10？

其实小编也没找到具体原因。据说是因为sun的程序员对一系列广泛使用的程序代码进行了调研，结果就是10这个长度的数组是最常用的最有效率的。也有说就是随便起的一个数字，8个12个都没什么区别，只是因为10这个数组比较的圆满而已。

ArrayList是线程安全的么？

当然不是，线程安全版本的数组容器是Vector。Vector的实现很简单，就是把所有的方法统统加上synchronized就完事了。你也可以不使用Vector，用Collections.synchronizedList把一个普通ArrayList包装成一个线程安全版本的数组容器也可以，原理同Vector是一样的，就是给所有的方法套上一层synchronized。

数组用来做队列合适么？

队列一般是FIFO的，如果用ArrayList做队列，就需要在数组尾部追加数据，数组头部删除数组，反过来也可以。但是无论如何总会有一个操作会涉及到数组的数据搬迁，这个是比较耗费性能的。

这个回答是错误的！

ArrayList固然不适合做队列，但是数组是非常合适的。比如ArrayBlockingQueue内部实现就是一个环形队列，它是一个定长队列，内部是用一个定长数组来实现的。另外著名的Disruptor开源Library也是用环形数组来实现的超高性能队列，具体原理不做解释，比较复杂。简单点说就是使用两个偏移量来标记数组的读位置和写位置，如果超过长度就折回到数组开头，前提是它们是定长数组。

Read full article from 坑人无数的Java面试题之ArrayList

Fundamentals of System Design — Part 2 – Hacker Noon

Applications need to have permanent storage for user or applications specific data. In memory data structures like linked list, arrays are optimised for access by CPU via pointers. Permanent storage is optimised for read/write access by clients/processes connecting to database server. A very important aspect of permanent persistence is data modelling. I will devote this post on how to choose a good data model for your application.

Read full article from Fundamentals of System Design — Part 2 – Hacker Noon

Fundamentals of System Design — Part 1 – Hacker Noon

While designing systems, there are three primary concerns that should be addressed — reliability, scalability and maintainability. These terms are tossed around quite frequently and in this post I want to provide expositions for each of them.

Read full article from Fundamentals of System Design — Part 1 – Hacker Noon

记录一次壮烈牺牲的阿里巴巴面试！ - Java后端技术 | 十条

首先呢，大佬让我用两分钟自我介绍。我本以为自己能滔滔不绝，将对方视作相亲对象般全方位介绍自己。结果不到半分钟，我就介绍完了==。

五秒钟的沉默后，大佬嗯了一声。

感觉自己的脸上堆满了尴尬而不失礼貌的微笑。

三、最近的项目经历

这时大佬问我最近从事了什么项目，研究生阶段都进行了什么样的工作。
那必须吹一吹！从Java的起源到Spring的发展再到Jenkins的使用顺便提一嘴dva+antd，结果半分钟一到，又说不下去了==

大佬很有耐心的听我说了一堆语无伦次的话，开始进入正题。

四、Spring

大佬：我看你用过这个Spring啊，你来聊聊为什么我们要使用Spring呢？
我：（因为大家都说好啊）首先呢，Spring是一个庞大的框架，它封装了很多成熟的功能能够让我们无需重复造轮子。其次呢，它使用IOC进行依赖管理，我们就不用自己初始化实例啦。
大佬：（我就知道你会说IOC啦）那你解释一下IOC吧
我：IOC就是依赖控制转化，利用Java的反射机制，将实例的初始化交给Spring。Spring可以通过配置文件管理实例。
大佬：那我们可以直接使用工厂模式呀。工厂模式也可以管理实例的初始化呀，为什么一定要使用Spring呢？
我：啊........因为.......方便？（仿佛看到大佬凝固的表情，为了不那么尴尬，我决定转移话题）。而且Spring的IOC是单例模式呢。

Read full article from 记录一次壮烈牺牲的阿里巴巴面试！ - Java后端技术 | 十条

Git: Rollback (or Undo) a Pull from an External Repository To Return To A Previous Stable Commit State « OmegaMan's Musings

In the case where one pulls from a repository such as GitHub and something breaks, one may want to undo that pull. Here are the steps to rollback to a previous version using Git.

Note if you have any work in the local working directory done after the pull, it will be lost using this method.

Our goal is to move to the Head to the last snapshot before the pull and return the Zen to us.

Steps

Find the SHA-1 version using reflog. The reflog is interactive and one uses a q to exit out.

Read full article from Git: Rollback (or Undo) a Pull from an External Repository To Return To A Previous Stable Commit State « OmegaMan's Musings

When Is the Best Time of Year to Buy Large Appliances?

The best time to buy most major appliances is during the months of September and October. During these two months, manufacturers unveil their latest models. This means that the previous year's models must be discounted in order to make room for the new models that will hit stores in the winter.

2. May
The exception to point number one is refrigerators. Unlike the other major appliances, most manufacturers roll out their new models of refrigerators in the summer. This means that last year's models get discounted during the spring.

3. January
Some stores hang on to older inventory while they make the transition from last year's models to the next. But once the new year hits, all of last year's remaining models must be discounted even further. Though better deals may be available at this time, selection will be limited.

Read full article from When Is the Best Time of Year to Buy Large Appliances?

Ashwin Jayaprakash's Blog: Holiday 2017 reading

Hi folks, a belated Merry Christmas and a Happy New Year! Here's a big list of tech videos and articles that I've been bookmarking for a while.

Read full article from Ashwin Jayaprakash's Blog: Holiday 2017 reading

Ashwin Jayaprakash's Blog: Analyzing large Java heap dumps when Eclipse Memory Analyzer (MAT) UI fails

If you find yourself trying to analyze a big heap dump (20-30GB) downloaded from your production server to your staging/test machines.. only to find out that X-over-SSH is too slow then this article is for you.

As of Nov 2013, we have 2 options - Eclipse MAT and a hidden gem called Bheapsampler.

Option 1:
Eclipse Memory Analyzer is obviously the best tool for this job. However, trying to get the UI to run remotely is very painful. Launching Eclipse and updating the UI is an extra load on the JVM that is already busy analyzing a 30G heap dump. Fortunately, there is a script that comes with MAT to parse the the heap dump and generate HTML reports without ever having to launch Eclipse! It's just that the command line option is not well advertised.

Read full article from Ashwin Jayaprakash's Blog: Analyzing large Java heap dumps when Eclipse Memory Analyzer (MAT) UI fails

[case9]频繁GC (Allocation Failure)及young gc时间过长分析 - code-craft - SegmentFault 思否

从上面的分析可以看出，young generation貌似有点大，ygc时间长；另外每次ygc之后survivor空间基本是空的，说明新生对象产生快，生命周期也短，原本设计的survivor空间没有派上用场。因此可以考虑缩小下young generation的大小，或者改为G1试试。

关于-XX:+PrintTenuringDistribution有几个要点，要明确一下：

这个打印的哪个区域的对象分布(survivor)
是在gc之前打印，还是在gc之后打印(gc之后打印)
一个新生对象第一次到survivor时其age算0还是算1

对象的年龄就是他经历的MinorGC次数，对象首次分配时，年龄为0，第一次经历MinorGC之后，若还没有被回收，则年龄+1，由于是第一次经历MinorGC，因此进入survivor区。因此对象第一次进入survivor区域的时候年龄为1.

晋升阈值(new threshold)动态调整

如果底下age的total大小大于Desired survivor size的大小，那么就代表了survivor空间溢出了，被填满，然后会重新计算threshold。

Read full article from [case9]频繁GC (Allocation Failure)及young gc时间过长分析 - code-craft - SegmentFault 思否

How do I check what version of Python is running my script? - Stack Overflow

From the command line (note the capital 'V'):

python -V

This is documented in 'man python'.

Read full article from How do I check what version of Python is running my script? - Stack Overflow

How to get IP address of running docker container - Stack Overflow

The IP will be localhost (since we are dealign with your computer. But if you want you can also type 127.0.0.1) and the port that you set in the container. For example, if you run Docker with the port option like this -p 3003:3000

Then it means that NodeJS should be listening on port 3000, while the exposed port will be 3003. Exposed to your local machine.

Thus you could type in the browser:

Read full article from How to get IP address of running docker container - Stack Overflow

OS X: About the application firewall - Apple Support

Use these steps to enable the application firewall:

Choose System Preferences from the Apple menu.
Click Security or Security & Privacy.
Click the Firewall tab.
Unlock the pane by clicking the lock in the lower-left corner and enter the administrator username and password.
Click "Turn On Firewall" or "Start" to enable the firewall.
Click Advanced to customize the firewall configuration.

Read full article from OS X: About the application firewall - Apple Support

Fatal Error: com.docker.osx.hyperkit.linux failed to start · Issue #1071 · docker/for-mac

Docker for Mac beta 34+ added extra integrity checks for the qcow2 file on VM start. Your logs contain the error

block device raised exception: (Failure "Found two references to cluster 139760: 118254.39256 and 84.43344")

Unfortunately this means that -- sometime in the past -- two groups of sectors in the virtual disk have become mapped to the same physical sector. It's unsafe to write to the file in this state. Unfortunately the only option is to use "Whale menu" -> "Preferences..." -> "Reset" -> "Reset to factory defaults". Unfortunately your containers and images will need to be rebuilt.

I suspect this may have happened when we enabled "online compaction" feature to reclaim space in the qcow2 file. The online feature is now disabled so I believe the problem should be fixed.

I apologies for the inconvenience. By the way beta 34 hotfix 1 was just released -- it has a bug fix relating to importing containers from toolbox. I recommend running the hotfix if you plan to import from toolbox.

Read full article from Fatal Error: com.docker.osx.hyperkit.linux failed to start · Issue #1071 · docker/for-mac

谈谈分布式系统

半年前，一个谁也没见过的日本浪人推出的理财产品突然在七侠镇火爆起来，据说买上点屯着，不出几月就能把同福客栈，甚至龙门镖局都盘下。我们家小六的七舅老爷，卖掉祖宅也嚷嚷着要 all in。我觉得这事吧很是蹊跷，好歹也是自家人嘛，不能让老人家上当受骗 —— 所以 … 放着我来。我用我无双的智慧，和堪比丞相的三寸不烂之舌给七舅老爷拦下来，让他打消了念头。没出半年，小六七舅老爷全家就和我们斩了联系，死生不复相见。 – 摘自《无双日记》 2018.1.1

Read full article from 谈谈分布式系统

Big List of Scalabilty Conferences - High Scalability -

Which of the many conferences should you attend? I get this question a lot, so I compiled a list. The list isn't life, it's not a top 10, and it won't say if a conference is naughty or nice, but they are conferences I know about, have attended, or referenced in an article. By no means is this list exhaustive. If you know of a conference people should consider attending, please add them in the comments. If you have an opinion about a particular conference, please comment on that too.

Read full article from Big List of Scalabilty Conferences - High Scalability -

Installation

To add the plugin, start the IDE and find the Plugins dialog. Browse Repositories, choose Category: Build, and find the Error-prone plugin. Right-click and choose "Download and install". The IDE will restart after you've exited these dialogs.

Eclipse

Ideally, you should find out about failed Error Prone checks as you code in eclipse, thanks to the continuous compilation by ECJ (eclipse compiler for Java). But this is an architectural challenge, as Error Prone currently relies heavily on the com.sun.* APIs for accessing the AST and symbol table.

For now, Eclipse users should use the Findbugs eclipse plugin instead, as it catches many of the same issues.

Read full article from Installation

The Architecture That Helps Stripe Move Faster | Software Development Conference QCon New York

The last story has to do with an infrastructure migration. This is something that we did last November. We migrated our entire infrastructure both from EC2 Classic, AWS's legacy networking environment, into VPC and also across regions from one region to another and so, we did this and an entire data center migration. We did it in the course of about 4 or 6 hours with no user visible downtime or any real latency impact. The important part of that story is that the way that we were able to do this effectively and without impacting our users was that we looked for points of super high leverage. We tried to find as many shared pieces of infrastructure as we could and solved problems at those layers to try and minimize the impact on the rest of the organization and sort of the amount of blockers that we had to do work through in order to make the migration move forward.

Question:

Read full article from The Architecture That Helps Stripe Move Faster | Software Development Conference QCon New York

APIs as infrastructure: future-proofing Stripe with versioning

When it comes to APIs, change isn't popular. While software developers are used to iterating quickly and often, API developers lose that flexibility as soon as even one user starts consuming their interface. Many of us are familiar with how the Unix operating system evolved. In 1994, The Unix-Haters Handbook was published containing a long list of missives about the software—everything from overly-cryptic command names that were optimized for Teletype machines, to irreversible file deletion, to unintuitive programs with far too many options. Over twenty years later, an overwhelming majority of these complaints are still valid even across the dozens of modern derivatives. Unix had become so widely used that changing its behavior would have challenging implications. For better or worse, it established a contract with its users that defined how Unix interfaces behave.

Read full article from APIs as infrastructure: future-proofing Stripe with versioning

github/scientist: A Ruby library for carefully refactoring critical paths.

Let's pretend you're changing the way you handle permissions in a large web app. Tests can help guide your refactoring, but you really want to compare the current and refactored behaviors under load.

Read full article from github/scientist: A Ruby library for carefully refactoring critical paths.

Don’t Alter Table. Do Copy and Rename – Research And Development Blog

Don't Alter Table. Do Copy and Rename – Research And Development Blog

In some cases a MySQL MyISAM table structure needs to be alter. This includes adding, removing and changing table columns (or indexes) and even re-ordering the MySQL table.
In these cases, for performance and safety reasons, it is wise to avoid altering the current working MySQL table and adopt the Copy and Rename approach.

The Copy and Rename approach consist from the following steps:

Create similar temporary table but with the requested change
Disable the temporary table keys,
Copy the rows from the original table to the temporary table
Enable the temporary table keys,
Backup the original table and rename the temporary table to have the original table name

Read full article from Don't Alter Table. Do Copy and Rename – Research And Development Blog

Online migrations at scale

Engineering teams face a common challenge when building software: they eventually need to redesign the data models they use to support clean abstractions and more complex features. In production environments, this might mean migrating millions of active objects and refactoring thousands of lines of code.

Stripe users expect availability and consistency from our API. This means that when we do migrations, we need to be extra careful: objects stored in our systems need to have accurate values, and Stripe's services need to remain available at all times.

Read full article from Online migrations at scale

Service discovery at Stripe

With so many new technologies coming out every year (like Kubernetes or Habitat), it's easy to become so entangled in our excitement about the future that we forget to pay homage to the tools that have been quietly supporting our production environments. One such tool we've been using at Stripe for several years now is Consul. Consul helps discover services (that is, it helps us navigate the thousands of servers we run with various services running on them and tells us which ones are up and available for use). This effective and practical architectural choice wasn't flashy or entirely novel, but has served us dutifully in our continued mission to provide reliable service to our users around the world.

Cracking Java Interviews (Java, Algorithm, Data structure, multi-threading, Spring and Hibernate): How will you create a thread safe table backed unique sequence generator in spring hibernate?

This is a common scenario where you want to generate a database controlled unique sequence for your application business requirement i.e order number generator, id generator, claim number generator etc. Considering that your application may have distributed architecture with multiple JVMs and a single database, we do not have option of using JVM level synchronization to ensure thread safety of sequence. The only option is to use database level concurrency control to achieve thread-safety (to resolve issues like lost updates, etc.)

Hibernate (and even JPA 2.0) offers two approaches to handle concurrency at database level - 1. Pessimistic Approach - All the calls to increment sequence in the table will be serialized in this case, thus no two threads can update the same table row at the same time. This approach suits our scenario since thread contention is high for sequence generator. 2. Optimistic Approach - a version field is introduced to the database table, which will be incremented every time a thread updates the sequence value. This does not require any locking at the table row level, thus scalability is high using this approach provided most threads just read the shared value and only few of them write to it.

Read full article from Cracking Java Interviews (Java, Algorithm, Data structure, multi-threading, Spring and Hibernate): How will you create a thread safe table backed unique sequence generator in spring hibernate?

Streaming Similarity Search for Fraud Detection – Smyte Blog – Medium

There are a large number of bad actor pseudonyms on the internet, but there are a relatively small number of actual bad actors. What do I mean by this? Let us consider a spammer. The person or company that produces the spam is the actual bad actor. The spammer assumes a number of pseudonyms, distinguished perhaps by email address and/or login name. If we look closely at the behavior of these actors, we can see patterns. Do they use the same credit card number? Did they create their accounts from the same machine or around the same time? Etc. If so, then we need only identify one pseudonym and then find other pseudonyms that exhibited similar behavior.

Read full article from Streaming Similarity Search for Fraud Detection – Smyte Blog – Medium

Api Blocking - xiaobaoqiu Blog

接口限流是保证系统稳定性的三大法宝之一(缓存, 限流, 降级).

Read full article from Api Blocking - xiaobaoqiu Blog

去分库分表的亿级数据NewSQL实践之旅

所以说，上述的特点非常契合目前我们在数据库应用设计上碰到的问题。经过评估，我们开始着手安排部署和测试，在备份、监控、故障恢复和扩展几个方面有以下的一些心得：

官方提供了一套基于 mydumper 和 myloader 的工具套件，是基于逻辑备份的方式，对于迁移来说是很便捷的；
监控用的是应用内置上报 Prometheus 的方案，可以写脚本与自有的监控系统对接；
有状态的KV层采用的是 Raft 协议来保证高可用，Leader 的选举机制满足了故障自恢复的需求；
无状态的 TiDB-Server 查询层可以在前段搭建LVS 或HAProxy来保证故障的切换；
KV 层的 Region 分裂保证了集群无感知的扩展。

测试之后发现数据库运维中比较重要的几项都已经闭环的解决了，但有得有失，实测结论是：

TiDB 是分布式结构，有网络以及多副本开销，导致 Sysbench OLTP 测试中单 server 的读性能不如 MySQL，但在大数据量下性能相较于MySQL不会产生明显下滑，且弹性扩展能力评估后可以满足业务的峰值需求；

Read full article from 去分库分表的亿级数据NewSQL实践之旅

Scalability Principles

One way to increase the amount of work that an application does is to decrease the time taken for individual work units to complete. For example, decreasing the amount of time required to process a user request means that you are able to handle more user requests in the same amount of time. Here are some examples of where this principle is appropriate and some possible realisation strategies.

Read full article from Scalability Principles

The WhatsApp Architecture Facebook Bought For $19 Billion - High Scalability -

What has hundreds of nodes, thousands of cores, hundreds of terabytes of RAM, and hopes to serve the billions of smartphones that will soon be a reality around the globe? The Erlang/FreeBSD-based server infrastructure at WhatsApp. We've faced many challenges in meeting the ever-growing demand for our messaging services, but as we continue to push the envelope on size (>8000 cores) and speed (>70M Erlang messages per second) of our serving system.

Read full article from The WhatsApp Architecture Facebook Bought For $19 Billion - High Scalability -

Machine Learning Crash Course | Google Developers

Machine Learning Crash Course features a series of lessons with video lectures, real-world case studies, and hands-on practice exercises.

Read full article from Machine Learning Crash Course | Google Developers

Understanding UnboundLocalError in Python - Eli Bendersky's website

If you're closely following the Python tag on StackOverflow, you'll notice that the same question comes up at least once a week. The question goes on like this:

Read full article from Understanding UnboundLocalError in Python - Eli Bendersky's website

Service Discovery in a Microservices Architecture - NGINX

Let's imagine that you are writing some code that invokes a service that has a REST API or Thrift API. In order to make a request, your code needs to know the network location (IP address and port) of a service instance. In a traditional application running on physical hardware, the network locations of service instances are relatively static. For example, your code can read the network locations from a configuration file that is occasionally updated.

Read full article from Service Discovery in a Microservices Architecture - NGINX

浅析海量用户的分布式系统设计（1） - 云+社区 - 腾讯云

我们常常会听说，某个互联网应用的服务器端系统多么牛逼，比如QQ拉、微信拉、淘宝拉。那么，一个互联网应用的服务器端系统，到底牛逼在什么地方？为什么海量的用户访问，会让一个服务器端系统变得更复杂？本文就是想从最基本的地方开始，探寻服务器端系统技术的基础概念。

承载量是分布式系统存在的原因

当一个互联网业务获得大众欢迎的时候，最显著碰到的技术问题，就是服务器非常繁忙。当每天有1000万个用户访问你的网站时，无论你使用什么样的服务器硬件，都不可能只用一台机器就承载的了。因此，在互联网程序员解决服务器端问题的时候，必须要考虑如何使用多台服务器，为同一种互联网应用提供服务，这就是所谓"分布式系统"的来源。

Read full article from 浅析海量用户的分布式系统设计（1） - 云+社区 - 腾讯云

为什么LinkedIn放弃MySQL slowlog，转用基于网络层的慢查询分析器？ - 高可用架构 | 十条

考虑到性能问题，我们没有使用慢查询日志。我们可以为查询时间设置一个阈值，然后将所有跨越阈值的查询记录在一个文件中，用于事后进行分析。这种方法的缺点是它无法捕获所有的查询。如果将阈值设置为 0 可以捕获所有查询，但实际是行不通的，数百万次查询记录进文件会导致海量 IO，并大大降低系统吞吐量。 所以使用慢查询日志是完全不行的。

我们考虑的下一个选项是 MySQL Performance Schema，可以用来在低水平监控 MySQL 服务器运行状态（从MySQL 5.5.3开始提供）。它提供了一种在运行时检查服务器的内部执行情况的方法。然而，使用此方法的主要缺点是启用或禁用 performance_schema 需要重新启动数据库。您可以尝试启用 Performance Schema，然后关闭所有调用者，这会导致增加大约 8％的开销; 如果您启用所有的调用者，会增加大约 20-25％的开销。分析 Performance Schema 也非常复杂，为了克服这个问题，MySQL 从 MySQL 5.7.7 版本引入了sys schema。但是为了查看历史数据，我们仍然需要将数据从 Performance Schema 转储到其他服务器。

Read full article from 为什么LinkedIn放弃MySQL slowlog，转用基于网络层的慢查询分析器？ - 高可用架构 | 十条

程序员必读书单 1.0 | lucida

本文把程序员所需掌握的关键知识总结为三大类19个关键概念，然后给出了掌握每个关键概念所需的入门书籍，必读书籍，以及延伸阅读。旨在成为最好最全面的程序员必读书单。

Read full article from 程序员必读书单 1.0 | lucida