Graph and its representations - GeeksforGeeks



Graph and its representations - GeeksforGeeks

Following two are the most commonly used representations of graph.
1. Adjacency Matrix
2. Adjacency List
There are other representations also like, Incidence Matrix and Incidence List. The choice of the graph representation is situation specific. It totally depends on the type of operations to be performed and ease of use.

Read full article from Graph and its representations - GeeksforGeeks


8 technologies that will change the way you do everything - MarketWatch



8 technologies that will change the way you do everything - MarketWatch

8:09 A.M. ET 5:30 A.M. ET Share Everett Collection 50 years after the launch of The Jetsons, robots are getting real...and humanlike. Slide 1 of 10 NEW YORK (MarketWatch) — It's well known that technology grows exponentially – but sometimes its development is unfolding right before your eyes and you don't even realize it. In 2015, new technologies and innovations will hit the market in the artificial intelligence, robotics, augmented reality, Internet of Things and 3-D printing spaces that could pave the way for a major shift in society. These technologies have already started to be integrated into society, changing the way we drive our cars, operate our homes, do our jobs, communicate and consume. But with new innovations on the horizon, you may want to keep an eye on these next year. Nest Internet of Things: automated homes The Internet of Things finally became dinner table conversation (well, sort of) in 2014 thanks to Google Inc. GOOGL, +0.85%  ,

Read full article from 8 technologies that will change the way you do everything - MarketWatch


驾驶分析公司称旧金山打车应用服务更为安全_网易科技



驾驶分析公司称旧金山打车应用服务更为安全_网易科技

网易科技讯 12月26日消息,据国外媒体报道,旧金山驾驶分析公司Zendrive发表研究报告称,在旧金山地区,为Uber 、Sidecar 和Lyft 等打车应用工作的司机,与为出租车公司工作的司机相比,在驾驶习惯上有更高的安全性。


Read full article from 驾驶分析公司称旧金山打车应用服务更为安全_网易科技


gearman [Gearman Job Server]



gearman [Gearman Job Server]

Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates. A few strong points about Gearman:


Read full article from gearman [Gearman Job Server]


Neil Fraser: Writing



Neil Fraser: Writing

Computer Science

Differential Synchronization
An architecture for collaborative text editing.
Diff and Patch
Improving the algorithms driving the central tools for version control.
Cursor Preservation
Techniques for preserving cursor locations in collaborative editors.
Exception Handling
The problems and benefits of programming exception handlers.
Virtual Host Insecurity
A technical document illustrating some of the security pitfalls of virtual hosting.
Neural Network Follies
Illustrates two of the pitfalls of neural networks.

Read full article from Neil Fraser: Writing


Neil Fraser: Writing: Differential Synchronization



Neil Fraser: Writing: Differential Synchronization

Differential synchronization is a symmetrical algorithm employing an unending cycle of background difference (diff) and patch operations. There is no requirement that "the chickens stop moving so we can count them" which plagues server-side three-way merges.


Read full article from Neil Fraser: Writing: Differential Synchronization


介绍的比较全面中国人最容易懂的paxos - dellme99的专栏 - 博客频道 - CSDN.NET



介绍的比较全面中国人最容易懂的paxos - dellme99的专栏 - 博客频道 - CSDN.NET

我们所有的描述都假设读者已经熟读了Lamport的paxos-simple一文,因此对各种概念不再解释。   状态机:在状态机中的一致性更强调在每个初始状态一致的状态机上执行一串命令后状态都必须相互一致,也就是顺序一致性。Paxos算法中的一致性指的就是这种情况,接下来我们会对这种场景进一步讨论。       大家或许会对该描述有更深的理解。 很显然,为了给每个数据唯一编号,每次表决只能产生一个数据,否则表决就没有任何意义。Paxos的算法的所有精力都放在如何在一次表决只产生一个数据。再进一步,我们称表决的数据叫Value,Paxos算法的核心和精华就是确保每次表决只产生一个Value。     P1:An acceptor must accept the first proposal that it receives. 乍一看,这个条件是显然的,因为之前没有任何value,acceptor理所当然地应该accept第一个proposal,但仔细想想,感觉P1这个条件很不严格,到底是一个对问题的简单描述还是一个数学上严格的必要条件?这些疑问归结为2个问题: (1)这个条件本质上在保证什么? (2)第二个proposal怎么办? 如何解决P1中无法形成多数派的问题 第二个proposal如何选择 于是约束P2出现了: P2:If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.

Read full article from 介绍的比较全面中国人最容易懂的paxos - dellme99的专栏 - 博客频道 - CSDN.NET


Paxos算法1-算法形成理论 - 老码农的专栏 - 博客频道 - CSDN.NET



Paxos算法1-算法形成理论 - 老码农的专栏 - 博客频道 - CSDN.NET

我们所有的描述都假设读者已经熟读了Lamport的paxos-simple一文,因此对各种概念不再解释。   状态机:在状态机中的一致性更强调在每个初始状态一致的状态机上执行一串命令后状态都必须相互一致,也就是顺序一致性。Paxos算法中的一致性指的就是这种情况,接下来我们会对这种场景进一步讨论。       大家或许会对该描述有更深的理解。 很显然,为了给每个数据唯一编号,每次表决只能产生一个数据,否则表决就没有任何意义。Paxos的算法的所有精力都放在如何在一次表决只产生一个数据。再进一步,我们称表决的数据叫Value,Paxos算法的核心和精华就是确保每次表决只产生一个Value。     P1:An acceptor must accept the first proposal that it receives. 乍一看,这个条件是显然的,因为之前没有任何value,acceptor理所当然地应该accept第一个proposal,但仔细想想,感觉P1这个条件很不严格,到底是一个对问题的简单描述还是一个数学上严格的必要条件?这些疑问归结为2个问题: (1)这个条件本质上在保证什么? (2)第二个proposal怎么办? 如何解决P1中无法形成多数派的问题 第二个proposal如何选择 于是约束P2出现了: P2:If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.

Read full article from Paxos算法1-算法形成理论 - 老码农的专栏 - 博客频道 - CSDN.NET


图解分布式一致性协议Paxos - loop in codes



图解分布式一致性协议Paxos - loop in codes

Paxos协议/算法是分布式系统中比较重要的协议,它有多重要呢? 理解这个算法的运作过程其实基本就可以用于工程实践。而且理解这个过程相对来说也容易得多。 算法内容 Phase 1 (a) A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors. (b) If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered pro-posal (if any) that it has accepted. Phase 2 (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v , where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals. (b) If an acceptor receives an accept request for a proposal numbered n,

Read full article from 图解分布式一致性协议Paxos - loop in codes


Consensus Protocols: Paxos : Paper Trail



Consensus Protocols: Paxos : Paper Trail

Consensus Protocols: Paxos You can't really read two articles about distributed systems today without someone mentioning the Paxos algorithm. Google use it in Chubby , Yahoo use it, or something a bit like it, in ZooKeeper and it seems that it's considered the ne plus ultra of consensus algorithms. It also comes with a reputation as being fantastically difficult to understand – a subtle, complex algorithm that is only properly appreciated by a select few. This is kind of true and not true at the same time. Paxos is an algorithm whose entire behaviour is subtly difficult to grasp. However, the algorithm itself is fairly intuitive, and certainly relatively simple. In this article I'll describe how basic Paxos operates, with reference to previous articles on two-phase and three-phase commit. I've included a bibliography at the end, for those who want plenty more detail. Why another consensus algorithm? If you read my previous article on three-phase commit ,

Read full article from Consensus Protocols: Paxos : Paper Trail


Why Vector Clocks Are Hard | Basho Technologies



Why Vector Clocks Are Hard | Basho Technologies

April 5, 2010 A couple of months ago,  Bryan wrote about vector clocks  on this blog. The title of the post was "Why Vector Clocks are Easy"; anyone who read the post would realize that he meant that they're easy for a client to use when talking to a system that implements them. For that reason, there is no reason to fear or avoid using a service that exposes the existence of vector clocks in its API. Of course, actually implementing such a system is not easy. Two of the hardest things are deciding what  an actor is (i.e. where the incrementing and resolution is, and what parties get their own field in the vector) and how to keep vclocks from growing without bound over time. In Bryan's example the parties that actually proposed changes ("clients") were the actors in the vector clocks. This is the model that vector clocks are designed for and work well with, but it has a drawback. The width of the vectors will grow proportionally with the number of clients.

Read full article from Why Vector Clocks Are Hard | Basho Technologies


Why Vector Clocks are Easy | Basho Technologies



Why Vector Clocks are Easy | Basho Technologies

January 29, 2010 Vector clocks  are confusing the first time you're introduced to them. It's not clear what their benefits are, nor how it is you derive said benefits. Indeed, each Riak developer has had his own set of false starts in making them behave. The truth, though, is that vector clocks are actually very simple, and a couple of quick rules will get you all the power you need to use them effectively. The simple rule is: assign each of your actors an ID, then make sure you include that ID and the last vector clock you saw for a given value whenever to store a modification. The rest of this post will explain why and how to follow that simple rule. First, I'll explain how vector clocks work with a very simple example, and then show how to use them easily in Riak. Vector Clocks by Example We've all had this problem: Alice, Ben, Cathy, and Dave are planning to meet next week for dinner. The planning starts with Alice suggesting they meet on Wednesday. Later,

Read full article from Why Vector Clocks are Easy | Basho Technologies


java - Implementation of Vector Clocks - Stack Overflow



java - Implementation of Vector Clocks - Stack Overflow

For my code which is running on different devices, i need to determine the ordering of messages which are sent between those devices. Therefore I would like to use vector clocks since I read vector clocks allow for the ordering of events.

Is there any established framework/public API which I can use for that? Or a reference implementation= Or do I have to code it from scratch?


Read full article from java - Implementation of Vector Clocks - Stack Overflow


Alexey Ragozin: Coherence 101 - EntryProcessor traffic amplification



Alexey Ragozin: Coherence 101 - EntryProcessor traffic amplification

Coherence 101 - EntryProcessor traffic amplification Oracle Coherence data grid has a powerful tool for inplace data manipulation - EntryProcessor . Using entry processor you can get reasonable atomicity guarantees without locks or transactions (and without drastic performance fees associated). One good example of entry processor would be built-in ConditionalPut processor, which will verify certain condition before overriding value. This, in turn, could be used for implementing optimistic locking and other patterns. ConditionalPut could accept only one value, but ConditionalPutAll processor is also available. ConditionalPutAll accepts a map of key/values. Using it, we can update multiple cache entries with single call to NamedCache API. But there is one caveat. We have placed values for all keys in single map instance inside of entry processor object. On the other side, in distributed cache keys are distributed across different processes.

Read full article from Alexey Ragozin: Coherence 101 - EntryProcessor traffic amplification


TwitterServer -- TwitterServer 1.9.0 documentation



TwitterServer — TwitterServer 1.9.0 documentation

Quick-start ¶ twttr twttr http://maven.twttr.com/ com.twitter twitter-server_2.10 1.9.0 or, with sbt: NB: You only need to add the maven.twttr.com common maven.twttr.com , which adds Metrics , requires a twitter common library. First we’ll need to import a few things into our namespace. import com.twitter.finagle.{Http, Service} import com.twitter.io.Charsets import com.twitter.server.TwitterServer import com.twitter.util.{Await, Future} import org.jboss.netty.buffer.ChannelBuffers.copiedBuffer import org.jboss.netty.handler.codec.http._ TwitterServer defines its own version of the standard main com.twitter.server.TwitterServer method (no arguments). In this example, we use Finagle to start an HTTP server on port 8888. The service bound to this port is a simple hello service.

Read full article from TwitterServer — TwitterServer 1.9.0 documentation


Pragmatic Programming Techniques: Architecture Design



Pragmatic Programming Techniques: Architecture Design

Sunday, August 17, 2014 "Lambda Architecture" (introduced by Nathan Marz) has gained a lot of traction recently.  Fundamentally, it is a set of design patterns of dealing with Batch and Real time data processing workflow that fuel many organization's business operations.  Although I don't realize any novice ideas has been introduced, it is the first time these principles are being outlined in such a clear and unambiguous manner. In this post, I'd like to summarize the key principles of the Lambda architecture, focus more in the underlying design principles and less in the choice of implementation technologies, which I may have a different favors from Nathan. One important distinction of Lambda architecture is that it has a clear separation between the batch processing pipeline (ie: Batch Layer) and the real-time processing pipeline (ie: Real-time Layer).  Such separation provides a means to localize and isolate complexity for handling data update.  To handle real-time query,

Read full article from Pragmatic Programming Techniques: Architecture Design


Introduction to Architecting Systems for Scale - Irrational Exuberance



Introduction to Architecting Systems for Scale - Irrational Exuberance

Few computer science or software development programs attempt to teach the building blocks of scalable systems. Instead, system architecture is usually picked up on the job by working through the pain of a growing product or by working with engineers who have already learned through that suffering process. In this post I'll attempt to document some of the scalability architecture lessons I've learned while working on systems at Yahoo! and Digg . I've attempted to maintain a color convention for diagrams in this post: green represents an external request from an external client (an HTTP request from a browser, etc), blue represents your code running in some container (a Django app running on mod_wsgi , a Python script listening to RabbitMQ , etc), and red represents a piece of infrastructure (MySQL, Redis , RabbitMQ, etc). Load Balancing: Scalability & Redundancy The ideal system increases capacity linearly with adding hardware. In such a system, if you have one machine and add another,

Read full article from Introduction to Architecting Systems for Scale - Irrational Exuberance


Pragmatic Programming Techniques: Scalable System Design Patterns



Pragmatic Programming Techniques: Scalable System Design Patterns

Friday, October 15, 2010 Scalable System Design Patterns Looking back after 2.5 years since my previous post on scalable system design techniques , I've observed an emergence of a set of commonly used design patterns. Here is my attempt to capture and share them. Load Balancer In this model, there is a dispatcher that determines which worker instance will handle the request based on different policies. The application should best be "stateless" so any worker instance can handle the request. This pattern is deployed in almost every medium to large web site setup. Scatter and Gather In this model, the dispatcher multicast the request to all workers of the pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client. This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request ... etc. Result Cache In this model,

Read full article from Pragmatic Programming Techniques: Scalable System Design Patterns


14 Productivity Hacks to Improve Work Happiness | Betterment.com



14 Productivity Hacks to Improve Work Happiness | Betterment.com

2014-12-23 12:38:46 2014-12-19 00:00:16 Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors Hot on the Blog Posted: This article, written by Catherine New, originally appeared on Betterment . Automation and efficiency: We take these things seriously at Betterment, not only as core product and investing values, but also as a way to get our work done during the day and live happier lives. By being more efficient, we allow more time for the good stuff: ideating new products, creative brainstorming, lunches together as a team. We asked around the office for people's best tips for improving productivity, and here's what they said. (Note: While we use several different types of software mentioned below, this is not an endorsement.) Inbox Zero "I use Inbox Pause to block, batch, and deliver my email four times daily. That way, I can focus on my highest priority tasks, rather than the most recent requests.

Read full article from 14 Productivity Hacks to Improve Work Happiness | Betterment.com


Scalability for Dummies - Part 1: Clones



Scalability for Dummies - Part 1: Clones

for Git & Mercurial. Just recently I was asked what it would take to make a web service massively scalable. My answer was lengthy and maybe it is also for other people interesting. So I share it with you here in my blog and split it into parts to make it easier to read. New parts are released on a regular basis. Have fun and your comments are always welcomed! The other parts of the series " Scalability for Dummies " you can (soon) find here . Part 1 - Clones Public servers of a scalable web service are hidden behind a load balancer.  This load balancer evenly distributes load (requests from your users) onto your group/cluster of  application servers. That means that if, for example, user Steve interacts with your service, he may be served at his first request by server 2, then with his second request by server 9 and then maybe again by server 2 on his third request.  Steve should always get the same results of his request back, independent what server he  "landed on".

Read full article from Scalability for Dummies - Part 1: Clones


Numbers Everyone Should Know | Everything is Data



Numbers Everyone Should Know | Everything is Data

When you're designing a performance-sensitive computer system, it is important to have an intuition for the relative costs of different operations. How much does a network I/O cost, compared to a disk I/O, a load from DRAM, or an L2 cache hit? How much computation does it make sense to trade for a reduction in I/O? What is the relative cost of random vs. sequential I/O? For a given workload, what is the bottleneck resource? When designing a system, you rarely have enough time to completely build two alternative designs to compare their performance. This makes two skills useful: Back-of-the-envelope analysis. This essentially means developing an intuition for the performance of different alternate designs, so that you can reject possible designs out-of-hand, or choose which alternatives to consider more carefully. Microbenchmarking. If you can identify the bottleneck operation for a given resource,

Read full article from Numbers Everyone Should Know | Everything is Data


Design a feed system of stock informaiton | Runhe Tian Coding Practice



Design a feed system of stock informaiton | Runhe Tian Coding Practice

Let's assume we have some scripts which are scheduled to get the data via FTP at the end of the day. Where do we store the data? How do we store the data in such a way that we can do various analyses of it?

  • Proposal #1
    Keep the data in text files. This would be very difficult to manage and update, as well as very hard to query. Keeping unorganized text files would lead to a very inefficient data model.
  • Proposal #2
    We could use a database. This provides the following benefits:

    • Logical storage of data.
    • Facilitates an easy way of doing query processing over the data.

    Example: return all stocks having open > N AND closing price < M.
    Advantages:

    • Makes the maintenance easy once installed properly.
    • Roll back, backing up data, and security could be provided using standard database features. We don't have to "reinvent the wheel."
  • Proposal #3
    If requirements are not that broad and we just want to do a simple analysis and distribute the data, then XML could be another good option.
    Our data has fixed format and fixed size: company_name, open, high, low, closing price. The XML could look like this:

Read full article from Design a feed system of stock informaiton | Runhe Tian Coding Practice


Most Popular Developer Posts Of 2014 | Lifehacker Australia



Most Popular Developer Posts Of 2014 | Lifehacker Australia


Read full article from Most Popular Developer Posts Of 2014 | Lifehacker Australia


Life Is Short, Use Python | (R news & tutorials)



Life Is Short, Use Python | (R news & tutorials)

Welcome! Here you will find daily news and tutorials about R, contributed by over 555 bloggers. There are many ways to follow us - November 24, 2010 Life is short, use Python I started to play with Python two weeks ago due to the limitation of R in terms of handling large data, then a friend of mine suggested me to try Python since I had to do data massage frequently, "Python is the best choice, trust me", he said. Although I was unwilling to learn another new software, I couldn't bear with the low efficiency of R (or of my work) for large data. You may realize my learning curve as: Excellent free CSV splitter --> MySQL+RMySQL package --> Several R packages including bigmemory and ff . But to be honest, none of them satisfies me either because of the limitation of the method (slow + malfunction) or of my own computer (short of memory). I am shocked by python's extreme power and easy-to-use design after nearly two weeks, dealing with a 10GB CSV had never become so easy.

Read full article from Life Is Short, Use Python | (R news & tutorials)


How to avoid getting into infinite loops when designing a web crawler | Runhe Tian Coding Practice



How to avoid getting into infinite loops when designing a web crawler | Runhe Tian Coding Practice

First, how does the crawler get into a loop? The answer is very simple: when we re-parse an already parsed page. This would mean that we revisit all the links found in that page, and this would continue in a circular fashion.
Be careful about what the interviewer considers the "same" page. Is it URL or content? One could easily get redirected to a previously crawled page.
So how do we stop visiting an already visited page? The web is a graph-based structure, and we commonly use DFS (depth first search) and BFS (breadth first search) for traversing graphs. We can mark already visited pages the same way that we would in a BFS/DFS.
We can easily prove that this algorithm will terminate in any case. We know that each step of the algorithm will parse only new pages, not already visited pages. So, if we assume that we have N number of unvisited pages, then at every step we are reducing N (N-1) by 1. That proves that our algorithm will continue until they are only N steps.

Read full article from How to avoid getting into infinite loops when designing a web crawler | Runhe Tian Coding Practice


Java HashMap Performance | Martin Klewitz



Java HashMap Performance | Martin Klewitz

Recently I was wondering if there are alternatives to the Java Collections Implementations and if they have a better performance than the java.util package. Especially the performance of HashMaps was of interest. At the company I work at we have keep lots of business data entries in memory and those entries are stored in HashMaps. In several Articles I read that other implementaions are faster than java.util.HashMap. So I was conducting a performance check. Alternatives           net.sf.trove4j         trove4j         3.0.3                 com.google.guava         guava         15.0               commons-collections       commons-collections       3.2.

Read full article from Java HashMap Performance | Martin Klewitz


Enhance Collection Performance with this Treasure Trove - O'Reilly Media



Enhance Collection Performance with this Treasure Trove - O'Reilly Media

06/12/2002 Ever use the Collections API? How can you say no?! The Collections framework is probably one of the best API sets available to a Java programmer. Since many parts of our applications use a HashMap ArrayList , or LinkedList at some point, enhancing the performance of these guys can do a lot for us. Eric D. Friedman wrote a high performance set of collections called Trove . Trove allows you to plug in their versions of certain containers ( HashMap LinkedList ), and use them just like you did with the standard versions. There are also ways to utilize primitive collections to gain even more performance. Don't you love open source? In this article, we will discuss: Using Trove classes Using a Factory to allow you to switch between Trove and JDK collections Trove's Primitive collections Using Trove Classes Trove is ridiculously simple to work with. You can simply download the code and set up your CLASSPATH lib/trove.jar file. To use the Trove classes,

Read full article from Enhance Collection Performance with this Treasure Trove - O'Reilly Media


Java and Eclipse Plugin Development: UseCompressedOops flag with java



Java and Eclipse Plugin Development: UseCompressedOops flag with java

Compressed oops is supported and enabled by default in Java SE 6u23 and later. In Java SE 7, use of compressed oops is the default for 64-bit JVM processes when -Xmx isn't specified and for values of -Xmx less than 32 gigabytes. For JDK 6 before the 6u23 release, use the -XX:+UseCompressedOops flag with the java command to enable the feature.

In summary:
Java version <6u23 - use command to set it.
Java version >=6u23 - by default enabled.
JDK7 - if -xmx not specified or -xmx <32GB => configured bydefault

Read full article from Java and Eclipse Plugin Development: UseCompressedOops flag with java


CompressedOops - HotSpot Internals for OpenJDK - Oracle Wiki



CompressedOops - HotSpot Internals for OpenJDK - Oracle Wiki

Comment: What's an oop? An "oop", or "ordinary object pointer" in HotSpot parlance is a managed pointer to an object.  It is normally the same size as a native machine pointer.  A managed pointer is carefully tracked by the Java application and GC subsystem, so that storage for unused objects can be reclaimed.  This process can also involve relocating (copying) objects which are in use, so that storage can be compacted. The term "oop" is traditional to certain VMs that derive from Smalltalk and Self, including the following: Self (an prototype-based relative of Smalltalk)  https://github.com/russellallen/self/blob/master/vm/src/any/objects/oop.hh Strongtalk (a Smalltalk implementation)  http://code.google.com/p/strongtalk/wiki/VMTypesForSmalltalkObjects V8  http://code.google.com/p/v8/source/browse/trunk/src/objects.h (mentions "smi" but not "oop") (In some of these systems, the term "smi" refers to a special non-oop word, a pseudo-pointer, which encodes a small, 30-bit integer.

Read full article from CompressedOops - HotSpot Internals for OpenJDK - Oracle Wiki


What is -XX:+UseCompressedOops in 64 bit JVM



What is -XX:+UseCompressedOops in 64 bit JVM

Why should you use -XX:+UseCompressedOops JVM option -XX:+UseCompressedOops JVM option neutralize penalty imposed by 64 bit JVM. By using -XX:+UseCompressedOops you can avail benefit of both 64 bit JVM in terms of larger Java heap size and 32 bit JVM in terms of compressed size of OOPS which results in better performance by utilizing CPU cache better than larger, space inefficient 64 bit OOPS pointers. Since better application performance is directly proportional to better CPU cache utilization, -XX:+UseCompressedOops allows you to get most of your available CPU registers along with additional CPU registers provided by some platforms like AMD x64. Some people may argue that further expansion of 32 bit compressed OOPS into 64 bit pointers may slow down things but that shouldn't be problem with modern high end processors.

Read full article from What is -XX:+UseCompressedOops in 64 bit JVM


java - What does the UseCompressedOops JVM flag do and when should I use it? - Stack Overflow



java - What does the UseCompressedOops JVM flag do and when should I use it? - Stack Overflow

Most HotSpot JVM in the last year have had it on by default. This option allows references to be 32-bit in a 64-bit JVM and access close to 32 GB of heap. (more than 32-bit pointers can) (You can have near unlimited off heap memory as well). This can save a significant amount of memory and potentially improve performance.

If you want to use this option I suggest you update to a version which has it on by default as there may have been a good reason, such as bugs, why it wasn't enabled previously. Try Java 6 update 23 or Java 7 update 5.

In short, don't turn it on, use a version which has it on by default.


Read full article from java - What does the UseCompressedOops JVM flag do and when should I use it? - Stack Overflow


Jeff Dean facts: How a Google programmer became the Chuck Norris of the Internet.



Jeff Dean facts: How a Google programmer became the Chuck Norris of the Internet.

Jan. 23 2013 10:20 AM The Optimizer How Google's Jeff Dean became the Chuck Norris of the Internet. Courtesy Google. "The speed of light in a vacuum used to be about 35 mph. Then Jeff Dean spent a weekend optimizing physics."— Jeff Dean Facts Will Oremus is Slate's senior technology writer. Jeff Dean facts aren't, well, true. But the fact that someone went to the trouble to make up Chuck Norris-esque exploits about Dean is remarkable. That's because Jeff Dean is a software engineer, and software engineers are not like Chuck Norris. For one thing, they're not lone rangers—software development is an inherently collaborative enterprise. For another, they rarely shoot cowboys with an Uzi . Nevertheless, on April Fool's Day 2007, some admiring young Google engineers saw fit to bestow upon Jeff Dean the honor of a website extolling his programming achievements. For instance: Compilers don't warn Jeff Dean. Jeff Dean warns compilers. Jeff Dean writes directly in binary.

Read full article from Jeff Dean facts: How a Google programmer became the Chuck Norris of the Internet.


Reducing memory consumption by 20x - Plumbr



Reducing memory consumption by 20x – Plumbr

May 30, 2013 by Nikita Salnikov-Tarnovski This is going to be another story sharing our recent experience with memory-related problems. The case is extracted from a recent customer support case, where we faced a badly behaving application repeatedly dying with OutOfMemoryError messages in production. After running the application with Plumbr attached we were sure we were not facing a memory leak this time. But something was still terribly wrong. The symptoms were discovered by one of our experimental features monitoring the overhead on certain data structures. It gave us a signal pinpointing towards one particular location in the source code. In order to protect the privacy of the customer we have recreated the case using a synthetic sample, at the same time keeping it technically equivalent to the original problem. Feel free to download the source code . We found ourselves staring at a set of objects loaded from an external source.

Read full article from Reducing memory consumption by 20x – Plumbr


性能优化:Trove集合库 - 小毛的胡思乱想



性能优化:Trove集合库 - 小毛的胡思乱想
Trove相当于把JDK集合类都针对原生类型处理了一遍,例如int,常见的类有 TIntList、TIntObjectMap、TObjectIntMap、TIntSet,可想而知,维护Trove的工作量是挺大的
Trove还提供了开放寻址法的Map,Set,LinkedList实现,可以参考Enhance Collection Performance with this Treasure Trove的做法,类似于:
Trove不推荐JDK的entryXX的做法,而是采用了forEach的回调方式。 代码显得更好看些,另外内存方面也有优势,因为使用entryXX的做法,需要创建一个新的数组。
  • 自定义Hash策略
我们知道,在JDK集合类里边,有时候是没法自定义Hash策略的,例如String。 不过Trove提供了自定义Hash策略的功能,让你可以根据数据特性进行优化
  • 直接使用原生类型,而不是包装类型
JDK5的自动封箱机制,让我们可以暂时忽略原生类型和包准类型的区别。自动封箱机制只是一种语法糖,实际上并没有提高效率。 直接使用原生类型替代包装类型,明显可以占用更小的内存、运行起来也更有效率。对于基本类型的集合组合,Trove都提供了 等价的集合类。
  • 使用开放寻址法,而不是链地址法
大多数的JDK集合类都是采用链地址法实现的,它需要一个地址表,并且元素之间需要链表结点,而Trove采用开放寻址法, 虽然需要保持足够的空闲位置(装载因子小于0.5),但因为不需要链表结点,所以总体上内存占用要更少,性能还要更快一些。
  • HashSet不再通过内置HashMap实现
JDK的HashSet是通过内置一个HashSet来实现的,所以白白浪费了value的空间。 Trove提供的THashSet和其他基本类型的HashSet,都不再采用这种方式,直接使用开放地址存储。
  • 采用素数长度大小的数组
为了最大程度避免hash冲突,除了保持较小的装载因子,还采用了素数长度大小的数组。具体见gnu.trove.impl.PrimeFinder

Read full article from 性能优化:Trove集合库 - 小毛的胡思乱想

collections - Is it possible in java make something like Comparator but for implementing custom equals() and hashCode() - Stack Overflow



collections - Is it possible in java make something like Comparator but for implementing custom equals() and hashCode() - Stack Overflow

Yes it is possible to do such a thing. But it won't allow you to put your objects into a HashMap, HashSet, etc. That's because the standard collection classes expect key objects to provide the equals and hashCode methods. (That's the way they are designed to work ...)

Alternatives:

  1. Implement a wrapper class that holds an instance of the real class, and provides its own implementation of equals and hashCode.

  2. Implement your own hashtable-based classes which can use a "hashable" object to provide equals and hashcode functionality.

  3. Bite the bullet and implement equals and hashCode overrides on the relevant classes.

In fact, the 3rd option is probably the best, because your codebase most likely needs to to be using a consistent notion of what it means for these objects to be equal. There are other things that suggest that your code needs an overhaul. For instance, the fact that it is currently using an array of objects instead of a Set implementation to represent what is apparently supposed to be a set.

On the other hand, maybe there was/is some real (or imagined) performance reason for the current implementation; e.g. reduction of memory usage. In that case, you should probably write a bunch of helper methods for doing operations like concatenating 2 sets represented as arrays.


Read full article from collections - Is it possible in java make something like Comparator but for implementing custom equals() and hashCode() - Stack Overflow


The 3 things you should know about hashCode() « EclipseSource Blog



The 3 things you should know about hashCode() « EclipseSource Blog

The 3 things you should know about hashCode() In Java, every object has a method hashCode that is simple to understand but still it’s sometimes forgotten or misused. Here are three things to keep in mind to avoid the common pitfalls. An object’s hash code allows algorithms and data structures to put objects into compartments, just like letter types in a printer’s type case. The printer puts all “A” types into the compartment for “A”, and he looks for an “A” only in this one compartment. This simple system lets him find types much faster than searching in an unsorted drawer. That’s also the idea of hash-based collections, such as HashMap and HashSet. The hashCode contract The contract is explained in the hashCode method’s JavaDoc. It can be roughly summarized with this statement: Objects that are equal must have the same hash code within a running process Please note that this does not imply the following common misconceptions: Unequal objects must have different hash codes – WRONG!

Read full article from The 3 things you should know about hashCode() « EclipseSource Blog


java - How do HashTables deal with collisions? - Stack Overflow



java - How do HashTables deal with collisions? - Stack Overflow

When you talked about "Hash Table will place a new entry into the 'next available' bucket if the new Key entry collides with another.", you are talking about the Open addressing strategy of Collision resolution of hash table.


There are several strategies for hash table to resolve collision.

First kind of big method require that the keys (or pointers to them) be stored in the table, together with the associated values, which further includes:

  • Separate chaining

Read full article from java - How do HashTables deal with collisions? - Stack Overflow


相似文档查找算法之 simHash 简介及其 java 实现 - leejun_2005的个人页面 - 开源中国社区



相似文档查找算法之 simHash 简介及其 java 实现 - leejun_2005的个人页面 - 开源中国社区

0人收藏此文章, 赞6 而 Google 的 simhash 算法产生的签名,可以满足上述要求。出人意料,这个算法并不深奥,其思想是非常清澈美妙的。 1、Simhash 算法简介 simhash算法的输入是一个向量,输出是一个 f 位的签名值。为了陈述方便,假设输入的是一个文档的特征集合,每个特征有一定的权重。比如特征可以是文档中的词,其权重可以是这个词出现的次数。 simhash 算法如下: 1,将一个 f 维的向量 V 初始化为 0 ; f 位的二进制数 S 初始化为 0 ; 2,对每一个特征:用传统的 hash 算法对该特征产生一个 f 位的签名 b 。对 i=1 到 f : 如果b 的第 i 位为 1 ,则 V 的第 i 个元素加上该特征的权重; 否则,V 的第 i 个元素减去该特征的权重。  3,如果 V 的第 i 个元素大于 0 ,则 S 的第 i 位为 1 ,否则为 0 ; 4,输出 S 作为签名。 明确了算法了几何意义,使这个算法直观上看来是合理的。但是,为何最终得到的签名相近的程度,可以衡量原始文档的相似程度呢?这需要一个清晰的思路和证明。在simhash的发明人Charikar的论文中[2]并没有给出具体的simhash算法和证明,以下列出我自己得出的证明思路。 Simhash是由随机超平面hash算法演变而来的,随机超平面hash算法非常简单,对于一个n维向量v,要得到一个f位的签名(f<

Read full article from 相似文档查找算法之 simHash 简介及其 java 实现 - leejun_2005的个人页面 - 开源中国社区


The convergence of domains and intelligent systems: SimHash: Hash-based Similarity Detection



The convergence of domains and intelligent systems: SimHash: Hash-based Similarity Detection

"SimHash: Hash-based Similarity Detection", Sadowski, Levin, 2007.

This paper outlines a hash algorithm that can be used for similarity detection. Most hash algorithm are designed to offer low collision and hash values for similar strings can vary quite a bit. This hash basically sets out to achieve the opposite, higher collision and hash keys for similar strings are similar if not the same.

If you cannot use term frequency and need a numeric representation of a string for statistical processing, what process can be used? Using an integer-based hash is one way to achieve this, though in my opinion it is not the most sophisticated of approaches.

An implementation of this algorithm showed that it is a reasonable approach for hashing strings with the intent to determine similarity. I found minor issues which I will address by altering the algorithm.

Read full article from The convergence of domains and intelligent systems: SimHash: Hash-based Similarity Detection


How to detect duplicate documents in billions of urls | Runhe Tian Coding Practice



How to detect duplicate documents in billions of urls | Runhe Tian Coding Practice

Based on the above two observations we can derive an algorithm which is as follows:

  1. Iterate through the pages and compute the hash table of each one.
  2. Check if the hash value is in the hash table. If it is, throw out the url as a duplicate. If it is not, then keep the url and insert it in into the hash table.

This algorithm will provide us a list of unique urls. But wait, can this fit on one computer?

  • How much space does each page take up in the hash table?
    • Each page hashes to a four byte value.
    • Each url is an average of 30 characters, so that's another 30 bytes at least.
    • Each url takes up roughly 34 bytes.
  • 34 bytes * 1 billion = 31.6 gigabytes. We're going to have trouble holding that all in memory!

What do we do?

  • We could split this up into files. We'll have to deal with the file loading / unloading—ugh.
  • We could hash to disk. Size wouldn't be a problem, but access time might. A hash table on disk would require a random access read for each check and write to store a viewed url. This could take msecs waiting for seek and rotational latencies. Elevator algorithms could elimate random bouncing from track to track.
  • Or, we could split this up across machines, and deal with network latency. Let's go with this solution, and assume we have n machines.
    • First, we hash the document to get a hash value v
    • v\%n tells us which machine this document's hash table can be found on.
    • v / n is the value in the hash table that is located on its machine.

Read full article from How to detect duplicate documents in billions of urls | Runhe Tian Coding Practice


Algo Ramblings: LRUCache in java



Algo Ramblings: LRUCache in java

The simplest way to create a cache using LinkedHashMap is to extend it. The constructor takes as an argument the maximum number of entries we want a Cache object to hold. The superclass constructor has three arguments: the initial capacity of the map, the load factor, and a boolean argument that tells theLinkedHashMap constructor to keep entries in access order instead of the default insertion order.


Read full article from Algo Ramblings: LRUCache in java


Algorithm Implementation/Trees/B+ tree - Wikibooks, open books for an open world



Algorithm Implementation/Trees/B+ tree - Wikibooks, open books for an open world

In computer science, a B+ tree is a type of tree data structure . It represents sorted data in a way that allows for efficient insertion and removal of elements. It is a dynamic, multilevel index with maximum and minimum bounds on the number of keys in each node. A B+ tree is a variation on a B-tree . In a B+ tree, in contrast to a B tree, all data are saved in the leaves. Internal nodes contain only keys and tree pointers. All leaves are at the same lowest level. Leaf nodes are also linked together as a linked list to make range queries easy. The maximum number of keys in a record is called the order of the B+ tree. The minimum number of keys per record is 1/2 of the maximum number of keys. For example, if the order of a B+ tree is n, each node (except for the root) must have between n/2 and n keys. The number of keys that may be indexed using a B+ tree is a function of the order of the tree and its height. For a n-order B+ tree with a height of h:

Read full article from Algorithm Implementation/Trees/B+ tree - Wikibooks, open books for an open world


B-Tree | Set 1 (Introduction) - GeeksforGeeks



B-Tree | Set 1 (Introduction) - GeeksforGeeks

The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size.

Read full article from B-Tree | Set 1 (Introduction) - GeeksforGeeks


Database System Technology



Database System Technology

Memory hierarchy and their access characteristics In this tutorial , we walk through an experiment with page based secondary storage data access. Storage and maintenance of relational data on secondary storage devices using heap file with pages. PL1: First Assignment: Implement a heap file to store relational data in pages on disk. star Concurrency Control & Recovery 2. Tentative Schedule Chapters refer to the course textbook. Articles refer to the online article in Section 1 above. Week Reading No tutorial this week 2 Memory Hierarchy and Data Layout Articles and introduction to practical learning PL1 3 Ch 12, 14 Query Evaluation: join 4 Ch 1, 8-14 All Articles through Disk-based Sorting Test 1 in EX 300 TAs will be available during the tutorial to answer PL1 questions in BA7230. Introduction to PL2 Answer PL2 questions All Articles through Disk-based Sorting Test 2 in EX 300 PL2 due (before lecture) 9 Ch 18 Recovery Late PL2's due (before lecture). NOTE:

Read full article from Database System Technology


algorithm - Solving T(n)=4T(n/2)+n^2 - Stack Overflow



algorithm - Solving T(n)=4T(n/2)+n^2 - Stack Overflow

T(n)=4T(n/2)+n2 = n2+4[4T(n/4)+n^2/4] = 2n2+16T(n/4) = ... = k*n2+4kT(n/2k) = ...

The process stops when 2k reaches n. ==> k = log2n.

==> T(n) = O(n2logn).


Read full article from algorithm - Solving T(n)=4T(n/2)+n^2 - Stack Overflow


Does Java pass by reference or pass by value? | JavaWorld



Does Java pass by reference or pass by value? | JavaWorld

Java manipulates objects 'by reference,' but it passes object references to methods 'by value.'" As a result, you cannot write a standard swap method to swap objects

Read full article from Does Java pass by reference or pass by value? | JavaWorld


What Are The Best-Kept Secrets Of Great Programmers? - Forbes



What Are The Best-Kept Secrets Of Great Programmers? - Forbes

Connect 1. Never reveal all that you know. OK, seriously this time. I think there are really a few things that distinguish great programmers. Know the concepts.  Solving a problem via memory or pattern recognition is much faster than solving it by reason alone.  If you've solved a similar problem before, you'll be able to recall that solution intuitively.  Failing that, if you at least keep up with current research and projects related to your own you'll have a much better idea where to turn for inspiration.  Solving a problem "automatically" might seem like magic to others, but it's really an application of "practice practice practice" as Miguel Paraz suggests. Know the tools.  This is not an end in itself, but a way to maintain "flow" while programming.  Every time you have to think about how to make your editor or version-control system or debugger do what you want, it bumps you out of your higher-level thought process.  These "micro-interruptions" are small,

Read full article from What Are The Best-Kept Secrets Of Great Programmers? - Forbes


15 Characteristics of a Good Programmer



15 Characteristics of a Good Programmer

Today's Most Read Sign up for weekly leadership advice you cannot afford to miss Invalid Email December 16, 2014 Business leaders are often challenged to find talented, experienced programming staff, especially if salaries must fit within certain budget guidelines. The fact that most of a programmer's work is conducted in front of a screen makes the hiring process even more complicated. Over the past few months I've been hiring a bit of tech talent for my latest startup Hostt.com .  This has been a big challenge as I live in the heart of Silicon Valley and talent is hard to persuade to leave big companies with large paychecks to work for a new, hip startup. With everything I've been going through, I decided to write a post about some of the characteristics in the best programming talent that I've been able to attract to my startup. Beyond knowing the programming languages necessary to do the job, there are certain requirements that are essential in hiring the right programmer.

Read full article from 15 Characteristics of a Good Programmer


15 Linux lsof Command Examples (Identify Open Files)



15 Linux lsof Command Examples (Identify Open Files)

by Lakshmanan Ganapathy on August 29, 2012 lsof stands for List Open Files. It is easy to remember lsof command if you think of it as "ls + of", where ls stands for list, and of stands for open files. It is a command line utility which is used to list the information about the files that are opened by various processes. In unix, everything is a file, ( pipes, sockets, directories, devices, etc.). So by using lsof, you can get the information about any opened files. 1. Introduction to lsof Simply typing lsof will provide a list of all open files belonging to all active processes. # lsof COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME init 1 root cwd DIR 8,1 4096 2 / init 1 root txt REG 8,1 124704 917562 /sbin/init init 1 root 0u CHR 1,3 0t0 4369 /dev/null init 1 root 1u CHR 1,

Read full article from 15 Linux lsof Command Examples (Identify Open Files)


privileges - lsof: WARNING: can't stat() fuse.gvfsd-fuse file system - Unix & Linux Stack Exchange



privileges - lsof: WARNING: can't stat() fuse.gvfsd-fuse file system - Unix & Linux Stack Exchange

FUSE and its access rights

lsof by default checks all mounted file systems including FUSE - file systems implemented in user space which have special access rights in Linux.

As you can see in this answer on Ask Ubuntu a mounted GVFS file system (special case of FUSE) is normally accessible only to the user which mounted it (the owner of gvfsd-fuse). Even root cannot access it. To override this restriction it is possible to use mount options allow_root and allow_other. The option must be also enabled in the FUSE daemon which is described for example in this answer ...but in your case you do not need to (and should not) change the access rights.

Excluding file systems from lsof

In your case lsof does not need to check the GVFS file systems so you can exclude the stat() calls on them using the -e option (or you can just ignore the waring):

lsof -e /run/user/1000/gvfs  

Checking certain files by lsof

You are using lsof to get information about all processes running on your system and only then you filter the complete output using grep. If you want to check just certain files and the related processes use the -f option without a value directly following it then specify a list of files after the "end of options" separator --. This will be considerably faster.

lsof -e /run/user/1000/gvfs -f -- /tmp/report.csv  

General solution

To exclude all mounted file systems on which stat() fails you can run something like this (in bash):

x=(); for a in $(mount | cut -d' ' -f3); do test -e "$a" || x+=("-e$a"); done  lsof "${x[@]}" -f -- /tmp/report.csv

Read full article from privileges - lsof: WARNING: can't stat() fuse.gvfsd-fuse file system - Unix & Linux Stack Exchange


GVFS - Wikipedia, the free encyclopedia



GVFS - Wikipedia, the free encyclopedia

GVFS is the virtual filesystem for the GNOME desktop, which allows users easy access to remote data via SFTP, FTP, WebDAV, SMB, and local data via Udev integration, OBEX, MTP and others.[1]

Attached resources are exposed via a URI syntax, for example smb://server01/gamedata or ftp://username:password@ftp.example.net/public_html, but are also mounted in the traditional manner under ~/.gvfs/ or /run/user/$USERNAME/gvfs or $XDG_RUNTIME_DIR/gvfs directory[2][3] to make them available to applications using standard POSIX commands and I/O.


Read full article from GVFS - Wikipedia, the free encyclopedia


fogus: 10 Technical Papers Every Programmer Should Read (At Least Twice)



fogus: 10 Technical Papers Every Programmer Should Read (At Least Twice)

Run this blog in mobile this is the second entry in a series on programmer enrichment Inspired by a fabulous post by Michael Feathers along a similar vein , I've composed this post as a sequel to the original. That is, while I agree almost wholly with Mr. Feather's 1 choices, I tend to think that his choices are design-oriented 2 and/or philosophical. In no way, do I disparage that approach, instead I think that there is room for another list that is more technical in nature, but the question remains, where to go next? In this post I will offer some guidance based on my own readings. The papers chosen herein are not intended to act as a C.S. hall of fame, but instead hope to accomplish the following: All papers are freely available online (i.e. not pay-walled) They are technical (at times highly so) They cover a wide-range of topics The form the basis of knowledge that every great programmer should know,

Read full article from fogus: 10 Technical Papers Every Programmer Should Read (At Least Twice)


You Too May Be A Victim Of Developaralysis | TechCrunch



You Too May Be A Victim Of Developaralysis | TechCrunch

Posted Dear developers: Do you feel insecure because you're only fluent in a mere eight programming languages used across three families of devices? Does exposure to yet another JavaScript framework make you shudder and wince? Have you postponed a pet project because you couldn't figure out which cloud platform would be best for it? You too may suffer from Developaralysis. Be afraid. There is no cure. The panoply of options available to developers today is ridiculous. We're choking on a cornucopia. Over the last few years I've been paid to write Java, Objective-C, C, C++, Python, Ruby, JavaScript, PHP (sorry) backed by various flavors of SQL/key-value/document datastores (MySQL, PostgreSQL, MongoDB, BigTable, Redis, Memcached, etc.) Do I feel good about this? Good God, no. Mostly I just feel guilty that I haven't done anything with Erlang, Clojure, Rust, Go, C#, Scala, Haskell, Julia, Scheme, Swift, or OCaml. I'm a victim of Developaralysis:

Read full article from You Too May Be A Victim Of Developaralysis | TechCrunch


Program to validate an IP address - GeeksforGeeks



Program to validate an IP address - GeeksforGeeks

Write a program to Validate an IPv4 Address. According to Wikipedia, IPv4 addresses are canonically represented in dot-decimal notation, which consists of four decimal numbers, each ranging from 0 to 255, separated by dots, e.g., 172.16.254.1 Following are steps to check whether a given string is valid IPv4 address or not: step 1) Parse string with "." as delimiter using " strtok() " function. e.g. ptr = strtok(str, DELIM); step 2) ……..a) If ptr contains any character which is not digit then return 0 ……..b) Convert "ptr" to decimal number say 'NUM' ……..c) If NUM is not in range of 0-255 return 0 ……..d) If NUM is in range of 0-255 and ptr is non-NULL increment "dot_counter" by 1 ……..e) if ptr is NULL goto step 3 else goto step 1 step 3) if dot_counter != 3 return 0 else return 1. // Program to check if a given string is valid IPv4 address or not #include #include #include #define DELIM "." /* return 1 if string contain only digits,

Read full article from Program to validate an IP address - GeeksforGeeks


How to Implement Forward DNS Look Up Cache? - GeeksforGeeks



How to Implement Forward DNS Look Up Cache? - GeeksforGeeks

How to Implement Forward DNS Look Up Cache? We have discussed implementation of Reverse DNS Look Up Cache . Forward DNS look up is getting IP address for a given domain name typed in the web browser. The cache should do the following operations : 1. Add a mapping from URL to IP address 2. Find IP address for a given URL. There are a few changes from reverse DNS look up cache that we need to incorporate. 1. Instead of [0-9] and (.) dot we need to take care of [A-Z], [a-z] and (.) dot. As most of the domain name contains only lowercase characters we can assume that there will be [a-z] and (.) 27 children for each trie node. 2. When we type www.google.in and google.in the browser takes us to the same page. So, we need to add a domain name into trie for the words after www(.). Similarly while searching for a domain name corresponding IP address remove the www(.) if the user has provided it. This is left as an exercise and for simplicity we have taken care of www. also.

Read full article from How to Implement Forward DNS Look Up Cache? - GeeksforGeeks


Diagonal Sum of a Binary Tree - GeeksforGeeks



Diagonal Sum of a Binary Tree - GeeksforGeeks

Diagonal Sum of a Binary Tree Consider lines of slope -1 passing between nodes (dotted lines in below diagram). Diagonal sum in a binary tree is sum of all node's data lying between these lines. Given a Binary Tree, print all diagonal sums. For the following input tree, output should be 9, 19, 42. 9 is sum of 1, 3 and 5. 19 is sum of 2, 6, 4 and 7. 42 is sum of 9, 10, 11 and 12. We strongly recommend to minimize your browser and try this yourself first Algorithm: The idea is to keep track of vertical distance from top diagonal passing through root. We increment the vertical distance we go down to next diagonal. 1. Add root with vertical distance as 0 to the queue. 2. Process the sum of all right child and right of right child and so on. 3. Add left child current node into the queue for later processing. The vertical distance of left child is vertical distance of current node plus 1. 4. Keep doing 2nd, 3rd and 4th step till the queue is empty.

Read full article from Diagonal Sum of a Binary Tree - GeeksforGeeks


Longest Even Length Substring such that Sum of First and Second Half is same - GeeksforGeeks



Longest Even Length Substring such that Sum of First and Second Half is same - GeeksforGeeks

Longest Even Length Substring such that Sum of First and Second Half is same Given a string 'str' of digits, find length of the longest substring of 'str', such that the length of the substring is 2k digits and sum of left k digits is equal to the sum of right k digits. Examples: Input: str = "123123" Output: 6 The complete string is of even length and sum of first and second half digits is same Input: str = "1538023" Output: 4 The longest substring with same first and second half sum is "5380" A Simple Solution is to check every substring of even length. The following is C based implementation of simple approach. // A simple C based program to find length of longest even length // substring with same sum of digits in left and right #include #include int findLength(char *str) { int n = strlen(str); int maxlen =0; // Initialize result // Choose starting point of every substring for (int i=0; i

Read full article from Longest Even Length Substring such that Sum of First and Second Half is same - GeeksforGeeks


Nuts & Bolts Problem (Lock & Key problem) - GeeksforGeeks



Nuts & Bolts Problem (Lock & Key problem) - GeeksforGeeks

Nuts & Bolts Problem (Lock & Key problem) Given a set of n nuts of different sizes and n bolts of different sizes. There is a one-one mapping between nuts and bolts. Match nuts and bolts efficiently. Constraint: Comparison of a nut to another nut or a bolt to another bolt is not allowed. It means nut can only be compared with bolt and bolt can only be compared with nut to see which one is bigger/smaller. Other way of asking this problem is, given a box with locks and keys where one lock can be opened by one key in the box. We need to match the pair. Brute force Way: Start with the first bolt and compare it with each nut until we find a match. In the worst case we require n comparisons. Doing this for all bolts gives us O(n^2) complexity. Quick Sort Way: We can use quick sort technique to solve this. We represent nuts and bolts in character array for understanding the logic. Nuts represented as array of character char nuts[] = {'@', '#', '$', '%', '^', '&'} char bolts[] = {'$', '%',

Read full article from Nuts & Bolts Problem (Lock & Key problem) - GeeksforGeeks


All the New Stuff in Android 5.0 Lollipop



All the New Stuff in Android 5.0 Lollipop

1. 2. A Brand New, Unified Design The new version of Android comes with a brand new interface, built on its new Material Design principles . The interface features brighter colors, smoother animations, and tools for developers to build apps that look the same on Android and the web. Google is giving developers tools to create apps that have a consistent appearance not just across phones and tablets, but wearables, cars, and everywhere else you can get Google. A New Notification System Android Lollipop has a new approach to notifications. In Jelly Bean, Google allowed developers to add expanded information and functionality to notifications. Now, users can access their entire notification list directly from the lock screen. You can swipe down to expand the notification panel to get more information. You can also swipe up to unlock the device. Google also introduced heads-up notifications that allow developers to add a small box that appears above full-screen apps that users can expand,

Read full article from All the New Stuff in Android 5.0 Lollipop


Android 5.0 "Lollipop" Feature Recap - The Best New Features | Droid Life



Android 5.0 “Lollipop” Feature Recap – The Best New Features | Droid Life

Android 5.0 “Lollipop” Feature Recap – The Best New Features Android 5.0 “ Lollipop ” is without a doubt one of the biggest (if not the) Android releases to date. There are 5,000 new APIs for developers to take advantage of, a brand new design language that will re-shape the way we look at and use apps going forward, and a couple of dozen new forward-facing features that you and I can take advantage of from day one. Because it has been such a massive release, we have spent the better part of the last couple of months diving through the biggest features that we wouldn’t want anyone to miss out on. From a mini tour of 5.0 to the new power of Android Beam to screen pinning to multiple account setup on phones to the way Chrome acts within the app switcher, we have tried to cover almost all of it.  Most of our coverage of each individual feature has been through videos, all of which you will find below, but some were better explained through words and pictures.

Read full article from Android 5.0 “Lollipop” Feature Recap – The Best New Features | Droid Life


Singleton Design Pattern - An Introspection w/ Best Practices | Javalobby



Singleton Design Pattern – An Introspection w/ Best Practices | Javalobby

Singleton Design Pattern – An Introspection w/ Best Practices 02.13.2013 Singleton is a part of Gang of Four design pattern and it is categorized under creational design patterns. In this article we are going to take a deeper look into the usage of the Singleton pattern. It is one of the most simple design pattern in terms of the modelling but on the other hand this is one of the most controversial pattern in terms of complexity of usage. In Java the Singleton pattern will ensure that there is only one instance of a class is created in the Java Virtual Machine. It is used to provide global point of access to the object. In terms of practical use Singleton patterns are used in logging, caches, thread pools, configuration settings, device driver objects. Design pattern is often used in conjunction with Factory design pattern . This pattern is also used in Service Locator JEE pattern. Structure: Static member : This contains the instance of the singleton class. Private constructor :

Read full article from Singleton Design Pattern – An Introspection w/ Best Practices | Javalobby


Why Enum Singleton are better in Java



Why Enum Singleton are better in Java

Singleton using Enum in Java
This is the way we generally declare Enum Singleton , it may contain instace variable and instance method but for sake of simplicity I haven't used any, just beware that if you are using any instance method than you need to ensure thread-safety of that method if at all it affect the state of object. By default creation of Enum instance is thread safe but any other method on Enum is programmers responsibility.

/**
* Singleton pattern example using Java Enumj
*/
public enum EasySingleton{
    INSTANCE;
}

You can acess it by EasySingleton.INSTANCE, much easier than calling getInstance() method on Singleton.

Read full article from Why Enum Singleton are better in Java


Different ways to write singleton in Java - Stack Overflow



Different ways to write singleton in Java - Stack Overflow

Initialization on Demand Holder (IODH) idiom which requires very little code and has zero synchronization overhead. Zero, as in even faster than volatile. IODH requires the same number of lines of code as plain old synchronization, and it's faster than DCL!

IODH utilizes lazy class initialization. The JVM won't execute a class's static initializer until you actually touch something in the class. This applies to static nested classes, too. In the following example, the JLS guarantees the JVM will not initialize instance until someone calls getInstance():

static class SingletonHolder {    static Singleton instance = new Singleton();      }    public static Singleton getInstance() {    return SingletonHolder.instance;  }

Read full article from Different ways to write singleton in Java - Stack Overflow


Java Singleton Design Pattern Best Practices with Examples



Java Singleton Design Pattern Best Practices with Examples

Singleton Pattern Singleton pattern restricts the instantiation of a class and ensures that only one instance of the class exists in the java virtual machine. The singleton class must provide a global access point to get the instance of the class. Singleton pattern is used for logging , drivers objects, caching and thread pool . Singleton design pattern is also used in other design patterns like Abstract Factory , Builder , Prototype , Facade etc. Singleton design pattern is used in core java classes also, for example java.lang.Runtime Java Singleton Pattern To implement Singleton pattern, we have different approaches but all of them have following common concepts. Private constructor to restrict instantiation of the class from other classes. Private static variable of the same class that is the only instance of the class. Public static method that returns the instance of the class, this is the global access point for outer world to get the instance of the singleton class.

Read full article from Java Singleton Design Pattern Best Practices with Examples


What Your Business Can Learn about Leveraging Big Data From Netflix, Eloqua and the 2008 Election



What Your Business Can Learn about Leveraging Big Data From Netflix, Eloqua and the 2008 Election

4 min read What Your Business Can Learn about Leveraging Big Data From Netflix, Eloqua and the 2008 Election Today's Most Read Get the latest news and opportunities in tech each week. Invalid Email December 19, 2014 For many organizations, big data is the engine of their success. Netflix , Eloqua and Obama's 2008 winning campaign for the presidency provide key lessons for entrepreneurs looking to harness the power of big data. Each of these businesses used big data to get closer to its customers -- and to develop a successful strategy.  In show business, a sector traditionally ruled by executives making decisions based on gut instinct, Netflix brought big data to the table when making the initial decision to invest in what eventually became a huge hit, House of Cards. When pitched the show that other producers had passed on, Netflix consulted its viewership data, according to David Carr of  The New York Times .

Read full article from What Your Business Can Learn about Leveraging Big Data From Netflix, Eloqua and the 2008 Election


Converting array to list in Java - Stack Overflow



Converting array to list in Java - Stack Overflow

In your example, it is because you can't have a List of a primitive type. In other words, List<int> is not possible. You can, however, have a List<Integer>.

Integer[] spam = new Integer[] { 1, 2, 3 };  Arrays.asList(spam);

That works as expected.


Read full article from Converting array to list in Java - Stack Overflow


HCL announces global partnership with MongoDB to broaden solution offerings - Business Today



HCL announces global partnership with MongoDB to broaden solution offerings - Business Today

HCL announces global partnership with MongoDB to broaden solution offerings - Business Today HCL announces global partnership with MongoDB to broaden solution offerings PTI     Bangalore   Last Updated: December 18, 2014  | 19:42 IST HCL Infosystems is a subsidiary of HCL Technologies (Photo: Reuters) HCL Infosystems on Thursday announced a global partnership with next generation database MongoDB that will allow the company to further broaden its solution offerings in the emerging big data segment. The partnership will enable HCL Infosystems to further broaden its solution offerings in the emerging big data segment by developing services around MongoDB, the Noida-based IT company said in a release. "We are very excited to form an alliance with MongoDB. It's a great opportunity to bolster our offerings and build on our legacy experience in the Indian market," APS Bedi, President-Enterprise Business, HCL Infotech said.

Read full article from HCL announces global partnership with MongoDB to broaden solution offerings - Business Today


Burner phone? There's an app for that, and it's earning millions of dollars | The Verge



Burner phone? There's an app for that, and it's earning millions of dollars | The Verge

December 18, 2014 Burner phone? There's an app for that, and it's earning millions of dollars It turns out there is a big market for a smarter, more adaptable approach to identity on your phone "It's not a one time, one and done thing." While the initial thesis was that people would want to create and quickly dispose of these numbers, over the last two years Burner's internal data has told a very different story. "It's not a one time, one and done thing," says Cohn. Instead of drug dealers looking to hide their activity, many of their customers were lawyers, cops, teachers, and taxi drivers looking to create some separation between their personal and professional lives without the hassle or expense of having a second phone. Solving that problem has become a big business for Burner. Over the last year the startup has been among the top grossing apps in the utility category for for both  Android and  iOS ,

Read full article from Burner phone? There's an app for that, and it's earning millions of dollars | The Verge


HttpClient: Target host must not be null, or set in parameters | H3x.no - Tor Henning Uelands blog



HttpClient: Target host must not be null, or set in parameters | H3x.no – Tor Henning Uelands blog

If you have the following code failing:

HttpGet httpget = new HttpGet(“www.host.com”);

Then the error is pretty easy to solve:
The problem is that you have not added a protocol to the URL, so change it to:

HttpGet httpget = new HttpGet(“http://www.host.com”);

And then it will work as wanted.


Read full article from HttpClient: Target host must not be null, or set in parameters | H3x.no – Tor Henning Uelands blog


solr - Accessing a core's default handler through SolrJ using setQueryType - Stack Overflow



solr - Accessing a core's default handler through SolrJ using setQueryType - Stack Overflow

You an use the qt parameter in Solr to specify a specific request handler, even the default /select handler. See CoreQueryParameters for details on how this works. SolrJ supports this through the use of the setRequestHandler method on the SolrQuery class.

Read full article from solr - Accessing a core's default handler through SolrJ using setQueryType - Stack Overflow


Query by Slice, Parallel Execute, and Join: A Thread Pool Pattern in Java | Java.net



Query by Slice, Parallel Execute, and Join: A Thread Pool Pattern in Java | Java.net

Forums Blogs Projects People January 31, 2008 sets in small chunks with forward and backward navigability. Pagination can be done with custom code or with "http://en.wikipedia.org/wiki/Commercial_off-the-shelf">commercial, off-the-shelf (COTS) libraries. Nevertheless, many of these frameworks first bring the full dataset to the business, presentation, or client tier and then page them into small batches. This may not be the best possible solution; for one thing, such approaches consume huge amounts of memory. This article will first show you how to effectively utilize ROWNUM implement "true pagination": querying data in slices. Of course, you may also want to do some business processing to the fetched data. If you have millions of rows to be processed, you may want to process them in parallel to fully utilize the available processing power. In Java we use threads to do this, but with the advent of Java SE 5's , we also have a means to reuse the threads created.

Read full article from Query by Slice, Parallel Execute, and Join: A Thread Pool Pattern in Java | Java.net


Data mining tops LinkedIn's list of the 'hottest skills of 2014'



Data mining tops LinkedIn's list of the 'hottest skills of 2014'


Read full article from Data mining tops LinkedIn's list of the 'hottest skills of 2014'


The best workplace app of 2014 - MarketWatch



The best workplace app of 2014 - MarketWatch


Read full article from The best workplace app of 2014 - MarketWatch


Article Page | TheStreet



Article Page | TheStreet

By Chris Ciaccia - 12/18/14 - 9:03 AM EST NEW YORK (TheStreet) -- Earlier this week, I  took a look at some tech predictions that I think are likely to come true in 2015. Considering that the world of technology creeps into or dominates much of our lives (nearly every company these days thinks they're a tech company), there's a need for a few more predictions going into the end of the year. Here's my outlook for 2015, including a few possible major moves in the enterprise space, some news about drones (yay drones!!) and a potential IPO for the company everyone loves to hate. Without further introduction, here's the next five predictions for 2015, starting with one I'm pretty sure about. Facebook Launches Facebook For Work It seems like every company is going after the enterprise -- Apple teamed up with IBM , Microsoft is making a renewed push to making Office available everywhere while Google is attacking Microsoft, pushing its Google Apps suite even harder. Enter Facebook in 2015.

Read full article from Article Page | TheStreet


A Dropbox Challenge - Solving The Subset Sum Problem Via Dynamic Programming



A Dropbox Challenge - Solving The Subset Sum Problem Via Dynamic Programming

Lately I've slowly been trying to grok the fullness of dynamic programming . It is an algorithmic technique that the vast majority of developers never master, which is unfortunate since it can help you come up with viable solutions for seemingly intractable problems. The issue with dynamic programming (besides the totally misleading name), is that it can be very difficult to see how to apply it to a particular problem and even when you do, it is a real pain to get it right. Anyway, I don't want to expound on this, I have something more interesting in mind. The Dropbox Challenges I was surfing the web the other day and in the course of my random wanderings I ended up at the Dropbox programming challenges page. Apparently, the Dropbox guys have posted up some coding challenges for people who want to apply to work there (and everyone else, I guess, since it's on the internet and all :)).

Read full article from A Dropbox Challenge - Solving The Subset Sum Problem Via Dynamic Programming


Real-time Data Mining with Spark - Steven Skelton's Blog



Real-time Data Mining with Spark - Steven Skelton's Blog

Posted on There are 2 new principles at the vanguard of today's technology: Reactive UX . As the world's population spends an increasing portion of their lives electronically, it's becoming more and more important for businesses to capture the online audience. Web 2.0 is now over a decade old: the age of the static website is gone. UI advancements of HTML5 , CSS, and a new breed of high performance JavaScript engines are bringing native app experiences to the browser. Big Data analytics. Business needs have increased in complexity beyond simple BI aggregates. To separate one business from the rest it's becoming increasing important to find the needle in a growing haystack. Today's web users expect a Reactive UX, just as today's business analysts expect Big Data functionality. One of today's hottest fields for R&D lies in their intersection. There are few software packages optimized for this purpose, perhaps the best originated in UC Berkeley's AMPLab , and it's called Spark.

Read full article from Real-time Data Mining with Spark - Steven Skelton's Blog


Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts