Wilan: Google Code Jam Round 1C



Overview:
This was the last opportunity for qualifiers to advance into Round 2. Congratulations to everyone who has made it to Round 2! Bad luck to those that didn't, there's always next year!

This round was greeted with a set of interesting problems. Problem A was the easy problem in the set, and most competitors were able to solve the small input with some incorrect submissions for the large case. Problem B proved rather interesting and had multiple correct approaches. One of which required a little mathematical manipulation to derive a short and safe implementation, the other had the danger of precision issues but was based on a simpler concept without the use of much mathematics. Problem C was a fairly subtle DP problem in disguise but this did not deter the top competitors to solve all three problems in under 40 minutes.

Links:
Google Code Jam Statistics: http://www.go-hero.net/jam/
Google Code Jam Scoreboard: http://code.google.com/codejam/contest/scoreboard?c=189252
Google Code Jam Unofficial Scoreboard/Online Source Viewer (Thanks to Zibada on TC): http://zibada.ru/gcj/2009/round1a.html, http://zibada.ru/gcj/2009/round1b.html, http://zibada.ru/gcj/2009/round1c.html

Problem A: All Your Base
Category: Greedy
URL: http://code.google.com/codejam/contest/dashboard?c=189252#s=p0

Read full article from Wilan: Google Code Jam Round 1C


乱搞系列之Google Code jam 四题 - Jecvay Notes



Minimum Scalar Product (2008 Round1A A)

这题想到贪心的方法不难, 看看样例就想到了. 如果要证明还是要思考一番. 证明的关键思想在于采用反证法证明简单情况, 用数学归纳推广到任意情况.

Crazy Rows (2009 Round2 A)

在所有能移动到第一行的备选行当中, 选择花费最低的. 然后用同样的方法处理第二行, 第三行, 第N行.

预处理 O(N^2), 选择并移动O(N^2). 所以最后的复杂度是O(N^2).

因为多组数据, 忘了memset数组, WA了好久.

 

Bribe the Prisoners (2009 Round 1C C)

P个牢房释放Q个人的问题
首先想到的是递归解决问题, 比如dfs(a, b), 枚举i, a < i < b, 然后 dfs(a, b) = dfs(a, i) + dfs(i, b) + cost[i] 之类的. 考虑到这样可能会有重复计算的地方, 那么可以用记忆化搜索.

 

书上用的是动态规划. 我觉得把思路从递归转化到动态规划是比较好玩的事情. 有些递归是不那么好转的, 比如在解答树上的叶子的深度深浅不一的时候, 或者解答树的枝叶长得比较凌乱的时候- -||

 

所以我觉得考虑递归边界是否整齐是一个好的判断是否能转化为dp的一个好方法. 而对dp的状态进行定义是一个关键步骤, 决定了递归边界是如何界定的.

 

书上的状态定义如下

dp[i][j] : 将从A[i]号到B[i]号的囚犯(不含这两个)的连续部分的所有囚犯都释放的时候, 所需要的最少金币总数.

显然"叶子节点"是 dp[i][i+1] = 0.

目标是要求出dp[0][Q+1].

从叶子节点向上延伸一层的时候, 就是状态转移方程要写出的意思:

dp[i][j] = A[j] � A[i] � 2 + min{dp[i][k], dp[k][j]} , i < k < j.

 

 

What are Birds? (APAC Semifinal 2008 C)

有趣的赌博问题,

也是一个搜索转为dp的题目.

不写了额看电影去了.


Read full article from 乱搞系列之Google Code jam 四题 - Jecvay Notes


Retrieve, View or Display Wireless (WEP or WPA WiFi) Network Security Key or Password in Windows 7 « My Digital Life



How to Recover, Retrieve, Show or Display Network Security Key for WEP, WPA or WPA2 Protected Secure Wireless Network in Windows 7

  1. Go to Control Panel -> Network and Internet -> Network and Sharing Center.
  2. Click on Manage wireless networks on the left pane.

    Manage Wireless Network

  3. A list of wireless networks that the PC used to connect to with saved password or security key. Click to highlight on a wireless network connection that user wants to view its network security key, and the right click on it and select Properties.

    Wireless Network Properties

  4. Click on Security tab.
  5. The network security key for the wireless network is hidden by asterisk by default. Instead of using utilities to reveal, display or show original characters of password or key hidden behind asterisks, click on Show characters button to reveal and display the actual original network security key on screen.

    Wireless Network Security Key


Read full article from Retrieve, View or Display Wireless (WEP or WPA WiFi) Network Security Key or Password in Windows 7 « My Digital Life


POJ 2796: Feel Good - ShineCheng - 博客园



一个数组,对于某个区间,这个区间的和*这个区间中的最小值=这个区间的计算值。求这个数组中的最大计算值,并任意输出其的一个左右位置。

思路:

维护一个最小单调栈。
对于某个元素:
如果被弹出了,说明它最多向右延伸到这里。
对于进栈,如果它的进栈没有造成任何元素的弹出,则说明它的位置就是它左边能延伸到的位置。如果造成元素弹出,那么,最后一个弹出的元素的左边能延伸到的位置就是本元素往左能延伸到的位置。
对于两个元素相同时:我不作弹出处理,仅仅把这个元素的位置更新(代码中表现为把这个元素往左延伸的和更新)。

在弹出元素时,能求和,而这个和正是元素往右延伸的和。往左延伸的和已经算好,所以这个元素为最小值的最大区间值就可以知道了。


Read full article from POJ 2796: Feel Good - ShineCheng - 博客园


连续最大区域面积系列 POJ 2082,2559,2796,3494,1964,3250 - gaofen100 - ITeye技术网站



函数功能是求连续的最大平面矩形的面积。
函数本质是运用栈的思维。
栈:先进后出,栈在写入的时候只用了个top标记栈顶,每次入栈top ++,但实际stack[]已有数组没有变化,top只是一个指向的作用。
而这个函数的left[], right[]数组就等同于top,具有指向作用。
但是得到的left[i]/right[i]是比本来left[i]/right[i]高的最左和最右位置(满足DP的子问题重叠性质),而这个位置就是具有指向意味。

这是关键代码,下面的练习是以之为基础的。
POJ PKU 2559
直接套用上面函数

POJ PKU 3494
2维的最大矩形覆盖,加个连续行的递推判断,注意数组类型要用__int64
if(str)
a[i][j] = a[i-1][j]+1;

POJ PKU 1964
同 3494

POJ PKU 2796
同 2559 底边是high[left[i]] ―― high[right[i]]之和,
读入的时候可以预处理个数组 h1[i] += h1[i - 1] + h[i];

POJ PKU 3250
栈,逆向思维,后面的牛被前面的牛看到。

POJ PKU 2082
同2796,恶心的描述,最后画下图就清晰了。


Read full article from 连续最大区域面积系列 POJ 2082,2559,2796,3494,1964,3250 - gaofen100 - ITeye技术网站


poj 2082 Terrible Sets(dp) | acmj1991



题意:就是紧贴 x 轴有一批互相紧挨着的矩形,给定每个矩形的长宽,问它们可以形成的最大矩形是多少
思路:简单的dp,R[i]数组记录第i个矩形往右连续有多少个矩形高度小于等于它,L[i]一样

Read full article from poj 2082 Terrible Sets(dp) | acmj1991


POJ 2559 单调栈 Histogram _人人IT网



题目在http://go.rritw.com/poj.org/problem?id=2559

这个题目是一个好朋友给我讲的方法,我按照自己的理解,敲出来代码。 所以把算法流程和代码贡献出来,希望和大家共同学习。


题目大意:

给出一个柱形统计图(histogram), 它的每个项目的宽度是1, 高度和具体问题有关。 现在编程求出在这个柱形图中的最大面积的长方形。

例如:?

7 2 1 4 5 1 3 3

7表示柱形图有7个数据,分别是 2 1 4 5 1 3 3, 对应的柱形图如下,最后求出来的面积最大的图如右图所示。



分析:?

如果采用枚举的方式,如果当前我们枚举项是 i = 0, 即 height = 2,?

我们用另外两个变量 j 和k 向左和向右两个方向搜素,找到第一个 小于 height的下标,这样我们就找到了用 i 项作为高度长方形了。

例如:

i = 0, j = -1, k = 1, 最后的面积是 (k - j - 1) * height = 2

i = 1, j = -1, k = 7, 最后面积是( k - j - 1) * height = 7;

...

i = 3, j = 2, k = 5 面积是 ( k - j - 1) * height = 8?

枚举出所有的长方形的同时,然后得到最后的面积。

不过这样的程序的时间复杂度是 O(n^2)

我们如何能仅仅做一次,就求出这个面积呢?

观察:

当我们扫扫描到第一个高度 H1 = 2的时候,我可以标记它的起始位置1, 因为我们还不知道它将向右扩展到什么地方,所以继续扫面。

当遇到第二项 H2 = 1, 因为这项比之前的小,我们知道,用H1做高度的长方形结束了,算出它的面积。

同时这个时候,我们多了一个高度H2,用它做长方形高度的长方形起始位置应该是在哪里呢? 因为H1的高度比H2要高,所以这个起始位置自然是H1所在的位置。


为了模拟上面的过程,我们引入单调栈~

我们先定义我们我们要保存的每一项数据

struct Node

{

? ? ? int height;

? ? ? int startPosition;

};

用来描述某一个高度,和这个高度的起始位置。

然后我们按照高度来组织成单调栈。我们来看一下它是如何工作的。

为了不用考虑堆栈为空的情况,我们用插入栈底 一个高度(0, 0)的项。

数据:?

 2 1 4 5 1 3 3

这样初始化

(0 , 0)

I = 1

当扫描到(2, 1)时候,因为高度2 大于栈顶,插入

(0, 0), ?(2, 1)

I = 2:?

当扫描到1的时候,因为1小于栈顶高度2, 我们认为栈顶的那个高度应不能再向右扩展了,所以我们将它弹出

这个时候扫描到 i = 2;

高度是 (i - 1(H1.startIndex)) * H1.height = 2;

我们得到一个面积是2的长方形。

同时我们发现高度是1的当前高度,可以扩展到 H1所在的下标,所以我们插入( 1, 1) 堆栈变成

(0, 0), (1, 1) 因为(2, 1)已经不能向右伸展了,已经被弹出了


i = 3

(0, 0), (1, 1), ( 4 3)

i = 4

(0, 0), (1, 1), (4, 3), (5, 4)

i = 5?

这个时候当前高度小于栈顶高度,我们认为栈顶已经不能向右扩展,所以弹出,并且获得面积 ( i ?- H5.startindex) * H5.height = (5 - 4 ) * 5 = 5

弹出这个元素后,其实(4, 3)的高度也要比 1 大,所以把这个也弹出来,同样方式获得面积 8.

最后我们的堆栈是

(0, 0) , (1, 1)

i ?= 6

(0, 0), (1, 1), ( 3, 6)

i = 7

(0, 0), (1, 1), (3, 6)

i = 8

最后一步是有点特殊的,因为我们必须要把所有的元素都弹出来,因为栈里面的高度,都坚持到了最后,我们要把这些高度组成的长方形拿出来检测。

我们可以假设扫面到8的时候,高度是0,(最小值)

弹出(3,6)获得面积 (8 - 6 ) * 3 = 6

弹出(1, 1)获得面积(8 - 1) * 1 = 7


Read full article from POJ 2559 单调栈 Histogram _人人IT网


poj 2559 DP - 推酷



我想到的是最大的矩形,中间一定有个最矮的某个单位矩形,所以用两个数组记录histogram[i]左右两边第一个比它小的单位矩形的序号leftLowerId[i]和rightLowerId[i],那么对于histogram[i],它自己的最大矩形面积就是(rightLowerId[i] - leftLowerId[i] - 1) *  histogram[i]。

  这里找leftLowerId和rightLowerId的时候用DP加速。以rightLowerId为例,找到右边比histogram[i]矮的矩形,停止,遇到比histogram[i]高的矩形j,直接跳到比histogram[j]矮的矩形rightLowerId[j].


Read full article from poj 2559 DP - 推酷


[POJ 2559] | Eurce's Blog



题意:输入n(<=100,000),然后是n个整数值h[i](<=10^9),每个值按顺序描述了一个宽为1高为h[i]的矩形,矩形间是无缝相邻的,求该柱形图内边与坐标轴平行的矩形的最大面积。

拿到手之后第一想法是二分面积+RMQ判断,结果发现行不通…… 看了discuss可以用栈水过,就大概写了一下。

基本思想是,对于位置i,分别找其左边和右边高度不小于h[i]的最长宽度wl[i]和wr[i],那么h[i]对应的最大矩形面积就是h[i]*(wl[i]+wr[i]-1)。分别求出对于所有n个位置的最大矩形面积就能得到答案了。

具体实现方面,计算wl[i]和wr[i]时,需要维护一个栈stk,记录扫描到i时,到i为止非递减的h[i]序列。例如对于"7 2 1 4 5 1 3 3"的样例,当i=3即h[i]=5时,栈里面分别是1、4、5,同时也需要记录这些值对应的位置。有了该栈,就可以找到刚好不小于h[i]的值stk[l]以及对应的位置si[l],从而算出wl[i]、wr[i]。其中,由于栈有序,可以二分查找,那么计算wl[i]和wr[i]的时间复杂度便是O(lgn),从而总体复杂度就是O(nlgn)的。

还有一些细节:例如总是将栈顶置为最大值,这样二分查找如果失败将返回栈顶位置,方便了处理;另外结果需要用到long long。


Read full article from [POJ 2559] | Eurce's Blog


POJ 2559 Largest Rectangle in a Histogram - 白~ - 博客园



解法一:

解题思路:

  1. 这题的基本思想是找出每一个矩形的左右边界(即左右边第一个比他高的),但如果直接对每一个矩形暴搜会超时。
  2. 于是可以dp一下:如果它原有左边界的左边第一个,比他本身还要高,那么它原有左边界的左边第一个的左边界就是它的左边界。晕了吧~看代码:
    tmp = i;   //先把左边界设为自己 while(arr[i] <= arr[tmp-1] && tmp > 1)      tmp = left[tmp-1];  

    右边界同理

  3. 还要注意一点,就是tmp  > 1 这句一定不能缺,表示如果左边界到了最左边,那么就是它了。

Read full article from POJ 2559 Largest Rectangle in a Histogram - 白~ - 博客园


poj 3468 树状数组解法 - The time is passing - ITeye技术网站



一 算法    

    树状数组天生用来动态维护数组前缀和,其特点是每次更新一个元素的值,查询只能查数组的前缀和,

但这个题目求的是某一区间的数组和,而且要支持批量更新某一区间内元素的值,怎么办呢?实际上,

还是可以把问题转化为求数组的前缀和。

 

    首先,看更新操作update(s, t, d)把区间A[s]...A[t]都增加d,我们引入一个数组delta[i],表示

A[i]...A[n]的共同增量,n是数组的大小。那么update操作可以转化为:

1)令delta[s] = delta[s] + d,表示将A[s]...A[n]同时增加d,但这样A[t+1]...A[n]就多加了d,所以

2)再令delta[t+1] = delta[t+1] - d,表示将A[t+1]...A[n]同时减d

 

    然后来看查询操作query(s, t),求A[s]...A[t]的区间和,转化为求前缀和,设sum[i] = A[1]+...+A[i],则

                            A[s]+...+A[t] = sum[t] - sum[s-1],

那么前缀和sum[x]又如何求呢?它由两部分组成,一是数组的原始和,二是该区间内的累计增量和, 把数组A的原始

值保存在数组org中,并且delta[i]对sum[x]的贡献值为delta[i]*(x+1-i),那么

                            sum[x] = org[1]+...+org[x] + delta[1]*x + delta[2]*(x-1) + delta[3]*(x-2)+...+delta[x]*1

                                         = org[1]+...+org[x] + segma(delta[i]*(x+1-i))

                                         = segma(org[i]) + (x+1)*segma(delta[i]) - segma(delta[i]*i),1 <= i <= x

这其实就是三个数组org[i], delta[i]和delta[i]*i的前缀和,org[i]的前缀和保持不变,事先就可以求出来,delta[i]和

delta[i]*i的前缀和是不断变化的,可以用两个树状数组来维护。

 

    树状数组的解法比朴素线段树快很多,如果把long long变量改成__int64,然后用C提交的话,可以达到1047ms,

排在22名,但很奇怪,如果用long long变量,用gcc提交的话就要慢很多。

 

 

二 代码

 

C代码  收藏代码
  1. #include <stdio.h>  
  2.   
  3. #define DEBUG  
  4.   
  5. #ifdef DEBUG  
  6. #define debug(...) printf( __VA_ARGS__)   
  7. #else  
  8. #define debug(...)  

Basic Binary Indexed Tree (English version) - Codeforces



Advantages of BIT:

            - Use less memory than RMQ

            - Easy to code

            - Can be used in many problems about number sequence

            - Runtime: O(logN)

 

Disadvantages='= BIT is hard to understand  (so that's why I'm writing :D)

 


 

Main content:

            

BIT consists of two operations in an array A[1..N] of numbers

            SET (index, value) : To add value to A[index] (or A[index] += value)

            GET (index) :  To sum up A[1]..A[index ] (or Get A[1] + … + A[index])

 

          

Details:

            We know that a natural number can be expressed as a sum of the powers of 2, for example:

        

            Example 1:

                         22       =          16        +          4          +          2

                                    =          2^4      +          2^2      +          2 ^1

 

            Applying this idea for BIT, we're going to express the sum of A[1]..A[n] as a sum of sub arrays, each of them has 2^k elements

 


How to code:

 

S(i,j) is the sum of A[i]..A[j]  (or S[i,j] = A[i] + A[i+1] +…+ A[j]). So with number 22 in Example 1, expressed as a sum of the powers of 2, is gonna be like this:

 

            S(1,22) = S(1,16) + S(17,20) + S(21,22)

To get the positions of sub arrays, use this formula:  i - i AND (-i) + 1.

Demo.:

            22 - (22 AND (-22)) = 20

            20 - (20 AND (-20)) = 16 

            16 - (16 AND (-16)) = 0

Thus, the structure of BIT T[] will be:

             T[i]: Sum of  S(i - i AND (-i) + 1 , i )


How does GET work?

 

Notice: GET (index) sums up A[1] → A[index].

 

Idea: To get the sum of A[1].. A[index], here we start with A[index] first. Parent of A[index] can be reached by using the following formula: i – i AND (-i)


Pseudo-code:

GET (T,i)

1 s ← 0

2 while i > 0 do

3          s ← s + T[i]

4          i ← i – i and (-i)

5 return s


 

How does SET work?

 

Notice: SET (index,value) adds value units to A[index]

Idea: To increase the value of A[index] with value, is to increase sub arrays which CONTAINS A[index]

Example : We want to add 1 to A[9]. So we also have to add 1 to sub arrays which contains A[9]. Look at the image, here we have A[10], A[12] and A[16] both consist of A[9] J

So how do we find which array contains A[9]?

Use this formula: i + i and (-i)! 

With this relationship, the demonstration of BIT will be opposed to the first one in GET procedure:


 Pseudo-code

 

SET (T,i,v)

1 while i ≤ size[T] do

2          T[i] ← T[i] + v

3          i ← i + i AND (-i) 

Read full article from Basic Binary Indexed Tree (English version) - Codeforces


algorithm - BIT:Using a binary indexed tree? - Stack Overflow



I found this answer by @templatetypedef very clearly explain about the intuition and proof of a binary indexed tree: The answer....

Intuitively, you can think of a binary indexed tree as a compressed representation of a binary tree that is itself an optimization of a standard array representation. This answer goes into one possible derivation.

Let's suppose, for example, that you want to store cumulative frequencies for a total of 7 different elements. You could start off by writing out seven buckets into which the numbers will be distributed:

[   ] [   ] [   ] [   ] [   ] [   ] [   ]    1     2     3     4     5     6     7  

Now, let's suppose that the cumulative frequencies look something like this:

[ 5 ] [ 6 ] [14 ] [25 ] [77 ] [105] [105]    1     2     3     4     5     6     7  

Using this version of the array, you can increment the cumulative frequency of any element by increasing the value of the number stored at that spot, then incrementing the frequencies of everything that come afterwards. For example, to increase the cumulative frequency of 3 by 7, we could add 7 to each element in the array at or after position 3, as shown here:

[ 5 ] [ 6 ] [21 ] [32 ] [84 ] [112] [112]    1     2     3     4     5     6     7  

The problem with this is that it takes O(n) time to do this, which is pretty slow if n is large.

One way that we can think about improving this operation would be to change what we store in the buckets. Rather than storing the cumulative frequency up to the given point, you can instead think of just storing the amount that the current frequency has increased relative to the previous bucket. For example, in our case, we would rewrite the above buckets as follows:

Before:  [ 5 ] [ 6 ] [21 ] [32 ] [84 ] [112] [112]    1     2     3     4     5     6     7    After:  [ +5] [ +1] [+15] [+11] [+52] [+28] [ +0]    1     2     3     4     5     6     7  

Now, we can increment the frequency within a bucket in time O(1) by just adding the appropriate amount to that bucket. However, the total cost of doing a lookup now becomes O(n), since we have to recompute the total in the bucket by summing up the values in all smaller buckets.

The first major insight we need to get from here to a binary indexed tree is the following: rather than continuously recomputing the sum of the array elements that precede a particular element, what if we were to precompute the total sum of all the elements before specific points in the sequence? If we could do that, then we could figure out the cumulative sum at a point by just summing up the right combination of these precomputed sums.

One way to do this is to change the representation from being an array of buckets to being a binary tree of nodes. Each node will be annotated with a value that represents the cumulative sum of all the nodes to the left of that given node. For example, suppose we construct the following binary tree from these nodes:

             4            /     \           2       6          / \     / \         1   3   5   7  

Now, we can augment each node by storing the cumulative sum of all the values including that node and its left subtree. For example, given our values, we would store the following:

Before:  [ +5] [ +1] [+15] [+11] [+52] [+28] [ +0]    1     2     3     4     5     6     7    After:                   4                 [+32]                /     \             2           6           [ +6]       [+80]           /   \       /   \          1     3     5     7        [ +5] [+15] [+52] [ +0]  

Given this tree structure, it's easy to determine the cumulative sum up to a point. The idea is the following: we maintain a counter, initially 0, then do a normal binary search up until we find the node in question. As we do so, we also the following: any time that we move right, we also add in the current value to the counter.

For example, suppose we want to look up the sum for 3. To do so, we do the following:

  • Start at the root (4). Counter is 0.
  • Go left to node (2). Counter is 0.
  • Go right to node (3). Counter is 0 + 6 = 6.
  • Find node (3). Counter is 6 + 15 = 21.

You could imagine also running this process in reverse: starting at a given node, initialize the counter to that node's value, then walk up the tree to the root. Any time you follow a right child link upward, add in the value at the node you arrive at. For example, to find the frequency for 3, we could do the following:

  • Start at node (3). Counter is 15.
  • Go upward to node (2). Counter is 15 + 6 = 21.
  • Go upward to node (1). Counter is 21.

To increment the frequency of a node (and, implicitly, the frequencies of all nodes that come after it), we need to update the set of nodes in the tree that include that node in its left subtree. To do this, we do the following: increment the frequency for that node, then start walking up to the root of the tree. Any time you follow a link that takes you up as a left child, increment the frequency of the node you encounter by adding in the current value.

For example, to increment the frequency of node 1 by five, we would do the following:

                 4                 [+32]                /     \             2           6           [ +6]       [+80]           /   \       /   \        > 1     3     5     7        [ +5] [+15] [+52] [ +0]  

Starting at node 1, increment its frequency by 5 to get

                 4                 [+32]                /     \             2           6           [ +6]       [+80]           /   \       /   \        > 1     3     5     7        [+10] [+15] [+52] [ +0]  

Now, go to its parent:

                 4                 [+32]                /     \           > 2           6           [ +6]       [+80]           /   \       /   \          1     3     5     7        [+10] [+15] [+52] [ +0]  

We followed a left child link upward, so we increment this node's frequency as well:

                 4                 [+32]                /     \           > 2           6           [+11]       [+80]           /   \       /   \          1     3     5     7        [+10] [+15] [+52] [ +0]  

We now go to its parent:

               > 4                 [+32]                /     \             2           6           [+11]       [+80]           /   \       /   \          1     3     5     7        [+10] [+15] [+52] [ +0]  

That was a left child link, so we increment this node as well:

                 4                 [+37]                /     \             2           6           [+11]       [+80]           /   \       /   \          1     3     5     7        [+10] [+15] [+52] [ +0]  

And now we're done!

The final step is to convert from this to a binary indexed tree, and this is where we get to do some fun things with binary numbers. Let's rewrite each bucket index in this tree in binary:

                100                 [+37]                /     \            010         110           [+11]       [+80]           /   \       /   \         001   011   101   111        [+10] [+15] [+52] [ +0]  

Here, we can make a very, very cool observation. Take any of these binary numbers and find the very last 1 that was set in the number, then drop that bit off, along with all the bits that come after it. You are now left with the following:

              (empty)                 [+37]                /     \             0           1           [+11]       [+80]           /   \       /   \          00   01     10   11        [+10] [+15] [+52] [ +0]  

Here is a really, really cool observation: if you treat 0 to mean "left" and 1 to mean "right," the remaining bits on each number spell out exactly how to start at the root and then walk down to that number. For example, node 5 has binary pattern 101. The last 1 is the final bit, so we drop that to get 10. Indeed, if you start at the root, go right (1), then go left (0), you end up at node 5!

The reason that this is significant is that our lookup and update operations depend on the access path from the node back up to the root and whether we're following left or right child links. For example, during a lookup, we just care about the left links we follow. During an update, we just care about the right links we follow. This binary indexed tree does all of this super efficiently by just using the bits in the index.

The key trick is the following property of this perfect binary tree:

Given node n, the next node on the access path back up to the root in which we go right is given by taking the binary representation of n and removing the last 1.

For example, take a look at the access path for node 7, which is 111. The nodes on the access path to the root that we take that involve following a right pointer upward is

  • Node 7: 111
  • Node 6: 110
  • Node 4: 100

All of these are right links. If we take the access path for node 3, which is 011, and look at the nodes where we go right, we get

  • Node 3: 011
  • Node 2: 010
  • (Node 4: 100, which follows a left link)

This means that we can very, very efficiently compute the cumulative sum up to a node as follows:

  • Write out node n in binary.
  • Set the counter to 0.
  • Repeat the following while n ≠ 0:
    • Add in the value at node n.
    • Remove the leftmost 1 bit from n.

Similarly, let's think about how we would do an update step. To do this, we would want to follow the access path back up to the root, updating all nodes where we followed a left link upward. We can do this by essentially doing the above algorithm, but switching all 1's to 0's and 0's to 1's.

The final step in the binary indexed tree is to note that because of this bitwise trickery, we don't even need to have the tree stored explicitly anymore. We can just store all the nodes in an array of length n, then use the bitwise twiddling techniques to navigate the tree implicitly. In fact, that's exactly what the bitwise indexed tree does - it stores the nodes in an array, then uses these bitwise tricks to efficiently simulate walking upward in this tree.


Read full article from algorithm - BIT:Using a binary indexed tree? - Stack Overflow


Bit Indexed Tree (Fenwick Tree) | Gautam's personal blog



Bit Indexed Tree or Fenwick Tree is a complex data structure that is used to calculate the cumulative frequency of a data set.  It was invented by Peter Fenwick, hence the name Fenwick Tree.

To understand what is the purpose for this type of a data structure, first we have to first understand what is the problem it is trying to solve. Let us take up the example of web statistics of a blog or web site. Let us say that there are 8 pages and and in a array we update number of visits for each page. We make two queries on this data,

1. Update or Add new frequency to a page.

add(int index, int freq)

2. Get the the cumulative sum of frequencies from 1 to a given index,

sum(int index)

First naive approach could be to just store the frequencies as shown below,

INDEX 1 2 3 4 5 6 7 8
FREQUENCY 4 2 0 5 1 3 8 1

In this approach the add query would be O(1) and sum query would be O(n).

Another naive approach could be to keep track of the cumulative sum for each add or update,

INDEX 1 2 3 4 5 6 7 8
FREQUENCY 4 2 0 5 1 3 8 1
CUM SUM 4 6 6 11 12 15 23 24

In the table above the first row shows the page indexes (total 8 pages). The 2nd row shows the number of hits on each page (frequency) at a particular time and the 3rd row shows the cumulative hit (sum2=feq1+feq2, sum3=sum2+feq3 etc).

Now if we need to update the frequency of the 6th page, index 6 from 3 to 9, we just have to update the 2nd row 6th index to 9. But updating the 3rd row is not easy, we have to update all the cells in the 3rd row after 6th.

In this case add would be O(n) and sum would be O(1). Not good enough.

Basic Concept

Each integer can be represented as sum of powers of two.

5 = 2^2+2^1

In the same way, cumulative frequency can be represented as sum of sets of subfrequencies. In our case, each set contains some successive number of non-overlapping frequencies.

We can create a bit indexed tree in a one dimensional array from the original array which contains the frequency of each index.

Original ARRAY

 index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
frequency 1 0 2 1 1 3 0 4 2 5 2 2 3 1 0 2

TRANSFORM TO FENWICK TREE

 index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Fenwick tree 1 1 2 4 1 4 0 12 2 7 2 11 3 4 0 29

The table below shows the concept in detail,

First row is the index, 2nd row shows the frequency and 3rd row shows the cumulative sum.  The fourth row shows the values for each of the index of the Fenwick Tree. The fifth row explains the concept of the tree, that the element in the first index is responsible for 1 position. Element in 2nd index is responsible for two positions (1 and 2), Element in the 3rd position is responsible for 1 position (3) only. Like that the element in 4th position is responsible for 4 positions (1 to 4).

 index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
frequency 1 0 2 1 1 3 0 4 2 5 2 2 3 1 0 2
cumulative sum 1 1 3 4 5 8 8 12 14 19 21 23 26 27 27 29
Fenwick tree 1 1 2 4 1 4 0 12 2 7 2 11 3 4 0 29
responsible for 1 1..2 3 1..4 5 5..6 7 1..8 9 9..10 11 9..12 13 13..14 15 1..16

The diagram below show the responsibility of each index in the Fenwick Tree,

Suppose we are looking for cumulative frequency of index 13 (for the first 13 elements). In binary notation, 13 is equal to 1101. Accordingly, we will calculate c[1101] = tree[1101] + tree[1100] + tree[1000].

So basically we need to  find out the last non zero digit from the binary number of the index. A very detailed explanation is provided in the original article in top coder here, this is a must read.

Here is java code to build the Fenwick Tree from a frequency set,

    public static int[] buildFenwickTree(int[] A){          int[] fenwickTree = new int[A.length];            for (int i = 1; i < = A.length; i++) {              int idx=i;              do{                  fenwickTree[idx-1] += A[i-1];                  idx += (idx & -idx);              }while (idx < = A.length && idx>0);          }          return fenwickTree;      }

Read cumulative frequency
If we want to read cumulative frequency for some integer idx, we add to sum tree[idx], substract last bit of idx from itself (also we can write – remove the last digit; change the last digit to zero) and repeat this while idx is greater than zero.

    public static int read(int idx){          int sum = 0;          while (idx > 0){              sum = sum + tree[idx];              int x = idx & -idx;              idx = idx - x;          }          return sum;      }

Similarly to update a frequency following method could be used,

    public static void update(int idx ,int val){          while (idx < tree.length){              tree[idx] += val;              idx += (idx & -idx);          }      }

If we want to find the actual frequency in a index, we could preserve the original array and return the value in the given index. But in this case we have compromise on space. We could calculate the actual frequency efficiently from the Fenwick Tree itself. The detailed explanation is provided in the original tutorial in the top coder site could be found here.


Read full article from Bit Indexed Tree (Fenwick Tree) | Gautam's personal blog


bufferedBytes!: Binary Indexed Tree- Fenwick Tree



INTRODUCTION

The Binary Indexed Tree or the Fenwick Tree as it is alternatively known is a very useful data structure that has wide uses in various day to day applications. BIT is very useful for solving queries of a particular type, Let us suppose we have an array of elements and we need to do two types of queries frequently.
  1. Change the value of any element in the list.
  2. Get the sum of all the elements in any range within the given list
The naive solution has time complexity of O(1) for query 1 and O(n) for query 2. Suppose we make m queries. The worst case (when all queries are 2) has time complexity O(n * m). Binary Indexed Trees are easy to code and have worst time complexity O(m log n). i.e each query of the type 2 takes at most O(log n)


Now, Let us consider the elements of an array as frequencies e.g. arr[1]=5,arr[2]=6 etc, They can have any values, the term frequency is just meant to use as a notation.

Let us consider an initial array of size N(can be any valid size within Integer Range).We can easily construct another array of similar size N with values which is used to hold the cumulative values e.g.




Responsibility

The underlying working principle of the Binary Indexed Tree is achieved by the help of responsibility array. It's an array of size>=N size where each index holds some specific values (called the responsibility sum).
Let us declare int res[N];

Basic Idea

We know that each integer can be represented by sum of Powers of 2  e.g.
  • 9=8+1   ----------> 1001(Binary Notation)
  • 37=32+4+1 ----------> 00100101(Binary Notation)
  • 15=8+4+2+1 ----------> 1111(Binary Notation)
We can see that the maximum number of 1 in the binary notation of 15 is ceil(log215)=4 ,similarly for 37 and 9.
i.e for 37 ceil(log237)=6 
    for 9 ceil(log29)=4

So, number of digits of any number in 2's power form would be log2N Presto, this is what exactly the property being made use in BIT.

Similarly, instead of storing the cumulative frequencies for the entire array we can store the (sum of some sub frequencies(Not the entire values from 0 to that index)) in particular indexes, the purpose of which will be a bit clear in a short while.

  • Let idx be some index(not value) of the array of size N (therefore,0<=idx<=N)
  • Let r be the position of the last occurrence of 1 in the binary notation of idx from left  to right.(See the example below) (therefore, 1<=r<=log2N (
The responsibility range is the range of (idx-2^r+1)    to   idx (inclusive). And the responsibility array contains the sum of all the index in the frequency array corresponding to the range
e.g.

Let us see an example here,let idx=12,(1100),hence r=2 and thus res[12] ,  the range of indexes of the frequency array covered by it is , [12-(2^2)+1 to 12] i.e [9-12] hence ,
res[12]=(frequency[9]+frequency[10]+frequency[11]+frequency[12])





Reading Cumulative value at any position. 

 In order to read the cumulative value at any index e.g  13 we just need to remove the last one bit(i.e r) each time from the idx till the value is >0
e.g.

13th index(1101) = res[13]+res[12]+res[8 ]
                  1101 + 1100    +1000(here we are removing the last bit having  1 at each iteration)
We can also check from the above table e.g. cumulative value of 6th index (0110)=res[6]+res[4]=9+11=20
i.e 0110 =6
     0100 =4

Reading Logic


Now we need to find a fast way of finding the r at each step. This can be easily done using 
Val=(num & -num); where val is the value obtained after making rth bit 0 and num is the initial number

The proof is as follows:

 Let num be the integer whose last digit(means r) we want to isolate. In binary notation num can be represented as a1b, where a represents binary digits before the last digit and b represents zeroes after the last digit.

Integer  num is equal to (a1b)¯ + 1 = a¯0b¯ + 1.    b consists of all zeroes, so b¯ consists of all ones. Finally we have
-num = (a1b)¯ + 1 
= a¯0b¯ + 1 
= a¯0(0...0)¯ + 1
= a¯0(1...1) + 1 
= a¯1(0...0) 
= a¯1b.
Now, we can easily isolate the last digit, using bitwise operator AND (in C++, Java it is &) with num and -num:

           a1b
&      a¯1b
--------------------
= (0...0)1(0...0)

Below is the code for querying the cumulative value at any index.


The number of iterations in this function is number of  bits in idx, which is at most log N.
Time complexity: O(log N).
Code length: Up to ten lines.



Data hosted with ♥ by Pastebin.com - Download Raw - See Original
  1. int read(int idx){
  2.         int sum = 0;
  3.         while (idx > 0){
  4.                 sum += res[idx];
  5.                 idx -= (idx & -idx);
  6.         }
  7.         return sum;
  8. }
So,in order to get the sum between say two points a,b, we do read(b)-read(a-1) [we have used a-1 to include the ath value as well] and for getting the actual position at any index b just read(b)-read(b-1)



Change frequency at some position and update tree

The concept is to update tree frequency at all indexes which are responsible for frequency whose value we are changing. In reading cumulative frequency at some index, we were removing the last bit and going on. In changing some frequency val in tree, we should increment value at the current index (the starting index is always the one whose frequency is changed) for val, add the last digit to index and go on while the index is less than or equal to N. Function in C.



Data hosted with ♥ by Pastebin.com - Download Raw - See Original
  1. void update(int idx ,int val){
  2.         while (idx <= N){
  3.                 res[idx] += val;
  4.                 idx += (idx & -idx);
  5.         }
  6. }



Change frequency at some given range with the same value and update the array



Data hosted with ♥ by Pastebin.com - Download Raw - See Original
  1. void updateRange( int start, int end, int value ) {
  2.   update( start, value );
  3.   update( end+1, -value );
  4. }




When incrementing frequencies in range [start..end], we just increment difference at index start  and decrement difference at index end+1. This is also like saying: increment all frequencies in range [a..infinity] and decrement all frequencies in range [b+1..infinity].


Above function shows how to initialise the BIT in an array called res[] and N=number of elements + 1and arr is the initial frequency array

Read full article from bufferedBytes!: Binary Indexed Tree- Fenwick Tree


Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts