INTRODUCTION
The Binary Indexed Tree or the Fenwick Tree as it is alternatively known is a very useful data structure that has wide uses in various day to day applications. BIT is very useful for solving queries of a particular type, Let us suppose we have an array of elements and we need to do two types of queries frequently.
- Change the value of any element in the list.
- Get the sum of all the elements in any range within the given list
The naive solution has time complexity of O(1) for query 1 and O(n) for query 2. Suppose we make m queries. The worst case (when all queries are 2) has time complexity O(n * m). Binary Indexed Trees are easy to code and have worst time complexity O(m log n). i.e each query of the type 2 takes at most O(log n)
Now, Let us consider the elements of an array as frequencies e.g. arr[1]=5,arr[2]=6 etc, They can have any values, the term frequency is just meant to use as a notation.
Let us consider an initial array of size N(can be any valid size within Integer Range).We can easily construct another array of similar size N with values which is used to hold the cumulative values e.g.
Let us see an example here,let idx=12,(1100),hence r=2 and thus res[12] , the range of indexes of the frequency array covered by it is , [12-(2^2)+1 to 12] i.e [9-12] hence ,
res[12]=(frequency[9]+frequency[10]+frequency[11]+frequency[12])
e.g.
13th index(1101) = res[13]+res[12]+res[8 ]
1101 + 1100 +1000(here we are removing the last bit having 1 at each iteration)
We can also check from the above table e.g. cumulative value of 6th index (0110)=res[6]+res[4]=9+11=20
i.e 0110 =6
0100 =4
Reading Logic
The number of iterations in this function is number of bits in idx, which is at most log N.
Time complexity: O(log N).
Code length: Up to ten lines.
So,in order to get the sum between say two points a,b, we do read(b)-read(a-1) [we have used a-1 to include the ath value as well] and for getting the actual position at any index b just read(b)-read(b-1)
When incrementing frequencies in range [start..end], we just increment difference at index start and decrement difference at index end+1. This is also like saying: increment all frequencies in range [a..infinity] and decrement all frequencies in range [b+1..infinity].
Above function shows how to initialise the BIT in an array called res[] and N=number of elements + 1and arr is the initial frequency array
Now, Let us consider the elements of an array as frequencies e.g. arr[1]=5,arr[2]=6 etc, They can have any values, the term frequency is just meant to use as a notation.
Let us consider an initial array of size N(can be any valid size within Integer Range).We can easily construct another array of similar size N with values which is used to hold the cumulative values e.g.
Responsibility
The underlying working principle of the Binary Indexed Tree is achieved by the help of responsibility array. It's an array of size>=N size where each index holds some specific values (called the responsibility sum).
Let us declare int res[N];
Basic Idea
We know that each integer can be represented by sum of Powers of 2 e.g.
- 9=8+1 ----------> 1001(Binary Notation)
- 37=32+4+1 ----------> 00100101(Binary Notation)
- 15=8+4+2+1 ----------> 1111(Binary Notation)
We can see that the maximum number of 1 in the binary notation of 15 is ceil(log215)=4 ,similarly for 37 and 9.
i.e for 37 ceil(log237)=6
for 9 ceil(log29)=4
So, number of digits of any number in 2's power form would be log2N Presto, this is what exactly the property being made use in BIT.
Similarly, instead of storing the cumulative frequencies for the entire array we can store the (sum of some sub frequencies(Not the entire values from 0 to that index)) in particular indexes, the purpose of which will be a bit clear in a short while.
e.g.
i.e for 37 ceil(log237)=6
for 9 ceil(log29)=4
So, number of digits of any number in 2's power form would be log2N Presto, this is what exactly the property being made use in BIT.
Similarly, instead of storing the cumulative frequencies for the entire array we can store the (sum of some sub frequencies(Not the entire values from 0 to that index)) in particular indexes, the purpose of which will be a bit clear in a short while.
- Let idx be some index(not value) of the array of size N (therefore,0<=idx<=N)
- Let r be the position of the last occurrence of 1 in the binary notation of idx from left to right.(See the example below) (therefore, 1<=r<=log2N (
e.g.
Let us see an example here,let idx=12,(1100),hence r=2 and thus res[12] , the range of indexes of the frequency array covered by it is , [12-(2^2)+1 to 12] i.e [9-12] hence ,
res[12]=(frequency[9]+frequency[10]+frequency[11]+frequency[12])
Reading Cumulative value at any position.
In order to read the cumulative value at any index e.g 13 we just need to remove the last one bit(i.e r) each time from the idx till the value is >0e.g.
13th index(1101) = res[13]+res[12]+res[8 ]
1101 + 1100 +1000(here we are removing the last bit having 1 at each iteration)
We can also check from the above table e.g. cumulative value of 6th index (0110)=res[6]+res[4]=9+11=20
i.e 0110 =6
0100 =4
Reading Logic
Now we need to find a fast way of finding the r at each step. This can be easily done using
Val=(num & -num); where val is the value obtained after making rth bit 0 and num is the initial number
Val=(num & -num); where val is the value obtained after making rth bit 0 and num is the initial number
The proof is as follows:
Let num be the integer whose last digit(means r) we want to isolate. In binary notation num can be represented as a1b, where a represents binary digits before the last digit and b represents zeroes after the last digit.
Integer num is equal to (a1b)¯ + 1 = a¯0b¯ + 1. b consists of all zeroes, so b¯ consists of all ones. Finally we have
-num = (a1b)¯ + 1
= a¯0b¯ + 1
= a¯0(0...0)¯ + 1
= a¯0(1...1) + 1
= a¯1(0...0)
= a¯1b.
= a¯0b¯ + 1
= a¯0(0...0)¯ + 1
= a¯0(1...1) + 1
= a¯1(0...0)
= a¯1b.
Now, we can easily isolate the last digit, using bitwise operator AND (in C++, Java it is &) with num and -num:
a1b
& a¯1b
--------------------
= (0...0)1(0...0)
& a¯1b
--------------------
= (0...0)1(0...0)
Below is the code for querying the cumulative value at any index.
The number of iterations in this function is number of bits in idx, which is at most log N.
Time complexity: O(log N).
Code length: Up to ten lines.
So,in order to get the sum between say two points a,b, we do read(b)-read(a-1) [we have used a-1 to include the ath value as well] and for getting the actual position at any index b just read(b)-read(b-1)
Change frequency at some position and update tree
The concept is to update tree frequency at all indexes which are responsible for frequency whose value we are changing. In reading cumulative frequency at some index, we were removing the last bit and going on. In changing some frequency val in tree, we should increment value at the current index (the starting index is always the one whose frequency is changed) for val, add the last digit to index and go on while the index is less than or equal to N. Function in C.
Change frequency at some given range with the same value and update the array
When incrementing frequencies in range [start..end], we just increment difference at index start and decrement difference at index end+1. This is also like saying: increment all frequencies in range [a..infinity] and decrement all frequencies in range [b+1..infinity].
Above function shows how to initialise the BIT in an array called res[] and N=number of elements + 1and arr is the initial frequency array
Read full article from bufferedBytes!: Binary Indexed Tree- Fenwick Tree
No comments:
Post a Comment