Put simply you give a hash function an item of data x and it returns a number h(x). You can use this number for all sorts of things but in general a hash function has a number of desirable properties. The first is that h(x), the hash value should be smaller in the sense it takes less storage than the original data x. The second is that if you have a a set of data items then h(x) should spread the data items out over the range of possible hash values i.e. the relationship between x and h(x) should not be too regular.
Typically hash functions are used for storage, store x in Data[h(x)], or for security, if h(x) changes then x has been changed.
If you are using a hash function for security then you need a higher grade of hash function - a cryptographic hash. The hash function described in the example can be extended to a cryptographic hash but in its current form it is suitable for use as a storage hash.
Notice that as a hash function "condenses" down an item of data to a smaller number of possible hash values it is not unique. That is, for any hash function there will be usually quite a lot of other data values, y say, for which h(y)=h(x). This is often called a collision and it is perfectly natural for a hash function and all hashing algorithms have to deal with the situation in one way or another.
Read full article from Universal Hashing
No comments:
Post a Comment