Standard Deviation and Variance



Standard Deviation and Variance
Standard DeviationThe Standard Deviation is a measure of how spread out numbers are.
Its symbol is σ (the greek letter sigma)
The formula is easy: it is the square root of the Variance. So now you ask, "What is the Variance?"
Variance
The average of the squared differences from the Mean.
  •  for each number: subtract the Mean and square the result (the squared difference).
  • Then work out the average of those squared differences.
  • Now we calculate each dog's difference from the Mean:
    To calculate the Variance, take each difference, square it, and then average the result:
    So, the Variance is 21,704.
    And the Standard Deviation is just the square root of Variance, so:
    Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm)
    And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean:
    So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small.
    But if the data is a Sample (a selection taken from a bigger Population), then the calculation changes!
    When you have "N" data values that are:
    • The Population: divide by N when calculating Variance (like we did)
    • A Sample: divide by N-1 when calculating Variance
    All other calculations stay the same, including how we calculated the mean.
    Example: if our 5 dogs were just a sample of a bigger population of dogs, we would divide by 4 instead of 5 like this:
    Sample Variance = 108,520 / 4 = 27,130
    Sample Standard Deviation = √27,130 = 164 (to the nearest mm)
    Think of it as a "correction" when your data is only a sample.
    That is nice! The Standard Deviation is bigger when the differences are more spread out ... just what we want!
    Read full article from Standard Deviation and Variance

    No comments:

    Post a Comment

    Labels

    Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

    Popular Posts