Skewness & Kurtosis In R

in #statistics6 years ago (edited)

 

Hi there. This post is on the statistics topics of skewness and kurtosis. For simplicity, math formulas for sample skewness and sample kurtosis are not presented.

 

Skewness

 

Skewness is a measure of asymmetry and there are three types.

 

Zero skewness implies symmetry. This is where the mean, median and mode are all equal. A probability distribution that has symmetry at its mean is the ever famous normal distribution.

A skewness that is non-negative or above zero implies a positive or a right skew. This is when the mode is less than the median which is less than the mean.

 

Mode < Median < Mean

 

A below zero skewness implies a negative or left skew of the frequency distribution. The mean is less than the median which is less than the mode.

 

Mean < Median < Mode

 

Notice how the median is in the middle of each of the inequalities above hence the word median. Due to this, you can just compare only the mean and mode.

The visual below summarizes skewness:

 

Image from: http://www.managedfuturesinvesting.com/managed-futures/news/aisource-news/2015/10/13/what-is-skewness

 

Kurtosis


 

Kurtosis is a measure of the peakedness of a probability/frequency distribution.A diagram will be given below to help visualize this concept.

When it comes to kurtosis there are three types of kurtosis and the concept of excess kurtosis. Excess kurtosis can be represented as a formula as:

 

Excess Kurtosis = Kurtosis - 3

 

The (famous) Normal Distribution or Gaussian has a Kurtosis of three or an excess kurtosis of zero. We also say that the normal distribution is mesokurtic.

A leptokurtic distribution has a higher peak than a normal distribution. The kurtosis is above three and the excess kurtosis is above 0.

A platykurtic distribution has a shorter peak than a normal distribution. The kurtosis is below three and the excess kurtosis is below 0.

The following diagram is a great visual which summarizes the concept of kurtosis.

 

Image Link

 

Example In R


 

In the R example, the focus is on (fake) test scores from a math test.

 

grades <- c(68, 92, 55, 80, 81, 40, 39, 51, 88, 65, 67, 59, 77, 91, 84, 70, 62, 68, 79, 81, 89, 94, 91, 55, 57, 48, 20, 83, 72, 71, 70, 89, 91, 93, 66, 55, 52, 67, 78, 33, 46, 80)

hist(grades)

 

hist_summary.JPG

 

You can find the mean() and median() in R of the sample data.

 

mean(grades)

## [1] 68.97619

median(grades)

## [1] 70

 

To find the skewness or kurtosis of a sample dataset, you can use the moments package in R to run the skewness() and kurtosis().

 

# Load moments package for skewness & kurtosis:
# Use install.packages("moments") to install the package.

library(moments)

skewness(grades)

## [1] -0.6095399

kurtosis(grades)

## [1] 2.791072

 

The skewness from the sample data is negative which does make sense as our median being larger than the mean.


References

Coin Marketplace

STEEM 0.17
TRX 0.15
JST 0.028
BTC 62014.91
ETH 2410.27
USDT 1.00
SBD 2.50