random
distribution – normal
distribution – exponential
distribution – log-normal
distribution – logistic
distribution – geometric
distribution – Poisson
distribution – t
distribution – f
distribution – beta
distribution – weibull
distribution – binomial
distribution – negative
binomial
distribution – chi-squared
distribution – uniform
distribution – gamma
distribution – cauchy
distribution – hypergeometric
Table 8.1: Common distributions and their names in R.
Distribution | R-name | Distribution | R-name |
Normal | norm | Weibull | weibull |
Exponential exp | Binomial | binom | |
Log-normal | lnorm | Negative binomial | nbinom |
Logistic | logis | χ 2 | chisq |
Geometric | geom | Uniform | unif |
Poisson | pois | Gamma | gamma |
t | t | Cauchy | cauchy |
f | f | Hypergeometric | hyper |
Beta | beta |
As all distributions work in a very similar way, we use the normal distribution to show how the logic works.
8.4.1 Normal Distribution
distribution – normal
One of the most quintessential distributions is the Gaussian distribution or Normal distribution. Its probability density function resembles a bell. The centre of the curve is the mean of the data set. In the graph, 50% of values lie to the left of the mean and the other 50% lie to the right of the graph.
The Normal Distribution in R
R has four built-in functions to work with the normal distribution. They are described below.
dnorm(x, mean, sd): The height of the probability distribution
pnorm(x, mean, sd): The cumulative distribution function (the probability of the observation to be lower than x)
dnorm()
pnorm()
qnorm(p, mean, sd): Gives a number whose cumulative value matches the given probability value p
rnorm(n, mean, sd): Generates normally distributed variables,
qnorm()
rnorm()
with
x: A vector of numbers
p: A vector of probabilities
n: The number of observations(sample size)
mean: The mean value of the sample data (default is zero)
sd: The standard deviation (default is 1).
Illustrating the Normal Distribution
In the following example we generate data with the random generator function rnorm()
and then compare the histogramof that data with the ideal probability density function of the Normal distribution. The output of the following code is Figure 8.1 on this page.
Figure 8.1: A comparison between a set of random numbers drawn from the normal distribution (khaki) and the theoretical shape of the normal distribution in blue.
obs <- rnorm(600,10,3) hist(obs,col=“khaki3”,freq=FALSE) x <- seq(from=,to=20,by=0.001) lines(x, dnorm(x,10,3),col=“blue”,lwd=4)
Case Study: Returns on the Stock Exchange
In this simple illustration, we will compare the returns of the index S&P500 to the Normal distribution. The output of the following code is Figure 8.2 on this page.
Figure 8.2: The same plot for the returns of the SP500 index seems acceptable, though there are outliers (where the normal distribution converges fast to zero).
library(MASS) ## ## Attaching package: ‘MASS’ ## The following object is masked from ‘package:dplyr’:## ##select hist(SP500,col=“khaki3”,freq=FALSE,border=“khaki3”) x <- seq(from=-5,to=5,by=0.001)