1
2
3
4
5
#

R - Histograms

Histograms are graphs that represent the distribution of data using bars. Each bar (called a “bin”) represents one interval of data. Smaller intervals means more bins are used in the graph, while larger intervals means less bins are used. The height of each bar corresponds to how frequently data that falls in that bin (within that interval) appears.

Constructing Histograms in R

For a set of data x, the basic command to create a histogram is:

hist(x)

 R Code   x=c(10, 10, 12, 12, 8, 4, 7, 8, 9, 10) hist(x) Output

Bins

To define the number of bins in your histogram:

hist(x, breaks=numberofbins)
 R Code   x=c(10,10,12,12,8,4,7,8,9,10) hist(x, breaks = 8) Output

To change the y-axis from frequency to probability:

hist(x, prob=T)

 R Code   a = rnorm(1000, 100, 5) hist(a, prob = T) Output

lines(density(x, bw=number)

 R Code   hist(a, prob = T) lines(density(a, bw=1.5), col= “blue”) Output

Example 1:

 RStudio Code   hist(Orange\$circumference, xlab = "Circumference (mm)", ylab = "Number of Trees", main = "Circumference of Orange Trees", col = "orange", breaks=10) Output

Example 2:

 RStudio Code   x=c(25,25,25,25,25,20,45,45,30,50,15,30,30,30, 30,25,50, 25,25,30,25,20,30,30,15) hist(x, xlab= “Time (minutes)”, ylab= “Number of People”, main= “Time it Takes to Get to School”, col= “cyan1”, breaks=7) Output

Example 3:

 RStudio Code   hist(a, xlab = "Age (years)", prob = T, main = "Ages of High School Seniors", col = "violet") lines(density(a, bw=0.10), col="navy", lwd = 4) Output