Pie Charts In R With ggplot2

in #programming9 years ago (edited)

Hi there. This post will be about creating pie charts in R with the ggplot2 data visualization package. You will learn to make pie charts that would like something like this.

pieChart.png

Loading Libraries Into R


To start off, load in the ggplot2 package and the dplyr (data manipulation) package into R. (Use the install.packages("package_name") code for installation.)

# Pie Charts In R:

# Example: Made Up Survey Data (Favourite Subject):

# Reference: 
# http://www.sthda.com/english/wiki/ggplot2-pie-chart-quick-start-guide-r-software-and-data-visualization
# http://stackoverflow.com/questions/20442693/how-to-use-ggplot2-to-generate-a-pie-graph

library(ggplot2)
library(dplyr)

Survey Results Dataset


As an example, I make up a (fake) sample data set from a survey. This survey is based on favourite subjects selected by students.


subjects <- c("Biology", "Business", "Physics", "Chemistry", "History", "Math")

counts <- c(9, 16, 7, 10, 10, 8)

subjects_table <- data.frame(subjects, counts) # Create data frame

It is a good idea to check your work along the way to make sure things are okay.

output01.PNG

The number of survey participants (sample size) can be determined by taking the sum of the counts from the second column.

# Total Counts In Survey:

subjects_total <- sum(subjects_table[,2])

subjects_total

output02.PNG

Setting Up Pie Chart Labels


This next section is about setting up labels for the pie chart. You could just produce a pie chart without any labels but it would not be informative for the viewer. Labels with just percentages is not good enough since percentages can be misleading. I have decided to create labels with both percentages and their associated counts to make it clear for the viewer.

(This %>% pipe operator from the dplyr R pacakge is a shortcut for mutate(data = subjects_table, ...). )

output03.PNG

  • I turn each of the subjects into a factor. Subject is a categorical variable.
  • The cumulative column is a running total of the counts from the top of the table to the bottom.
  • Midpoint is for position the labels in the middle of the pie piece.
  • The labels column contains the percentage and the count in brackets. Percentages are from the count divided by the number of people in the survey (subjects_total variable).
  • The paste0() command is useful for combing variables and text characters.
  • The mutate() function from dplyr allows the user to add columns depending on what is specified.

Plotting The Survey Results


When it comes to creating pie charts in R and ggplot2 you need to start with a bar graph from geom_bar() after the initial ggplot() function. To convert this bar graph into a circular pie chart you would use coord_polar(theta = "y", start = 0) on top of geom_bar().

The rest of the ggplot code below deals with colours, labels, the placement of labels and modifying the aesthetics.

  • scale_fill_manual() is for the colours
  • labs() is for the title, the legend and labelling the axes
  • geom_text() is for placing the labels which were created earlier
  • The theme() code section is for centering the title and changing the title's size and font.
  • HTML colour codes can be used as well. As an example I use #AD7366 to get a chocolate like colour.

output04.PNG

pieChart.png


Edit (June 28, 2017): Fixed a typo and added an explanation on the mutate() function.

References & Resources


Sort:  

Thank you for this high-quality article. Resteemed.

Thank you for the comment. I put an effort in what I do.

Thanks! upvoted.

!-=o0o=-!

To follow curated math content follow @math-trail.
If you wish @math-trail to follow you then read this article.

Coin Marketplace

STEEM 0.04
TRX 0.32
JST 0.074
BTC 64559.21
ETH 1683.31
USDT 1.00
SBD 0.41