Pie Charts In R With ggplot2
Hi there. This post will be about creating pie charts in R with the ggplot2 data visualization package. You will learn to make pie charts that would like something like this.
Loading Libraries Into R
To start off, load in the ggplot2 package and the dplyr (data manipulation) package into R. (Use the install.packages("package_name") code for installation.)
# Pie Charts In R:
# Example: Made Up Survey Data (Favourite Subject):
# Reference:
# http://www.sthda.com/english/wiki/ggplot2-pie-chart-quick-start-guide-r-software-and-data-visualization
# http://stackoverflow.com/questions/20442693/how-to-use-ggplot2-to-generate-a-pie-graph
library(ggplot2)
library(dplyr)
Survey Results Dataset
As an example, I make up a (fake) sample data set from a survey. This survey is based on favourite subjects selected by students.
subjects <- c("Biology", "Business", "Physics", "Chemistry", "History", "Math")
counts <- c(9, 16, 7, 10, 10, 8)
subjects_table <- data.frame(subjects, counts) # Create data frame
It is a good idea to check your work along the way to make sure things are okay.
The number of survey participants (sample size) can be determined by taking the sum of the counts from the second column.
# Total Counts In Survey:
subjects_total <- sum(subjects_table[,2])
subjects_total
Setting Up Pie Chart Labels
This next section is about setting up labels for the pie chart. You could just produce a pie chart without any labels but it would not be informative for the viewer. Labels with just percentages is not good enough since percentages can be misleading. I have decided to create labels with both percentages and their associated counts to make it clear for the viewer.
(This %>% pipe operator from the dplyr R pacakge is a shortcut for mutate(data = subjects_table, ...). )
- I turn each of the subjects into a factor. Subject is a categorical variable.
- The cumulative column is a running total of the counts from the top of the table to the bottom.
- Midpoint is for position the labels in the middle of the pie piece.
- The
labelscolumn contains the percentage and the count in brackets. Percentages are from the count divided by the number of people in the survey (subjects_total variable). - The
paste0()command is useful for combing variables and text characters. - The
mutate()function fromdplyrallows the user to add columns depending on what is specified.
Plotting The Survey Results
When it comes to creating pie charts in R and ggplot2 you need to start with a bar graph from geom_bar() after the initial ggplot() function. To convert this bar graph into a circular pie chart you would use coord_polar(theta = "y", start = 0) on top of geom_bar().
The rest of the ggplot code below deals with colours, labels, the placement of labels and modifying the aesthetics.
scale_fill_manual()is for the colourslabs()is for the title, the legend and labelling the axesgeom_text()is for placing the labels which were created earlier- The
theme()code section is for centering the title and changing the title's size and font. - HTML colour codes can be used as well. As an example I use
#AD7366to get a chocolate like colour.
Edit (June 28, 2017): Fixed a typo and added an explanation on the mutate() function.
References & Resources
- R Graphics Cookbook by Winston Chang
- http://www.sthda.com/english/wiki/ggplot2-pie-chart-quick-start-guide-r-software-and-data-visualization
- http://stackoverflow.com/questions/20442693/how-to-use-ggplot2-to-generate-a-pie-graph
- http://stackoverflow.com/questions/41338757/adding-percentage-labels-on-pie-chart-in-r
- http://stackoverflow.com/questions/7145826/how-to-format-a-number-as-percentage-in-r
Thank you for this high-quality article. Resteemed.
Thank you for the comment. I put an effort in what I do.
Thanks! upvoted.
!-=o0o=-!
To follow curated math content follow @math-trail.
If you wish @math-trail to follow you then read this article.