Graphing in R
R has very flexible built-in graphing capabilities, allowing you to create publication-quality graphs.
In this section:
1) Basic plotting commands
2) Saving and copying plots as files
3) Scatterplot options (axes, titles, etc.)
4) Plot parameters (symbols, colors, sizes, etc.)
5) Histogram options
6) Adding lines, points, text, arrows (error bars), and polygons
7) Creating a legend
8) Making multi-panel graphs
The basic plotting command is:
plot(x,y) #x and y are the two numbers, vector variables, or data frame columns to plot
This will most often plot an x-y scatterplot, but if the x variable is categorical (i.e., a set of names) R will automatically plot a box-and-whisker plot.
WARNING: the plot() command overwrites any existing data in the plot window!
Other commands include:
barplot(x) #make a barplot
boxplot(x) #make a box-and-whisker plot
hist(x) #make a histogram
stem(x) #make a stem-and-leaf plot
pie(x) #make a pie diagram
These also overwrite any existing data, so there are two options.
1) To open a new plot window, use:
windows() #on a PC
quartz() #on a Mac
2) Copy your graph (as a metafile or bitmap) by right-clicking on the graph (works on PC) or save it as a file:
To save as a postscript, pdf, or png file:
postscript("filename") #opens postscript device driver to save as file. Can also use pdf(…) and png(…)
plot(x,y) #creates plot in file (will not display in R)
dev.off() #turns off device driver and finalizes the file
Or to save a graph already plotted in the active window:
dev.print(device=pdf,"filename") #can use device=postscript, device=png too
Files will be saved in the R working directory. You can also use menu commands to save open graphs.
The default scatterplot plots points but you can graph lines or both (or neither):
plot(x,y,type="p") #default option, plots points
plot(x,y,type="l") #connects points by lines but doesn’t plot point symbols
plot(x,y,type="b") #plots point symbols connected by lines
plot(x,y,type="o") #plots point symbols connected by lines, points on top of lines
plot(x,y,type="n") #plots axes only, no symbols
You can also plot data points but no axes, using:
To change the length of axes, use:
plot(x,y,xlim=c(m,n),ylim=c(x,y)) #sets x-axis range to m (minimum) and n (maximum) and y-axis range to x and y.
You can make axis labels more descriptive (default is to use variable names):
plot(x,y,xlab="Length (mm)",ylab="Width (mm)") #changes x- and y-axis labels
You can also add a title (the default value is no title):
plot(x,y,main="Graph Title") #adds title to graph
The following commands are useful for changing plot symbols, colors, and sizes. For the full suite of options, use par().
Symbols: Default plot character is an open circle, but you can choose from 25 different symbols:
plot(x,y,pch=16) #changes plot character to closed circle
Here are the 25 symbols (21-25 differ in allowing different outline and fill colors)
You can also use any single character for the plot symbol:
plot(x,y,pch="a") #changes plot character to the lowercase letter a
If you have a data frame with multiple categories (e.g., length measurements for multiple species), you can assign different symbol types to each category:
plot(x,y,pch=c(1:5)[variablename$category]) #uses symbol types 1 through 5, depending on the value in the category column
Colors: Default color is black, but you can choose from 657 different colors (see colors()).
plot(x,y,col="red") #changes symbol color to red
When using character types 21-25, you can vary outline and background color
plot(x,y,pch=21,col="red",bg="blue") #changes symbol outline to red, fill to blue
plot(x,y,col.axis="red") #changes axis values to red
plot(x,y,col.lab="red") #changes axis labels to red
plot(x,y,col.main="red") #changes plot title to red
Colors can also be varied by category using the same syntax as used for symbols
Point Size: Default point size is 1, but it can also be changed:
plot(x,y,cex=1.5) #points are 150% of the default size
plot(x,y,cex=0.75) #points are 75% of the default size
Line Width: Default line width (for lines and point outlines) is 1, but can be changed:
plot(x,y,lwd=2) #point outlines are twice the default thickness
Many of the above graphical parameters apply to histograms as well, but with some differences:
hist(x,col="red") #changes histogram fill to red
hist(x,border="red") #changes histogram border to red
hist(x,lwd=2) #just changes the width of axes, not the histogram!
To change the width of the bins:
hist(x,breaks=n) #plots histogram with n bins
hist(x,breaks=c(a,b,c,d)) #plots histogram with breakpoints between bins specified by vector of numbers a,b,c,d. Range of breaks must include all data values.
Default bins are “right-open” so that the bin from 0 to 10 includes all values greater than or equal to zero and all values less than (but not equal to) 10. This can be changed:
hist(x,right=F) #changes bins to be “right-closed” and “left-open”
Adding Lines to a Plot
You can add horizontal, vertical, and sloped lines to the plot with the abline() command:
abline(h=ycoordinate) #adds horizontal line at specified y-coordinate
abline(v=xcoordinate) #adds vertical line at specified x-coordinate
abline(intercept,slope) #adds line with specified intercept and slope
abline() will add a line that crosses the whole plot. To add a line segment:
lines(c(x0,x1),c(y0,y1)) #adds line segment from (x0,y0) to (x1,y1)
segments(x0,y0,x1,y1) #adds line segment from (x0,y0) to (x1,y1)
You can add multi-segment lines from specified points or based on variables
lines(variablename$x,variablename$y) #adds lines with points based on coordinates in variables
Changing line properties:
abline(h=ycoordinate,lty=2) #changes line type to dashed lines
Line types include 1 (solid), 2 (dashed), 3 (dotted), 4 (dash-dot), 5 (larger dash), 6 (dash-smaller dash).
You can change the line color (col) and width (lwd).
Adding Points to a Plot
Use points(x,y) with all of the regular plotting subcommands (col, pch, cex, etc.)
Points(), lines(), text(), arrows(), and polygon() do not erase the previous plot so are used to add data to an existing graph
Adding Text to a Plot
You can also add text labels to a plot, using the text() command:
text(x,y,"text") #adds text centered at specified x, y coordinates
Text can also be added from a variable:
text(variablename$x,variablename$y,variablename$text) #adds text values from data frame at given x and y coordinates for each point
Text colors (col) and size (cex) can be modified, and there are 18 fonts to choose from (1-17 and 19; font 18 seems to be the same as font 1).
text(x,y,"text",font=2) #changes font to bold
Changing the font of the axis labels, title, etc. requires use of font commands as part of the primary plot() function:
plot(x,y,font.axis=3,font.lab=3,font.main=3) #changes axis values, labels, and title to italics
Adding Arrows to a Plot (Error Bars)
You can add arrows to a plot, but the most useful function of this command is to generate error bars.
The basic use of arrows() is:
arrows(x0,y0,x1,y1) #draws an arrow from (x0,y0) to (x1,y1)
The length (default 0.25 inches), angle (default 30 degrees), and position (default at x1 end – code 2) of the arrowhead can all be changed:
arrows(x,mean-sd,x,mean+sd,length=0.1,angle=90,code=3) #draws line with flat bars (angle=90) at both ends (code=3) of error bar. Code 1 draws arrowhead at x0 end of line.
Adding Polygons to a Plot
The command is polygon(x,y), where x and y are vector variables containing the coordinates of each vertex.
To change the polygon colors:
polygon(x,y,border="red") #border color is red but polygon is not filled
polygon(x,y,col="red") #fill color is red and border remains black
The line type and width can be changed using lty and lwd, respectively.
Rectangles can also be plotted using the simpler rect() command:
rect(xleft,ybottom,xright,ytop) #plots rectangle with vertexes at (xleft,ybottom), (xright,ybottom), (xright,ytop), (xleft,ytop)
Adding Legend to a Plot
The final step in creating a plot is to add a legend explaining the symbols, colors, or line types, using the legend() command:
You can specify the position of the legend by (x,y) coordinates or using preset positions:
legend(x,y,c("name1","name2")) #adds legend with its top-left corner at (x,y)
The preset legend positions are:
#adds legend in
the bottom right corner.
See ?legend() for more parameters, such as displaying the legend without a border (bty=”n”), changing the size (cex), background color of legend box (bg), style of box (box.lty, box.lwd,box.col), color of legend text (text.col), etc.
The syntax to match legend symbols/colors/lines to plot symbols/colors/lines is:
legend(x,y,c("name1","name2"),pch=c(1,16)) #matches names to symbols in order (first name=first symbol, etc.)
Instead of manually typing the names, you can use the structure of factor variables to generate the list. If colors or symbols were assigned to points using this syntax:
the default is to assign the five symbols in alphabetical order to the five names present in the “category” column. The “category” column (if it contains words) is treated as a factor variable so each unique name is a level. Therefore, you can use levels(variablename$category) to generate the list of names in the correct order:
legend(x,y,levels(variablename$category),pch=c(1:5)) #matches levels of category column (in alphabetical order) to symbols 1 through 5
Making Multi-Panel Graphs
The R plotting window can be set up to create multi-panel graphs.
First, create a blank plotting window with the specified parameters:
par(mfrow=c(x,y)) #opens new plot window with x rows and y columns, fills by rows
par(mfcol=c(x,y)) #opens new plot window with x rows and y columns, fills by columns
Next, create the required number of graphs (x×y) using regular plot() commands.
There are two ways to create more complex graphing layouts, using the layout() command and par(fig=…).
layout() allows you to specify different sub-plot sizes (column and row widths can be specified) and arrangements (requiring a matrix-style input):
layout(matrix(c(1,1,2,3),ncol=2,byrow=T)) #creates plot window with the numbers and position of sub-plots specified by matrix
The matrix essentially creates a picture showing the position of numbered sub-plots (in the command above, the numbers 1,1,2,3 are placed into a two-column matrix (ncol=2) by rows (byrow=T)):
[1,] 1 1
[2,] 2 3
Column and row widths can be specified after the command specifying the matrix:
layout(matrix(c(2,0,1,3),ncol=2,byrow=T),width=c(3,1),height=c(1,3),respect=T) #creates a plot layout that looks like this:
#width= and height= are implied and not necessary.
#“respect=T” forces column and row sizes to be equal (otherwise dimensions vary with plot window size). Can just use “T”.
If the layout() command is assigned to a variable, you can see the layout using:
layout.show(variable) #opens plot window with sub-plots outlined and numbered
It may be necessary to edit the margins of each window to reduce white space between plots (if desired), using:
par(mar=c(b,l,t,r)) #adjusts bottom, left, top, and right margins to specified size (measured in lines, default is (5,4,4,2))
For even more flexibility, you can use the graphical parameters (par()) to specify x and y coordinates for each sub-plot:
par(fig=c(x1,x2,y1,y2)) #creates sub-plot window from x1 to x2 and y1 to y2 (on a scale of 0 (left edge/bottom) to 1 (ridge edge/top)
par(fig=c(x1,x2,y1,y2),new=T) #The new=T command is required to add sub-plots to an existing plot. It may take trial and error to find suitable coordinates.