Graphing in R

 

R has very flexible built-in graphing capabilities, allowing you to create publication-quality graphs.

 

In this section:

 

1) Basic plotting commands

2) Saving and copying plots as files

3) Scatterplot options (axes, titles, etc.)

4) Plot parameters (symbols, colors, sizes, etc.)

5) Histogram options

6) Adding lines, points, text, arrows (error bars), and polygons

7) Creating a legend

8) Making multi-panel graphs

 

 

The basic plotting command is:

 

plot(x,y) #x and y are the two numbers, vector variables, or data frame columns to plot

 

This will most often plot an x-y scatterplot, but if the x variable is categorical (i.e., a set of names) R will automatically plot a box-and-whisker plot.

 

WARNING: the plot() command overwrites any existing data in the plot window!

 

Other commands include:

 

barplot(x) #make a barplot

 

boxplot(x) #make a box-and-whisker plot

 

hist(x) #make a histogram

 

stem(x) #make a stem-and-leaf plot

 

pie(x) #make a pie diagram

 

 

These also overwrite any existing data, so there are two options.

 

1) To open a new plot window, use:

 

windows() #on a PC

quartz() #on a Mac

 

2) Copy your graph (as a metafile or bitmap) by right-clicking on the graph (works on PC) or save it as a file:

 

To save as a postscript, pdf, or png file:

 

postscript("filename") #opens postscript device driver to save as file.  Can also use pdf(…) and png(…)

plot(x,y) #creates plot in file (will not display in R)

dev.off() #turns off device driver and finalizes the file

 

Or to save a graph already plotted in the active window:

 

dev.print(device=pdf,"filename") #can use device=postscript, device=png too

 

Files will be saved in the R working directory.  You can also use menu commands to save open graphs.

 

 

Scatterplot Options

 

The default scatterplot plots points but you can graph lines or both (or neither):

 

plot(x,y,type="p") #default option, plots points

plot(x,y,type="l") #connects points by lines but doesn’t plot point symbols

plot(x,y,type="b") #plots point symbols connected by lines

plot(x,y,type="o") #plots point symbols connected by lines, points on top of lines

plot(x,y,type="n") #plots axes only, no symbols

 

You can also plot data points but no axes, using:

 

plot(x,y,axes=F)

 

To change the length of axes, use:

 

plot(x,y,xlim=c(m,n),ylim=c(x,y)) #sets x-axis range to m (minimum) and n (maximum) and y-axis range to x and y.

 

You can make axis labels more descriptive (default is to use variable names):

 

plot(x,y,xlab="Length (mm)",ylab="Width (mm)") #changes x- and y-axis labels

 

You can also add a title (the default value is no title):

 

plot(x,y,main="Graph Title") #adds title to graph

 

 

The following commands are useful for changing plot symbols, colors, and sizes.  For the full suite of options, use par().

 

Symbols: Default plot character is an open circle, but you can choose from 25 different symbols:

 

plot(x,y,pch=16) #changes plot character to closed circle

 

Here are the 25 symbols (21-25 differ in allowing different outline and fill colors)

 

You can also use any single character for the plot symbol:

 

plot(x,y,pch="a") #changes plot character to the lowercase letter a

 

If you have a data frame with multiple categories (e.g., length measurements for multiple species), you can assign different symbol types to each category:

 

plot(x,y,pch=c(1:5)[variablename$category]) #uses symbol types 1 through 5, depending on the value in the category column

 

 

Colors: Default color is black, but you can choose from 657 different colors (see colors()).

 

plot(x,y,col="red") #changes symbol color to red

 

When using character types 21-25, you can vary outline and background color

 

plot(x,y,pch=21,col="red",bg="blue") #changes symbol outline to red, fill to blue

 

Other options:

 

plot(x,y,col.axis="red") #changes axis values to red

plot(x,y,col.lab="red") #changes axis labels to red

plot(x,y,col.main="red") #changes plot title to red

 

Colors can also be varied by category using the same syntax as used for symbols

 

 

Point Size: Default point size is 1, but it can also be changed:

 

plot(x,y,cex=1.5) #points are 150% of the default size

plot(x,y,cex=0.75) #points are 75% of the default size

 

 

Line Width: Default line width (for lines and point outlines) is 1, but can be changed:

 

plot(x,y,lwd=2) #point outlines are twice the default thickness

 

 

Histograms

 

Many of the above graphical parameters apply to histograms as well, but with some differences:

 

hist(x,col="red") #changes histogram fill to red

hist(x,border="red") #changes histogram border to red

hist(x,lwd=2) #just changes the width of axes, not the histogram!

 

To change the width of the bins:

 

hist(x,breaks=n) #plots histogram with n bins

 

hist(x,breaks=c(a,b,c,d)) #plots histogram with breakpoints between bins specified by vector of numbers a,b,c,d.  Range of breaks must include all data values.

 

Default bins are “right-open” so that the bin from 0 to 10 includes all values greater than or equal to zero and all values less than (but not equal to) 10.  This can be changed:

 

hist(x,right=F) #changes bins to be “right-closed” and “left-open”

 

 

Adding Lines to a Plot

 

You can add horizontal, vertical, and sloped lines to the plot with the abline() command:

 

abline(h=ycoordinate) #adds horizontal line at specified y-coordinate

 

abline(v=xcoordinate) #adds vertical line at specified x-coordinate

 

abline(intercept,slope) #adds line with specified intercept and slope

 

abline() will add a line that crosses the whole plot.  To add a line segment:

 

lines(c(x0,x1),c(y0,y1)) #adds line segment from (x0,y0) to (x1,y1)

 

or,

 

segments(x0,y0,x1,y1) #adds line segment from (x0,y0) to (x1,y1)

 

You can add multi-segment lines from specified points or based on variables

 

lines(variablename$x,variablename$y) #adds lines with points based on coordinates in variables

 

Changing line properties:

 

abline(h=ycoordinate,lty=2) #changes line type to dashed lines

 

Line types include 1 (solid), 2 (dashed), 3 (dotted), 4 (dash-dot), 5 (larger dash), 6 (dash-smaller dash).

 

You can change the line color (col) and width (lwd).

 

 

Adding Points to a Plot

 

Use points(x,y) with all of the regular plotting subcommands (col, pch, cex, etc.)

 

Points(), lines(), text(), arrows(), and polygon() do not erase the previous plot so are used to add data to an existing graph

 

 

Adding Text to a Plot

 

You can also add text labels to a plot, using the text() command:

 

text(x,y,"text") #adds text centered at specified x, y coordinates

 

Text can also be added from a variable:

 

text(variablename$x,variablename$y,variablename$text) #adds text values from data frame at given x and y coordinates for each point

 

Text colors (col) and size (cex) can be modified, and there are 18 fonts to choose from (1-17 and 19; font 18 seems to be the same as font 1).

 

 

text(x,y,"text",font=2) #changes font to bold

 

Changing the font of the axis labels, title, etc. requires use of font commands as part of the primary plot() function:

 

plot(x,y,font.axis=3,font.lab=3,font.main=3) #changes axis values, labels, and title to italics

 

 

Adding Arrows to a Plot (Error Bars)

 

You can add arrows to a plot, but the most useful function of this command is to generate error bars.

 

The basic use of arrows() is:

 

arrows(x0,y0,x1,y1) #draws an arrow from (x0,y0) to (x1,y1)

 

The length (default 0.25 inches), angle (default 30 degrees), and position (default at x1 end – code 2) of the arrowhead can all be changed:

 

arrows(x,mean-sd,x,mean+sd,length=0.1,angle=90,code=3) #draws line with flat bars (angle=90) at both ends (code=3) of error bar.  Code 1 draws arrowhead at x0 end of line.

 

 

Adding Polygons to a Plot

 

The command is polygon(x,y), where x and y are vector variables containing the coordinates of each vertex.

 

To change the polygon colors:

 

polygon(x,y,border="red") #border color is red but polygon is not filled

 

polygon(x,y,col="red") #fill color is red and border remains black

 

The line type and width can be changed using lty and lwd, respectively.

 

Rectangles can also be plotted using the simpler rect() command:

 

rect(xleft,ybottom,xright,ytop) #plots rectangle with vertexes at (xleft,ybottom), (xright,ybottom), (xright,ytop), (xleft,ytop)

 

 

Adding Legend to a Plot

 

The final step in creating a plot is to add a legend explaining the symbols, colors, or line types, using the legend() command:

 

You can specify the position of the legend by (x,y) coordinates or using preset positions:

 

legend(x,y,c("name1","name2")) #adds legend with its top-left corner at (x,y)

 

The preset legend positions are:

 

legend("bottomright",c("name1","name2")) #adds legend in the bottom right corner. Also "bottom", "bottomleft", "left", "topleft", "top", "topright", "right" and "center".

 

See ?legend() for more parameters, such as displaying the legend without a border (bty=”n”), changing the size (cex), background color of legend box (bg), style of box (box.lty, box.lwd,box.col), color of legend text (text.col), etc.

 

The syntax to match legend symbols/colors/lines to plot symbols/colors/lines is:

 

legend(x,y,c("name1","name2"),pch=c(1,16)) #matches names to symbols in order (first name=first symbol, etc.)

 

Instead of manually typing the names, you can use the structure of factor variables to generate the list.  If colors or symbols were assigned to points using this syntax:

 

plot(x,y,pch=c(1:5)[variablename$category])

 

the default is to assign the five symbols in alphabetical order to the five names present in the “category” column.  The “category” column (if it contains words) is treated as a factor variable so each unique name is a level.  Therefore, you can use levels(variablename$category) to generate the list of names in the correct order:

 

legend(x,y,levels(variablename$category),pch=c(1:5)) #matches levels of category column (in alphabetical order) to symbols 1 through 5

 

 

Making Multi-Panel Graphs

 

The R plotting window can be set up to create multi-panel graphs.

 

First, create a blank plotting window with the specified parameters:

 

par(mfrow=c(x,y)) #opens new plot window with x rows and y columns, fills by rows

par(mfcol=c(x,y)) #opens new plot window with x rows and y columns, fills by columns

 

Next, create the required number of graphs (x×y) using regular plot() commands.

 

There are two ways to create more complex graphing layouts, using the layout() command and par(fig=…).

 

layout() allows you to specify different sub-plot sizes (column and row widths can be specified) and arrangements (requiring a matrix-style input):

 

layout(matrix(c(1,1,2,3),ncol=2,byrow=T)) #creates plot window with the numbers and position of sub-plots specified by matrix

 

The matrix essentially creates a picture showing the position of numbered sub-plots (in the command above, the numbers 1,1,2,3 are placed into a two-column matrix (ncol=2) by rows (byrow=T)):

 

     [,1] [,2]

[1,]    1    1

[2,]    2    3

 

Column and row widths can be specified after the command specifying the matrix:

 

layout(matrix(c(2,0,1,3),ncol=2,byrow=T),width=c(3,1),height=c(1,3),respect=T) #creates a plot layout that looks like this:

#width= and height= are implied and not necessary.

#“respect=T” forces column and row sizes to be equal (otherwise dimensions vary with plot window size).  Can just use “T”.

 

 

If the layout() command is assigned to a variable, you can see the layout using:

 

layout.show(variable) #opens plot window with sub-plots outlined and numbered

 

 

It may be necessary to edit the margins of each window to reduce white space between plots (if desired), using:

 

par(mar=c(b,l,t,r)) #adjusts bottom, left, top, and right margins to specified size (measured in lines, default is (5,4,4,2))

 

 

For even more flexibility, you can use the graphical parameters (par()) to specify x and y coordinates for each sub-plot:

 

par(fig=c(x1,x2,y1,y2)) #creates sub-plot window from x1 to x2 and y1 to y2 (on a scale of 0 (left edge/bottom) to 1 (ridge edge/top)

 

par(fig=c(x1,x2,y1,y2),new=T) #The new=T command is required to add sub-plots to an existing plot.  It may take trial and error to find suitable coordinates.