This document lists and illustrates a number of basic plot functions. For replicating the example scripts, please load the restructured data set AirPassengers
(see help(AirPassengers)
) in R:
The function plot
sets up a plotting grid. The function optionally adds data in the form of a line or points, depending on the argument type
. If you just would like to set up an empty grid and add data later, you could use the argument type='n'
. Here are some examples:
dat$Time <- dat$Year+dat$Month/12
# by default, type is set to 'p': points
plot(x=dat$Time, y=dat$AirPassengers)
# ... but lines may be more insightful here:
plot(x=dat$Time, y=dat$AirPassengers, type='l',
main="Title", xlab="Year", ylab="AirPassengers")
# we could also set up an empty plot, to add points or lines later:
plot(x=c(), y=c(), type='n',
xlim=range(dat$Time), ylim=range(dat$AirPassengers),
main="Title", xlab="Year", ylab="AirPassengers",
bty='n') # <- no box around plot
# the library plotfunctions provides a wrapper for doing this:
library(plotfunctions)
# note that we set the *range* rather than actual values here
emptyPlot( range(dat$Time), range(dat$AirPassengers))
See help(par)
for graphical parameters that can be added to most plotting functions to change the layout and colors.
The function points
is used to add points to a plot grid. The argument pch
specifies the layout.
emptyPlot( range(dat$Time), range(dat$AirPassengers))
points(dat$Time, dat$AirPassengers,
pch=16, col=alpha(2)) # <- alpha() from package plotfunctions
Various types of points:
par(cex=1.1)
emptyPlot(c(0,6), c(0,6), axes=FALSE, bty='o')
for(i in 1:5){
for(j in 1:5){
points(i, j, pch=(i-1)*5+j, bg=alpha(2))
text(i,j,labels=(i-1)*5+j, cex=.5, pos=3, col='red')
}
}
The function lines
is used to add lines to a plot grid. The argument lwd
specifies the width, and lty
the type.
emptyPlot( range(dat$Time), range(dat$AirPassengers))
points(dat$Time, dat$AirPassengers, pch=16, col=alpha(2))
lines(dat$Time, dat$AirPassengers, lwd=1, lty=5)
# different types:
emptyPlot( range(dat$Time), range(dat$AirPassengers))
lines(dat$Time, dat$AirPassengers, lwd=1, type='b', pch=21)
emptyPlot( range(dat$Time), range(dat$AirPassengers))
lines(dat$Time, dat$AirPassengers, lwd=1, type='o', pch=21)
emptyPlot( range(dat$Time), range(dat$AirPassengers))
lines(dat$Time, dat$AirPassengers, lwd=1, type='h')
Various types of lines:
emptyPlot(c(0,6), c(0,7), axes=FALSE, bty='o')
for(i in 1:6){
abline(h=i, lty=i)
text(1,i,labels=i, cex=.5, pos=3, col='red')
}
R allows for specifying 8 colors as numbers, which are recycled when higher values are used:
par(cex=1.1)
emptyPlot(c(0,4), c(0,4), axes=FALSE, bty='o')
for(i in 1:4){
for(j in 1:3){
points(i, j, pch=15, cex=2, col=(i-1)*3+j)
text(i,j,labels=(i-1)*3+j, cex=.5, pos=3, col='red')
}
}
In addition, R also accepts 657 predefined color names, such as ‘red’, ‘black’, ‘cornflowerblue’, and ‘limegreen’ to name a few examples. All colors can be accessed with the function colors()
.
Axis labels and titles can be provided to most generic plot functions, such as plot()
, barplot()
, emptyPlot()
, using the following arguments:
main
: adds an overall title for the plot;sub
: adds a sub title for the plot;xlab
, ylab
: adds a title for the x axis or y axis, respectively;Instead, the function title()
is available for adding labels and titles afterwards. The following two code snippets result in the same plot:
Setting the argument axes
to FALSE
will exclude the axes. This is useful when you would like to specify your own axes: The function axis()
will add an axis at a specific side of the plot.
In R, each side of the plot has a number, as illustrated below. Therefore, we can use the number 1 for adding an axis to the bottom: axis(1)
.
The function axis()
gives more control over the layout of the axes. For example, the arguments at
and labels
specify the tick marks and their labels, and the arguments pos
(value measured in coordinates) and line
(the number of lines into the margin) allow to vary the distance into the plotregion or into the margin at which the axis is drawn.
Another useful argument is las
(see help(par)
for more information):
las=0
presents the axis labels always parallel to the axis (default);las=1
presents the axis labels always horizontal;las=2
presents the axis labels always perpendicular to the axis;las=3
presents the axis labels always vertical.emptyPlot(1,1,axes=FALSE)
# axis at bottom, range and labels automatically:
axis(1)
# axis at left, horizontal labels:
axis(2, las=1)
# axis at top, perpendicular and customized labels:
axis(3, at=c(.25,.75), labels=c("position 1", "position 2"), las=2)
# axis at right, no labels:
axis(4, labels = FALSE)
# different positions and layout:
axis(4, at=c(0.25,1), labels=c(1,6), pos=0.8, las=1,
col.ticks=2, col=2, col.axis=2, font=2, lwd=2)
axis(4, at=c(0.25,1), labels=FALSE, pos=-0.25, las=1,
col.ticks=2, col=2, col.axis=2, font=2, lwd=2)
for(i in c(-1:3)){
axis(4, at=c(0.25,1), labels=FALSE, line=i, las=1,
col.ticks=2, col=2, col.axis=2, lwd=0.5)
}
With the argument bty
the border box around the plot region is adjusted. The argument accepts the following options:
bty="o"
: closed box (the default),bty="l"
: lines at sides 1 and 2,bty="7"
: lines at sides 3 and 4,bty="c"
: lines at sides 1, 2, 3,bty="u"
: lines at sides 4, 1, 3,bty="]"
: lines at sides 3, 4, 1,bty="n"
: no box.The function box()
draws a box around the plot region and accepts the same values to determine the style.
Besides these basic building blocks, there is also a range of functions for making special plots. Here we illustrate barplot
, hist
, density
, and boxplot
.
For illustrating the use of the basic R function barplot
, we first calculate the average air passengers per month. For more information on aggregation see aggregation.
avg1 <- tapply(dat$AirPassengers, list(dat$Month), mean)
sd1 <- tapply(dat$AirPassengers, list(dat$Month), sd)
Basic use of barplot:
But we could also add error bars (which are illustrated in more detail below). The important point here is that the output b
saves the x-positions of the bars.
## [,1]
## [1,] 0.7
## [2,] 1.9
## [3,] 3.1
## [4,] 4.3
## [5,] 5.5
## [6,] 6.7
## [7,] 7.9
## [8,] 9.1
## [9,] 10.3
## [10,] 11.5
## [11,] 12.7
## [12,] 13.9
A nice feature of barplot
is that it can also handle two-dimensional data.
avg2 <- with(dat[dat$Year %in% c(1950, 1960),],
matrix(AirPassengers, ncol=12, nrow=2, byrow=TRUE))
rownames(avg2) <- c("1950", "1960")
colnames(avg2) <- 1:12
avg2
## 1 2 3 4 5 6 7 8 9 10 11 12
## 1950 115 126 141 135 125 149 170 170 158 133 114 140
## 1960 417 391 419 461 472 535 622 606 508 461 390 432
There are different variations of how to visualize two dimensional data. Below we show 3 examples.
par(mfrow=c(1,3), cex=1.1)
# PLOT 1
# stacked (default)
barplot(avg2, main="PLOT 1")
# PLOT 2
# besides
barplot(avg2,beside=TRUE, main="PLOT 2")
# PLOT 3
# transposed dimensions
avg3 <- t(avg2)
dim(avg2)
## [1] 2 12
## [1] 12 2
Histograms present the distribution of numerical data. Each bar represents the number of observations in that particular interval. The data organised in intervals is optionally returned.
… with the following information returned:
## $breaks
## [1] 100 150 200 250 300 350 400 450 500 550 600 650
##
## $counts
## [1] 24 24 21 13 21 13 13 8 4 1 2
##
## $density
## [1] 0.0033333333 0.0033333333 0.0029166667 0.0018055556 0.0029166667
## [6] 0.0018055556 0.0018055556 0.0011111111 0.0005555556 0.0001388889
## [11] 0.0002777778
##
## $mids
## [1] 125 175 225 275 325 375 425 475 525 575 625
##
## $xname
## [1] "dat$AirPassengers"
##
## $equidist
## [1] TRUE
##
## attr(,"class")
## [1] "histogram"
Options include the arguments xlim
and breaks
to control the size of the intervals.
Boxplots (i.e., box-and-whisker plots) summarize the distribution of numerical predictors in five values, namely the median, the upper and lower hinge of the box (i.e., first and third quartile), and extremes of the lower and the upper whisker. The box is representing the IQR (inter-quartile range, first to third quantile of data), and the whiskers extend to the most extreme data point which is no further than 1.5 times the length of the box away (i.e., 1.5*IQR) from the box. Data points outside this range are considered as outliers and plotted as points.
boxplot(AirPassengers ~ Year, data=dat,
col="gray", main="Grouped data", las=1,
xlab="Year", ylab="AirPassengers")
A contour plot is a graphical representation of a 3-dimensional surface by indicating the value of z with contourlines. Given a value for z, lines are drawn for connecting the x and y coordinates where that z value is measured.
x <- unique(dat$Year)
y <- unique(dat$Month)
z <- matrix(dat$AirPassengers, ncol=length(x), byrow=TRUE)
contour(x, y, z,
main="AirPassengers",
xlab="Year", ylab="Month")
image()
and contour()
are combined.
An image is a graphical representation of a 3-dimensional surface by indicating the value of z with a colorspectrum.
x <- unique(dat$Year)
y <- unique(dat$Month)
z <- matrix(dat$AirPassengers, ncol=length(x), byrow=TRUE)
image(x, y, z, col=terrain.colors(50),
zlim=c(0,700),
main="AirPassengers",
xlab="Year", ylab="Month")
gradientLegend(valRange = c(0,700),color = terrain.colors(50),
side=3,pos=0.125,inside = FALSE)
Generally, the functions image()
and contour()
are combined.
For example, the function plotsurface()
combines image
and contour
for visualizing the contents of a dataframe.
library(plotfunctions)
plotsurface(dat, view=c("Year", "Month"), predictor = "AirPassengers", color=terrain.colors(50), col=1, zlim=c(0,700))
The function legend
adds a legend in the specified location within the plot region. Possible location values include:
‘topleft’ | ‘top’ | ‘topleft’ |
‘left’ | ‘center’ | ‘right’ |
‘bottomleft’ | ‘bottom’ | ‘bottomright’ |
emptyPlot(1,1, bty='o')
legend('topright', legend=c("A", "B", "C"),
col=c(2, 3, 4), lwd=c(1,2,3), lty=c(1,2,3),
pch=c(16, NA, NA), merge=TRUE)
legend('center', legend=c("A", "B", "C"),
col=c(2, 3, 4), lwd=c(1,2,3), lty=c(1,2,3),
pch=c(16, NA, NA), merge=TRUE)
The function legend_margin
works the same, but instead plots the legend in the plot margins. So the position is specified relative to the figure panel rather than relative to the plotregion.
emptyPlot(1,1, bty='o')
legend_margin('topright', legend=c("A", "B", "C"),
col=c(2, 3, 4), lwd=c(1,2,3), lty=c(1,2,3),
pch=c(16, NA, NA), merge=TRUE)
legend('center', legend=c("A", "B", "C"),
col=c(2, 3, 4), lwd=c(1,2,3), lty=c(1,2,3),
pch=c(16, NA, NA), merge=TRUE)
The function text
is useful for adding labels to plots. The function expecteds an x- and y-position and a label to plot. In addition other parameters specify the layout:
pos
: pos=1
under coordinates (x,y); pos=2
left from coordinates (x,y); pos=3
above coordinates (x,y); pos=4
right from coordinates (x,y).adj
: horizontal alignment; the two extremes are adj=0
, which causes the text to start at the right side of the position, and adj=1
, which causes the text to end at the position.font
: font=1
normal font, font=2
bold, font=3
italics, and font=4
italics bold.emptyPlot(1,1, las=1, v=0.5)
for(i in seq(0,1, by=0.2)){
text(0.5, i, labels=sprintf("### adj=%01.1f ###", i), adj=i, col=alpha(2), xpd=TRUE)
}
points(0.2,0.2, pch=16)
for(i in 1:4){
text(0.2,0.2, labels=sprintf("pos=%d",i), pos=i)
}
The function mtext
is used for adding labels into the margins of the plot.
emptyPlot(1,1, axes=FALSE)
axis(1, at=c(0,1))
axis(2, at=c(0,1))
box(lty=3)
for(i in 1:4){
for(j in -2:4){
mtext(sprintf("%d | line=%d",i, j), side=i, line=j, col=j+3)
}
}
mtext(c(0,1), side=1, line=2, at=c(0,1),col=2 )
mtext(c(0,1), side=2, line=2, at=c(0,1),col=2 )
Segments (straight lines) and arrows are useful for highlighting aspects in the plots. The functions segments()
and arrws()
work in a similar way and expect the following information:
x0
: x-coordinate starting point;y0
: y-coordinate starting point;x1
: x-coordinate end point;y1
: y-coordinate end point.In addition, the following arguments determine the arrow head:
code
: code=1
draws arrowhead at starting point, code=2
draws arrowhead at end point, code=3
draws arrowheads at both sides.length
: length of arrowhead in inches.angle
: angle from the shaft of the arrow to the edge of the arrow head.Finally, the generic layout options such as col
(color), lwd
(line with), and lty
(line style) apply.
emptyPlot(1,1)
# draw arrows of length=1
x2y <- function(x, l=1){
y = sqrt(l^2-x^2)
return(y)
}
x <- c(0.2,0.3)
y <- x2y(x)
segments(x0=x, y0=c(0,0),
x1=x, y1=y,
lty=3, col=2)
arrows(x0=c(0,0), y0=c(0,0),
x1=x, y1=y,
length=.15,
lwd=c(1,2), col=c(1,4))
Error bars can be added using arrows. The function errorBars
is a wrapper around arrows
.
For plotting symmetric errorbars, the function expects a series of x-coordinatoes (or y-coordinates when horiz=TRUE
), a series of means, and a series of confidence intervals (arguments x
, mean
, and ci
).
For plotting asymmetric errorbars, the optional argument ci.l
represents the value for the lower confidence intervals, whereas the argument ci
is taken as the value for the upper confidence band.
Note that the values for ci
and ci.l
in any case will be added to the mean.
library(plyr)
library(plotfunctions)
# calculate means and sd and quantiles:
means <- ddply(dat, "Year", summarise,
mean = mean(AirPassengers),
sd = sd(AirPassengers),
min = min(AirPassengers),
max = max(AirPassengers))
# Two different plots:
# PLOT 1:
b <- barplot(means$mean, names.arg = means$Year, beside = TRUE,
ylim=c(0,600), las=2,
main="AirPassengers")
with(means, errorBars(b, mean, sd))
text(min(b), 600, labels=expression("" %+-% "1SD"),
adj=0, xpd=TRUE)
# PLOT 2:
emptyPlot(c(0,600), range(means$Year),
ylab="Year", xlab="AirPassengers", las=1)
with(means, {
errorBars(Year, mean, ci=mean-min, ci.l=max-mean, horiz = TRUE, length=.05)
points(mean, Year, pch=15)
})
legend("bottomright", legend=c("mean", "min-max"),
lwd=c(0, 1), pch=c(15, NA),
merge=TRUE, bty='n')
The function plot_images()
can add images (png, jpg, gif, or matrices) to a plot - as background or at specfic coordinates. See help(plot_images)
for more information and examples.
library(plotfunctions)
# see Volcano example at help(image)
# create image object:
myimg <- list(image=volcano-min(volcano), col=topo.colors(max(volcano)-min(volcano)))
# create emoty plot window:
emptyPlot(1,1, main="Volcano images")
# add image topleft corner:
plot_image(img=myimg, xrange=c(0,.25), yrange=c(.75,1), add=TRUE)