R seminar series #10

Last update on April 30, 2015.

String operations in R

Efficiently checking and/or changing strings in complex data sets is often very time consuming, but R has a rich set of functions for managing strings (e.g. sentences or paragraphs). We will focus on the stringr library in this sessions, introducing a few key functions. You will need:



install.packages("stringr") #if you don't have it already

The example data we will use can be downloaded here!

# Sample code from the introductory session

text.dat <- read.csv("textExample.csv") #or point R to where you saved the file from the above link
head(text.dat) # to have a look at the start of the data...

# Let's count the number of times a certain string appears in the vegetation strata description
text.dat$acacia <- str_count(text.dat$VegStrataDescr, pattern = "acacia")
text.dat$grass <- str_count(text.dat$VegStrataDescr, pattern = "grass")
text.dat$tree <- str_count(text.dat$VegStrataDescr, pattern = "tree")

# You may want to try this with other terms as well...

# We often need to change from small caps to LARGE caps or the other way around. tolower() is a useful function for this.
text.dat$VegStrataDescr <- tolower(text.dat$VegStrataDescr) # Converts the whole string to small caps

# Let's make a copy of our R object and do some string replacement
text.dat$NewStrataDescr <- text.dat$VegStrataDescr
text.dat$NewStrataDescr <- str_replace(text.dat$VegStrataDescr, pattern = "grass", replacement = "grassland") # we will look at using regular expression or even grep for this in a later seminar.

Next entry

Previous entry

Related entries

Similar entries


Pingbacks are closed.


No comments yet.

Post your comment