R: Indexing & Subsetting Vectors


I. Indexing Vectors

R operates on named data structures. The simplest such structure is the numeric vector, which is a single entity consisting of an collection of numbers. As an example, lets create a simple vector named x consisting of four numbers, namely 13, 21, 23.4, and 7.4. To create the vector, and assing the data to the variable name x, use the following R code:


II. Index Extraction

Elements from a vector can be extracted using numeric indexing. As shown below, specific elements of a vector can be extracted by referencing the specific location of a given element or the specific locations of multiple elements. 

> ### Create a variable x, with numbers 11 through 20
> x=11:20
> x
 [1] 11 12 13 14 15 16 17 18 19 20
> 
> # the second element of x
> x[2]
[1] 12
> 
> # the second to fifth element of x, inclusive
> x[2:5]
[1] 12 13 14 15
> 
> # all except the second element
> x[-2]
[1] 11 13 14 15 16 17 18 19 20
> 
> # Keep all values except 1st, 3rd, and 5th numbers
> x[-c(1,3,5)]
[1] 12 14 16 17 18 19 20
> 
> # Keep only the 1st, 3rd, and 5th numbers
> x[c(1,3,5)]
[1] 11 13 15

III. Conditional Extraction

With index extraction, elements are extracted using location. When using conditional extraction, elements are extracted that meet specific conditions. For example, a vector may contain contain all the ages of students in a high school. If one wished to extract only the ages that were less than 15, conditional extraction would be a good way to do this. 

> # all numbers greater than 12 AND less than 15
> x[x>12 & x<15]
[1] 13 14
> 
> # all numbers less than 12 OR greater than 15
> x[x<12 | x>15]
[1] 11 16 17 18 19 20
> 
> # all numbers not equal to 15
> x[x!=15]
[1] 11 12 13 14 16 17 18 19 20
> 
> # all numbers greater than or equal to 18
> x[x>=18]
[1] 18 19 20
> 
> # Which x values are greater than 13
> x>=13
 [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
Operator Description
> greater than
< less than
>= greater than or equal to
<= less than or equal to
== equal to
!= not equal to
!x not x
x | y x OR y
x & y x AND y
isTRUE(x) test if X is TRUE

IV. subset function

The subset function, is a useful function, similar to conditional indexing. It can be used to return a subset of a vector depending on a defined condition or group of conditions.

> v = 10:20
> v
 [1] 10 11 12 13 14 15 16 17 18 19 20
> 
> # ALL ELEMENTS OF v GREATER THAN 14
> subset(v, v > 14)
[1] 15 16 17 18 19 20
> 
> # ALL ELEMENTS OF v LESS THAN 14 OR GREATER THAN OR EQUAL TO 18
> subset(v, v < 14 | v >= 18)
[1] 10 11 12 13 18 19 20
> 
> # ALL ELEMENTS OF v GREANTER THAN 14 OR LESS THAN 18
> subset(v, v > 14 & v < 18)
[1] 15 16 17

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.