1
2
3
4
5
#

R: Indexing & Subsetting Vectors

I. Indexing Vectors

R operates on named data structures. The simplest such structure is the numeric vector, which is a single entity consisting of an collection of numbers. As an example, lets create a simple vector named x consisting of four numbers, namely 13, 21, 23.4, and 7.4. To create the vector, and assing the data to the variable name x, use the following R code:

II. Index Extraction

Elements from a vector can be extracted using numeric indexing. As shown below, specific elements of a vector can be extracted by referencing the specific location of a given element or the specific locations of multiple elements.

> ### Create a variable x, with numbers 11 through 20
> x=11:20
> x
 11 12 13 14 15 16 17 18 19 20
>
> # the second element of x
> x
 12
>
> # the second to fifth element of x, inclusive
> x[2:5]
 12 13 14 15
>
> # all except the second element
> x[-2]
 11 13 14 15 16 17 18 19 20
>
> # Keep all values except 1st, 3rd, and 5th numbers
> x[-c(1,3,5)]
 12 14 16 17 18 19 20
>
> # Keep only the 1st, 3rd, and 5th numbers
> x[c(1,3,5)]
 11 13 15

III. Conditional Extraction

With index extraction, elements are extracted using location. When using conditional extraction, elements are extracted that meet specific conditions. For example, a vector may contain contain all the ages of students in a high school. If one wished to extract only the ages that were less than 15, conditional extraction would be a good way to do this.

> # all numbers greater than 12 AND less than 15
> x[x>12 & x<15]
 13 14
>
> # all numbers less than 12 OR greater than 15
> x[x<12 | x>15]
 11 16 17 18 19 20
>
> # all numbers not equal to 15
> x[x!=15]
 11 12 13 14 16 17 18 19 20
>
> # all numbers greater than or equal to 18
> x[x>=18]
 18 19 20
>
> # Which x values are greater than 13
> x>=13
 FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 Operator Description > greater than < less than >= greater than or equal to <= less than or equal to == equal to != not equal to !x not x x | y x OR y x & y x AND y isTRUE(x) test if X is TRUE

IV. subset function

The subset function, is a useful function, similar to conditional indexing. It can be used to return a subset of a vector depending on a defined condition or group of conditions.

> v = 10:20
> v
 10 11 12 13 14 15 16 17 18 19 20
>
> # ALL ELEMENTS OF v GREATER THAN 14
> subset(v, v > 14)
 15 16 17 18 19 20
>
> # ALL ELEMENTS OF v LESS THAN 14 OR GREATER THAN OR EQUAL TO 18
> subset(v, v < 14 | v >= 18)
 10 11 12 13 18 19 20
>
> # ALL ELEMENTS OF v GREANTER THAN 14 OR LESS THAN 18
> subset(v, v > 14 & v < 18)
 15 16 17