Home »
R Language
Data types in R programming language
In this tutorial, we are going to learn about the data types in R programming language: Vectors, Lists, Matrices, Arrays, Factors and data frames.
Submitted by Ayush Sharma, on November 08, 2018
Generally, while doing programming in any programming language, you need to use different variables to store information. Variables are nothing but some reserved memory locations to store the values. This means that you reserve some space in memory when you create a variable.
Contrary to other programming languages like C, C++, and Java in R, the variables are not declared as some data type. The variables in R are R-Objects and the data type of the R-object becomes the data type of the variable. There are different types of R-objects. The frequently used ones are:
- Vectors
- Lists
- Matrices
- Arrays
- Factors
- Data Frames
1) Vectors
When vector is created with more than one element, you should use c() function which means to combine the elements into a vector.
# Create a vector.
apple <- c('red','green',"yellow")
print(apple)
# Get the class of the vector.
print(class(apple))
The above code, it produces the following result
[1] "red" "green" "yellow"
[1] "character"
2) Lists
A list is another R-object which can contain many different types of elements inside it like functions, vectors and even another list inside it.
# Create a list.
list1 <- list(c(2,5,3),21.3,sin)
# Print the list.
print(list1)
When we execute the above code, it produces the following result
[[1]]
[1] 2 5 3
[[2]]
[1] 21.3
[[3]]
3) Matrices
A matrix is a 2D rectangular data set. It can be created using a vector input which can be given to the matrix function.
# Create a matrix.
M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3, byrow = TRUE)
print(M)
When we execute the code, it produces the following result
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "c" "b" "a"
4) Arrays
While matrices are just upto two dimensions, arrays can be of any dimension. The array function takes a dim (dimension) attribute which creates the required number of dimension. In the example below we create an array with two elements which are 3x3 matrices each.
# Create an array.
a <- array(c('green','yellow'),dim = c(3,3,2))
print(a)
When we execute the above code, it produces the following result
, , 1
[,1] [,2] [,3]
[1,] "green" "yellow" "green"
[2,] "yellow" "green" "yellow"
[3,] "green" "yellow" "green"
, , 2
[,1] [,2] [,3]
[1,] "yellow" "green" "yellow"
[2,] "green" "yellow" "green"
[3,] "yellow" "green" "yellow"
5) Factors
Factors are other R-objects which are created using a vector. They store the vector along with the distinct values of the elements in the vector as labels. The labels are always character irrespective of the data type whether it is numeric or character or Boolean etc given in the input vector. They are useful in statistical modeling.
Factors are created using the function factor(). The n levels functions give the count of levels in the factor.
# Create a vector.
apple_colors <- c('green','green','yellow','red','red','red','green')
# Create a factor object.
factor_apple <- factor(apple_colors)
# Print the factor.
print(factor_apple)
print(nlevels(factor_apple))
When we execute the above code, it produces the following result
[1] green green yellow red red red green
Levels: green red yellow
[1] 3
6) Data Frames
Data frames are tabular objects of data. Contrary to a matrix in data frame each column can contain different modes of data. Columns can be of different type like first column can be numeric while the second column can be character and third column can be logical. It can be said to be a list of vectors of equal length.
Data Frames are created using the function data.frame().
# Create the data frame.
BMI <- data.frame(
gender = c("Male", "Male","Female"),
height = c(152, 171.5, 165),
weight = c(81,93, 78),
Age = c(42,38,26)
)
print(BMI)
When we execute the above code, it produces the following result
gender height weight Age
1 Male 152.0 81 42
2 Male 171.5 93 38
3 Female 165.0 78 26
Note: Examples of every data type are referenced from book and different sites.