Home »
R Language
Data Reshaping in R Programming Language
In this tutorial, we are going to learn about the data reshaping in R programming language with examples.
Submitted by Bhavya Sri Khandrika, on May 02, 2020
In general, the data reshaping is concerned with the organization of the considered data. Usually, the data will be rearranged in terms of rows and the corresponding columns. To accomplish this particular task the data must be processed. Now when it comes to the data processing one of the prior steps to be taken is to consider the input as the data frame. As we all are acquainted with the fact that the modification of the data frames is very comfortable. Typically, the users or the programmer can easily extract the data from the respective rows and the columns from a data frame. Well, the matter is crystal clear. But now if the situation prevails where the user has the data in a different format than the data frame then the problem starts enhancing its strength.
Now in such a case prevails then, there are various functions developed by the team to help the users to cross such obstacles. The foresightedness of the development team of R has led to a great invention of the several functions that enabled the user to work very comfortably on the R platform. Here as a part of data reshaping the user finds the following functions like the functions that deal with the actions like merging, splitting, and bringing a change in the number of rows and columns and vice versa in the input taken and eventually one can obtain the well-framed data frame.
Transpose of a Matrix in the R
R is well appreciated due to the versed structure and the attributes included in it. R allows its users to find or calculate the transpose of a matrix. To achieve this task one needs to get acknowledged over the syntax of the transpose function in the R language.
Syntax:
The syntax for finding the transpose of the matrix,
t(Matrix/data frame)
This particular function not only supports the input as a matrix but also the data frame which means you can also compute the transpose of data frames.
Example:
# create a matrix and with the declared values
B <- matrix(seq.int(2.4, 1.3, -0.1), nrow = 6,
byrow =TRUE)
print(B)
# now print the matrix after the transposing is completed
print("Matrix after transposing")
d <- t(B)
print(d)
Output
[,1] [,2]
[1,] 2.4 2.3
[2,] 2.2 2.1
[3,] 2.0 1.9
[4,] 1.8 1.7
[5,] 1.6 1.5
[6,] 1.4 1.3
[1] "Matrix after transposing"
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 2.4 2.2 2.0 1.8 1.6 1.4
[2,] 2.3 2.1 1.9 1.7 1.5 1.3
Joining if the rows and the columns in the data frame
R fosters the users in joining various numbers of the vectors to form a single data frame. To attain this result there are two main functions in R which assist the programmers in combining the rows and columns as a data frame. They are cbind() function and rbind() function.
The other benefit of using the rbind() function is that it also combines or merges the two data frame inputs as a single one. Merging of the data frames is often required in many applications which help the individuals to access the data from both individual data frames.
Syntax:
The syntax of the above-discussed functions, cbind() and rbind()
cbind(vector1, vector2,vector3,vector4,.......vectorN)
rbind(dataframe1, dataframe 2, dataframe 3, dataframe 4........ dataframeN)
The below code will help the users in understanding the core concept of joining the rows and the columns to form a data frame.
Student_Name <- c("Siva","Bhargav","Ram","Swathi")
roll_no <- c("D18EI002","D18EC052","D18EE036","D18CE016")
Grades <- c('A','S','C','B')
# Combining vectors into one data frame
info <- cbind(Student_Name,roll_no, Grades)
# Printing data frame
print(info)
Output
Student_Name roll_no Grades
[1,] "Siva" "D18EI002" "A"
[2,] "Bhargav" "D18EC052" "S"
[3,] "Ram" "D18EE036" "C"
[4,] "Swathi" "D18CE016" "B"
As part of the data reshaping the user can create two individual data frames and can eventually combine them to a single one. Now let us create the other data frame which later will be appended to the first one shown above.
# Creating another data frame with similar columns
new.stu_data <- data.frame(
Stydent_Name = c("Lasya","Sowmya"),
roll_no = c("D18CE052","D18EI016"),
Grades = c('S','A'),
stringsAsFactors=FALSE
)
# Now let us give a header to the above dataframe
# as the second dataframe
cat("****___ The Second DF___ ****\n")
# The name is given as the second data frame so
# now it's time to print the things that are
# available in this second dataframe
print(new.stu_data)
Output
****___ The Second DF___ ****
Stydent_Name roll_no Grades
1 Lasya D18CE052 S
2 Sowmya D18EI016 A
# this is the first data frame considered
Student_Name <- c("Siva","Bhargav","Ram","Swathi")
roll_no <- c("D18EI002","D18EC052","D18EE036","D18CE016")
Grades <- c('A','S','C','B')
# Combining vectors into one data frame
info <- cbind(Student_Name,roll_no, Grades)
# Printing data frame
print(info)
# the below is the second data frame
# Creating another data frame with similar columns
new.stu_data <- data.frame(
Student_Name = c("Lasya","Sowmya"),
roll_no = c("D18CE052","D18EI016"),
Grades = c('S','A'),
stringsAsFactors=FALSE
)
# Now let us give a header to the above
# dataframe as the second dataframe
cat("****___ The Second DF___ ****\n")
# The name is given as the second data frame so now
# it's time to print the things that are
# available in this second dataframe
print(new.stu_data)
# Combining rows from both the data frames.
all.info <- rbind(info,new.stu_data)
# Printing a header.
cat("# # # The combined data frame\n")
# Printing the result.
print(all.info)
Output
Student_Name roll_no Grades
[1,] "Siva" "D18EI002" "A"
[2,] "Bhargav" "D18EC052" "S"
[3,] "Ram" "D18EE036" "C"
[4,] "Swathi" "D18CE016" "B"
****___ The Second DF___ ****
Student_Name roll_no Grades
1 Lasya D18CE052 S
2 Sowmya D18EI016 A
# # # The combined data frame
Student_Name roll_no Grades
1 Siva D18EI002 A
2 Bhargav D18EC052 S
3 Ram D18EE036 C
4 Swathi D18CE016 B
5 Lasya D18CE052 S
6 Sowmya D18EI016 A
Now moving on further let us learn about how to merge the data frames?
There is a special attribute called merge() function which assists the users in the merging process of the two data frames. To attain this task there is something that needs to be concentrated on. Usually one will find the constraints in the merging process. The names of the columns declared in the two data frames must be identical to complete the merging process.
Example to depict the above concept
Considering the dataset of the people who have diabetes and diabetes level. This dataset represents the diabetes level of the women in the Pima Indian Women which is included in the 'MASS' library. Now our prime duty is to merge the values of the blood pressures and the body mass indices of the two datasets. Usually, as a part of the merging process, the two-column which needs to be merged are considered and after the final execution of the code will finally give the merged data set of the above two data frames.
Consider the following in which the library MASS is included:
library(MASS)
merging_pima<- merge(x = Pima.te, y = Pima.tr,
by.x = c("bp", "bmi"),
by.y = c("bp", "bmi")
)
print(merging_pima)
nrow(merging_pima)
Output
bp bmi npreg.x glu.x skin.x ped.x age.x type.x npreg.y glu.y skin.y ped.y
1 60 33.8 1 117 23 0.466 27 No 2 125 20 0.088
2 64 29.7 2 75 24 0.370 33 No 2 100 23 0.368
3 64 31.2 5 189 33 0.583 29 Yes 3 158 13 0.295
4 64 33.2 4 117 27 0.230 24 No 1 96 27 0.289
5 66 38.1 3 115 39 0.150 28 No 1 114 36 0.289
6 68 38.5 2 100 25 0.324 26 No 7 129 49 0.439
7 70 27.4 1 116 28 0.204 21 No 0 124 20 0.254
8 70 33.1 4 91 32 0.446 22 No 9 123 44 0.374
9 70 35.4 9 124 33 0.282 34 No 6 134 23 0.542
10 72 25.6 1 157 21 0.123 24 No 4 99 17 0.294
11 72 37.7 5 95 33 0.370 27 No 6 103 32 0.324
12 74 25.9 9 134 33 0.460 81 No 8 126 38 0.162
13 74 25.9 1 95 21 0.673 36 No 8 126 38 0.162
14 78 27.6 5 88 30 0.258 37 No 6 125 31 0.565
15 78 27.6 10 122 31 0.512 45 No 6 125 31 0.565
16 78 39.4 2 112 50 0.175 24 No 4 112 40 0.236
17 88 34.5 1 117 24 0.403 40 Yes 4 127 11 0.598
age.y type.y
1 31 No
2 21 No
3 24 No
4 21 No
5 21 No
6 43 Yes
7 36 Yes
8 40 No
9 29 Yes
10 28 No
11 55 No
12 39 No
13 39 No
14 49 Yes
15 49 Yes
16 38 No
17 28 No
[1] 17
The above example deals with the exact libraries that are present in the R language.