Home »
Python »
Python Programs
How to merge multiple DataFrames on columns?
Given multiple DataFrames, we have to merge them on columns.
By Pranit Sharma Last updated : September 20, 2023
DataFrames are 2-dimensional data structure in pandas. DataFrames consists of rows, columns and the data. DataFrame can be created with the help of python dictionaries or lists but in real world, csv files are imported and then converted into DataFrames.
Problem statement
Given multiple DataFrames, we have to merge them on columns.
Merging multiple DataFrames on columns
We can have one, two or more than 2 DataFrames in pandas, and it may be possible that our required information is distributed in all of the DataFrames, the best way to deal with this situation is to join these DataFrames into a single DataFrame.
To merge two or more DataFrames into a single DataFrame, we use pandas.merge() method.
Syntax:
pandas.merge(
left,
right,
how='inner',
on=None,
left_on=None,
right_on=None,
left_index=False,
right_index=False,
sort=False,
suffixes=('_x', '_y'),
copy=True,
indicator=False,
validate=None
)
Let us understand with the help of an example.
Python program to merge multiple DataFrames on columns
# Importing pandas package
import pandas as pd
# Creating a Dictionary
dict1 = {'Name':['Amit Sharma','Bhairav Pandey','Chirag Bharadwaj','Divyansh Chaturvedi','Esha Dubey']}
dict2 = {'Name':['Jatin prajapati','Rahul Shakya','Gaurav Dixit','Pooja Sharma','Mukesh Jha']}
dict3 = {'Name':['Ram Manohar','Sheetal Bhadoriya','Anand singh','Ritesh Arya','Aman Gupta']}
# Creating a DataFrame
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)
df3 = pd.DataFrame(dict3)
# Display DataFrame
print("DataFrame1:\n",df1,"\n")
print("DataFrame2:\n",df2,"\n")
print("DataFrame3:\n",df3,"\n")
# Joining two DataFrames and then joining the first two DataFrames
# with third DataFrame on the column Name
result = pd.merge(df1,df2,left_index=True, right_index=True)
result = pd.merge(result,df3,left_index=True, right_index=True)
# Display Result
pd.set_option('display.max_columns', 3)
print("Merged DataFrames:\n",result)
Output
The output of the above program is:
DataFrame1:
Name
0 Amit Sharma
1 Bhairav Pandey
2 Chirag Bharadwaj
3 Divyansh Chaturvedi
4 Esha Dubey
DataFrame2:
Name
0 Jatin prajapati
1 Rahul Shakya
2 Gaurav Dixit
3 Pooja Sharma
4 Mukesh Jha
DataFrame3:
Name
0 Ram Manohar
1 Sheetal Bhadoriya
2 Anand singh
3 Ritesh Arya
4 Aman Gupta
Merged DataFrames:
Name_x Name_y Name
0 Amit Sharma Jatin prajapati Ram Manohar
1 Bhairav Pandey Rahul Shakya Sheetal Bhadoriya
2 Chirag Bharadwaj Gaurav Dixit Anand singh
3 Divyansh Chaturvedi Pooja Sharma Ritesh Arya
4 Esha Dubey Mukesh Jha Aman Gupta
Python Pandas Programs »