Home »
Python »
Python Programs
Python - Join or merge with overwrite in pandas
Given two Pandas DataFrames, we have to join or merge with overwrite.
Submitted by Pranit Sharma, on August 20, 2022
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
We have a DataFrame df1 and we want to add df2 (another DataFrame) which can have a few or more columns and overlapping indexes. We are going to overwrite the column value of df1 with df2 if both the values are the same and the indices overlap.
Join or merge with overwrite in pandas
There are several approaches to solve this problem, we can either use the join() / merge() method or we can use df.comb() method. But here, we are going to use df.update() method which will automatically do all the tasks we want.
Pandas DataFrame update() method
This method is used to update a DataFrame with those values which have some similar values in place (usually another DataFrame).
Syntax:
DataFrame.update(
other,
join='left',
overwrite=True,
filter_func=None,
errors='ignore'
)
Parameter(s):
- other: Another object or DataFrame.
- join: Left join by default
- overwrite: It accepts bool values whether True or False
- errors: If raises, both DataFrames will have a Null value, by default it is ignore.
Let us understand with the help of an example,
Python program to join or merge with overwrite in pandas
# Importing pandas package
import pandas as pd
# Import numpy package
import numpy as np
# Creating DataFrames
df1 = pd.DataFrame([[np.nan, 20, 20], [10, 40, np.nan],
[np.nan, 50, np.nan]])
df2 = pd.DataFrame([[60, np.nan, 70], [80, 90, 100]],
index=[1, 2])
# Display Original DataFrames
print("Created DataFrame 1:\n",df1,"\n")
print("Created DataFrame 2:\n",df2,"\n")
# Using df.update
df1.update(df2)
# Display result
print("Result:\n",df1)
Output
Python Pandas Programs »