Home »
Python »
Python Programs
Pandas DataFrame concat / update ('upsert')?
Learn, how to concat or update ('upsert') in Pandas dataframe?
Submitted by Pranit Sharma, on December 06, 2022
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
Suppose we are given two data frames and we need to look for an elegant way to append all the rows from one dataframe to another dataframe (both DataFrames having the same index and column structure), but in cases where the same index value appears in both the dataframes used the row from the second dataframe.
DataFrame concat / update ('upsert')
This operation is similar to 'upsert' in SQL systems which means a combination of update and insert in the sense that each row from the second data frame is either:
- Used to update an existing row in the first data frame if the key already exists in the first dataframe.
- Inserted into the first dataframe at the end if the key does not already exist.
The solution for this problem is to concatenate the first dataframe with new rows in the second dataframe i.e., where the index does not match.
Let us understand with the help of an example,
Python program for Pandas DataFrame concat / update ('upsert')
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating DataFrames
df1 = pd.DataFrame([['test',1, True], ['test2',2, True]]).set_index(0)
df2 = pd.DataFrame([['test2',4], ['test3',3]]).set_index(0)
# Display Original dataframes
print("Original DataFrame 1:\n",df1,"\n")
print("Original DataFrame 2:\n",df2,"\n")
# Calculating upser result
df1 = pd.concat([df1, df2[~df2.index.isin(df1.index)]])
df1.update(df2)
# Display result
print("Result:\n",df1)
Output
The output of the above program is:
Python Pandas Programs »