Home »
Python »
Python Programs
Reverse a get dummies encoding in pandas
Given a pandas dataframe, we have to reverse a get dummies encoding in it.
Submitted by Pranit Sharma, on November 15, 2022
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Dummy columns in pandas contain categorical data into dummy or indicator variables. These are used for data analysis. In most cases, this is a feature of any action being described.
Problem statement
Here, we are given a DataFrame with multiple columns, out of these columns only one column has some variability and other columns only have 0 or 1 value.
Let us say column X has some values and we need to find out for each value of X which column has 1 in front of it.
Reversing a get dummies encoding
To get a dummy column, we must use pandas.get_dummies() method, this method returns all the dummy values of each column passed as a series inside it.
For this purpose, we will set the index of the DataFrame with the values of column X and use the stack() method with a condition that only if the values are 1. After this, we will reset the index and we will drop all the zeroes from the columns.
Let us understand with the help of an example,
Python code to reverse a get dummies encoding in pandas
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating a dictionary
d ={
'X':[100,101,102,104],
'1':[1,0,0,1],
'2':[0,1,1,0],
'3':[1,0,0,1],
'4':[0,1,1,0]
}
# Creating DataFrame
df = pd.DataFrame(d)
# Display dataframe
print('Original DataFrame:\n',df,'\n')
# Dropping zeroes and finding 1
# for each values of X
df.set_index('X',inplace=True)
res = df[df==1].stack().reset_index().drop(0, axis=1)
# Display Result
print("Result:\n",res)
Output
The output of the above program is:
Python Pandas Programs »