Home »
Python »
Python Programs
Stop Pandas from converting int to float due to an insertion in another column
Learn, how can we stop Pandas from converting int to float due to an insertion in another column?
Submitted by Pranit Sharma, on September 25, 2022
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
In simple words, we are given a DataFrame with two columns, the columns have int and string data types respectively.
If we insert a NaN value in an int column, pandas will convert int values to float values which is obvious but if we insert a nan value in a string column, it will also convert the int value to float value hence it recasts a column on insertion in another column.
We need to find a solution so that pandas do not recast the int value to float, the best way to do this is to change the data type of int to object so that it will be converted into float automatically by pandas.
Converting int to float due to an insertion in another column
Let us understand with the help of an example,
Example
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating a dictionary
d = {
'int':[],
'string':[]
}
# Creating a DataFrame
df = pd.DataFrame(d)
# Fixing the column values
df.loc[0] = [10,'A']
df.loc[1] = [20,'B']
df.loc[2] = [30,'C']
df.loc[3] = [40,np.nan]
# Display Original DataFrame
print("Created DataFrame:\n",df,"\n")
The output of the above program is:
As we can see that the int value is converted to float values and hence we will make the data type of the int column as an object.
Stop Pandas from converting int to float due to an insertion in another column
For this purpose, you can simply define the dtype of the column types by using Pandas.Series() method before assigning the column values. Consider the below given example,
Example
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating a DataFrame
df = pd.DataFrame()
# Defining columns
df["int"] = pd.Series([], dtype=object)
df["str"] = pd.Series([], dtype=str)
# Fixing the column values
df.loc[0] = ['10','A']
df.loc[1] = ['20','B']
df.loc[2] = ['30','C']
df.loc[3] = ['40',np.nan]
# Display Original DataFrame
print("Created DataFrame:\n",df,"\n")
The output of the above program is:
Python Pandas Programs »