Home »
Python »
Python Programs
Pandas replacing strings in dataframe with numbers
Given a pandas dataframe, we have to replace strings with numbers.
By Pranit Sharma Last updated : September 30, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
Suppose, we have a DataFrame that contains some columns and all the columns have numerical values but their data type is a string. Since these are numerical values, they should support all the numerical operations but their data type is string hence numerical operation is not possible.
Replace strings with number in pandas dataframe
In this case, we will replace all these strings with numerical values using pandas.to_numeric() method. This method is used to convert the given argument to a numeric type.
The syntax of pandas.to_numeric() method is:
pandas.to_numeric(
arg,
errors='raise',
downcast=None
)
Let us understand with the help of an example,
Python code to replace strings in dataframe with numbers
# Importing pandas package
import pandas as pd
# Creating two dictionaries
d1 = {
'Name':['Ram','Shyam','Seeta','Geeta'],
'Age':['19','21','23','30'],
'School':['BVN','DPS','DPA','APS']
}
# Creating DataFrame
df = pd.DataFrame(d1)
# Display the DataFrame
print("Original DataFrame:\n",df,"\n")
# Check dtype of Age column
print("Data type of Age columns is: ",df['Age'].dtype,"\n")
# Converting age to numeric data type
pd.to_numeric(df['Age'])
df['Age']=df['Age'].astype('int64')
# Checking dtype again
print("Modified data type of Age column is: ",df['Age'].dtype)
Output
The output of the above program is:
Reference: pandas.to_numeric()
Python Pandas Programs »