Home »
Python »
Python Programs
Python - Pandas applying regex to replace values
Given a Pandas DataFrame, we have to apply regex to replace values.
By Pranit Sharma Last updated : September 26, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Regex (Regular Expression)
The regex or a regular expression is simply a group of characters or special characters which follows a particular pattern with the help of which we can search and filter pandas DataFrame rows.
Example
- 'K.*' : It will filter all the records which start with the letter 'K'.
- 'A.*' : It will filter all the records which start with the letter 'A'.
Replacing a value using regex or by comparing the value by regex
For this purpose, we will use DataFrame[col].str.replace() method, inside which we will define our regex to compare.
Let us understand with the help of an example,
Python program for applying regex to replace values
# Importing pandas package
import pandas as pd
# Creating a dictionary
d = {'Col':['$1100,000*','$40000 string created']}
# Creating a dataframe
df = pd.DataFrame(d)
# Display Dataframe
print("DataFrame :\n",df,"\n")
# Using regex comparison
df['Col'] = df['Col'].str.replace(r'\D+', '', regex=True).astype('int')
# Display modified DataFrame
print("Modified DataFrame:\n",df)
Output
The output of the above program is:
Python Pandas Programs »