Best way to count the number of rows with missing values in a pandas DataFrame

Given a pandas dataframe, we have to find the best way to count the number of rows with missing values. By Pranit Sharma Last updated : October 05, 2023

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.

While creating a DataFrame or importing a CSV file, there could be some NaN values in the cells. NaN values mean "Not a Number" which generally means that there are some missing values in the cell. To deal with this type of data, you can either remove the particular row (if the number of missing values is low) or you can handle these values. For handling these values, you might need to count the number of NaN values or you need to count the number of non-NaN values.

Problem statement

We are given a DataFrame with multiple columns and each column contains some nan values at some particular position.

Counting the number of rows with missing values

There are two things that we can do, we can either count the number of nan values present in the DataFrame or we can count the rows which have some missing value.

One of the quickest possible ways to count the number of rows with missing values is to subtract the number of rows returned from the dropna() method from the total number of rows.

ADVERTISEMENT

Let us understand with the help of an example,

Python program to count the number of rows with missing values

# Importing pandas package
import pandas as pd

# Importing numpy package
import numpy as np

from numpy.random import randn

# Creating DataFrame
df = pd.DataFrame(randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],
               columns=['one', 'two', 'three'])

# Display dataframe
print('Original DataFrame:\n',df,'\n')

# Setting index
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])

# using dropna and counting rows
res = df.shape[0] - df.dropna().shape[0]

# Display Result
print("Rows with missing values:\n",res)

Output

The output of the above program is:

Example: Best way to count the number of rows with missing values in a pandas DataFrame

Python Pandas Programs »

Comments and Discussions!

Load comments ↻





Copyright © 2024 www.includehelp.com. All rights reserved.