Home »
Python »
Python Programs
Pandas dataframe select row by max value in group
Learn, how to select a row in Pandas dataframe by maximum value in a group?
Submitted by Pranit Sharma, on November 24, 2022
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
Suppose we are given with a dataframe with multiple columns. We need to filter and return a single row for each value of a particular column only returning the row with the maximum of a groupby object. This groupby object would be created by grouping other particular columns of the data frame.
Select row by max value in group
To select row by max value in group, we will simply groupby the columns and use the idxmax() method this method returns the index labels.
Let us understand with the help of an example
Python program to select row by max value in group
# Importing pandas package
import pandas as pd
# Importing numpy package
import numpy as np
# Creating a dictionary
d = {
'A':[1,2,3,4,5,6],
'B':[3000,3000,6000,6000,1000,1000],
'C':[200,np.nan,100,np.nan,500,np.nan]
}
# Creating a DataFrame
df = pd.DataFrame(d)
# Display DataFrame
print("Original DataFrame:\n",df,"\n")
# grouping and returning max of group
res = df.loc[df.reset_index().groupby(['A'])['B'].idxmax()]
# Display result
print("Result:\n",res)
Output
Python Pandas Programs »