Home »
Python »
Python Programs
Random Sample of a subset of a dataframe in Pandas
Learn, how to create random sample of a subset of a dataframe in Python Pandas?
By Pranit Sharma Last updated : October 03, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
Problem statement
Suppose, we are given a DataFrame with a large number of entries and we need to split some data from the subset of the entire DataFrame.
Random Sample of a subset of a dataframe
For this purpose, we will use pandas.DataFrame.sample() method. It is used to return a random sample of items from an object.
Syntax:
DataFrame.sample(
n=None,
frac=None,
replace=False,
weights=None,
random_state=None,
axis=None,
ignore_index=False
)
Parameter(s):
- n: Number of items from the axis to return.
- frac: fraction of axis to be returned.
- replace: bool value
Let us understand with the help of an example,
Python program to create random sample of a subset of a dataframe
# Importing pandas package
import pandas as pd
# Creating a list
l = [[1, 2], [3, 4], [5, 6], [7, 8]]
# Creating a DataFrame
df = pd.DataFrame(l,columns=['A','B'])
# Display original DataFrame
print("Original Dataframe:\n",df,"\n")
# Getting a sample
res = df.sample(2)
# Display this random sample
print("Sample of subset:\n",res,"\n")
Output
The output of the above program is:
Python Pandas Programs »