Home »
Python »
Python Programs
Split a large pandas DataFrame
Given a Pandas DataFrame, we have to split it.
By Pranit Sharma Last updated : September 23, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and the data.
Problem statement
Given a Pandas DataFrame, we have to split it.
Splitting a DataFrame
Splitting a DataFrame means breaking a DataFrame into multiple parts for a better understanding of the data and effective data analysis. For splitting a DataFrame, we use numpy.array_split() method which is a library method of the NumPy package. This method is generally used for splitting an array into multiple sub-arrays, but it can also be used for splitting a DataFrame.
Note
To work with pandas, we need to import pandas package first, below is the syntax:
import pandas as pd
Let us understand with the help of an example,
Python program to split a large pandas DataFrame
# Importing pandas package
import pandas as pd
# Importing random package
import random
# Importing numpy package
import numpy as np
# Create a DataFrame
df = pd.DataFrame({
'A':[39,40,32,45,89,102293],
'B':[40,39,22,54,22,0],
'C':[42,44,20,49,30,110],
'D':[30,34,43,56,44,86],
'E':[76,67,45,56,55,45]
})
# Display original DataFrame
print("Orignal DataFrame:\n",df,"\n")
# Splitting DataFrame
result = np.array_split(df, 3)
# Display result
print("Splitted DataFrame:\n",result,"\n")
Output
The output of the above program is:
Python Pandas Programs »