Home »
Python »
Python Programs
How to estimate how much memory a Pandas' DataFrame will need?
Given a Pandas DataFrame, we have to estimate how much memory it will need.
By Pranit Sharma Last updated : September 22, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and the data.
DataFrame can be created with the help of Python dictionaries or lists but in the real world, csv files are imported and then converted into DataFrames. Sometimes, DataFrames are first written into csv files. Here, we are going to write a DataFrame into a csv file.
While loading a csv file, it is better that we know how much memory our csv file will take. Pandas allows us to guess the estimated value of the memory which the csv file may consume.
Also, we can make an estimation of each column that how much memory will be consumed. In the end, we can also add all the estimations so that we will get to know about the total memory consumption.
Problem statement
Given a Pandas DataFrame, we have to estimate how much memory it will need.
Estimating how much memory a Pandas' DataFrame will need?
For this purpose, we will use pandas.DataFrame.memory_usage() method. It will return all the estimated values for the corresponding columns.
Note
To work with pandas, we need to import pandas package first, below is the syntax:
import pandas as pd
Let us understand with the help of an example,
Python program to estimate how much memory a Pandas' DataFrame will need
# Importing pandas package
import pandas as pd
# creating a dictionary of student marks
d={
"Players":['Sachin','Ganguly','Dravid','Yuvraj','Dhoni','Kohli'],
"Format":['ODI','ODI','ODI','ODI','ODI','ODI'],
"Runs":[15921,7212,13228,1900,4876,8043]
}
# Now we will create DataFrame
df=pd.DataFrame(d)
# Viewing the DataFrame
print("DataFrame:\n",df,"\n\n")
# Accessing the estimated memory usage
result = df.memory_usage()
# Display result
print("Estimated memory usage:\n",result)
Output
The output of the above program is:
Python Pandas Programs »