Home »
Python »
Python Programs
Writing pandas DataFrame to JSON in unicode
Given a pandas dataframe, we have to write into to JSON in unicode.
By Pranit Sharma Last updated : October 05, 2023
Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consist of rows, columns, and data.
A JSON file is just like a python dictionary which holds the elements in the form of a key:value pair. JSON allows programmers to store different data types in the form of human-readable code, with the keys representing the names and the values containing the related data to it.
There are more than a million code points supported by Unicode, these code points are written with a u followed by a plus sign and the number in hex.
Problem statement
We are given a DataFrame with multiple columns and these columns contain some Unicode every time we convert this dataframe to JSON, it converts the original character into its code point.
Write a DataFrame to JSON in unicode
To make the original character into the JSON file, we need to write it into JSON once. We will then open this file and this time we again need to convert it into JSON using pandas.DataFrame.to_json() and pass a parameter force_ascii=False.
Let us understand with the help of an example,
Python program to write pandas DataFrame to JSON in unicode
# Importing pandas package
import pandas as pd
# Creating a dataframe
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
# Converting to json
df.to_json('df.json')
Output:
As we can see that the Unicode is converted into some of its code points, we need to overcome this situation by using the following code:
# Opening and again converting to JSON
with open('df.json', 'w', encoding='utf-8') as file:
df.to_json(file, force_ascii=False)
Output:
Python Pandas Programs »