Home »
Python »
Python Programs
Python program to calculate Jaccard similarity
Jaccard similarity in Python: In this tutorial, we will learn what is Jaccard similarity, how to calculate it, and how to write a Python program to calculate Jaccard similarity.
By IncludeHelp Last updated : August 13, 2023
Jaccard similarity
The Jaccard similarity is used to measure the similarity between two data sets to see find the shared and distinct members. The Jaccard similarity is calculated by dividing the size of the intersection and the size of the union of two sets. Lear more about on Jaccard similarity at learndatasci
Jaccard similarity formula
Here is the formula to find Jaccard similarity is:
Example 1: Python program to calculate Jaccard similarity
In this program, we are finding the Jaccard similarity of the given data sets.
set1 = {10, 20, 30, 40, 50}
set2 = {10, 20, 30, 80, 90}
# Finding the intersection of sets
intersection_result = set1.intersection(set2)
# Finding the union of sets
union_result = set1.union(set2)
# Printing the values
print("AnB = ", intersection_result)
print("AUB = ", union_result)
print(
"Jaccard similarity: J(set1, set2) = ",
float(len(intersection_result)) / float(len(union_result)),
)
Output
AnB = {10, 20, 30}
AUB = {40, 10, 80, 50, 20, 90, 30}
Jaccard similarity: J(set1, set2) = 0.42857142857142855
Example 2: Python program to calculate Jaccard similarity
In this program, we are writing a user-define function, passing the sets, and returning the Jaccard similarity.
# Function to calculate Jaccard similarity
def calculate_Jaccard_similarity(set1, set2):
# size of the intersection of the sets
intersection = len(set1.intersection(set2))
# size of the union of the sets
union = len(set1.union(set2))
# Calculating Jaccard similarity
Jaccard_similarity = intersection / union
# Retuning it
return Jaccard_similarity
# Main code i.e, call function here
set1 = {10, 20, 30, 40, 50}
set2 = {10, 20, 30, 80, 90}
result = calculate_Jaccard_similarity(set1, set2)
print("Jaccard similarity is:", result)
Output
Jaccard similarity is: 0.42857142857142855