Cosine Similarity Part 1: The Basics – Algorithms for Big Data

The business use case for cosine similarity involves comparing customer profiles, product profiles or text documents. The algorithmic question is whether two customer profiles are similar or not. Cosine similarity is perhaps the simplest way to determine this.

If one can compare whether any two objects are similar, one can use the similarity as a building block to achieve more complex tasks, such as:

search: find the most similar document to a given one
classification: is some customer likely to buy that product
clustering: are there natural groups of similar documents
product recommendations: which products are similar to the customer’s past purchases