Implementing the Five Most Popular Similarity Measures in Python – Dataconomy

Similarity is the measure of how much alike two data objects are. Similarity in a data mining context is usually described as a distance with dimensions representing features of the objects. If this distance is small, there will be high degree of similarity; if a distance is large, there will be low degree of similarity. Similarity is subjective and is highly dependent on the domain and application. For example, two fruits are similar because of color or size or taste. Care should be taken when calculating distance across dimensions/features that are unrelated. The relative values of each feature must be normalized, or one feature could end up dominating the distance

These are my links for July 20th through July 24th:

Ask HN: Best-architected open-source business applications worth studying? | Hacker News -
Monospaced Programming Fonts with Ligatures | Hacker News -
The language of choice - Propositional logic was discovered by Stoics around 300 B.C., only to be abandoned in later antiquity and rebuilt in the 19th century by George Boole’s successors. One of them, Charles Peirce, saw its significance for what we now call logic circuits, yet that discovery too was forgotten until the 1930s. In the ’50s John McCarthy invented conditional expressions, casting the logic into the form we’ll study here; then in 1986 Randal Bryant repeated one of McCarthy’s constructions with a crucial tweak that made his report “for many years the most cited paper in all of computer science, because it revolutionized the data structures used to represent Boolean functions” (Knuth).1 Let’s explore and code up some of this heritage of millennia, and bring it to bear on a suitable challenge: playing tic-tac-toe.
Then we’ll tackle a task that’s a little more practical: verifying a carry-lookahead adder circuit. Supposedly logic gets used all the time for all kinds of serious work, but for such you’ll have to consult the serious authors; what I can say myself, from working out the code to follow, is that the subject offers a fun playground plus the most primitive form of the pun between meaning and mechanism.

You’re encouraged to read with this article’s code cloned and ready

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Chris's Digital Detritus

Videotext for the twenty first century.

Implementing the Five Most Popular Similarity Measures in Python – Dataconomy