- The 640 K Barrier | Hacker News –
- Competitive Programmer’s Handbook – The purpose of this book is to give the reader a thorough introduction to competitive programming. The book is especially intended for students who want to learn algorithms and possibly participate in the International Olympiad in Informatics (IOI) or in the International Collegiate Programming Contest (ICPC).
- Competitive Programming Book Companion Website – This book contains a collection of relevant data structures, algorithms, and programming tips written for University students who want to be more competitive in the ACM International Collegiate Programming Contest (ICPC), high school students who are aspiring to be competitive in the International Olympiad in Informatics (IOI), coaches for these competitions, those who love problem solving using computer programs, and those who go for interviews in big IT-companies.
book
Bookmarks for April 8th
- https://therecipe.github.io/qt/ –
- GitHub – shlomif/PySolFC: An awesome collection of Solitaire games for Python. This is an unofficial clone of the Subversion repository, in order to maintain some branches with improvements. –
- GitHub – happi/theBeamBook: A description of the Erlang Runtime System ERTS and the virtual Machine BEAM. –
Bookmarks for April 3rd through April 4th
These are my links for April 3rd through April 4th:
- Python for Business: Identifying Duplicate Data – 33 Sticks – Data Preparation is one of those critical tasks that most digital analysts take for granted as many of the analytics platforms we use take care of this task for us or at least we like to believe they do so. With that said, Data Preparation should be a task that every good analyst completes as part of any data investigation.
Wes McKinney, author of Python for Data Analysis, defines Data Preparation as “cleaning, munging, combining, normalizing, reshaping, slicing, dicing, and transforming data for analysis.”
In this post, I am going to walk you through a real world example, focusing on Data Preparation, of how Python can be a very powerful tool for business focused data analysis. - Data Mining: Finding Similar Items and Users – To find similar items to a certain item, you've got to first define what it means for 2 items to be similar and this depends on the problem you're trying to solve:
on a blog, you may want to suggest similar articles that share the same tags, or that have been viewed by the same people viewing the item you want to compare with
Amazon has this section called "customers that bought this item also bought", which is self-explanatory
a service like IMDB, based on your ratings, could find users similar to you, users that liked or hated approximately the same movies you did, thus giving you suggestions on movies you'd like to watch in the future
In each case you need a way to classify these items you're comparing, whether it is tags, or items purchased, or movies reviewed. We'll be using tags, as it is simpler, but the formula holds for more complicated instances. - Implementing the Five Most Popular Similarity Measures in Python – Dataconomy – Similarity is the measure of how much alike two data objects are. Similarity in a data mining context is usually described as a distance with dimensions representing features of the objects. If this distance is small, there will be high degree of similarity; if a distance is large, there will be low degree of similarity. Similarity is subjective and is highly dependent on the domain and application. For example, two fruits are similar because of color or size or taste. Care should be taken when calculating distance across dimensions/features that are unrelated. The relative values of each feature must be normalized, or one feature could end up dominating the distance
- Cosine Similarity Part 1: The Basics – Algorithms for Big Data – The business use case for cosine similarity involves comparing customer profiles, product profiles or text documents. The algorithmic question is whether two customer profiles are similar or not. Cosine similarity is perhaps the simplest way to determine this.
If one can compare whether any two objects are similar, one can use the similarity as a building block to achieve more complex tasks, such as:
search: find the most similar document to a given one
classification: is some customer likely to buy that product
clustering: are there natural groups of similar documents
product recommendations: which products are similar to the customer’s past purchases - Harry Potter and the Methods of Rationality | Petunia married a professor, and Harry grew up reading science and science fiction. –
Bookmarks for March 12th
- Problem Solving with Algorithms and Data Structures using Python — Problem Solving with Algorithms and Data Structures –
- I had an autoimmune disease, then the disease had me (2013) | Hacker News –
- On snot and fonts / Luc Devroye –
Bookmarks for February 16th
- Show HN: A guide to all HTML5 elements and attributes | Hacker News –
- HTML Reference – A free guide to all HTML elements and attributes. –
- os01 by tuhdo –
- Why LINQ beats SQL | Hacker News –