- Men Without Work | Hacker News –
- A deep dive into APL –
- how_to_rust_language_visual_studio_2017 –
- Containers vs. Zones vs. Jails vs. VMs | Hacker News –
Web Bookmarks
Bookmarked content from around the web. Originally from my Delicious feed, I’m now using Pinboard.
Bookmarks for April 5th through April 6th
These are my links for April 5th through April 6th:
- Tales from a Core File » Turtles on the Wire: Understanding how the OS uses the Modern NIC –
- Alliance on Nautilus: The Unbearable Weirdness of CRISPR –
- GitHub – hnarayanan/artistic-style-transfer: Convolutional neural networks for artistic style transfer. –
- Convolutional neural networks for artistic style transfer — Harish Narayanan –
- Deep Photo Style Transfer | Hacker News –
- Using Rust in Windows –
- Kotlin/Native Tech Preview: Kotlin without a VM | Kotlin Blog –
- Why F.E.A.R.’s AI is still the best in first-person shooters | Hacker News –
Bookmarks for April 5th
- Open Source Community Forum Software –
- NASA Technical Reports Server (NTRS) – On the typography of flight-deck documentation – The report focuses on typographical factors such as type-faces, character height, use of lower- and upper-case characters, line length, and spacing. Some graphical aspects such as layout, color coding, fonts, and character contrast are also discussed. In addition, several aspects of cockpit reading conditions such as glare, angular alignment, and paper quality are addressed. Finally, a list of recommendations for the graphical design of flight-deck documentation is provided.
- Ask HN: Building a side project that makes money. Where to start? | Hacker News –
Bookmarks for April 3rd through April 4th
These are my links for April 3rd through April 4th:
- Python for Business: Identifying Duplicate Data – 33 Sticks – Data Preparation is one of those critical tasks that most digital analysts take for granted as many of the analytics platforms we use take care of this task for us or at least we like to believe they do so. With that said, Data Preparation should be a task that every good analyst completes as part of any data investigation.
Wes McKinney, author of Python for Data Analysis, defines Data Preparation as “cleaning, munging, combining, normalizing, reshaping, slicing, dicing, and transforming data for analysis.”
In this post, I am going to walk you through a real world example, focusing on Data Preparation, of how Python can be a very powerful tool for business focused data analysis. - Data Mining: Finding Similar Items and Users – To find similar items to a certain item, you've got to first define what it means for 2 items to be similar and this depends on the problem you're trying to solve:
on a blog, you may want to suggest similar articles that share the same tags, or that have been viewed by the same people viewing the item you want to compare with
Amazon has this section called "customers that bought this item also bought", which is self-explanatory
a service like IMDB, based on your ratings, could find users similar to you, users that liked or hated approximately the same movies you did, thus giving you suggestions on movies you'd like to watch in the future
In each case you need a way to classify these items you're comparing, whether it is tags, or items purchased, or movies reviewed. We'll be using tags, as it is simpler, but the formula holds for more complicated instances. - Implementing the Five Most Popular Similarity Measures in Python – Dataconomy – Similarity is the measure of how much alike two data objects are. Similarity in a data mining context is usually described as a distance with dimensions representing features of the objects. If this distance is small, there will be high degree of similarity; if a distance is large, there will be low degree of similarity. Similarity is subjective and is highly dependent on the domain and application. For example, two fruits are similar because of color or size or taste. Care should be taken when calculating distance across dimensions/features that are unrelated. The relative values of each feature must be normalized, or one feature could end up dominating the distance
- Cosine Similarity Part 1: The Basics – Algorithms for Big Data – The business use case for cosine similarity involves comparing customer profiles, product profiles or text documents. The algorithmic question is whether two customer profiles are similar or not. Cosine similarity is perhaps the simplest way to determine this.
If one can compare whether any two objects are similar, one can use the similarity as a building block to achieve more complex tasks, such as:
search: find the most similar document to a given one
classification: is some customer likely to buy that product
clustering: are there natural groups of similar documents
product recommendations: which products are similar to the customer’s past purchases - Harry Potter and the Methods of Rationality | Petunia married a professor, and Harry grew up reading science and science fiction. –
Bookmarks for April 3rd
- Show HN: Fake SMTP server as a service | Hacker News –
- Category Theory for Programmers: The Preface | Bartosz Milewski’s Programming Cafe –
- Education of a Programmer | Hacker News –