Wednesday, December 31, 2014

Programming Computer Vision with Python

http://programmingcomputervision.com/

Programming Computer Vision with Python

PCV - an open source Python module for computer vision

Download .zip Download data View on GitHub

PCV is a pure Python library for computer vision based on the book "Programming Computer Vision with Python" by Jan Erik Solem.
Book cover Available from Amazon and O'Reilly.

The final pre-production draft of the book (as of March 18, 2012) is available under a Creative Commons license. Note that this version does not have the final copy edits and last minute fixes. If you like the book, consider supporting O'Reilly and me by purchasing the official version.
The final draft pdf is here.

Thursday, December 25, 2014

Fast Domain Generalization with Kernel Methods – Part 2 (SSSL)

https://ghifar.wordpress.com/2014/12/25/fast-domain-generalization-with-kernel-methods-part-2-sssl/

Pada dasarnya, jika ada berbagai pilihan solusi untuk pemecahan suatu masalah, saya sangat tertarik solusi yang paling simpel (salah satu interpretasi dari Occam’s razor). Di semi-supervised learning, algoritma Simple Semi-Supervised Learning (SSSL) (Ji et al., ICML2012) saya anggap termasuk dalam kategori ini. Bahkan dalam banyak kasus SSSL mampu menghasilkan performa yang lebih baik dibandingkan dengan salah satu metode yang sebelumnya dianggap sebagai yang terbaik.

Friday, December 19, 2014

Machine Learning: The High-Interest Credit Card of Technical Debt


https://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/43146.pdf

Machine learning offers a fantastically powerful toolkit for building complex systems quickly. This paper argues that it is dangerous to thinkof these quick wins as coming for free. Using the framework of technical debt, we note that it is remarkably easy to incur massive ongoing maintenance costs atthe system levelwhen applying machine learning. The goal of this paper is highlight several machine learning specific risk factors and design patterns to be avoided or refactored where possible. These include boundary erosion, entanglement, hidden feedback
loops, undeclared consumers, data dependencies, changes in the external world,and a variety of system-level anti-patterns

Kuliah Machine Learning di CMU


http://www.cs.cmu.edu/~tom/10701_sp11/

Course Description:
Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, Support Vector Machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam's Razor. Short programming assignments include hands-on experiments with various learning algorithms, and a larger course project gives students a chance to dig into an area of their choice. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.

National Data Science Bowl

Kompetisi data science: klasifikasi plankton

http://www.kaggle.com/c/datasciencebowl/data

The Geometry of Classifiers

Sumber: http://www.win-vector.com/blog/2014/12/the-geometry-of-classifiers/

As John mentioned in his last post, we have been quite interested in the recent study by Fernandez-Delgado, et.al., “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” (the “DWN study” for short), which evaluated 179 popular implementations of common classification algorithms over 120 or so data sets, mostly from the UCI Machine Learning Repository. For fun, we decided to do a follow-up study, using their data and several classifier implementations from scikit-learn, the Python machine learning library. We were interested not just in classifier accuracy, but also in seeing if there is a “geometry” of classifiers: which classifiers produce predictions patterns that look similar to each other, and which classifiers produce predictions that are quite different? To examine these questions, we put together a Shiny app to interactively explore how the relative behavior of classifiers changes for different types of data sets.