Mesin Belajar: July 2017

Friday, July 28, 2017

Putting Deep Learning To Work

http://blog.udacity.com/2016/01/putting-deep-learning-to-work.html

Deep learning is a modern take on the old idea of teaching computers, instead of programming them. It has taken the world of machine learning by storm in recent years, and for good reason! Deep learning provides state-of-the-art results in many of the thorniest problems in computing, from machine perception and forecasting, to analytics and natural language processing. Our brand new Deep Learning Course, a collaboration between Google and Udacity, will have you learning and mastering these techniques in an interactive, hands on fashion, and give you the tools and best practices you need to apply deep learning to solve your own problems.
Reading the flurry of recent popular press around deep learning, you might rightfully wonder: isn’t deep learning just a ‘Big Data’ thing? Don’t I need the computing resources of Google or Facebook to take advantage of it? Isn’t there a lot of ‘black magic’ involved in making these models tick? And wouldn’t it only work for a narrow spectrum of perception tasks in the first place?
As someone from industry who accidentally fell into deep learning while working on Google Voice Search just five years ago, I’ve seen how nothing can be further from the truth. At that time, I didn’t use Google’s bazillion machines to get started with deep learning: I bought a modest computer with a GPU. Getting started was difficult then: few people outside of select research groups in academia knew the tricks of the trade which were necessary to make deep learning work well. But the trend – that continues today – of researchers using open source tools, and open sourcing the results of their papers started to take root, and today that knowledge is readily accessible to anyone with basic understanding of machine learning.
Today, the best tools for both research and deployment in industrial applications are all open source, and TensorFlow is a prime example of a framework that caters to the whole spectrum of users: from researchers and data scientists to systems engineers who need production-grade systems to deploy in production. Deep learning is also significantly impacting other arenas as well, including drug discovery, natural language processing, and web analytics. In these cases and more, deep learning is augmenting—and often replacing—the traditional arsenal of machine learning. And it is surprisingly easy to get started on it, with many examples to draw from on the web, open datasets, and a thriving community of enthusiasts.
Screen Shot 2016-01-20 at 3.12.25 PM

This new course is geared toward those of you who are eager to try deep learning in the real world, but have have not made the jump yet. Perhaps because you need concrete solutions to concrete problems, and want just enough theory to feel confident exploring what this approach to machine learning can do for you. Or maybe you’re an undergraduate or beginning graduate student, who wants to get your feet wet without spending a whole semester on the problem quite yet, or who wants to start coding here-and-now on a research problem that you care about.
This course is also a great way for those of you with some experience of deep learning to get started with TensorFlow, with end-to-end solutions to several classes of problems spelled out in detailed iPython Notebooks.
Deep learning is just now getting out of the lab and proving itself as a fantastic tool for a wide variety of users. This course is a great opportunity for you to learn about this exciting field and put it to work for you. I hope you enjoy the class as much as I enjoyed putting it together, and that we’ll see you soon join the ranks of this exciting community of researchers and engineers changing the face of machine learning, one stochastic gradient step at a time!

Saturday, July 22, 2017

OpenSNP Height Prediction

https://www.crowdai.org/challenges/opensnp-height-prediction

Friday, July 21, 2017

Accelerating Deep Learning with the OpenCL™ Platform and Intel® Stratix® 10 FPGAs

https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/wp/wp-01269-accelerating-deep-learning-with-opencl-and-intel-stratix-10-fpgas.pdf

Introduction
Internet video traffic will grow fourfold from 2015 to 2020.
[1]
With this explosion
of visual data, it is critical to find effective methods for sorting, classifying, and
identifying imagery. Convolutional neural networks (CNNs), a machine learning
methodology based on the function of the human brain, are commonly used to
analyze images. Software separates the image into sections, often overlapping,
and then analyzes them to form an overall map of the visual space. This process
involves several complex mathematical steps to analyze, compare, and identify the
image with a low error rate.
Developers create CNNs using computationally intensive algorithms and
implement them on a variety of platforms. This white paper discusses a CNN
implementation on an Intel® Stratix® 10 FPGA that processes 14,000 images/
second at 70 images/second/watt for large batches and 3,015 images/second
at 18 images/second/watt for batch sizes of 1.
†
As these numbers show, Intel
Stratix 10 FPGAa are competitive with other high-performance computing (HPC)
devices such as GPUs for large batch sizes, and are significantly faster than other
devices at low batch sizes.
CNN benchmarks
The Stanford Vision Lab has hosted the ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) since 2010. Competitors are challenged to develop a CNN
algorithm that can analyze and classify objects in data sets comprising millions of
images or video clips. The 2012 contest-winning algorithm, dubbed AlexNet,*
[2]
provided a huge leap forward in reducing the classification error rates compared
to previous algorithms.
[3]
In 2014, the winning algorithm (GoogLeNet*) used an
improved algorithm to reduce the error rate even further.
[4]
Intel has developed a
novel design that implements these benchmark algorithms with modifications to
boost the performance on Intel FPGAs.
CNN algorithms consist of a series of operations or layers. For example, the AlexNet
algorithm has:
•
Convolution layers that perform a convolution operation on a 3-dimensional (3D)
data array (called a feature map) and a 3D filter. The operation uses a rectified
linear unit (ReLU) as an activation function.
•
Cross-channel local response normalization layers that scale the feature map
elements by a factor that is a function of the elements at the same location in
adjacent channels as the element being normalized.
•
Max pooling layers that read the data in 2-dimensional (2D) windows and output
the maximum values

Thursday, July 13, 2017

Seven lessons I learned teaching data analysis with Python

https://www.californiacivicdata.org/2017/07/12/first-python-notebook-at-sdsu/

My Curated List of AI and Machine Learning Resources from Around the Web

https://unsupervisedmethods.com/my-curated-list-of-ai-and-machine-learning-resources-from-around-the-web-9a97823b8524

When I was writing books on networking and programming topics in the early 2000s, the web was a good, but an incomplete resource. Blogging had started to take off, but YouTube wasn’t around yet, nor was Quora, Twitter, or podcasts. Over ten years later as I’ve been diving into AI and machine learning, it is a completely different ballgame. There are so many resources — it’s difficult to know where to start (and stop)!

To save you some of the effort I went through in researching all the different nooks and crannies of the web to find the best content; I’ve organized them into a big collection here. I’m only include links to free content. There is enough free content to keep you busy for a while. It’s amazing just how much information is available on machine learning, deep learning, and artificial intelligence on the web. This article should give you a sense of the scope.

I’ve created sections below that contain: well-known researchers, AI organizations, video courses, bloggers, Medium writers, books, YouTube channels, Quora topics, subreddits, Github repos, podcasts, newsletters, conferences, research links, tutorials, and cheat sheets. It’s a lot, but given the popularity of my Tutorials and Cheat Sheets articles, there seems to be a need for this kind of curated list.

Note: I wrote this in early July 2017. In some sections, I’ve included subscriber/follower/view counts, which will go out-of-date as soon as the article is published, but it should still be useful to give you a sense of interest level.

Let me know if there anything good I’m missing! I’m always looking to add to the list.

Researchers

Many of the most well-known AI researchers have a strong presence on the web. Below I’ve listed around twenty and included links to their website, Wikipedia page, Twitter profile, Google Scholar profile, and Quora profile. Quite a few have done an Ask-Me-Anything on Reddit or a Quora Session so I’ve included that is well when applicable.

I could include dozens more in a list like this. See Quora for more names.

Sebastian Thrun (Wikipedia / Twitter / GScholar / Quora / AMA)
Yann Lecun (Wikipedia / Twitter / GScholar / Quora / AMA)
Nando de Freitas (Wikipedia / Twitter / GScholar / AMA)
Andrew Ng (Wikipedia / Twitter / GScholar / Quora / AMA)
Daphne Koller (Wikipedia / Twitter / GScholar / Quora / Quora Session)
Adam Coates (Twitter / GScholar / AMA)
Jürgen Schmidhuber (Wikipedia / GScholar / AMA)
Geoffrey Hinton (Wikipedia / GScholar / AMA)
Terry Sejnowski (Wikipedia / Twitter / GScholar / AMA)
Michael Jordan (Wikipedia / GScholar / AMA)
Peter Norvig (Wikipedia / GScholar / Quora / AMA)
Yoshua Bengio (Wikipedia / GScholar / Quora / AMA)
Ian Goodfellow (Wikipedia / Twitter / GScholar / Quora / Quora Session)
Andrej Karpathy (Twitter / GScholar / Quora / Quora Session)
Richard Socher (Twitter / GScholar / Interview)
Demis Hassabis (Wikipedia / Twitter / GScholar / Interview)
Christopher Manning (Twitter / GScholar)
Fei-Fei Li (Wikipedia / Twitter / GScholar / Ted Talk)
François Chollet (Twitter / GScholar / Quora / Quora Session)
Dan Jurafsky (Wikipedia / Twitter / GScholar)
Oren Etzioni (Wikipedia / Twitter / GScholar / Quora / AMA)

Organizations

There are a handful of well-known organizations that are dedicated to furthering AI research and development. Below are the ones with websites/blogs and Twitter accounts.

OpenAI / Twitter (127K followers)
DeepMind / Twitter (80K followers)
Google Research / Twitter (1.1M followers)
AWS AI / Twitter (1.4M followers)
Facebook AI Research (no Twitter :)
Microsoft Research / Twitter (341K followers)
Baidu Research / Twitter (18K followers)
IntelAI / Twitter (2K followers)
AI² / Twitter (4.6K followers)
Partnership on AI / Twitter (5K followers)

Video Courses

There are an overwhelming number of video courses and tutorials available online now — many of them free. There are some good paid options too, but for this article, I’m focusing exclusively on free content. There are considerably more college courses where the professor has made the course materials available online, but there are no videos. Those can be more challenging to follow along and you probably don’t need them. The following courses would keep you busy for months:

Coursera — Machine Learning (Andrew Ng)
Coursera — Neural Networks for Machine Learning (Geoffrey Hinton)
Udacity — Intro to Machine Learning (Sebastian Thrun)
Udacity — Machine Learning (Georgia Tech)
Udacity — Deep Learning (Vincent Vanhoucke)
Machine Learning (mathematicalmonk)
Practical Deep Learning For Coders (Jeremy Howard & Rachel Thomas)
Stanford CS231n — Convolutional Neural Networks for Visual Recognition (Winter 2016) (class link)
Stanford CS224n — Natural Language Processing with Deep Learning (Winter 2017) (class link)
Oxford Deep NLP 2017 (Phil Blunsom et al.)
Reinforcement Learning (David Silver)
Practical Machine Learning Tutorial with Python (sentdex)

YouTube

Below I include links to YouTube channels or users that have regular content that is AI or machine learning-related. I’ve ordered by subscriber/view count to give a sense of their popularity.

sentdex (225K subscribers, 21M views)
Artificial Intelligence A.I. (7M views)
Siraj Raval (140K subscribers, 5M views)
Two Minute Papers (60K subscribers, 3.3M views)
DeepLearning.TV (42K subscribers, 1.7M views)
Data School (37K subscribers, 1.8M views)
Machine Learning Recipes with Josh Gordon (324K views)
Artificial Intelligence — Topic (10K subscribers)
Allen Institute for Artificial Intelligence (AI2) (1.6K subscribers, 69K views)
Machine Learning at Berkeley (634 subscribers, 48K views)
Understanding Machine Learning — Shai Ben-David (973 subscribers, 43K views)
Machine Learning TV (455 subscribers, 11K views)

Blogs

Given the popularity of AI and machine learning, I’m surprised there aren’t more consistent bloggers. Given the complexity of the material, it takes quite a bit of effort to put together meaningful content. Also, there are other outlets like Quora that give options to experts that want to give back but don’t have the time to create longer form content.

Below I include bloggers that post consistently on AI-related topics with original material and are not just news feeds or company blogs — sorted by Twitter follower count.

Andrej Karpathy / Twitter (69K followers)
i am trask / Twitter (14K followers)
Christopher Olah / Twitter (13K followers)
Top Bots / Twitter (11K followers)
WildML / Twitter (10K followers)
Distill / Twitter (9K followers)
Machine Learning Mastery / Twitter (5K followers)
FastML / Twitter (5K followers)
Sebastian Ruder / Twitter (3K followers)
Unsupervised Methods / Twitter (1.7K followers)
Explosion / Twitter (1K followers)
Tim Dettmers / Twitter (1K followers)
When trees fall… / Twitter (265 followers)
ML@B / Twitter (80 followers)

Medium Writers

Below are some of the top writers on Medium that cover Artificial Intelligence. Hover over a name for more info. Ordered by ranking on Medium as of July 2017.

Books

There are a lot of books out there that cover some aspect of machine learning, deep learning, and NLP. In this section, I’m going to focus purely on the free books that you can access or download straight from the web.

Machine Learning

NLP

Math

Quora

Quora has become a great resource for AI and machine learning. Many of the top researchers answer questions on the site. Below I’ve listed some of the main AI-related topics, which you can subscribe to if you want to customize your Quora feed. Check out the FAQ section within each topic (e.g. FAQ for Machine Learning) for a curated list of questions by the Quora community.

Computer-Science (5.6M followers)
Machine-Learning (1.1M followers)
Artificial-Intelligence (635K followers)
Deep-Learning (167K followers)
Natural-Language-Processing (155K followers)
Classification-machine-learning (119K followers)
Artificial-General-Intelligence (82K followers)
Convolutional-Neural-Networks-CNNs (25K followers)
Computational-Linguistics (23K followers)
Recurrent-Neural-Networks (17.4K followers)

The AI community on Reddit isn’t as large as Quora, but it still has some good subreddits worth checking out. Reddit can be helpful to keep up with the latest news and research whereas Quora is question/answer. Below are the main AI-related subreddits ordered by number of subscribers.

/r/MachineLearning (111K readers)
/r/robotics/ (43K readers)
/r/artificial (35K readers)
/r/datascience (34K readers)
/r/learnmachinelearning (11K readers)
/r/computervision (11K readers)
/r/MLQuestions (8K readers)
/r/LanguageTechnology (7K readers)
/r/mlclass (4K readers)
/r/mlpapers (4K readers)

Github

One of the nice things about the AI community is most new projects are open-sourced and made available on Github. There are also many educational resources on Github if you want example algorithm implementations in Python or using Juypter Notebooks. Below are links to repos that have been tagged with a particular topic.

Machine Learning (6K repos)
Deep Learning (3K repos)
Tensorflow (2K repos)
Neural Network (1K repos)
NLP (1K repos)

Podcasts

There are an increasing number of podcasts around AI, some centered on the latest news and others that are more educationally-oriented.

Concerning AI / iTunes
This Week in Machine Learning and AI / iTunes
The AI Podcast / iTunes
Data Skeptic / iTunes
Linear Digressions / iTunes
Partially Derivative / iTunes
O’Reilly Data Show / iTunes
Learning Machines 101 / iTunes
The Talking Machines / iTunes
Artificial Intelligence in Industry / iTunes
Machine Learning Guide / iTunes

Newsletters

If you want to stay up-to-speed with the latest news and research, there are a growing number of weekly newsletters you can choose from. Most of them cover the same stuff, so you’ll only need a couple to stay current.

Conferences

Unsurprisingly, with the rise in AI’s popularity there has also been an increase in the number of AI-related conference. Instead of providing a comprehensive list of every niche conference, I’m going to list the “major” conferences for some definition of major. If I’m missing one you think should be included, let me know. (And these are not free!)

Academic:

NIPS (Neural Information Processing Systems)
ICML (International Conference on Machine Learning)
KDD (Knowledge Discovery and Data Mining)
ICLR (International Conference on Learning Representations)
ACL (Association for Computational Linguistics)
EMNLP (Empirical Methods in Natural Language Processing)
CVPR (Computer Vision and Pattern Recognition)
ICCF (International Conference on Computer Vision)

Professional:

O’Reilly Artificial Intelligence Conference
Machine Learning Conference (MLConf)
AI Expo (North America, Europe, World)
AI Summit
AI Conference

Research Papers

Browse or search the academic papers being published.

arXiv.org subject classes:

Semantic Scholar searches:

Neural Networks (179K results)
Machine Learning (94K results)
Natural Language (62K results)
Computer Vision (55K results)
Deep Learning (24K results)

Another great resource for exploring research papers is a side project from Andrej Karpathy:

http://www.arxiv-sanity.com/

Tutorials

I created a separate comprehensive post covering all the good tutorial content I’ve found:

Over 150 of the Best Machine Learning, NLP, and Python Tutorials

Cheatsheets

Similar to tutorials, I created a separate article with a variety of good cheat sheets:

Cheat Sheet of Machine Learning and Python (and Math) Cheat Sheets