Sunday, May 7, 2017

Neural Network Classification Model for Handwritten Digit Recognition

https://www.kaggle.com/statinstilettos/neural-network-approach

Exploratory Analysis

Summary statistics and visualizations of data. The data is first preprocessed by visualizing the sample size for each digit in the dataset, plotting a few of the digits using the data provided to get an understanding of exactly what the data represents, normalizing the data, and reducing the features using PCA. It is important to note that this dataset is sparse, meaning that there are mostly 0's in the feature matrix. Some pixels carry a lot of information about the digit written, while other pixel features such as the edges and usually 0 and not very informative. The scale of the data used to represent the pixels is not numerically meaningful, which leads to the need to normalize the data so that the values do not contribute to the model in an improper way.

No comments:

Post a Comment