Kannada Digit Classifier

Recognizing handwritten Kannada digits (0-9) using Convolutional Neural Networks (CNN) and K-Nearest Neighbors (KNN) with various preprocessing and augmentation techniques to optimize model performance.

Posted Dec 8, 2024

By Shawn Kim

1 min read

Summary

View the Git repository.

About

The Kannada MNIST dataset is an adaptation of the classic MNIST digit recognition task, consisting of 75,000 grayscale images (28x28 pixels) of handwritten Kannada numerals. The objective was to accurately classify these digits using supervised machine learning techniques.

Dataset

Total Images: ~75,000 (60,000 training, 10,000 validation, 5,000 testing)
Image Dimensions: 28x28 pixels, Grayscale

Preprocessing Techniques

Normalization

Rescaled pixel intensity values to the range [0, 1] to standardize image brightness and reduce noise.

Principal Component Analysis (PCA)

Reduced dimensionality from 784 pixels to 237 principal components, explaining 95% variance.
Intended for use with KNN to simplify computation and remove less informative features.

Canny Edge Detection

Identified image edges to emphasize the shape and outline of digits.
Reduced complexity of image data for KNN classification.

Data Augmentation

Convolutional Layers	Augmented Dataset Accuracy (%)	Non-Augmented Dataset Accuracy (%)
Four Layers	85.31	76.77
Three Layers	87.28	75.57
Two Layers	84.22	72.90
One Layer	77.54	70.01

Kannada Digit Classifier

Summary

About

Dataset

Preprocessing Techniques

Normalization

Principal Component Analysis (PCA)

Canny Edge Detection

Data Augmentation

Machine Learning Models

K-Nearest Neighbors (KNN)

Convolutional Neural Network (CNN)

Ablation Study

Future Directions

Tools Used

Trending Tags