Post

Kannada Digit Classifier

Recognizing handwritten Kannada digits (0-9) using Convolutional Neural Networks (CNN) and K-Nearest Neighbors (KNN) with various preprocessing and augmentation techniques to optimize model performance.

Kannada Digit Classifier

Summary

View the Git repository.

About

The Kannada MNIST dataset is an adaptation of the classic MNIST digit recognition task, consisting of 75,000 grayscale images (28x28 pixels) of handwritten Kannada numerals. The objective was to accurately classify these digits using supervised machine learning techniques.

Dataset

  • Total Images: ~75,000 (60,000 training, 10,000 validation, 5,000 testing)

  • Image Dimensions: 28x28 pixels, Grayscale

Preprocessing Techniques

Normalization

  • Rescaled pixel intensity values to the range [0, 1] to standardize image brightness and reduce noise.

Principal Component Analysis (PCA)

  • Reduced dimensionality from 784 pixels to 237 principal components, explaining 95% variance.

  • Intended for use with KNN to simplify computation and remove less informative features.

Canny Edge Detection

  • Identified image edges to emphasize the shape and outline of digits.

  • Reduced complexity of image data for KNN classification.

Data Augmentation

Increased dataset from 60,000 to 120,000 images through augmentation techniques:

  • Random rotation [-45°, 45°]

  • Random scaling [0.75, 1.25]

  • Random translations [-2, 2] pixels

Improved robustness and generalization of CNN model.

Machine Learning Models

K-Nearest Neighbors (KNN)

  • Evaluated with varying values of k (3, 5, 7).

  • Tested preprocessing combinations including PCA, Canny edge detection, and their combination.

  • Highest accuracy obtained with raw (normalized/non-normalized) dataset (~71.9%).

Convolutional Neural Network (CNN)

Consisted of:

  • 1 Input Layer (28x28)

  • 3 Convolutional Layers (3x3 kernel, Batch Normalization, ReLU activation, Max Pooling)

  • 1 Fully Connected Layer (10 output neurons, Softmax activation)

  • Achieved 75.57% accuracy with normalized, non-augmented dataset.

  • Achieved 87.28% accuracy with augmented dataset.

Ablation Study

Conducted experiments by varying the number of convolutional layers in the CNN:

Convolutional LayersAugmented Dataset Accuracy (%)Non-Augmented Dataset Accuracy (%)
Four Layers85.3176.77
Three Layers87.2875.57
Two Layers84.2272.90
One Layer77.5470.01

Three-layer CNN architecture showed the best performance.

Future Directions

  • Explore alternative preprocessing techniques like Scale-Invariant Feature Transform (SIFT) for improving KNN performance.

  • Enhance dataset augmentation strategies to further improve CNN robustness and accuracy.

  • Conduct training using GPU-accelerated computing environments for faster experimentation and optimization.

Tools Used

  • MATLAB

  • MATLAB Deep Learning Toolbox

Note: For detailed insights, visualization, and methodology, refer to the linked PDF report at the top.

This post is licensed under CC BY 4.0 by the author.