Decision Trees and k-NN on Synthetic Data
Entropy-based feature selection and learning curve analysis with custom 2D data distributions.
Decision Trees and k-NN on Synthetic Data
Summary
This project begins with a manual implementation of a decision tree classifier, using entropy-based uncertainty to select features and construct a shallow binary tree.
The second half focuses on generating structured synthetic data from a probabilistic distribution and evaluating the k-Nearest Neighbors classifier.
It explores the effect of training size and the hyperparameter
k
on prediction accuracy, providing a hands-on study of interpretable and non-parametric models in small-scale learning tasks.
This post is licensed under CC BY 4.0 by the author.