You may not realize how big of a role Machine Learning already plays in your life. From that “ducking” auto-correct to high-activity bank alerts to Amazon’s recommended purchases, here is some of the method behind the mystery.
Give a machine a fish, it will eat for a night. Teach a machine to program itself….
Machine learning teaches computers to program themselves so that they can accomplish tasks without being explicitly instructed by humans. Fields that rely on machine learning include:
- Image Processing
- Facial recognition
- Speech recognition
- Natural language processing
- Recommender systems
In this article we will introduce you to the two main types of machine learning algorithms: supervised learning and unsupervised learning. We will focus on continuous data, which is numeric and can be measured, as oppose to categorical data.
Supervised Machine Learning
With supervised learning, you use training data to train a machine learning algorithm. For example, if you want to detect spam e-mail, you could have a set of training data labelled with correct answers (i.e. “spam” vs “not spam”). The labeled data are used to train the algorithm, which can then be applied to unlabeled data.
Where it’s used: Spam Detection, Handwriting Recognition
Supervised Learning Algorithms
Here are some examples of supervised learning techniques for continuous data:
- Decision Trees: predicts categorical labels with classification decision trees. The better the algorithm, the more purity, or correctly labeled data, the tree has.
- Regression (Linear & Polynomial): predicting an output based on a regression model.
- Random Forests: ensembles the output from multiple regression or classification trees in order to achieve a more stabilized and generalized fitting algorithm.
Unsupervised Machine Learning
Unsupervised learning has no training or validation, and is when the machine tries to find patterns in the data without any guidance. For example, in credit card fraud, there are not enough examples to train the data (since most activity is not fraud). The goal for the machine is to find data points that don’t look like the rest.
Where it’s used: Customer Segmentation, Image Processing
Types of Unsupervised Learning Algorithms
Clustering is the method used for unsupervised learning with continuous data. Clustering is the process of finding coherent groups with similar traits, regardless of whether you can classify or describe the resulting groups. Types of unsupervised clustering with continuous data include:
- Singular Value Decomposition (SVD): factorizes a matrix and rotates the axes of a feature space, resulting in a ranked list of directions ordered from greatest to least variance.
- Principal Component Analysis (PCA): converts observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
- K-Means: groups data points into multiple clusters based on how close they are to each other.