
Lecture Description
In the previous tutorial, we covered how to use the K Nearest Neighbors algorithm via Scikit-Learn to achieve 95% accuracy in predicting benign vs malignant tumors based on tumor attributes. Now, we're going to dig into how K Nearest Neighbors works so we have a full understanding of the algorithm itself, to better understand when it will and wont work for us.
We will come back to our breast cancer dataset, using it on our custom-made K Nearest Neighbors algorithm and compare it to Scikit-Learn's, but we're going to start off with some very simple data first. K Nearest Neighbors boils down to proximity, not by group, but by individual points. Thus, all this algorithm is actually doing is computing distance between points, and then picking the most popular class of the top K classes of points nearest to it. There are various ways to compute distance on a plane, many of which you can use here, but the most accepted version is Euclidean Distance, named after Euclid, a famous mathematician who is popularly referred to as the father of Geometry, and he definitely wrote the book (The Elements) on it.
pythonprogramming.net
twitter.com/sentdex
www.facebook.com/pythonprogramming.net/
plus.google.com/+sentdex
Course Index
- Introduction to Machine Learning
- Regression Intro
- Regression Features and Labels
- Regression Training and Testing
- Regression forecasting and predicting
- Pickling and Scaling
- Regression How it Works
- How to program the Best Fit Slope
- How to program the Best Fit Line
- R Squared Theory
- Programming R Squared
- Testing Assumptions
- Classification w/ K Nearest Neighbors Intro
- K Nearest Neighbors Application
- Euclidean Distance
- Creating Our K Nearest Neighbors A
- Writing our own K Nearest Neighbors in Code
- Applying our K Nearest Neighbors Algorithm
- Final thoughts on K Nearest Neighbors
- Support Vector Machine Intro and Application
- Understanding Vectors
- Support Vector Assertion
- Support Vector Machine Fundamentals
- Support Vector Machine Optimization
- Creating an SVM from scratch
- SVM Training
- SVM Optimization
- Completing SVM from Scratch
- Kernels Introduction
- Why Kernels
- Soft Margin SVM
- Soft Margin SVM and Kernels with CVXOPT
- SVM Parameters
- Clustering Introduction
- Handling Non-Numeric Data
- K Means with Titanic Dataset
- Custom K Means
- K Means from Scratch
- Mean Shift Intro
- Mean Shift with Titanic Dataset
- Mean Shift from Scratch
- Mean Shift Dynamic Bandwidth
Course Description
The objective of this course is to give you a holistic understanding of machine learning, covering theory, application, and inner workings of supervised, unsupervised, and deep learning algorithms.
In this series, we'll be covering linear regression, K Nearest Neighbors, Support Vector Machines (SVM), flat clustering, hierarchical clustering, and neural networks.