iPython Tensorflow Notebook Tutorials
Tensorflow Programs and Tutorials - This Github repo contains multiple iPython notebooks which serve as tutorials for learning about deploying deep learning and machine learning models. The following topics are covered.
- Convolutional Neural Networks: This notebook shows you how to define a simple convolutional network using Tensorflow. We’ll discuss how to load in datasets, how to create our network architecture using Tensorflow variables, as well as how to define loss functions and optimizers.
- Generative Adversarial Networks: This notebook shows you how to create a simple GAN. The basic idea is that you have 2 different networks, a generator network and a discriminator network. The discriminative model has the task of determining whether a given image looks natural (an image from the dataset) or looks like it has been artificially created. The task of the generator is to create natural looking images that are similar to the original data distribution. In this tutorial, we’ll look at how to create both models, and the unique process of training them to reach a Nash equilibrium.
- CNN’s with Noisy Labels: This notebook looks at a recent paper that discusses how convolutional neural networks that are trained on random labels (with some probability) are still able to acheive good accuracy on MNIST. I thought that the paper showed some eye-brow raising results, so I went ahead and tried it out for myself. It was pretty amazing to see that even when training a CNN with random labels 50% of the time, and the correct labels the other 50% of the time, the network was still able to get a 90+% accuracy.
- Character Level RNN: This notebook shows you how to train a character level RNN in Tensorflow. The idea was inspired by Andrej Karpathy’s famous blog post and was based on this Keras implementation. In this notebook, you’ll learn more about what the model is doing, and how you can input your own dataset, and train a model to generate similar looking text.
- Linear Regression: This notebook shows you how Tensorflow is not just a deep learning library, but is a library centered on numerical computation, which allows you to create classic machine learning models relatively easily.
- Simple Neural Networks: This notebook shows you how to create simple 1 and 2 layer neural networks. We’ll then see how these networks perform on MNIST, and look at the type of hyperparamters that affect a model’s accuracy (network architecture, weight initialization, learning rate, etc)
React Web Apps
MLB Stats App - First web application using the React JS framework. Have been hearing a lot of good things about it, so I wanted to give it a shot. This app is pretty simple. It just displays the 2016 regular season stats for each MLB team. Looking to add more functionality and iron out a few of the bugs a little later.
Sports Data Analysis
March Madness 2017 Bracket Predictor Model - This iPython notebook looks at how we can use historical data on NCAA regular season games to develop a model that outputs win probability for 2 given teams facing each other. We can frame this as a supervised learning problem where we can use past game by game results as our labels. Each team is represented by a d-dimensional vector containing information for that team (PPG, Number of Wins, etc) during the given season. The element wise difference between the two teams is inputted into a neural network, where the output is the softmax-bounded probability of the likelihood that Team 1 will come out victorious in the matchup. I also wrote a blog post about predicting the 2017 March Madness tournament.
MLB Win Predictor Linear Regression Model - This project looks at how a team’s baseball statistics (ERA, Batting Average, RBI, etc), over the course of a season, contribute to their total number of wins. This uses the Torch7 computing framework to develop a linear regression model that takes in a set of 16 features representing traditional baseball statistics and outputs the predicted number of wins for the given season. Given the season statistics for a team, this model can predict a team’s win total within 3 games (1.85% error).
Kaggle Titanic - The Kaggle Titanic competition revolved around taking in a dataset of all the passengers in the Titanic, and then predicting whether or not they survived. The features in the dataset included room location, age, gender, etc. For this competition, I used a variety of different supervised learning approaches (SVMs, KNNs, Decision Trees, Neural Networks), but ultimately found that a KNN model (where K = 17) got the best accuracy of 78.95%. I used Numpy and Sklearn to help preprocess the data and create the models.
Kaggle MNIST - The Kaggle MNIST competiiton is a quite standard benchmark for all computer vision models. MNIST is a dataset of handwritten digits, and the overall goal is to have the model classify each image as a digit from 0-9. For this competition, I used a convolutional neural network written in Keras. The model gets an accuracy of 98.63%.
Kaggle Ghosts - This playground competition was a little different in that it involved a “fake” dataset and task. The goal was to take in features of different fantasy monsters (Ghoust, Ghool, or Goblin), and output the most likely classification. Some of the features included hair length, bone length and color. An SVM proved to work the best as I was able to achieve a 73.54% accuracy. I used Numpy, Pandas, and Matplotlib to help visualize the data and features.
Pandas Tutorial - Pandas is the one of the most popular Python libraries as it can be lots for a variety of different data sience tasks. From describing data structures to introducing the most important functions, this tutorial is a great starting point for anyone looking to start using Pandas.