Skip to main content

An Introduction

Greetings, and welcome to my blog!

In this blog, I’ll be illustrating the use of deep learning to solve various prediction tasks. For those of you that have never heard of deep learning, a proper introduction to the subject is in order.
Deep learning is a specific type of machine learning that attempts to approximate a function through the construction of a mathematical model called a neural network. Basically, when this model attempts to “approximate a function,” this means that it maps a certain set of inputs to a certain output. It’d be helpful to illustrate what this “mapping” looks like by way of an example.

For example, I’d like to identify what species a certain flower just by looking at the flower’s petal and sepal (green leaves underlying petals) length and width. If I were to attempt this with deep learning, I’d be constructing a model that maps petal and sepal length and width (the inputs) to a specific type of species (the output). This exact prediction task is widely used when practicing different modes of machine learning, and we’ll give it a try later in this blog. This mapping from input to output is called a function, and deep learning attempts to learn how to calculate (really, approximate) a function by looking at sets of inputs matched with their correct output (thus it learns empirically).

Really, a mapping can be postulated to exist between a bunch of different things. Perhaps what I eat for lunch can be mapped to (and thus predicted by) my mood that day and what day of the week it is. Maybe my mood that day can be predicted by other factors that can be considered inputs in a function, like whether the series of events that led up to that day included whether I won the lottery, got into a car accident, or had a proper night’s sleep. Maybe whether I got a proper night’s sleep can be mapped to if I had a test the next day and what class that test is for, or whether I had to stay up that night to write a blog post for CS475: Technical Writing (Healthcare IT variant) taught by the venerable Dr. Palmer. Given that there’s any association from the postulated inputs to the postulated outputs, a mapping can exist between the two, and we can attempt to approximate it with deep learning.

This all may seem like magic that shouldn’t be possible for humans. And really, humans didn’t come up with it: we discovered it. Deep learning attempts to approximate a function by using a neural network, a mathematical model that attempts to mimic (but doesn’t exactly strictly follow) the way neurons interact with each other in the brain. Discussions of how neural networks work can get really math-y really quickly and a quick Google search yields millions of blogs and tutorials that don’t spare the mathematical jargon; so I’ll keep the math reigned in a bit. Neural networks are composed of nodes, each with a specified value, called an activation. Here’s an example of a small network:



You can see that this network is composed of 2 layers of 3 nodes each. We can consider the first (left-most) layer the input layer, with input nodes, and the second layer the output layer, with output nodes. Each nodes have a specific value and the nodes of any layer except the input layer can be calculated as a function of the values of the previous layer (really, a function, called an activation function, of the sum of the values of the previous layer multiplied by their corresponding connections (weights) to the node in question).

From this example, you can see that the values at any layer can be computed from the values of the layer before it. So really, when a neural network calculates an output, information first has to flow (propagate) from the inputs, through each layer, to the output layer.


Now that we’ve introduced deep learning and neural networks, we can start applying it all. I’ll continue this tutorial with a quick install guide of the framework we’ll use to build and train these networks, then we’ll get our hands dirty in doing some predictive modeling. 

Comments

  1. I'm really interested to see where your blog goes. I'm very curious about deep learning and how it's actually implemented. I know it's a hot topic and will be seen more and more in technology. By the way, who is the author of this blog? A name was not provided. Thanks.

    ReplyDelete

Post a Comment

Popular posts from this blog

Installing the Tools

We'll continue our descent by installing the tools necessary to conduct deep learning. The tools will include R, MXNet (framework for building neural nets in R), Python, and Tensorflow (framework for building neural nets in Python). You might ask why I'm using 2 different languages and 2 different frameworks. Truth be told, I like the way MXNet does classic, feed-forward neural network classifiers. You'll see that the syntax is concise and doesn't require as much fiddling with formatting of data. Unfortunately, MXNet doesn't exhibit the same elegance for more complex network architectures that we'll encounter later in this blog, so we'll use Tensorflow for CNN's (Image classification) and RNN's (time series classification). So let's install R. First, we'll go to  https://cran.r-project.org/bin/windows/base/ to download the latest version of R on Windows: We'll click the "Download R 3.4.2 for Windows" link to download R (...