Greetings, and welcome to my blog!
In this blog, I’ll be illustrating the use of deep learning
to solve various prediction tasks. For those of you that have never heard of
deep learning, a proper introduction to the subject is in order.
Deep learning is a specific type of machine learning that attempts
to approximate a function through the construction of a mathematical model
called a neural network. Basically, when this model attempts to “approximate a
function,” this means that it maps a certain set of inputs to a certain output.
It’d be helpful to illustrate what this “mapping” looks like by way of an
example.
For example, I’d like to identify what species a certain
flower just by looking at the flower’s petal and sepal (green leaves underlying
petals) length and width. If I were to attempt this with deep learning, I’d be
constructing a model that maps petal and sepal length and width (the inputs) to
a specific type of species (the output). This exact prediction task is widely
used when practicing different modes of machine learning, and we’ll give it a
try later in this blog. This mapping from input to output is called a function,
and deep learning attempts to learn how to calculate (really, approximate) a
function by looking at sets of inputs matched with their correct output (thus
it learns empirically).
Really, a mapping can be postulated to exist between a bunch
of different things. Perhaps what I eat for lunch can be mapped to (and thus
predicted by) my mood that day and what day of the week it is. Maybe my mood
that day can be predicted by other factors that can be considered inputs in a
function, like whether the series of events that led up to that day included
whether I won the lottery, got into a car accident, or had a proper night’s
sleep. Maybe whether I got a proper night’s sleep can be mapped to if I had a
test the next day and what class that test is for, or whether I had to stay up
that night to write a blog post for CS475: Technical Writing (Healthcare IT
variant) taught by the venerable Dr. Palmer. Given that there’s any association
from the postulated inputs to the postulated outputs, a mapping can exist
between the two, and we can attempt to approximate it with deep learning.
This all may seem like magic that shouldn’t be possible for
humans. And really, humans didn’t come up with it: we discovered it. Deep
learning attempts to approximate a function by using a neural network, a
mathematical model that attempts to mimic (but doesn’t exactly strictly follow)
the way neurons interact with each other in the brain. Discussions of how
neural networks work can get really math-y really quickly and a quick Google
search yields millions of blogs and tutorials that don’t spare the mathematical
jargon; so I’ll keep the math reigned in a bit. Neural networks are composed of
nodes, each with a specified value, called an activation. Here’s an example of
a small network:
You can see that this network is composed of 2 layers of 3
nodes each. We can consider the first (left-most) layer the input layer, with input nodes,
and the second layer the output layer, with output nodes. Each nodes have a
specific value and the nodes of any layer except the input layer can be
calculated as a function of the values of the previous layer (really, a
function, called an activation function, of the sum of the values of the
previous layer multiplied by their corresponding connections (weights) to the
node in question).
From this example, you can see that the values at any layer
can be computed from the values of the layer before it. So really, when a
neural network calculates an output, information first has to flow (propagate)
from the inputs, through each layer, to the output layer.
Now that we’ve introduced deep learning and neural networks,
we can start applying it all. I’ll continue this tutorial with a quick install guide
of the framework we’ll use to build and train these networks, then we’ll get
our hands dirty in doing some predictive modeling.
I'm really interested to see where your blog goes. I'm very curious about deep learning and how it's actually implemented. I know it's a hot topic and will be seen more and more in technology. By the way, who is the author of this blog? A name was not provided. Thanks.
ReplyDelete