Let's continue our descent by evaluating how good our heart disease classifier is. We can do this by generating predictions on the test set and see how the predictions compare to the test set's ground truth labels. That can be done with the following lines of code: At the bottom we can see that the classifier predicted 24 true negatives, 9 false positives, 8 false negatives, and 19 true positives. That's pretty okay. There is obviously some inaccuracy in the predictions, but let's calculate the accuracy anyways. (24+19)/(24+19+8+9)=71.6. So the test accuracy was 71.6, while, if you recall from the last post, the training accuracy was nearing 90%. This disparity between training and testing accuracy is a result of overfitting. Essentially, 50,000 training loops was too much training for this little of data. The resulting network overfit to the noise inherent in the training data and, as a result, failed to generalize as well on the test set. Therefore, the testing...
As I begin writing this post I realize that we're not training a heart disease predictor , per se, as the classification task at hand isn't really to divine the onset of heart disease in the distant future. Rather, we're really training a heart disease classifier that identifies heart disease given test results, vital measurements, and demographic information. While much less cool than a predictor, this is still a pretty difficult task and I'm certain the definition of "heart disease" might present some aspects of ambiguity and subjectivity between cardiologists. To illustrate the difficulty of this task, you might ask yourself: if you were given the sex, angina status, heart rate measurements, cholesterol measurements, and EKG results of a patient, would you be able to diagnose that patient as having heart disease? Anyways, let's go ahead and train this network. That can be done with these lines of code. Execution of these lines of code takes about 20 m...