Day 87: Classifier progress
Well it’s been an interesting experience. I’ve been working hard to train a binary classifier to predict the two classes in my data. There has been a lot of ups and downs and more importantly, there has been some progress. It isn’t perfect, but it’s a start, so let’s look at what I’ve got so far and where I’m headed.
First, let it be known that I’ve never done anything like this prior. Yeah, story of my life these days, but it’s a learning process and that means screwing up from time to time. Currently I am still trying to find an optimized version of my model. Will it happen? Who knows, but the bottom line is this, despite all the setbacks, I’m getting closer to creating a good working model. In fact, I’ve got a pretty surprising result already, although not perfect, it’s pretty good.
Baby Steps I guess, but let’s look at where we were when we last talked about this and where I am now. You may recall from this post that I had a model, it predicted both of my classes, but it didn’t do that great of a job. In fact the output looked something like this.
Which left me to wonder if I was ever going to train a model that was good enough to predict my classes. Well after toying with a few settings I’ve managed to get a better predictive model and now I have something that looks like this
Which let’s be honest, isn’t half bad! Also, remember the model has never seen this data before, this was specially set aside data that it didn’t train with, so this (to me anyway) is pretty impressive since I haven’t done anything really advanced to get to this point, it’s all basic machine learning processes.
How did I end up here? Well I switched my kernel function. If you recall there are several different functions that you can use to define your hyperplane (the thing that the computer uses to separate my two classes. Originally I was using a gaussian, yes this gaussian. Then I tried a cubic (which is the first graph I showed), now I am using the RBF or radial basis function, which to be fair the gaussian is a RBF (we can go over that some other post), but as we talked about in the past, the gaussian is a good approximation for a lot of things.
Basically I’m making progress and I’m doing all of this using imperfect data, so that’s even more impressive in my opinion. In any case, we will have to wait and see how it all turns out in the end, but for now it looks like I’m headed in the right direction. With my QE date nice and nebulous at the moment, I will have some extra time to figure it out.
Until next time, don’t stop learning!