My Python NN Port – Results
Before moving on to my next project, I thought a good segue would be to discuss the actual results of porting my NN to Python and some open issues that I have yet to resolve.
In general, the port seems to have worked correctly on the basis of the three indicators I’ve reviewed. The first is a qualitative indicator — specifically the visualization of Theta 1. Here’s what it looks like in the original Octave version:
And here’s the Python version:
Of course, this doesn’t say anything quantitative. It’s merely an indicator that nothing has gone wrong. The quantitative metrics are right in line though. I used a set of 5,000 labeled examples, randomized with a training set of 3,500 and a validation set of 1,500. In both Python and ML/O versions, training accuracy hovers between 99% and 100%, and validation set accuracy between 92% and 94%, all dependent—of course—on the randomization of both the training/validation sets and the initial Thetas. So clearly I’ve implemented everything correctly.
However, there’s a problem. My Python version takes 50% longer to run the training. In both instances, I’m running a version of fmincg for 500 iterations using batch optimization. The ML/O version takes ~21 seconds to run all 500 whereas the Python version takes ~32 seconds. Not knowing much about the inner workings of either platform, there are many things it could be, but my hunch is it’s in one of a few areas.
It could be something as simple as how fmincg is implemented in NumPy vs. ML/O. A few times in the ML course, professor Ng alludes to there being different numerical libraries available in the various programming languages and hints at the possibility that some might be more efficiently implemented than others.
Or it could something to do with how I’ve implemented and failed to optimize the cost/gradient function in Python. Both the Python and ML/O versions are completely vectorized, so in that regard they’re as efficient as they can possibly be. In terms of the actual algorithm, they should be identical except syntactically since all I really did was translate the ML/O algorithm to Python syntax. But I paid no attention to things like variable typing, letting Python make any decisions I wasn’t forced to make myself. I’m guessing it’s possible that using variables with more accuracy than necessary, I could be incurring that extra cost. That seems unlikely though.
I can tell by looking at my CPU usage during the training that both versions are utilizing 8 threads across all four cores of my CPU. However, while in the ML/O version, my GPU spikes to around 50% usage, in the Python version, there’s no change in GPU usage when the training is running vs. not. It definitely seems likely that this could be a reason, but it’s hard to know for sure.
This is kind of interesting and I could definitely investigate further, but I probably won’t be doing that. The reality is, now that I have the basics down with regard to Python, I’m much more interested in beginning to move forward on learning more actual AI and designing my first substantial AI project. At this point, that’s still the checkers AI. If I get any traction there, I’m sure there will be ample opportunity to see the effects of Python tuning when I’m running it not on my laptop, but on my desktop PC with its GTX 1080ti.