TomBolton.io

Tom Bolton’s AI and Machine Learning Lab Notebook.

Machine Learning

Not Good

Despite my critical appraisal Udacity in my last post, I have been trying to keep an open mind and remain optimistic that the course would begin to start pulling its weight in terms of instruction. Having started the next unit, things are not looking good.

Given that the last unit ends with a nearly useless “coding exercise,” I was hopeful that the next unit would attempt to dive deeper into some of the specifics of how that algorithm actually works in practice, beyond just some complex and cryptic mathematical expressions and uncommented implementations of the same. Unfortunately, it seems that we have left that in the dust.

Instead, the unit starts with how excited our new instructor is to be teaching us. Good for him. It then moves on to a video-less recap of what seems to be a simplified version of the mathematical algorithm with minimal support.

We then do a “Code Walkthrough” which you would think would be something that would, perhaps, do the equivalent and more of what decent commenting would do for the coding exercise discussed previously. No. This is not the intent of the “Code Walkthrough.” Instead, it seems its only intent is to explain what the scaffolding is that they’ve created for us to then fill in the actual algorithms—which they have not explained. This is useless.

But the real coup de grace for all this can be summed up in the subsequent videos. Rather than actually be bothered explaining anything in detail that we’ve learned previously, our excited instructor is on to the next topic: a refinement of policy gradients called Proximal Policy Optimization (PPO). At this point, it would be far too time consuming to go through a blow-by-blow of how useless these videos really are. Here’s the only economical way I can think of to explain their spectacular uselessness.

First watch the the gradient ascent video I mentioned prior. While you’re doing that, think about the person for whom this video is necessary. That is: the person who is really having trouble wrapping their head around the concept of gradient ascent and truly needs the picture of the mountain.

Okay, got it? Next, imagine you are that person. Imagine also that you’ve somehow managed to understand the funamentals of implementing policy gradients in code (and I’m going to tell you: that is quite a stretch). Then imagine, finally, that you’re now expected to learn PPO (whatever that is), and these three videos are the ONLY explanation you’re going to get. (You can save yourself some time and just watch one, because they’re all the same in their aproach.)

The $64,000 question: do you, as the person who needed a metaphorical explanation of gradient descent with a cute picture of a mountain, have the slightest clue what this guy is talking about? Because if you don’t, you’re out of luck. These videos are the entirety of the PPO instruction that you’re going to get.

You Might Also Like