TomBolton.io

Tom Bolton’s AI and Machine Learning Lab Notebook.

Machine Learning

Dropout

Dropout helped the model a lot, but not for helping with overfitting. Instead, adding dropout layers in the model radically improved the speed at which it was able to increase its win rate. Here is pre-dropout performance for the first three bootstrap iterations:

And here is post-dropout performance:

So adding dropout layers to each of the fully-connected layers radically improves how quickly the model learns, getting to bootstrap version 3 in less than 210K games vs the no-dropout version which take over 340K games. However, it does not reduce overfitting, and in fact seems to slightly exacerbate it.

Here is the non-droput evaluation:

And here is the dropout evaluation:

I think I will hang onto the dropout layers if only for the improvement in performance.

One other thing I tried is adding a projection layer where the conv layers are concetenated with the piece inputs. My understanding of projection layers was that it would give the opportunity for the 1433 element output vector to not have 1024 elements exclusively associated with the conv net and the remaining 409 exclusive to the piece vectors, but rather, to have them comingled appropriately. When I added a straight linear projection, it significantly degraded the learning performance. It was only when adding dropout to it too that the performance came back in line with what it had been. The training for that is still going on so I

You Might Also Like