Dropout
artificial-intelligencernnganmicrosoft-for-beginnerslessonsAImicrosoft-AI-For-Beginnersmachine-learning08-TransferLearningdeep-learning4-ComputerVisioncomputer-visioncnnNLP
Export
The Effect of Dropout
Let's see for ourselves how dropout actually affects training. We will use MNIST dataset and a simple convolutional network to do that:
[3]
We will define train function that will take care of all training process, including:
- Defining the neural network architecture with a given dropout rate
d - Specifying suitable training parameters (optimizer and loss function)
- Doing the training and collecting the history
We will then run this function for a bunch of different dropout values:
[7]
Training with dropout = 0 Epoch 1/5 938/938 [==============================] - 26s 27ms/step - loss: 0.1949 - acc: 0.9435 - val_loss: 0.0596 - val_acc: 0.9802 Epoch 2/5 938/938 [==============================] - 27s 29ms/step - loss: 0.0592 - acc: 0.9816 - val_loss: 0.0433 - val_acc: 0.9857 Epoch 3/5 938/938 [==============================] - 26s 28ms/step - loss: 0.0438 - acc: 0.9867 - val_loss: 0.0472 - val_acc: 0.9849 Epoch 4/5 938/938 [==============================] - 27s 28ms/step - loss: 0.0355 - acc: 0.9890 - val_loss: 0.0353 - val_acc: 0.9882 Epoch 5/5 938/938 [==============================] - 26s 28ms/step - loss: 0.0294 - acc: 0.9910 - val_loss: 0.0305 - val_acc: 0.9894 Training with dropout = 0.2 Epoch 1/5 938/938 [==============================] - 29s 31ms/step - loss: 0.2097 - acc: 0.9377 - val_loss: 0.0655 - val_acc: 0.9781 Epoch 2/5 938/938 [==============================] - 31s 33ms/step - loss: 0.0676 - acc: 0.9792 - val_loss: 0.0409 - val_acc: 0.9852 Epoch 3/5 938/938 [==============================] - 28s 30ms/step - loss: 0.0514 - acc: 0.9837 - val_loss: 0.0384 - val_acc: 0.9871 Epoch 4/5 938/938 [==============================] - 28s 29ms/step - loss: 0.0424 - acc: 0.9871 - val_loss: 0.0343 - val_acc: 0.9889 Epoch 5/5 938/938 [==============================] - 30s 32ms/step - loss: 0.0356 - acc: 0.9893 - val_loss: 0.0343 - val_acc: 0.9885 Training with dropout = 0.5 Epoch 1/5 938/938 [==============================] - 30s 31ms/step - loss: 0.2586 - acc: 0.9212 - val_loss: 0.0666 - val_acc: 0.9797 Epoch 2/5 938/938 [==============================] - 28s 30ms/step - loss: 0.0860 - acc: 0.9734 - val_loss: 0.0441 - val_acc: 0.9860 Epoch 3/5 938/938 [==============================] - 29s 31ms/step - loss: 0.0674 - acc: 0.9792 - val_loss: 0.0414 - val_acc: 0.9868 Epoch 4/5 938/938 [==============================] - 30s 32ms/step - loss: 0.0564 - acc: 0.9822 - val_loss: 0.0326 - val_acc: 0.9886 Epoch 5/5 938/938 [==============================] - 29s 31ms/step - loss: 0.0511 - acc: 0.9843 - val_loss: 0.0298 - val_acc: 0.9899 Training with dropout = 0.8 Epoch 1/5 938/938 [==============================] - 31s 32ms/step - loss: 0.3832 - acc: 0.8766 - val_loss: 0.0849 - val_acc: 0.9732 Epoch 2/5 938/938 [==============================] - 29s 31ms/step - loss: 0.1563 - acc: 0.9521 - val_loss: 0.0686 - val_acc: 0.9797 Epoch 3/5 938/938 [==============================] - 32s 34ms/step - loss: 0.1253 - acc: 0.9616 - val_loss: 0.0490 - val_acc: 0.9854 Epoch 4/5 938/938 [==============================] - 33s 35ms/step - loss: 0.1105 - acc: 0.9658 - val_loss: 0.0395 - val_acc: 0.9872 Epoch 5/5 938/938 [==============================] - 34s 36ms/step - loss: 0.1022 - acc: 0.9680 - val_loss: 0.0363 - val_acc: 0.9878
Now, let's plot validation accuracy graphs for different dropout values to see how fast the training goes:
[16]
<matplotlib.legend.Legend at 0x235bc70f0d0>
From this graph, you would probably be able to see the following:
- Dropout values in the 0.2-0.5 range, you will see the fastest training the best overall results
- Without dropout (), you are likely to see less stable and slower training process
- High dropout (0.8) makes things worse