A downloadable project

We trained a simple Convolutional Neural Network on a poisoned version of the MNIST dataset. Some elements of the dataset include a watermark, for which the label has been modified. We describe the process for uncovering the path through the network the watermark takes by method of ablation and poisoning visualization through feature maximization methods. We also discuss applications to safety and further generalizations.

Github repo for the project

Download

Download
Write up.pdf 1.3 MB

Leave a comment

Log in with itch.io to leave a comment.