DavidSnider said:
I'm not sure I follow this.
We train the NN on a set of images, some of the images humans have labeled as cats, some labeled non-cat, but all 'normal' photos. Then we get a GA to converge on a bunch of images that the NN thinks with high probability are cat-like. A human looks at these images and they look more or less like noise (non-cat like by human standards for sure Is this what you mean by 'poison pattern'?) .
Let's say you have 100 cat images and 100 non-cat images. And you have several convolution filters and say filter number 3 is a mid-pass filter (perhaps a wavelength of 6 pixels). And let's say that as it happens, one of you "max" values from this filter is very telling. Call this F3M9. We get an F3M9 that is less than 4 for 70% of the non-cat images and a value greater than 10 for every cat image. With that type of input to the NN, the NN is going to key off that F3M9 value. If it sees F3M9<10, it will weigh it as very unlikely to be a cat.
So F3M9<10 would become a "poison pattern". If it is there, it mustn't be a cat. So a GA could eventually learn to check for this pattern - so it could avoid it.
It is important to note that we have to add something to our GA survival formula that prevents an algorithm from finding a winning cat image and then just repeating that image unchanged, over and over again - knowing that it will always get a win.
Ideally, our GA should eventually evolve the ability to mimic the NN process - at least well enough to score wins every time.
DavidSnider said:
Then we add these images to the training set labeled as non-cat and retrain.
How would expect this to affect the new NN's ability to label images 'cat' that humans would also label as 'cat'?
When we re-run GA on this new NN are we're still going to converge on data that looks like noise to a human?
So you would have your original 100 cat images and 100 non-cat images.
Then say let you GA "evolve" overnight and in the morning, you collect 1000 images.
You review all 1000 and you don't like any of them. So you add them to the training set. Now you have 100 cats and 1100 non-cats in your training set.
Then say let you GA "evolve" overnight and in the morning, you collect 1000 images.
You review all 1000 and one kind of looks like a cat. So you add them to the training set. Now you have 101 cats and 2099 non-cats in your training set.
And so on.
The problem would be the information capacity of your NN. If it could store all of the original 100 images in its programming (in terms of its learning capacity), you would tend to drive it towards that state - except that you would occasionally find new images of cats.