Using Genetic Algorithms to Generate Negative Training Data

What is a Genetic Algorithm (GA)?

GAs simulate evolution by having genotypes (like DNA in humans) generate phenotypes (like human beings). The genotypes are usually random at the start. The phenotypes are put through some kind of testing function, and the best ones get recombined with the other best ones to make children. Some noise is added, since in the real world DNA isn’t copied exactly correctly and cosmic rays break things occasionally. The phenotypes that don’t do well are discarded. Then the whole process is repeated. For more, see wikipedia.

The Wolfram Language makes it remarkably easy to generate random images, combine them, add noise, and test them against neural networks as a fitness function.

We can thus evolve images to fool neural networks fairly easily.

For example, the default image identification network model is 99.9977% sure that this is an image of Yoda:

That image took about 30 min of compute time to generate, and could be sped up in a variety of ways. It would be interesting to experiment with this to:

  • Select fitness for compression-ability to try and remove the randomness
  • Restrict the colors
  • Use geometry features instead of pixels (lines, rectangles and so on)
  • Use the generated negative images to improve the models

I’m assuming all of this isn’t novel, but after some quick googling around I couldn’t find anything particularly similar.

You can find the notebook here and play with it yourself, and it’s also embedded below:

2 Responses to Using Genetic Algorithms to Generate Negative Training Data

  1. Ben Gimpert March 14, 2020 at 10:00 am #

    There’s a literature on Black-Box attacks, adversarial image patches, and so on:

    • Steve Coast March 14, 2020 at 3:56 pm #

      Yeah to be clear I meant GA’s specifically

Leave a Reply

Powered by WordPress. Designed by WooThemes