Along with @maciejwolczyk we’ve been training a neural network that learns how to play NetHack, an old roguelike game, that looks like in the screenshot. Recently, something unexpected happened.

  • nomad
    link
    English
    2123 days ago

    Reminds me of a production bug we could not replicate for the life of me.

    The condition could logically not be reached. impossible.

    Turns out, in production we had two threads per process, and one would monkey patch a function in the shared process space with a non multi threading safe locking mechanism.

    That took several days to find.