I crafted a custom file analysis plugin that attaches to specific MIME types via file_sniff and fires an appropriate event once processing has been completed.

I had to jump through a few hoops to make a file analysis plugin, first, but those were cleared and everything runs and loads appropriately there (bro -NN verified.) My test regime is very straight forward, I have several PCAPs cooked up that contain simple HTTP file GETs (that extract otherwise properly and do not exhibit missing_bytes) and I am running them via `bro -C -r <>.pcap`. My issue comes with utter and complete inconsistency with execution - it is, effectively, a coin flip, with zero changes. 

When I have dumped the buffers being processed, as my file analysis plugin has a secondary verification to make sure the data passed is appropriate - which is confusing, as the mime type fires correct, which seems to indicate a bug somewhere in the data path - the correct execution, clearly has the proper data in it. The invalid executions, again changing nothing other than a subsequent execution, shows a buffer of what appears to be completely random data.

I currently cannot supply the file analysis plugin for inspection, but would very much appreciate insight in how to find the root cause. It very much seems to be upstream. If I run the analysis portion of the plugin as a free standing executable outside of Bro against the data transferred via HTTP, everything works perfect and the structures are filled accordingly.

I saw BIT-1832, and there could be similar root causes in there, but I have not had time to investigate otherwise. The issues I am raising, again, are command line replay via command line, not even “live” network traffic or tcpreplay over a NIC/dummy interface.


Aaron