Justin,

Indeed, cutting new territory is always interesting. As for the code,

https://github.com/aeppert/test_file_analyzer


File I am using for this case:
https://www.bro.org/static/exchange-2013/faf-exercise.pcap

`bro -C -r faf-exercise.pcap` after building and installing the plugin.

My suspicion is it’s either unbelievably trivial and I keep missing it because I am the only one staring at it, or it’s a rather deep rabbit hole. 

Aaron

On October 12, 2017 at 5:35:15 PM, Azoff, Justin S (jazoff@illinois.edu) wrote:


> On Oct 12, 2017, at 5:21 PM, Aaron Eppert <aaron.eppert@packetsled.com> wrote:
>
> I crafted a custom file analysis plugin that attaches to specific MIME types via file_sniff and fires an appropriate event once processing has been completed.
>
> I had to jump through a few hoops to make a file analysis plugin, first, but those were cleared and everything runs and loads appropriately there (bro -NN verified.) My test regime is very straight forward, I have several PCAPs cooked up that contain simple HTTP file GETs (that extract otherwise properly and do not exhibit missing_bytes) and I am running them via `bro -C -r <>.pcap`. My issue comes with utter and complete inconsistency with execution - it is, effectively, a coin flip, with zero changes.
>
> When I have dumped the buffers being processed, as my file analysis plugin has a secondary verification to make sure the data passed is appropriate - which is confusing, as the mime type fires correct, which seems to indicate a bug somewhere in the data path - the correct execution, clearly has the proper data in it. The invalid executions, again changing nothing other than a subsequent execution, shows a buffer of what appears to be completely random data.
>
>

That sounds a lot like an uninitialized buffer somewhere. I wonder if you compile bro and your plugin with -fsanitize=address if you will trigger something with that.

> I currently cannot supply the file analysis plugin for inspection, but would very much appreciate insight in how to find the root cause. It very much seems to be upstream. If I run the analysis portion of the plugin as a free standing executable outside of Bro against the data transferred via HTTP, everything works perfect and the structures are filled accordingly.


If you are seeing what looks like random data in your plugin you should be able to reproduce this behavior by having a file analysis plugin that just dumps out the buffers to stdout (as hex?). Can you rip out all the custom logic in your plugin leaving something that just dumps the buffers as-is? That should leave you with just the hello world of file analysis plugins. If that shows the problem we should be able to figure out where it is coming from.

I don't think file analysis is inherently broken somewhere, otherwise the bro test suite would fail. I think this would have to point to something unique about your plugin. I think you are the first person to build an out of tree file analysis plugin, so there may be an issue with the bro<->plugin interface for file analsys itself. If that is the case, extracting something like the built in md5 analysis plugin to an external plugin and calling it 'mymd5' would show the same problems.

> I saw BIT-1832, and there could be similar root causes in there, but I have not had time to investigate otherwise. The issues I am raising, again, are command line replay via command line, not even “live” network traffic or tcpreplay over a NIC/dummy interface.

That does sound similar, but I'm not sure if they were seeing different results on the same pcap on different runs.


Justin Azoff