On Feb 9, 2017, at 12:02 PM, Robin Sommer
- We need to maintain some predictability in scheduling, in
particular with regarding to timing/timers. Bro's network time
time is, by definition, defined through I/O. My gut feeling is
that we need to keep the tight coupling there, as otherwise
semantics would change quite a bit.
- Related, another reason for time playing such an important role
in the I/O loop is that Bro needs to process its soonest input
first. That's most important for packet sources: if we have
packets coming from multiple packet sources, earlier timestamps
must be processed before later ones across all of them.
- Time is generally complex, we have three different notions of
network time actually, all with some different specifics: time
during real-time processing, time during offline trace
processing, and pseudo-realtime.
Also not sure to what degree coupling related to time/timers can be reduced, though I
think at least an initial refactor of the run loop could be done such that it doesn’t
change much related to how time currently works. Then maybe later or during the refactor,
it will get easier to see what exactly can be improved.
- I believe we need to maintain the ability to have
I/O loops that
don't have FDs.
Yep, don’t think there will be a problem there.
- I like the idea of using CAF, including because
it's going to be
a required dependency anyways in the future. I would also like
it conceptually to move I/O to actors, and I'm wondering if even
packets sources could go there. However, I can't quite tell if
that's feasible given other constrains and how other parts of
the system are layed out (including that in the end, everything
needs to go back into the main thread before being further
processed; at least for the time being).
I do think even packet sources could get moved into actors. My initial idea for the main
loop refactor is for it to be a single actor waiting for “ready for processing” messages
from IOSources, and then for each IOSource to be responsible for its own FD polling (if it
needs it). That way, the main loop doesn’t care about FDs at all anymore and if an
IOSource needs to poll FDs it can just use poll() in its own actor/thread for now (my
guess is that most IOSources will just have a single FD to poll anyway or that the polling
mechanism isn’t a very significant chunk of time for ones that may have more, but the only
way to answer that is to actually do the performance testing.)
- One of the trickiest parts in the past has been
performance on a variety of platforms and OS versions. Whatever
we do, it'll be important to do quite a bit of test-driving and
benchmarking. Let's try to structure the work so that we can get
to a prototype quickly that allows for some initial performance
validation of the approach taken.
Sure. I was also expecting to try and just get something working without any significant
overhauling of any of Bro’s systems.