On May 9, 2011, at 12:14 PM, Matthias Vallentin wrote:
I'd love to do so, yet my cycles only allow for
brief inline feedback.
All thoughts are welcomed. :)
the metrics framework a metric is just a key or keys
that is connected to a number which is collected over some interval
before being written to disk and reset.
If it is really just a sequence of number, why not calling it time
series? The word metric (in networking) implies some sort of property of
a path, and more generally, some sort of performance measure. This would
also make more sense in the statistical context, where a time series
analysis is well-defined field of its own. I prefer this term not only
because I have taken a statistics course, but mainly because it is more
neutral, maybe even more general, since it only describes the format of
I think I agree with this. I'll probably change the name at some point.
To preserve the temporal ordering, timestamps need to
be part of the
synchronization game. It looks like a mergeable table indexed by
timestamp will do the trick.
Hm... there isn't a whole lot of attention given to temporal ordering. The manager
would just be asking for a particular measurement (conns originated for example) along
with the index or indexes (possibly each local /24 where a connection was originated?) and
their counts. Once the workers sent their values off to the manager, they would reset
back to zero and start counting up until the manager asks for the numbers again.
There is no actual attribute based synchronization (&synchronize) going on because
workers don't care about the value on other workers and doing full synchronization
would cause too much over head with variable synchronization.
There's a whole subfield of statistics waiting
for you. The natural
question is much of this should be in Bro versus offline log munging.
Clearly you're talking Bro. It seems would you would like to have is a
variance analysis on a detrended series (i.e., on the first-order
differences between two data points). Other analyses would be to check
for seasonal components.
You lost me at "variance analyzer on a detrended series". :)
Anyway, what I'm searching for is just enough statistics (even if it's fake-ish
pseudo-statistics) to be able to raise notices when statistically significant changes
happen in the time series data. I just don't even know where to start with it.
Why not use R? It is brilliant time series support!
(And there exist
also scripting language bindings if you really want a separate tool. I
tested the Ruby bindings once and they work well.)
Someone mentioned to me recently that graphviz will even do ascii art based graphs in your
terminal and it's already installed everywhere. This would just be a companion
script so people would be free to write whatever works best for themselves.
requests per host header (using a new SSL analyzer that
provides the information from the SSL establishment), this is an
example of a non-IP address based metric too.
Along those lines, one could (mis)use this new framework to count the
number of unique certificates per host as crude way to identify TLS
Oof. Yes. :) I'm trying to implement this in a way that people will be able to do
things like that though, so I suppose it's all good.
International Computer Science Institute
(Bro) because everyone has a network