To be honest, I have somehow lost track of the discussion. What I can
recall, it's about simplifying the API in the light of multi-hop
routing, which is not fully functional yet.
Regarding multi-hop routing I am even not sure what the actual goal is
that we are currently aiming at. However, from a conceptual perspective
I think "routing" either needs routing algorithms or strict conventions
of how the network, to route messages through, is structured. So, what
would a "deep cluster" look like and what kind of message flows do we
expect in there?
Some comments on the observations:
On 06/08/18 21:50, Robin Sommer wrote:
- The main topics are
bro/cluster/node/<name>. For these we wouldn't have a problem
with loops if we enabled automatic, topic-driven forwading as
far as I can see.
How does forwarding work if I add another node type? Do we assume a
certain cluster structure here? If yes: Is that a valid assumption?
- bro/cluster/broadcast seems to be the main case
with a looping
problem, because everybody subscribes to it. It's hardly used
though. (bro/config/change is used similarly though).
The topic-concept is a multicast scheme, isn't it? Having a broadcast
functionality on top of that feels odd. However, it's limited to the
cluster topic. This leads me to the question which domains do we operate
on? If I think of messages, I start to think about a cluster but that
might be only one domain of application. I think it would be good to
define layers of abstraction more precise here.
- There are a couple of script-specific topics
where I'm wondering
if these could switch to using bro/cluster/<node-type> instead
(bro/intel/*, bro/irc/dcc_transfer_update). In other words: when
clusterizing scripts, prefer not to introduce new topics.
From my understanding this would mean going back to the old
communication patterns. What's the point of having topics if we don't
- There's a lot of checks in publishing code
of the type "if I am
(not) of node type X".
That's something I would have expected. I don't think this is
necessarily an indicator of bad design. Having these kind of checks
means that roles are somehow fixed and responsibilities are explicitly
- Pools are used for two different things: 1. the
pick a proxy to process and log the information; whereas 2. the
Intel scripts pick a proxy just as a relay to broadcast stuff
out, reducing load. That 1st application is a good, but the 2nd
feels like should be handled differently.
I think we should be careful about introducing too much abstractions.
Communication patterns tend to be complex and the more of the complexity
is hidden, the easier it will be to generate misunderstandings. For
example, in case of the intel framework, proxy nodes might be able to
implement some more logic than just relaying at some point. Having the
relay abstraction would mean to deal with two different levels of
abstractions regarding intel on proxy nodes in this case.
Overall I have to say I found it pretty hard to follow
because we don't have much consistency right now in how scripts
structure their communication. That's not surprising, given that we're
just starting to use all this, but it suggests that we have room for
improvement in our abstractions. :)
I totally agree here! I think it could help to come up with some more
use cases to identify the best abstractions.