I continued my work on the bro deep cluster in the last months and just
want to share my outcome so far and future plans with you:
1. I want to get your opinion on my broker enhancement that allows to
route messages in between not directly connected peers (given that there
is a path between them).
2. I want to share some preliminary thoughts on how to enhance the
sumstats framework to a deep cluster setting, so that it is possible to
create (multiple) subgroups of bros (dynamically) in the deep cluster
that can share and aggregate information.
Criticism, opinions, and further suggestions are very welcome!
Summary deep cluster
A deep cluster provides one administrative interface for several
conventional clusters and/or standalone Bro nodes at once. A deep
cluster eases the monitoring of several links at once and can
interconnect different Bros and different Bro clusters in different
places. Due to its better scalability it can bring monitoring from the
edge of the monitored network into its depth (-> deep cluster).
Moreover, it enables and facilitates information exchange in between
different Bros and different Bro clusters. In essence, a deep cluster
can be seen as a P2P overlay network of Bro nodes, so that all Bros can
communicate with each other.
In summary, my outcome so far towards building such a deep-cluster is
* a modified multi-hop broker that allows to forward content in between
peers that are only connected indirectly with each other.
* some bro modifications in code and foremost in bro script land
* an enhanced broctl that operates as daemon and that can initiate
connections to other such daemons (all communication based on broker),
including a json-based configuration of nodes and connections
A summary of all changes can be found on the bro website (including
instruction on how to run the current version of the deep cluster):
Ongoing work is currently the adaption of my multi-hop broker to the
currently revised broker and the adaption of sumstats to work in a deep
cluster setting. Both is described in detail in the long (sorry)
remainder of this Email.
I enhanced broker to support publish-subscribe-based communication
between nodes that are not connected directly, but that are connected by
a path of other nodes. The working title of this is multi-hop broker. As
broker gets a significant revision soon, I want to share my design for
multi-hop broker with you, so that I can include your comments when
adding my multi-hop functionality to the revised broker.
A specific challenge here is the routing of publications to all
interested subscribers. For that, routing tables need to be established
among all nodes in a deep cluster. These routing tables are established
by flooding subscriptions in the deep cluster. Afterwards, publications
can be routed on the shortest paths to all interested subscribers
In that context, two issues arise, namely loop detection and avoidance
as well as limiting the scope of subscriptions for rudimentary access
control. Both issues are described in detail in the following.
*** Loop detection and avoidance
There is no unique identifier (like an IP address) anymore on which
basis you can forward information. There might be only one recipient for
a publish operation, but it can be also many of them. This can result in
routing loops, so that messages are forwarded endlessly in the broker
topology that is a result of the peerings between broker endpoints. Such
loops has to be avoided as it would falsify results, e.g., results
stored in datastores.
There is basically two options here:
1. Loop avoidance: During the set up phase of the deep cluster it needs
to be ensured that the topology does not contain loops.
2. Loop detection: Detect loops and drop duplicate packets. This
requires either to store each forwarded message locally to detect
duplicates or, more light-weight, to attach a ttl value to every broker
message. When the ttl turns 0, the message gets deleted. However, the
ttl does not prevent duplicates completely.
For multi-hop broker we chose a hybrid approach between the two options.
Loops in the broker topology need to be avoided during the initial
configuration of the deep cluster. A ttl that is attached to every
broker message will allow to detect routing loops and will result in an
error output. The ttl value can be configured, but its default value is 32.
However, there are certain configurations that require a more dense
interconnection of nodes. In conventional bro clusters all workers are
connected to manager and datanode, while the manager is also connected
to the datanode. Obviously this already represents a loop.
To avoid such routing loops we introduced an additional endpoint flag
``AUTO_ROUTING``. It indicates if the respective endpoint is allowed to
route message topics on behalf of other nodes.
Multi-hop topics are only stored locally and propagated if this flag is
set. If an auto-routing endpoint is coupled with an ordinary endpoint,
only the auto-routing endpoint will forward messages on behalf of the
other endpoint. As a result, not every node will forward subscriptions
received by others, so that loops can be prevented even though the
interconnection of nodes in the deep cluster results in topological loops.
*** Rudimentary Access Control
To prevent that subscriptions are disseminated in the whole deep cluster
single (=local) and multi-hop (=global) subscriptions are introduced.
Single-hop subscriptions are shared among the direct neighbors only and
thus make them only available within the one-hop neighborhood. In
contrast, multi-hop subscriptions get flooded in the whole deep
clusters. The differentiation in subscriptions with local
(``LOCAL_SCOPE``) and global scope (``GLOBAL_SCOPE``) is intended to
provide better efficiency and is configured as additional parameter when
creating a broker ``message_queue``. The default setting is always
The intention is to extend sumstats to be used within a deep cluster to
aggregate results in large-scale, but also to form sumstats groups on
the fly, e.g., as a result of detected events.
In the original sumstats only directly connected nodes in a
cluster-setup exchanged messages. By using multi-hop broker, we can
extend this to the complete deep cluster. We can form small groups of
nodes that are not directly connected to each other, but that rather are
connected indirectly by their subscriptions to a group id (e.g.,
To adapt sumstats to the deep cluster two basic approaches are feasible:
1. Sumstats groups: Instead of a cluster we apply sumstats on a group of
nodes in the deep cluster. This means that we keep the basic structure
and functioning of the current sumstats. We only replace direct links by
multi-hop links via multi-hop broker. However, we need a coordinator per
group (in original sumstats the manager took over this task). This
manager will initiate queries and will retrieve all results via the
routing mechanisms of multi-hop broker. There will be no processing or
aggregation of information directly in the deep cluster. Only nodes in
the group and foremost the manager will be able to process and aggregate
information. The deep cluster will only provide a routing service
between all members of the group.
2. Sumstats and deep cluster become one: We integrate the data
forwarding and the data storage with each other. The deep cluster is
used to aggregate and process results in a completely distributed
manner, while forwarding data to its destination. This means that all
members of a sumstats group get interconnected by the deep cluster (and
thus multi-hop broker) as in option 1, but now we have additional
processing and aggregation of information while it is forwarded towards
the manager by nodes of the deep cluster that are not part of the
sumstats group. That is definitely the most challenging option, but in
the long-term probably the most valuable one.
I am currently working on option 1 as it is the straightforward option
and as it is also a necessary intermediate step to get to option 2. I
would be especially grateful for additional input / alternate views here.