If anyone has time/interest, I feel like the main components of Broker are established now and deserving feedback/critique. Rather than try to detail how things work here, it’s probably best for people to try figuring things out from the repo (e.g. source code comments and unit test examples) and ask questions about what's unclear.
But it would be helpful to start a discussion on some of the planned features and open questions. I’ll try literally pasting my TODO list and hope it’s readable. The items are roughly ordered from most-certainty to least-certainty. Feedback welcome generally, but particularly where questions are posed.
Broker TODO
===========
- C API
- Python bindings
- Persistent storage backend
- SSL/IPv6 (dependent on actor-framework support)
- Need/want overload or flow-control mechanisms?
E.g. a simple policy for handling overload is to let a user specify
a threshold for how many items are allowed in a queue before new
messages are dropped.
- In-place data store value modifications
Plan to support increment/decrement on integral values.
Need any other operations?
What to do when applying an operation to invalid data type?
Plan to just send error message back to sender and leave further
decisions up to them.
- Data store support for optional expiry model
What are the desired mechanims? Options:
(1) Inserter may specify "expire this entry at time X" ?
(2) Inserter may specify "expire this entry based on
create/read/modification access time" ? Going this route,
seems read access times would need to be shared across
clones (contradicts goal of lightweight, local read-access)?
(3) Other hooks to make expiry conditional?
- Data typing model
Currently data in Broker is similar to Bro's threading::Value in
that full type info (from Bro's perspective) isn't available, just
the raw storage required for the types. Broker currently differs in
that it doesn't use any tag to distinguish between primitives that
share the same storage (e.g. enum/string or double/interval).
Interpretation of types is left entirely up to the receiver.
Do we need more strict typing than this? Options:
(1) Data holds additional type tag to suggest how to interpret
(2) Fully implement separate Bro-types.
Planning to try integrating w/ Bro as it is and see what specific
problems arise. I think (1) may end up being helpful, but maybe not
required and I'd like to avoid (2) if possible.
- Bro integration
Is Broker the default in Bro 2.4 ? That implies requiring C++11.
Also I'm requiring CMake 2.8.12+ and may be hard to go below 2.8.
Bro is still happy with 2.6.3.
The MySQL analyzer is ready to go, apart from one issue: a memleak btest
that I wrote is failing on some of Bro's regex code.
> # @TEST-EXEC: HEAP_CHECK_DUMP_DIRECTORY=. HEAPCHECK=local btest-bg-run
bro bro -b -m -r $TRACES/mysql/mysql.trace %INPUT
>
> @load base/protocols/mysql
This results in:
> Leak check net_run detected leaks of 203 bytes in 4 objects
>
> The 4 largest leaks:
> Leak of 72 bytes in 1 objects allocated from:
> @ 53e92d
>
> Leak of 56 bytes in 1 objects allocated from:
> @ 52fb66
> @ 83f663
>
> Leak of 56 bytes in 1 objects allocated from:
> @ 52fd52
> @ 20014
>
> Leak of 19 bytes in 1 objects allocated from:
> @ 53e9b6
Digging a little deeper shows that two of these leaks are in RE_parse
(re-parse.y:110 and re-parse.y:133), one is in re__scan_buffer
(re-scan.cc:2035), and one is in re__scan_bytes (re-scan.cc::2084).
The only regular expression that I have in the analyzer is: type NUL_String
= RE/[^\0]*/;
I'm pretty sure that this isn't really an issue, but can anyone help with
figuring out how to get the btest to pass? I'd really like to have a
memleak test for this.
Thanks,
--Vlad
[ https://bro-tracker.atlassian.net/browse/BIT-924?page=com.atlassian.jira.pl… ]
Robin Sommer commented on BIT-924:
----------------------------------
Yeah, that sounds good. We could also add a command line flag that
statically checks for use of deprecated functionality so that people
don't only see it during runtime.
A slightly alternative approach would be CMake's policies, where one
can set (and eventually switch) a default value. But I've always found
that confusing because it's not just the cmake's version that
determines what's acceptable, but also the policy setting.
> String BIFs Return 1-indexed string_arrays
> ------------------------------------------
>
> Key: BIT-924
> URL: https://bro-tracker.atlassian.net/browse/BIT-924
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Affects Versions: git/master
> Reporter: grigorescu
> Fix For: 2.4
>
>
> The following BIFs return 1-indexed string_arrays:
> * sort_string_array
> * split
> * split1
> * split_all
> * split_n
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
[ https://bro-tracker.atlassian.net/browse/BIT-924?page=com.atlassian.jira.pl… ]
Jon Siwek commented on BIT-924:
-------------------------------
{quote}
I think more generally, we want a good way to be able to make breaking changes to BIFs and the base scripts, which can be thought of as an API.
{quote}
Agreed.
{quote}
With this kind of notification, I could see a staged approach to "fix" these string functions (using split as an example, obviously extended to all the functions above):
Create split_0 and deprecate split
Rename split_0 to split, create a deprecated split_1 that implements the old functionality
Remove split_1
{quote}
Is it reasonable to expect a user to never "skip a release version" ? Because if they did, they may miss a deprecation warning and just be silently using the breaking change. But yeah, maybe the best that can be done here to clearly indicate in NEWS which breaking changes a user will need to audit their code for and also provide helpful hints in the form of deprecated usage warnings for a limited time.
> String BIFs Return 1-indexed string_arrays
> ------------------------------------------
>
> Key: BIT-924
> URL: https://bro-tracker.atlassian.net/browse/BIT-924
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Affects Versions: git/master
> Reporter: grigorescu
> Fix For: 2.4
>
>
> The following BIFs return 1-indexed string_arrays:
> * sort_string_array
> * split
> * split1
> * split_all
> * split_n
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
[ https://bro-tracker.atlassian.net/browse/BIT-1283?page=com.atlassian.jira.p… ]
Robin Sommer commented on BIT-1283:
-----------------------------------
Bro 1.5 came with a tool bdcat that decrypts these files. I'm reopening the ticket to see if we want to bring that back.
> Bro crashes when using &encrypt
> -------------------------------
>
> Key: BIT-1283
> URL: https://bro-tracker.atlassian.net/browse/BIT-1283
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Environment: bro version 2.3-263-debug
> Reporter: AK
> Fix For: 2.4
>
>
> Bro crashes when applying the &encrypt attribute when opening a file.
> bro -Ci eth0 -e 'global f1: file = open("f.out") &encrypt;'
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
[ https://bro-tracker.atlassian.net/browse/BIT-1280?page=com.atlassian.jira.p… ]
Robin Sommer commented on BIT-1280:
-----------------------------------
Merging.
Generally, I would actually prefer vector elements to be automatically initialized with null values corresponding to the vector's type; then the "in" would always return true. However, we don't have the concept of a type-specific null/default values.
> Checking index in vectors is broken
> -----------------------------------
>
> Key: BIT-1280
> URL: https://bro-tracker.atlassian.net/browse/BIT-1280
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Affects Versions: 2.4
> Reporter: Seth Hall
> Assignee: Robin Sommer
> Fix For: 2.4
>
>
> If you try to check an index in a vector for existence, you get an error...
> {noformat}
> event bro_init()
> {
> local vec: vector of count = vector();
> if ( 2 in vec )
> print vec[2];
> print vec;
> }
> {noformat}
> Error:
> {quote}
> fatal error in bool: BroType::AsVectorType (bool/vector) (bool)
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
[ https://bro-tracker.atlassian.net/browse/BIT-1280?page=com.atlassian.jira.p… ]
Robin Sommer reassigned BIT-1280:
---------------------------------
Assignee: Robin Sommer
> Checking index in vectors is broken
> -----------------------------------
>
> Key: BIT-1280
> URL: https://bro-tracker.atlassian.net/browse/BIT-1280
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Affects Versions: 2.4
> Reporter: Seth Hall
> Assignee: Robin Sommer
> Fix For: 2.4
>
>
> If you try to check an index in a vector for existence, you get an error...
> {noformat}
> event bro_init()
> {
> local vec: vector of count = vector();
> if ( 2 in vec )
> print vec[2];
> print vec;
> }
> {noformat}
> Error:
> {quote}
> fatal error in bool: BroType::AsVectorType (bool/vector) (bool)
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
[ https://bro-tracker.atlassian.net/browse/BIT-1285?page=com.atlassian.jira.p… ]
grigorescu updated BIT-1285:
----------------------------
Status: Merge Request (was: Open)
> MySQL Protocol Analyzer
> -----------------------
>
> Key: BIT-1285
> URL: https://bro-tracker.atlassian.net/browse/BIT-1285
> Project: Bro Issue Tracker
> Issue Type: New Feature
> Components: Bro
> Affects Versions: git/master
> Reporter: grigorescu
>
> topic/vladg/mysql is ready to be merged.
> Note: memleak btest core.leaks.mysql is currently failing due to an issue with how regexes are initialized.
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)
grigorescu created BIT-1285:
-------------------------------
Summary: MySQL Protocol Analyzer
Key: BIT-1285
URL: https://bro-tracker.atlassian.net/browse/BIT-1285
Project: Bro Issue Tracker
Issue Type: New Feature
Components: Bro
Affects Versions: git/master
Reporter: grigorescu
topic/vladg/mysql is ready to be merged.
Note: memleak btest core.leaks.mysql is currently failing due to an issue with how regexes are initialized.
--
This message was sent by Atlassian JIRA
(v6.4-OD-07-004#64005)