Recently I have some problems with Bro and PF_RING in cluster.
On my server, when I have less than 32 worker threads(rings),
everything is okay, but when I use worker threads more than 32, pf_ring
start to receive repeating data packets. For example, rings less than 32, I
send 400000 packets to server and pf_ring info in /proc shows there is
400000 packets in rings, but when rings greater than 32, I can get 800000
packets when 33 rings and 1200000 packets when 34 rings and so on.
I guess if there is some rules that a pf_ring or a bro cluster can only
support less than 32 rings or worker threads on a server or some other
Any insight would be helpful.
I am working on a Zeek script and would like to understand how can I make
Zeek look only for the first ten packets in a tcp session.The first ten
packets are enough to fingerprint the traffic I am trying to identify and
so would to ensure my script also looks at only the first 10 packets to
save processing time.
The communication is as follows :
There is the initial 3 way handshake and then there are 7 packets with
variable lengths and on a non-default destination port/service. So I had to
use the tcp_packet event in my script. Is there a better way of doing it ?
Using tcp_packet would make my script to check for all tcp packets
increasing the load on my zeek system.
Please do let me know if you have any suggestions for me on this. Looking
forward to your response.
I was hoping to understand how Zeek aggregates packets by connection. Is there any documentation that summarizes the approach? Is there a way to extract all the packets that correspond to a particular connection?
Ananditha Raghunath - 0557
Cyber Operations and Analysis Technology
MIT Lincoln Laboratory
ananditha.raghunath(a)ll.mit.edu | 781-981-9035
We have been seeing some crash reports on some of our nodes, regarding a tcmalloc error. I was wondering if anyone else has seen this before and if anyone has any suggestions on what the cause might be. We are running Zeek 2.6. Here is an example stderr.log output from one of these crashes:
Myricom: Local timesource
listening on p2p2
tcmalloc: large alloc 1329594368 bytes == 0xc701c000 @ 0x7f72a12604ef 0x7f72a1280d56 0x9623cf 0x9623ff 0x8d8c90 0x8d1b79 0x928352 0x92895f 0x928a71 0x9242bd 0x7b5908 0x7ff59f 0x7b535d 0x7b555f 0x7b3a98 0x8c422e 0x8c3a70 0x95d49e 0x95dc16 0x8c33cc 0x8c36f9 0x8c323f 0x8c18be 0x8bef32 0x95d352 0x5c61dd 0x676f75 0x677f1c 0x648a0f 0x914669 0x648ec5
tcmalloc: large alloc 1661992960 bytes == 0x11641c000 @ 0x7f72a12604ef 0x7f72a1280dad 0x9623cf 0x9623ff 0x8d8c90 0x8d1b79 0x928352 0x92895f 0x928a71 0x9242bd 0x7b5908 0x7ff59f 0x7b535d 0x7b555f 0x7b3a98 0x8c422e 0x8c3a70 0x95d49e 0x95dc16 0x8c33cc 0x8c36f9 0x8c323f 0x8c18be 0x8bef32 0x95d352 0x5c61dd 0x676f75 0x677f1c 0x648a0f 0x914669 0x648ec5
/usr/local/bro/share/broctl/scripts/run-bro: line 110: 138751 Killed nohup "$mybro" "$@"
Lead Security Analyst
Security and Network Monitoring
Oregon Research & Teaching Security Operations Center (ORTSOC)
GPG Fingerprint: ECC5 03A6 7E91 17C6 50C6 8FAC D6A0 8001 2869 BD52
Looking to see if anyone has created a script, or if this is an argument to
process multiple PCAPS using the bro -r argument.
I have it setup to output to JSON currently and change from EPOCH time to
normal date/time output, but that is one at a time, and will have
Looking at either a batch script of maybe python but wanted to see if
anyone has done this bfore.
(Reingest multiple old PCAP files) to get re-ingested.
We're going to be rolling out a newsletter.
Do you have any zeek related news you'd like me to considering adding in?
Do you know of any Zeek related jobs?
If you have any topics you'd like to suggest please let me know by sending
I look forward to hearing from you!
Director of Community
* Ask me about how you can participate in the Zeek (formerly Bro)
* Remember - ZEEK AND YOU SHALL FIND!!
My understanding is that 4,000+ CPU cores would be necessary to support
this throughput. In the recent meeting from CERN I recall seeing someone
describe 200Gbps, which would imply 8,000+ CPU cores. Is this accurate, or
am I doing a conversion incorrectly?
I am basing this purely on this quote, from
“The rule of thumb we have followed recently is to allocate approximately 1
core for every 250Mbps of traffic that is being analyzed. However, this
estimate could be extremely traffic mix-specific. It has generally worked
for mixed traffic with many users and servers. For example, if your traffic
peaks around 2Gbps (combined) and you want to handle traffic at peak load,
you may want to have 8 cores available (2048 / 250 == 8.2). ”
I want to extract files and have their names include their md5 hash.
The problem is that the md5 hashing happens on file_hash event while file
extraction occurs on former events such as file_new or
Any ideas on how to accomplish this?
I'd like to ask that there be some thought given to the deprecation and
eventual removal of the &persistent option in favor of Broker data
stores. IMHO, there are uses cases where the &persistent attribute is
much more attractive and lower overhead than the data store approach.
As you are likely aware, &persistent is now marked deprecated and we
expect it to disappear in the next version or two. The recommendation
for replacement is the much more robust, SQLite backed, Broker data store.
The data store solution is very elegant, though it does seem to require
more fiddling than it ought to to get a data store set up. In the long
term and when dealing with large amounts of data that must be persistent
and synchronized across nodes, this really is a wonderful solution.
That said, there seem to me to be some use cases where that is a massive
hammer to swing at some very small problems. For example, we have one
analysis script that is tracking successful external DNS resolutions.
Specifically, it is keeping track of all IPv4 and IPv6 addresses
resolved in the last 7 days (&read_expire 7 days) in a set. For all
outbound connection attempts, this script generates a notice when the
connection involves an external host that never appeared in a DNS answer
record. This is quite handy when it comes to locating unauthorized
outbound scanning, some C2 behaviors that do not rely on DNS/fast flux
sorts of things, fragile configurations of enterprise services, etc.
This has been performing quite well for several years now in more than
one relatively decent sized networks (100,000+ hosts).
For this problem (and others that I can imagine that would take a
similar tack - i.e., only storing a set, vector, or other single
primitive, rather than a massive record in a table or a table of
tables), the &persistent is perfectly "sized."
Am I alone in thinking that this feature should be retained *along side
of* Broker data stores and potentially documented as recommended for
simple primitive data persistence?
Chief of Operations
Enclave Forensics, Inc.
We are setting up a Zeek cluster consisting of a manager/logger and five
sensors. Each node uses the same hardware:
- 2.4 GHz AMD Epyc 7351P (16-core, 32-threads)
- 256 GB DDR3 ECC RAM
- Intel X520-T2 10 Gbps to Arista with 0.5m DAC
- Arista 7150S hashing on 5-tuple
- Gigamon sends to Arista via 4x10 Gbps
- Zeek v2.6-167 with AF_Packet
- 16 workers per sensor (total: 5x16=80 workers)
The capture loss was 50-70% until I remembered to turn off offloading. Now
it averages about 0.8%. Except that often 0-4 cores in a 1 hour summary
spike at 60-70% capture loss. There doesn't appear to be a pattern on which
core suffers the high loss. Searches for how to identify and fix the reason
for such large losses have failed to yield any suggestions for debugging
the problem. Suggestions?