Questions about services generated by zeek

Good morning, I am trying to solve doubts about how to treat the information obtained by zeek for the services.

I am dealing with the following event: Known::log_known_services

Now thanks to this envento I can create a table in which I will store the services found for each team and with this I can create an inventory with unique rows for each team.

I have the problem that this event does not contain the mac address of the device, because I do not contain all the data of the connection.

This is a problem for me because if there is one or more dhcp servers configured where zeek is running, it could be gathering information from different computers over time.

For this I have come up with a solution, in my custom scritp before the event to detect known services I could define an event at a lower level as the event: new_packet, to create a table in which the primary key is the ip, and the content is the mac.

In such a way that when the event of known services is executed, we can search in the table already generated the ip and from this obtain the mac, in such a way that when the data is inserted in the final table, the key is the mac.

I understand that the execution between the low level event and the creation of the table, should not be out of sync with the service event, since they would be running at the same time.

What do you think about this solution?

Should I use another low level event to detect the basic connection data (ip and mac)?

Thanks in advance!

What do you think about this solution?

Glad you figured that out!

Should I use another low level event to detect the basic connection data (ip and mac)?

event new_connection(c: connection) rather than new_packet() as the latter will have very high performance impact. c$orig$l2_addr and c$resp$l2_addr are the fields

That said, the know-services script isn’t really setup to allow what you’re doing more efficiently: Extending the ServicesInfo record with custom fields based on connection data. I wonder if the following patch would be interesting to more people:

diff --git a/scripts/policy/protocols/conn/known-services.zeek b/scripts/policy/protocols/conn/known-services.zeek
index e868868f7..0973c581c 100644
--- a/scripts/policy/protocols/conn/known-services.zeek
+++ b/scripts/policy/protocols/conn/known-services.zeek
@@ -81,6 +81,8 @@ export {
        ## Event that can be handled to access the :zeek:type:`Known::ServicesInfo`
        ## record as it is sent on to the logging framework.
        global log_known_services: event(rec: ServicesInfo);
+
+       global service_policy: hook(c: connection, rec: ServicesInfo);
 }
 
 redef record connection += {
@@ -281,6 +283,10 @@ function known_services_done(c: connection)
                                  $port_proto = get_port_transport_proto(id$resp_p),
                                  $service = tempservs);
 
+       # Allow extending or early filtering.
+       if ( ! hook Known::service_policy(c, info) )
+               return;
+
        # If no protocol was detected, wait a short time before attempting to log
        # in case a protocol is detected on another connection.
        if ( |c$service| == 0 )

Then you could implement the hook and could directly attach data to the ServicesInfo record as you might want to include the mac also in the log.

Good afternoon, thank you very much for your attention!

I am starting to program in zeek and I still have some doubts, but from what I understand with the modification you offered me, I still can’t capture the mac address of the Known::log_known_services event, or I’m not right?

Also, I would be grateful if you could also tell me how I could bring these records to the log.

Thank you very much in advance!

The shown would not log mac directly, but it would allow a patch as follows:

@load protocols/conn/known-services

redef record Known::ServicesInfo += {
        l2_addr: string &log &optional;
};

hook Known::service_policy(c: connection, info: Known::ServicesInfo)
        {
        info$l2_addr = c$resp$l2_addr;
        }

And then the log looks like:

#fields ts      host    port_num        port_proto      service l2_addr
#types  time    addr    port    enum    set[string]     string
1300475168.784020       208.80.152.118  80      tcp     HTTP    00:13:7f:be:8c:ff
1300475168.916018       208.80.152.3    80      tcp     HTTP    00:13:7f:be:8c:ff

Patch to Zeek is a bit more complex, actually. Would that work for you?

diff --git a/scripts/policy/protocols/conn/known-services.zeek b/scripts/policy/protocols/conn/known-services.zeek
index e868868f7..40128bc5d 100644
--- a/scripts/policy/protocols/conn/known-services.zeek
+++ b/scripts/policy/protocols/conn/known-services.zeek
@@ -81,6 +81,8 @@ export {
        ## Event that can be handled to access the :zeek:type:`Known::ServicesInfo`
        ## record as it is sent on to the logging framework.
        global log_known_services: event(rec: ServicesInfo);
+
+       global service_policy: hook(c: connection, rec: ServicesInfo);
 }
 
 redef record connection += {
@@ -156,11 +158,7 @@ event known_service_add(info: ServicesInfo)
                Known::services[info$host, info$port_num] = set();
 
         # service to log can be a subset of info$service if some were already seen
-       local info_to_log: ServicesInfo;
-       info_to_log$ts = info$ts;
-       info_to_log$host = info$host;
-       info_to_log$port_num = info$port_num;
-       info_to_log$port_proto = info$port_proto;
+       local info_to_log = copy(info);
        info_to_log$service = set();
 
        for ( s in info$service )
@@ -281,6 +279,9 @@ function known_services_done(c: connection)
                                  $port_proto = get_port_transport_proto(id$resp_p),
                                  $service = tempservs);
 
+       if ( ! hook Known::service_policy(c, info) )
+               return;
+
        # If no protocol was detected, wait a short time before attempting to log
        # in case a protocol is detected on another connection.

Thank you very much, it works perfectly, I have already tried to add the customization in the known_services.zeek script and I have also added the other part to the local.zeek script, then I loaded the data with zeek deploy and everything worked correctly.

I would like to explain you what I have understood to see if it is correct:

  • In the script known_services.zeek:

    EVENT TO WRITE THE KNOWN SERVICE IN THE FINAL TABLE (event known_service_add):

    1- We add the hook as a global variable so that it can be exported and used in other sites (local.zeek).
    2- Comment out all the lines that reference the values of the info_to_log record.
    3- We indicate that the value of the variable info_to_log is the result of passing info through the copy function.

    FUNCTION THAT IS ACTIVATED JUST BEFORE WRITING THE LOG (Funtion: known_services_done)

    1- if ( ! hook Known::service_policy(c, info) ) → checks if there is a policy hook defined for the service and the information associated with that service (c and info in this context). If no policy hook is defined, the ! hook expression Known::service_policy(c, info) will evaluate to true, meaning that the policy hook is either not enabled or not defined.
    If no policy hook is found to be defined, the known_services_done() function returns immediately, which means that no further actions will be taken regarding the specific service and the information associated with it, therefore the known services log would not be written.

    QUESTIONS:

    1- Have I defined correctly what exactly the patch does?
    2- For the modification of the event known_service_add event I do not understand why we have to do the point 2, in which we leave commented the values of the record register, we can not use them and simply add another line in which we specify the MAC?
    3- For the modification of the event event known_service_add I do not understand where the copy function is defined, nor what does it do at 100%?

Again thank you for all your help!

It’s actually not quite.

The known-services script uses a table/set to detect duplicate services. It’s only using host, not including the mac. It will not work correctly if you have different hosts with the same IP, but different mac addresses. Though… many other places will likely not work correctly in that case, so the extension to add custom fields to known_services.log still seems useful.

If no policy hook is defined, the ! hook expression Known::service_policy(c, info) will evaluate to true,

…will evaluate to false. Invoking hook x() with no policy attached to x evaluates to true. Only a break in a hook handler will cause the invocation to return false.

Read more about hooks in the documentation.

, we can not use them and simply add another line in which we specify the MAC?

The known_services script should not be aware of the mac address. It’s added by the user in an external script. I switched to copy() to ensure all fields (including custom ones added by a user) stay in-tact.

I do not understand where the copy function is defined, nor what does it do at 100%?

copy() is very lightly documented.