Sawmill Analytics

Analyse und Reporting
für Web | Netzwerk | Sicherheit

Zugriffs- und Datenanalyse von Server-Logs (Proxy, Mailserver, Firewall, Webserver) und Überwachung der Sicherheit & Performance, Schwachstellenanalyse.

Sawmill Analytics 8 | Loganalyse


Showing Usernames, Instead of IP Addresses, in Reports

When using Sawmill to generate reports from a forward proxy server, you will often want to know what person was responsible for a particular bit of traffic. Almost all proxy servers log the IP address of the internal computer, but some do not log the username, or actual name, of the person using that IP address, at the time of the access. This article discusses ways to determine the person responsible for the traffic.

Method 1: Use a proxy server which logs the username

Ideally, the proxy server should simply log the username on each line, showing which user was responsible for the access. Technically, that can be accomplished by having the proxy server query a local LDAP or Open Directory server, to determine the username associated with the IP address at the time of the request. This is the optimal solution, because the proxy server is in the ideal position to do the query. All other solutions must be done after the fact, when the IP address may no longer correspond to the same username in the authentication server. So if your proxy server logs the username, you're done--Sawmill will report it. If your proxy server is not currently logging the username, see if it can be configured to compute and log the username at the time of the event. If it cannot, contact the proxy server vendor to see if such a function can be enabled, or can be added.

Method 2: Use a CFG file to map IP addresses to usernames

In the real world, your proxy server may not log usernames, and you may not have any way to make it log usernames. A simple alternative, though not usually as accurate, is to give Sawmill a CFG file which includes a list of all IP addresses, and the usernames they map to (see the December 2006 newsletter for a discussion of creating and using CFG files). Once you have the CFG file, Sawmill will be able to tag every line of log data with the username (based on the IP address in that line), and you'll get a Usernames report and field, just as though the device had logged the username. This is perfect if your IP-to-username mappings never change, and can be automated using a simple script to query the LDAP server and generate the CFG file.

Method 3: Use a timestamped CFG file to map IP addresses to usernames based on time

Method 2 works if IP-to-username mappings never change, but in many environments they do. You might have a DHCP environment where multiple users share the same IP addresses; or you might have shared workstations where one person might be sitting at a particular workstation one day, and another person might be sitting there another day. A simple IP-to-username map cannot capture this complexity, and you would need to choose which single username corresponds to each IP. If the environment is largely stable, that could still be fairly effective, but it won't work well for environments where most of the IPs are dynamic.

If the IPs are highly dynamic, you can use a CFG file which contains timestamp information for each IP-to-username mapping. The CFG file could include information about which time range a particular mapping was valid, and the log filter which converts IPs to usernames could use the timestamp of the log line to choose which username to use. In the simplest form, the CFG file could just have multiple dumps of the authentication server, each one under a timestamped subnode in the CFG file, and the log filter could find the closest dump to the log line's timestamp, and use the mappings from there. If the authentication dumps are frequent enough, this can approach the accuracy of Method 1. However, the additional overhead of managing the timestamps and the much larger CFG file can make log processing much slower with this approach, versus Method 2 (and with Method 1, there is no overhead at all).

More sophisticated layouts of the CFG file are also possible, including methods which eliminate redundancy by using a hierarchical structure. The precise details of the possible structures of the CFG file, and the log filter which parses it, are beyond the scope of this article, but if you need assistance implementing this approach, we can help you.

Method 4: Query the authentication server directly from the log filter

This method dispenses with CFG entirely, instead having the log filter query the authentication server each time it needs to know the username associated with a particular IP address. Salang (the language of log filters) does not have direct support for querying authentication servers, so this must be done by running an arbitrary command (with exec()) to do the lookup. Furthermore, because it takes a significant fraction of a second to exec() a script to do the lookup, this cannot efficiently run for every line of log data, so an additional level of caching (using an in-memory node) should be added to make it fast.

This method (4) suffers from a possible accuracy problem similar to Method 2: it uses the IP-to-username mapping as of the log processing time, which may not be the same as the mapping at the time of the access. So it is effective for environments where users do not typically move between IP addresses, but will give incorrect usernames for some events in some cases, in environments where users move.

The details of the authentication query script is beyond the scope of this article (in a nutshell, it could accept the IP address on the command line, query the authentication server, and write the username to a file to be read and cached by the log filter with read_file()), but if you need assistance implementing this approach, we can help you.


For perfect auditing (certain determination of which username was responsible for an access), Method 1 is the best, as it determines the username at the time of access. Method 3 can approach the accuracy of Method 1, but is still not perfectly precise, due to the high granularity of authentication server dumps; it can be made perfect if dumps can be arranged to occur with every change in the authentication database. Method 2 is a simple and fast approach, suitable for environments where IPs are generally stable, or where exact IP-to-username correlations are not required. Method 4 is similar to Method 2, but does its work in real time--it eliminates the need for a CFG file, but introduces the need for a querying script.

[Article revision v1.0]

Professionelle Dienstleistungen

Sollten Sie die Anpassung von Sawmill Analytics nicht selbst vornehmen wollen, können wir Ihnen dies als Dienstleisung anbieten. Unsere Experten setzen sich gerne mit Ihnen in Verbindung, um die Reports oder sonstige Aspekte von Sawmill Analytics an Ihre Gegebenheiten und Wünsche anzupassen. Kontakt

Zur Tutorial-Übersicht

Weitere Informationen

      Live-Demonstrationen »    
© 1995-2011 HAAGE & PARTNER Computer GmbH · Impressum · Datenschutz · www.haage-partner.de