HAAGE&PARTNER Computer GmbH  HAAGE&PARTNER

Sawmill Analytics

Analyse und Reporting
für Web | Netzwerk | Sicherheit

Zugriffs- und Datenanalyse von Server-Logs (Proxy, Mailserver, Firewall, Webserver) und Überwachung der Sicherheit & Performance, Schwachstellenanalyse.

Sawmill Analytics 8 | Loganalyse

Sawmill-Tutorial

Converting Log Data With process_logs


Typically Sawmill is used to parse log data into a database, and then generate reports from that database with Sawmill's web reporting interface. But Sawmill can also be used as a stand-alone parser, or log converter. This is useful for parsing complicated log formats, like mail logs, which typically have many lines per event, and where the lines must be parsed intelligently, in a particular order, and then re-integrated into a single event. This sort of parsing is difficult, but Sawmill already knows how to do it for all common log formats. So if you're writing a script to import log data into a database, or to do alerting based on log data, or to doing anything else with log data, you can use Sawmill as the first stage of your script, to do the parsing and send the parsed data on to you.

This type of conversion is typically done from a script, or from the command line. For instance, you might run the following command on a profile created from an Apache log file:


  $ sawmill -p my_profile -a pl -v 0 > out.csv


The -p option specifies the internal profile name; the "-a pl" is short for "-a process_logs", which specifies the action; "-v 0" tells Sawmill to generate no output other than the process_logs output. If the input (Apache) log data is this:


140.177.203.25 - - [07/Apr/1998:16:53:06 -0500] "GET / HTTP/1.0" 200 734 "-" "Mozilla/4.04 [en] (X11; I; SunOS 5.6 sun4u)"
140.177.203.25 - - [07/Apr/1998:16:53:06 -0500] "GET /sawmill/picts/TopBanner.gif HTTP/1.0" 200 4573 "http://asooo/" "Mozilla/4.04 [en] (X11; I; SunOS 5.6 sun4u)"
192.17.19.148 - - [07/Apr/1998:21:32:04 -0500] "GET / HTTP/1.0" 200 734 "http://developer.javasoft.com/developer/earlyAccess/java3d-features.html" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:12 -0500] "GET /sawmill/picts/TopBanner.gif HTTP/1.0" 200 4573 "http://asooo.wolfram.com/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:15 -0500] "GET /flowerfire/ HTTP/1.0" 404 154 "http://asooo.wolfram.com/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:18 -0500] "GET / HTTP/1.0" 200 734 "http://asooo.wolfram.com/flowerfire/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:21 -0500] "GET /sawmill/ HTTP/1.0" 200 3344 "http://asooo.wolfram.com/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:26 -0500] "GET /sawmill/picts/title.gif HTTP/1.0" 200 48771 "http://asooo.wolfram.com/sawmill/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:29 -0500] "GET /sawmill/samples.html HTTP/1.0" 200 8996 "http://asooo.wolfram.com/sawmill/" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"
192.17.19.148 - - [07/Apr/1998:21:32:30 -0500] "GET /sawmill/picts/header.gif HTTP/1.0" 200 12371 "http://asooo.wolfram.com/sawmill/samples.html" "Mozilla/4.04 (Macintosh; I; PPC, Nav)"


then the result in out.csv will be this, a traditional CSV file with a comma-separated header listing the names of the fields, and one line per event with comma-separated values:

  date_time,day_of_week,hour_of_day,hit_type,page,file_type,worm,screen_dimensions,screen_depth,hostname,domain_description,location,organization,isp,domain,referrer,referrer_description,search_engine,search_phrase,web_browser,operating_system,spider,server_domain,authenticated_user,server_response,hits,page_views,spiders,worms,errors,broken_links,screen_info_hits,visitors,size
07/Apr/1998 16:53:06,3,16,page view,/{default},(no type),(not a worm),(-) x (-),(-),140.177.203.25,IP Address,United States/IL/Champaign,Wolfram Research,(unknown ISP),(unknown domain),(no referrer),(no referrer),(no search engine),(no search phrase),Netscape Navigator/4.04 ,SunOS,(not a spider),-,(not authenticated),200,1,1,(empty),(empty),(empty),(empty),(empty),140.177.203.25,734
07/Apr/1998 16:53:06,3,16,hit,/sawmill/picts/(nonpage),GIF,(not a worm),(-) x (-),(-),140.177.203.25,IP Address,United States/IL/Champaign,Wolfram Research,(unknown ISP),(unknown domain),http://asooo/(omitted),Unknown,(no search engine),(no search phrase),Netscape Navigator/4.04 ,SunOS,(not a spider),-,(not authenticated),200,1,(empty),(empty),(empty),(empty),(empty),(empty),140.177.203.25,4573
07/Apr/1998 21:32:04,3,21,page view,/{default},(no type),(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://developer.javasoft.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,1,(empty),(empty),(empty),(empty),(empty),192.17.19.148,734
07/Apr/1998 21:32:12,3,21,hit,/sawmill/picts/(nonpage),GIF,(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,(empty),(empty),(empty),(empty),(empty),(empty),192.17.19.148,4573
07/Apr/1998 21:32:15,3,21,broken link,/flowerfire/(nonpage),(no type),(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),404,1,(empty),(empty),(empty),1,1,(empty),192.17.19.148,154
07/Apr/1998 21:32:18,3,21,page view,/{default},(no type),(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,1,(empty),(empty),(empty),(empty),(empty),192.17.19.148,734
07/Apr/1998 21:32:21,3,21,page view,/sawmill/{default},(no type),(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,1,(empty),(empty),(empty),(empty),(empty),192.17.19.148,3344
07/Apr/1998 21:32:26,3,21,hit,/sawmill/picts/(nonpage),GIF,(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,(empty),(empty),(empty),(empty),(empty),(empty),192.17.19.148,48771
07/Apr/1998 21:32:29,3,21,page view,/sawmill/samples.html,HTML,(not a worm),(-) x (-),(-),192.17.19.148,IP Address,United States/IL/Urbana,University of Illinois,(unknown ISP),(unknown domain),http://asooo.wolfram.com/(omitted),Commercial (com),(no search engine),(no search phrase),Netscape Navigator/4.04 ,Macintosh,(not a spider),-,(not authenticated),200,1,1,(empty),(empty),(empty),(empty),(empty),192.17.19.148,8996


This can be used as input to a script. It's easier to parse than the original log data, and also has some extra information, like geographic locations; and Sawmill's own log filtering functionality can be used to include additional columns (database fields), populated programatically. For instance, it could use BCP or another program to import the comma-separated data into a SQL database for later querying.


Advance Topic: Customizing The Output of process_logs

The default output of process_logs is good for many purposes, but if you have a script which expects a particular format of input, you may need to use some of Sawmill's additional options to customize the output. Specifically,
Note: all four options above are (will be) available in Sawmill 7.2.16 and later. If you need them immediately, contact support for a pre-release download.


Conclusion

This newsletter describes the process_logs command-line action, which has general utility in creating environments where Sawmill acts as a log parser or converter, and in later stages in the process operate on the logs that Sawmill has parsed, converted, and simplified.


[Article revision v1.0]


Professionelle Dienstleistungen

Sollten Sie die Anpassung von Sawmill Analytics nicht selbst vornehmen wollen, können wir Ihnen dies als Dienstleisung anbieten. Unsere Experten setzen sich gerne mit Ihnen in Verbindung, um die Reports oder sonstige Aspekte von Sawmill Analytics an Ihre Gegebenheiten und Wünsche anzupassen. Kontakt

Zur Tutorial-Übersicht

Weitere Informationen

      Live-Demonstrationen »    
© 1995-2011 HAAGE & PARTNER Computer GmbH · Impressum · Datenschutz · www.haage-partner.de