Zugriffs- und Datenanalyse von Server-Logs (Proxy, Mailserver, Firewall, Webserver) und Überwachung der Sicherheit & Performance, Schwachstellenanalyse.
Creating A MySQL Profile And Database
To create a Sawmill profile which uses MySQL as its back-end database, create a profile just as you would for the internal database, but on the Database Options page of the Create Profile Wizard, select "Use MySQL database" (again, this is available only in Sawmill Enterprise). Then enter the hostname, username, and password of your MySQL database server. If you're using a socket to communicate with MySQL, and the socket is not in the default location, you can also enter its pathname on this page. Finally, if you want the name of the database (schema) to be different from the name of the profile, you can enter a different database name on this page:

Querying The Main Table And The Itemnum Tables
The main table of the database is called "logfile," and contains one
row for each event in the log data. In this example, we are analyzing a
small 5000-line Apache log file, so each line of log data is a separate
event; so the resulting logfile contains 5000 rows (all examples below
are captures from the "mysql" command-line program. If your mail client
does not support the HTML tags used to render this in a mono-spaced
font, these may format poorly; changing the font to a mono-spaced font
like Courier or Monaco will show them better):
mysql> select count(*) from logfile; +----------+ | count(*) | +----------+ | 5000 | +----------+ |
mysql> select * from logfile limit 1; +-----------+---------------------+-----------------+-------------+-------------+----------+------+-----------+------+-------------------+--------------+----------+--------------------+----------+--------------+------+--------+----------+----------------------+---------------+---------------+-------------+------------------+--------+---------------+--------------------+-----------------+------+------------+---------+-------+--------+--------------+------------------+----------+------+ | loadorder | date_time | bottomleveldate | day_of_week | hour_of_day | hit_type | page | file_type | worm | screen_dimensions | screen_depth | hostname | domain_description | location | organization | isp | domain | referrer | referrer_description | search_engine | search_phrase | web_browser | operating_system | spider | server_domain | authenticated_user | server_response | hits | page_views | spiders | worms | errors | broken_links | screen_info_hits | visitors | size | +-----------+---------------------+-----------------+-------------+-------------+----------+------+-----------+------+-------------------+--------------+----------+--------------------+----------+--------------+------+--------+----------+----------------------+---------------+---------------+-------------+------------------+--------+---------------+--------------------+-----------------+------+------------+---------+-------+--------+--------------+------------------+----------+------+ | 1 | 1998-04-07 16:53:06 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 734 | +-----------+---------------------+-----------------+-------------+-------------+----------+------+-----------+------+-------------------+--------------+----------+--------------------+----------+--------------+------+--------+----------+----------------------+---------------+---------------+-------------+------------------+--------+---------------+--------------------+-----------------+------+------------+---------+-------+--------+--------------+------------------+----------+------+ |
mysql> select * from hostnameitemnum where itemnum = 2; +---------+----------------+ | itemnum | hostname | +---------+----------------+ | 2 | 140.177.203.25 | +---------+----------------+ |
By joining one or more itemnum tables to the main table, you can get
results similar to those shown in Sawmill's own reports. For instance,
let's generate a "top 10 hostnames" report. We can do that by
selecting/summing logfile, grouping on hostname, joining in
hostnameitemnum to get real hostnames in the result (rather than
normalized hostname itemnums), ordering by hits descending, and
limiting to the top 10:
mysql> select i.hostname, sum(hits) as hits, sum(page_views) as page_views, sum(size) as size from logfile l left join hostnameitemnum i on l.hostname = i.itemnum group by hostname order by hits desc limit 10; +-----------------------------+------+------------+----------+ | hostname | hits | page_views | size | +-----------------------------+------+------------+----------+ | lipowitz.isdn.uiuc.edu | 692 | 236 | 6348398 | | 192.17.19.150 | 466 | 46 | 6363739 | | 192.17.19.148 | 317 | 136 | 24279573 | | pale.kai.com | 308 | 70 | 5289858 | | flowerfire.isdn.uiuc.edu | 242 | 52 | 2337963 | | spider.unh.edu | 171 | 19 | 433709 | | 195.101.37.244 | 87 | 21 | 575602 | | 206.148.222.50 | 79 | 25 | 876475 | | isdn-5.nii.enterconnect.net | 79 | 16 | 579010 | | gli2302.ctea.com | 75 | 9 | 436318 | +-----------------------------+------+------------+----------+ |
Any of Sawmill's standard table reports can be generated similarly using SQL.
Filtering By Itemnum
Now suppose we want to filter the result. Any type of filter is possible using a WHERE clause in the SQL query. In this case, we'll filter the report above to show only .com hostnames. By including "where i.hostname like '%.com'", the query now selects only those rows from logfile which contain hostname values ending with .com, when de-normalized. So, the resulting list contains all .com addresses:
mysql> select i.hostname, sum(hits) as hits, sum(page_views) as page_views, sum(size) as size from logfile l left join hostnameitemnum i on l.hostname = i.itemnum where i.hostname like '%.com' group by hostname order by hits desc limit 10; +--------------------------------+------+------------+---------+ | hostname | hits | page_views | size | +--------------------------------+------+------------+---------+ | pale.kai.com | 308 | 70 | 5289858 | | gli2302.ctea.com | 75 | 9 | 436318 | | wat.thedj.com | 57 | 22 | 767059 | | tosainu.trimark.com | 45 | 12 | 341875 | | gianduia.compecon.com | 39 | 6 | 206124 | | ip252.ts4.phx.inficad.com | 34 | 4 | 160689 | | h-205-217-240-156.netscape.com | 33 | 4 | 173806 | | clarendon.weblogic.com | 32 | 4 | 151206 | | gw.vixel.com | 30 | 4 | 115755 | | cx51617-a.dnpt1.occa.home.com | 30 | 4 | 150578 | +--------------------------------+------+------------+---------+ |
The filter doesn't have to work on only the primary field of the query (hostname); it can work on any fields in logfile, or the joined fields of any other table. MySQL supports many joins in a single query, so we can join in one table for the main column (hostname), and also join additional tables for filtering. For instance, here are the hostnames which accessed GIF images (the hits column shows how many GIF accesses each hostname had; the size column shows how many GIF bytes each hostname transferred):
mysql> select i.hostname, sum(hits) as hits, sum(page_views) as page_views, sum(size) as size from logfile l left join hostnameitemnum i on l.hostname = i.itemnum left join file_typeitemnum fi on l.file_type = fi.itemnum where fi.file_type = 'GIF' group by hostname order by hits desc limit 10; +-----------------------------+------+------------+--------+ | hostname | hits | page_views | size | +-----------------------------+------+------------+--------+ | lipowitz.isdn.uiuc.edu | 432 | 0 | 447331 | | 192.17.19.150 | 420 | 0 | 209925 | | pale.kai.com | 235 | 0 | 144966 | | 192.17.19.148 | 161 | 0 | 322118 | | spider.unh.edu | 152 | 0 | 129566 | | flowerfire.isdn.uiuc.edu | 151 | 0 | 433550 | | gli2302.ctea.com | 66 | 0 | 56660 | | 195.101.37.244 | 65 | 0 | 87507 | | isdn-5.nii.enterconnect.net | 63 | 0 | 24508 | | 206.148.222.50 | 52 | 0 | 74631 | +-----------------------------+------+------------+--------+ 10 rows in set (0.03 sec) |