Sawmill Analytics

Analyse und Reporting
für Web | Netzwerk | Sicherheit

Zugriffs- und Datenanalyse von Server-Logs (Proxy, Mailserver, Firewall, Webserver) und Überwachung der Sicherheit & Performance, Schwachstellenanalyse.

Sawmill Analytics 8 | Loganalyse


Sequential Scheduling

Sawmill's built-in Scheduler provides basic task scheduling capabilities. You can configure it to run a particular task, for a particular profile, at a particular time. For instance, you can configure it to update the databases for all your profiles at midnight every night, or to email yourself a Single-page Summary for a particular profile, every day at 8 AM. The Scheduler is available in the Admin page of the web interface.

However, there are some restrictions on what tasks can be run simultaneously. Database builds and updates, and "remove data" tasks, modify the database, and can conflict with each other and with reports if they are run simultaneously on the same profile. Depending on the number of processors (or cores) in the system, and the speed of the disk, you may not be able to run more than a few simultaneous tasks--each task generally uses as much as a full processor (or core), so on a four-processor system, performance will suffer if there are more than four simultaneous processes, even if they are on different profiles.

Therefore, it is often useful to run tasks sequentially rather than simultaneously. The Sawmill 7 Scheduler supports this in a few cases; you can rebuild or update databases for "all profiles," and it will rebuild or update them in sequence, starting the next task when the previous one completes (using one processor at all times). Also, some degree of sequencing is possible by spacing the scheduled tasks so they cannot overlap; for instance, if a database update is to be followed by a report generation, and the database update takes 1 hour, then scheduling the report generation two hours after the database build will generally ensure that it is run after the update completes. But this is problematic, because the time taken for a task can never really be predicted; if the log data suddenly gets larger, or if the system slows down for some other reason, that database update might take 3 hours, and the report generation will fail. What is sometimes needed is true sequencing of arbitrary tasks, running each task when the previous completes.

To perform sequencing of arbitrary tasks, it is easiest to use a script (a .BAT file on Windows), which executes the tasks with command line syntax, one after another. For instance, this .BAT file would do the database update, and then email the report:

  C:\Program Files\Sawmill 7\SawmillCL -p profilename -a ud
  C:\Program Files\Sawmill 7\SawmillCL -p profilename -a srbe -ss mail -rca me@here.com -rna you@there.com -rn overview

On non-Windows systems, the script would be very similar, but with the pathname of the "sawmill" binary instead of C:\Program Files\Sawmill 7\SawmillCL . This script runs a database update of profilename, and immediately when the update completes, it emails the Overview report. Create a text file (for instance, with Notepad), and call it update_and_email.bat, and paste the two lines above into the file. On non-Windows, you might call it update_and_email.sh, and make it executable with "chmod a+x update_and_email.sh".

The Sawmill Scheduler cannot run an arbitrary script, so to schedule this script it is necessary to use an external scheduler. On Windows, the Windows Scheduler is usually the best choice. Go to Control Panels, choose Scheduled Tasks, and choose Add Scheduled task. This will start the Scheduled Task Wizard. Then:
Now, the .BAT file will run every day at midnight, and it will run its two tasks sequentially. Any number of tasks can be added to this script, and they will all be run sequentially, with no gap in between.

On Linux, MacOS, UNIX, or other operating systems, this type of scheduling is usually done with cron, the built-in scheduler. The cron table can be edited through the graphical interface of the operating system, if one is available, or it can be edited from the command line with the command "crontab -e", adding a line like this to the cron table:

  0 0 * * * /opt/sawmill/bin/update_and_email.sh >> /opt/sawmill/log/update_and_email.log 2>&1

This runs the update_and_email.sh script every day at midnight, logging the output to a file.

Sawmill 8 Scheduler Features

The next major release of Sawmill, version 8, will include direct support for sequential scheduling in the Sawmill Scheduler, so it will be possible to do this sort of "A then B then C etc." scheduling directly from the Sawmill Scheduler.

Advanced Topic: Optimal Scheduling for Multiple Processors/Cores

If you have a multiprocessor (or multi-core) system, the approach above does not take full advantage of all your processors, because the .BAT file (or script) runs on only one processor. It is possible to configure Sawmill to use multiple processors for database builds or updates (using the Log Processing Threads option), but report generation always uses one processor, and multi-processor database builds/updates are less efficient than single-processor builds (i.e., running on two processors is faster, but not twice as fast). If you have many tasks, the optimal scheduling for multiple processors is to use single-threaded builds and updates, but to keep one task running per processor at all times. For instance, if there are four processors, you start four single-threaded tasks, and as each task completes, you start another one, always ensuring that there are four tasks running. This can be done by running four scripts (or four .BAT files), like the one above, at the same time, as long as each script takes roughly the same amount of time as the others. That splits the work of the tasks into four equal pieces, and runs them simultaneously.

It is also possible to write a script which does this sort of scheduling for you, and we have one, written in perl. The script, called multisawmill.pl, is available by emailing support@sawmill.net. At this point, it is limited to only one type of task, so for instance it can run 100 database builds, split over four processors, or 1000 report generations, split over 8 processors.

Professionelle Dienstleistungen

Sollten Sie die Anpassung von Sawmill Analytics nicht selbst vornehmen wollen, können wir Ihnen dies als Dienstleisung anbieten. Unsere Experten setzen sich gerne mit Ihnen in Verbindung, um die Reports oder sonstige Aspekte von Sawmill Analytics an Ihre Gegebenheiten und Wünsche anzupassen. Kontakt

Zur Tutorial-Übersicht

Weitere Informationen

      Live-Demonstrationen »    
© 1995-2011 HAAGE & PARTNER Computer GmbH · Impressum · Datenschutz · www.haage-partner.de