Matomo's [log analytics](https://matomo.org/log-analytics/), dockerized and watching specified directories for logs to ingest.
Matomo's [log analytics](https://matomo.org/log-analytics/), automatically watching specified directories for logs to ingest.
the log analytics script is wrapped in an [`inotifywait`](https://linux.die.net/man/1/inotifywait) loop to automagically pick up logs from specified directories.
The log analytics script is wrapped in an [`inotifywait`](https://linux.die.net/man/1/inotifywait) loop to automagically pick up logs from specified directories.
## Environment variables
## Requirements
-`WATCH_PATHS` (default: `"/srv/logs/"`)
The `logwatch.py` script requires Matomo's `import_logs.py` (branch `3-x.dev`) log analytics script to be available for import. Since that script only runs on Python 2.7, so does this. Obviously requirements of the `import_logs.py` script need to be satisfied, plus `inotify_simple` and `signal` modules need to be available.
Whitespace-separated list of paths to watch. These *should* be directories, which will be watched recursively. Once an `inotify` event fires, all files matching the glob pattern `*.log` contained in this directory will be ingested, one by one.
## Operation
-`WATCH_DELAY` (default: `"0.5"`)
The script sets [inotify](https://en.wikipedia.org/wiki/Inotify) watches on the listed directories.
Delay between detecting changes and starting to process the files, in seconds (decimals are supported). Since `inotifywait` will detect the *first* change, if there is a large batch of changes happening (for example, a batch of large logfiles being copied into the directory), starting to load the files immediately would lead to unexpected results.
When files matching the `--logfiles-glob` pattern are detected, the script waits `--ingestion-grace-period` seconds after all activity stops and starts ingesting the batch of detected files one by one. Ingested files are either renamed (using `--prefix-ingested` and `--suffix-ingested`) or deleted (`--delete-ingested`).
## Volume
If an unrecoverable error occurs during ingestion of a file, the file is either renamed (using `--prefix-failed` and `--suffix-failed`) or deleted (`--delete-failed`) — unless `--exit-on-error` is used, in which case the script immediately exits with an error message.
The "`/srv/logs`" directory is exposed through the [`VOLUME` Dockerfile directive](https://docs.docker.com/engine/reference/builder/#volume), and is also configured as the default location to watch in `WATCH_PATHS`.
After all files in the batch are processed (either ingested or failed), Matomo's report processing is automagically triggered, unless `--no-auto-archive` is used.
While ingestion is in progress new files are *not* being added to the batch. Once processing of a batch ends, if there are any inotify events since processing started, all files matching the configured glob are added to a new batch, which is then processed. If there are no inotify events since processing of the batch started, script waits for new events.
## Usage
Run `./logwatch.py --help` to get help. All `import_logs.py` options are supported, plus these additional ones:
-`--logfiles-glob` (default: `"*.log"`)
Only files matching this shell glob expression will be ingested. It's
important to make sure that the glob does not match ingested files after
prefix and suffix is applied! See `--prefix-ingested` and `--suffix-ingested`.
-`--ingestion-grace-period` (default: `5`)
Delay (in seconds; fractions are supported) between noticing a logfile to be processed and starting ingesting it.
This is part of the built-in heuristic for determining that a file is not being modified
or moved anymore and can be safely ingested.
-`--delete-ingested` (default: False)
Delete successfully ingested logfiles.
-`--prefix-ingested` (default: `"ingested/"`)
Rename ingested logfiles using this prefix; prefix can indicate directories (in
which case it should contain '/'), and is then relative to the directory a given
logfile was originally in: when watching several directories, a prefix of
'ingested/' will place ingested files in './ingested/' subdirectories of
respective watched directories. Directories will be created if needed. This option
is ignored if `--delete-ingested` is used.
-`--suffix-ingested` (default: `".ingested"`)
Rename ingested logfiles using this suffix; it cannot contain any '/' characters.
This option is ignored if `--delete-ingested` is used.
-`--exit-on-error` (default: False)
Exit when ingestion errors are encountered.
-`--delete-failed` (default: False)
Delete logfiles which failed to be ingested.
-`--prefix-failed` (default: `"failed/"`)
Rename logfiles that failed to be ingested using this prefix; prefix can
have directories (in which case it should contain '/'), and is then relative
to the directory a given logfile was originally in: when watching several
directories, a prefix of 'failed/' will place such files in './failed/'
subdirectories of respective watched directories. Directories will be created
if needed. This prefix will also be used for files containing information
on what error was encountered and at which line.
This option is ignored if `--delete-failed` is used.
-`--suffix-failed` (default: `".failed"`)
Rename logfiles that failed to be ingested using this suffix; it cannot
contain any '/' characters. This option is ignored if `--delete-failed` is used.
-`--no-auto-archive` (default: True)
Do not automatically run auto-archiving of Matomo reports. By default
auto-archiving is triggered after a batch of logfiles is ingested
## Docker usage
Run the image with log directories you want to watch volume-mounted. Specify the options and directories to watch directly as the command (`logwatch.py` is the entrypoint script, and default command is `--help`).
self.parser.description="""Watch HTTP access log directories and import HTTP access logs to Matomo
log_dir is the path to a directory with server access log files (uncompressed, .gz, or .bz2).
You may also watch many log file directories at once.
By default, the script will try to produce clean reports and will exclude bots, static files, discard http error and redirects, etc. This is customizable, see below."""