A small tutorial on how to analyze the Caddy access log with GoAccess and how to host the overview as a static web page. It is always nice to have a graphical overview of the amount and type of requests the webs erver is handling. GoAccess is a fairly powerful real-time web log analyzer that has the ability to save the output of parsed log files to an HTML report. I’ve automated this process with a bash script that runs hourly on my web server.

Configuration Caddy:

However, in order to parse the output of Caddy (which runs as a Docker container on my server), Caddy needs to be told to log all requests, and the log files need to be accessible from the host system. To get Caddy to log, I’ve created a snippet that can be imported into any domain directive:

(logging) {
	log {
		output file /var/log/access.log {
			roll_size 1gb
			roll_keep 10
			roll_keep_for 2160h
		}
	}
}

With this configuration, all access logs are stored for 90 days and the log is rolled when it reaches 1GB in size. The access.log file also needs to be made accessible to the host, which I configured in Caddy’s docker-compose.yml.

volumes:
  - /opt/caddy/logs:/var/log/

I’ve updated my Caddyfile, enabling Caddy to host the newly generated HTML file:

access.{$DOMAIN} {
	import logging
	root * /var/www/static/access/
	file_server
}

To easily replace the static site files, I also mount a directory on my host system as Caddy’s static file directory, like this:

volumes:
  - /opt/caddy/static/access:/var/www/static/access/

Generate Report HTML:

If you want to use the GeoIP feature, you need a GeoIP2 database. A free option is to use the GeoLite2 Database by Maxmind. It does require an account to create an authentication token to download the database.

Every time the script runs, I check to see if a newer version of the database has been released, and if so, I use the direct download option to curl the db file and replace the old one before parsing. The actual report is being created in this line of the script:

zcat --force $LOG_FILES | docker run --rm --name goaccess -i -e LANG="$LANG_VARIABLE" -e TZ="$TZ_VARIABLE" -v "$GEOIP_DATABASE:/GeoLite2-City.mmdb" allinurl/goaccess -a -o html --log-format CADDY --jobs 3 --geoip-database /GeoLite2-City.mmdb - > "$TEMP_OUTPUT_FILE"

I’m using zcat to feed compressed and uncompressed log files into GoAccess running in a Docker container. After the report HTML is generated, it is placed in Caddy’s static files directory. Since Release 1.9 of GoAccess it is also possible to use multithreading while parsing the log files. For that you can adjust the –jobs thread number to boost the parsing speed.

To generate the report on a regular basis, you can simply run the script from a cron job. In this example, a new report will be generated 30 minutes past the full hour:

30 * * * * /home/user/cron/generate_caddy_report.sh

Complete Bash script:

#!/bin/bash

# Set variables
LICENSE_KEY="XXX"
LANG_VARIABLE="$LANG"
TZ_VARIABLE="Europe/Berlin"
LOG_FILES="/opt/caddy/logs/access*.log*"
GEOIP_DATABASE="/opt/caddy/static/access/db/GeoLite2-City.mmdb"
OUTPUT_FILE="/opt/caddy/static/access/index.html"
TEMP_OUTPUT_FILE="/opt/caddy/static/access/index_temp.html"

DOWNLOAD_URL="https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-City&license_key=$LICENSE_KEY&suffix=tar.gz"
TEMP_DIR="/tmp/geolite2_update"
TEMP_DB_FILE="$TEMP_DIR/GeoLite2-City.mmdb"

# Create temporary directory
mkdir -p "$TEMP_DIR"

# Get remote last modified date
curl -s -I "$DOWNLOAD_URL" -o /tmp/headers.txt
REMOTE_LAST_MODIFIED=$(grep -i 'last-modified' /tmp/headers.txt | awk -F': ' '{print $2}' | tr -d '\r' | date -f- +"%Y-%m-%d")

# Check if local file exists and get the last modified date
if [ -f "$GEOIP_DATABASE" ]; then
    LOCAL_LAST_MODIFIED=$(stat -c %y "$GEOIP_DATABASE" | cut -d ' ' -f 1)
else
    LOCAL_LAST_MODIFIED="1970-01-02"
    echo "No IP location database found. Setting LOCAL_LAST_MODIFIED to 1970-01-02" >&2
fi

# Compare dates and update database if needed
if [[ "$(date -d "$REMOTE_LAST_MODIFIED" +%s)" -gt "$(date -d "$LOCAL_LAST_MODIFIED" +%s)" ]]; then
  curl -s -L "$DOWNLOAD_URL" | tar -xz --strip-components=1 -C "$TEMP_DIR"
  echo "Downloading new IP location database" >&2
  mv "$TEMP_DB_FILE" "$GEOIP_DATABASE"
  chown caddy:caddy "$GEOIP_DATABASE"
fi

# Cleanup temporary files
rm -rf "$TEMP_DIR"
rm /tmp/headers.txt

# Generate report using goaccess docker image
docker pull allinurl/goaccess
zcat --force $LOG_FILES | docker run --rm --name goaccess -i -e LANG="$LANG_VARIABLE" -e TZ="$TZ_VARIABLE" -v "$GEOIP_DATABASE:/GeoLite2-City.mmdb" allinurl/goaccess -a -o html --log-format CADDY --jobs 3 --geoip-database /GeoLite2-City.mmdb - > "$TEMP_OUTPUT_FILE"

# Set ownership for the temporary output file and move it to the final location
chown caddy:caddy "$TEMP_OUTPUT_FILE"
mv "$TEMP_OUTPUT_FILE" "$OUTPUT_FILE"