Server Setup

Using journalctl as the ultimate utility for troubleshooting LEMP stacks

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

Intro to journalctl

The journalctl is an integral part of SystemD. The main benefit in the context of LEMP stacks is an EASY log aggregation. It makes locating issues that span several services (PHP-FPM, NGINX, MySQL) much easier.
All without resorting to external log aggregators like Logstash, Papertrail, and the likes.
The future is here, it’s been here for a while, but in a little use. Let’s fill in the gaps and see how you can query your entire server log system for troubleshooting your website.
You can do this without any third-party tools:

journalctl -u nginx -u php-fpm --since today

It will show you an aggregated log of these services.

However, that command may be of little use because it won’t show you the actual PHP errors, neither NGINX requests.

Let’s see how we can use journalctl for querying your LEMP stack logs, including all the useful data. But we start from the basics and understanding the logging capabilities in CentOS/RHEL.

CentOS/RHEL: Journald and Rsyslog in one server? Why?

Rsyslog has been around much longer than SystemD and its logging component, Journald. This is why Rsyslog and Journald both co-exist together on the same system.
On a CentOS/RHEL system, you will find both systemd-journald and rsyslogd running.
The default configuration results in some stored log data duplication between the two services.

How it works by default

RedHat actually made Rsyslog aware of Journald using a special module, imjournal.
So how this works by default in CentOS/RHEL?

Both are installed and running:

  • Journald runs without persistent storage and stores log data only in RAM
  • Rsyslog (aka /var/log/messages and friends) persists log data to flat files

The standard Syslog socket /dev/log is reserved/bound by Journald.
The Rsyslog is configured with the imjournal module which reads the actual Journald binary logs.
It does not listen on the /dev/log socket, but it persists Journald messages onto plain-text files like /var/log/messages.

Complicated? A little bit.

Any app which logs data to the /dev/log socket, has its messages delivered as such:

Application -> /dev/log socket -> Journald (transient storage in RAM) -> Rsyslog (plain-text log files).

As you can see, there is some duplication right there by default.
Rsyslog will persist much of the logs, while Journald will store only a fraction of already-persisted messages, in RAM.

Avoid duplication and persist the logs in Journald

In CentOS/RHEL, we have JournalD running in non-persistent mode, so on-disk logs are only in the files like /var/log/messages (Rsyslog).

To get rid of duplicate log data, you’d typically want to configure JournalD to persist log data to disk, and disable Rsyslog.

Make persistent storage for Journald

Simply create /var/log/journal directory and restart Journald, using the commands below. That’s all there is to it for it to store the logs.
Note that the created logs are binary. The proper way to view the logs from Journald is by using journalctl program.

# create the directory
mkdir -p /var/log/journal
# ensure the write permissions
systemd-tmpfiles --create --prefix /var/log/journal
# ping the daemon that the directory is there
killall -USR1 systemd-journald

If things worked out well, you’ll immediately notice some subdirectories created under /var/log/journal, and one of them having the system.journal binary log.
This is because the last command reloaded the daemon configuration and persisted the currently stored log data from RAM onto the disk.

Disable Rsyslog

As we know from the above, it is not JournalD, which sends log data to Rsyslog.
But it is Rsyslog (with the help of imjournal module) imports the JournalD log.

As such, configuring Journald with ForwardToSyslog=no alone will not actually disable storing of duplicate log data to Rsyslog.

To avoid duplication of log messages, disable Rsyslog:

systemctl disable rsyslog
systemctl stop rsyslog

PHP and Journald

Journald benefit over Rsyslog is that it can store some structured data, as opposed to Rsyslog which is basically logging plain strings.

The proposed patch to have PHP to send structured log data to Journald was never accepted, while having quite a useful goal:

in order to allow both PHP as well as PHP scripts to log directly to journald
rather than logging via Syslog. The primary advantage of this is to permit
structured logging; a secondary advantage is that non-structured logs
(including messages logged via the error_log() function) can contain
additional information, including script file, script line number, and
PHP function or class name.

So the way to log to journald from PHP remains the same: via standard Syslog socket and plain messages.

In php.ini or any other PHP configuration file where you make use of the error_log directive, ensure:

error_log = syslog 
syslog.ident = php-fpm

Unfortunately, the PHP-FPM has a long-standing bug which results in log messages being tagged with incorrect program name ool <NAME-OF-THE-POOL>.
To deal with this, create an auto-prepend file at /etc/php-syslog-fix.php:

<?php
// Open logs and set the syslog.ident to a sensible value on php-fpm
if ( PHP_SAPI === 'fpm-fcgi' ) {
        openlog( 'php-fpm', LOG_ODELAY, LOG_DAEMON );
}

Ensure that this file can be read by all PHP-FPM pool users:

chmod 0644 /etc/php-syslog-fix.php

In every PHP-FPM pool configuration file, e.g. /etc/php-fpm.d/example.com.conf, ensure:

php_admin_value[auto_prepend_file] = /etc/php-syslog-fix.php

Now, journalctl -u php-fpm will work as expected? Nope. The -u switch specifies UNIT field of the structured data being logged to Journald.
But for log messages which are coming from the PHP-FPM scripts themselves, there is no such field.
The field we need is SYSLOG_IDENTIFIER, which can be specified either by a field selector like SYSLOG_IDENTIFIER=php-fpm or simply -t php-fpm flag.

Because we need to filter and join log data from both PHP-FPM service and execution of its scripts, we will put a logical OR filter:

journalctl UNIT=php-fpm.service + SYSLOG_IDENTIFIER=php-fpm

By running this, you get a browsable log of all logged PHP errors.

NGINX and Journald

NGINX, likewise cannot log the structured data to Journald, but since it is capable of working with the standard Syslog server, it means it will work with Journald as well:

server {
       error_log syslog:server=unix:/dev/log;
       access_log syslog:server=unix:/dev/log;
       ...
}

Now, journalctl -u nginx will work as expected, and provide log messages about requests and errors.

Putting things together

Now you can troubleshoot LEMP stack errors through the convenience of binary logging:

journalctl UNIT=nginx.service + UNIT=php-fpm.service + SYSLOG_IDENTIFIER=php-fpm

For further convenience, you can create an alias in ~/.bashrc file:

alias weblogctl='journalctl UNIT=nginx.service + UNIT=php-fpm.service + SYSLOG_IDENTIFIER=php-fpm'

Then activate it in your current session by running source ~/.bashrc or log out/in.

At this point, you can experience the beauty of fast log filtering.
To display NGINX/PHP-FPM log messages as of today:

weblogctl --since today

To additionally filter by messages which are of a warning or a more severe level:

weblogctl --since today -p warning

Why this whole solution is ultimate?

When troubleshooting LEMP stack issues, you are solving either downtime, delays, or an incorrectly functioning website.
It is the time filtering which is most hard to deal with when you are using plain-text log files.

The binary logging and journalctl solve just that in an excellent way, using the --since switch.
For example to see entries from Jan 10, 5:15 PM:

weblogctl --since "2021-01-10 17:15:00"

And to close the window of the desired log data chunk, you can use the --until switch:

weblogctl --since "2021-01-10" --until "2021-01-11 03:00"

Some convenience human-friendly shortcuts are available. Suppose that you have got a customer email alerting you about the downtime within the last 2 hours:

weblogctl --since "2 hour ago"

Or, between 2 AM and 4 AM:

weblogctl --since 02:00  --until 4:00

Important caveats

By default, JournalD throws away log lines when the throughput is too high.
Likewise, by default,t SystemD allows 1,000 messages within a 30 second period.

E.g. logged 40k messages in a few minutes will have some message discard.
So unless you configure to disable this behavior, don’t always assume that your journald messages are absolutely complete.

Disabling rate-limiting poses a security risk of sorts, because some system users may abuse the logging.
If you are aware of this and are fine with it, completely disable rate-limiting, use in /etc/systemd/journald.conf file:

RateLimitInterval=30s
RateLimitBurst=1000

Apply configuration with:

killall -USR1 systemd-journald

Docs, docs, docs

That’s not all there is to it, of course. journactl, and JournalD system in whole is very powerful in what it can do for you. We just took the very first on the journey of switching to the modern log facility for your LEMP stack.
You can always learn more with man journalctl.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: