Troubleshoot Systemd Services
From: https://opensource.com/article/20/5/systemd-troubleshooting-tool
Start using systemd as a troubleshooting tool
While systemd is not really a troubleshooting tool, the information in its
output points the way toward solving problems.
By David Both (Correspondent) May 11, 2020 | 2 Comments | %t min read
No one would really consider systemd to be a troubleshooting tool, but when
I encountered a problem on my webserver, my growing knowledge of systemd
and some of its features helped me locate and circumvent the problem.
The problem was that my server, yorktown, which provides name services,
DHCP, NTP, HTTPD, and SendMail email services for my home office network,
failed to start the Apache HTTPD daemon during normal startup. I had to
start it manually after I realized that it was not running. The problem
had been going on for some time, and I recently got around to trying to
fix it.
Some of you will say that systemd itself is the cause of this problem, and,
based on what I know now, I agree with you. However, I had similar types
of problems with SystemV. (In the first article in this series, I looked
at the controversy around systemd as a replacement for the old SystemV
init program and startup scripts. If you're interested in learning more
about systemd, read the second and third articles, too.) No software is
perfect, and neither systemd nor SystemV is an exception, but systemd
provides far more information for problem-solving than SystemV ever
offered.
Determining the problem
The first step to finding the source of this problem is to determine the
httpd service's status:
[root@yorktown ~]# systemctl status httpd
[root@yorktown ~]# systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor
preset: disabled)
Active: failed (Result: exit-code) since Thu 2020-04-16 11:54:37 EDT;
15min ago
Docs: man:httpd.service(8)
Process: 1101 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
Main PID: 1101 (code=exited, status=1/FAILURE)
Status: "Reading configuration..."
CPU: 60ms
Apr 16 11:54:35 yorktown.both.org systemd[1]: Starting The Apache HTTP
Server...
Apr 16 11:54:37 yorktown.both.org httpd[1101]: (99)Cannot assign requested
address: AH00072: make_sock: could not bind to address 192.168.0.52:80
Apr 16 11:54:37 yorktown.both.org httpd[1101]: no listening sockets
available, shutting down
Apr 16 11:54:37 yorktown.both.org httpd[1101]: AH00015: Unable to open logs
Apr 16 11:54:37 yorktown.both.org systemd[1]: httpd.service: Main process
exited, code=exited, status=1/FAILURE
Apr 16 11:54:37 yorktown.both.org systemd[1]: httpd.service: Failed with
result 'exit-code'.
Apr 16 11:54:37 yorktown.both.org systemd[1]: Failed to start The Apache
HTTP Server.
[root@yorktown ~]#
This status information is one of the systemd features that I find much
more useful than anything SystemV offers. The amount of helpful
information here leads me easily to a logical conclusion that takes me in
the right direction. All I ever got from the old chkconfig command is
whether or not the service is running and the process ID (PID) if it is.
That is not very helpful.
The key entry in this status report shows that HTTPD cannot bind to the IP
address, which means it cannot accept incoming requests. This indicates
that the network is not starting fast enough to be ready for the HTTPD
service to bind to the IP address because the IP address has not yet been
set. This is not supposed to happen, so I explored my network service
systemd startup configuration files; all appeared to be correct with the
right "after" and "requires" statements. Here is the
/lib/systemd/system/httpd.service file from my server:
# Modifying this file in-place is not recommended, because changes
# will be overwritten during package upgrades. To customize the
# behaviour, run "systemctl edit httpd" to create an override unit.
# For example, to pass additional options (such as -D definitions) to
# the httpd binary at startup, create an override unit (as is done by
# systemctl edit) and enter the following:
# [Service]
# Environment=OPTIONS=-DMY_DEFINE
[Unit]
Description=The Apache HTTP Server
Wants=httpd-init.service
After=network.target remote-fs.target nss-lookup.target httpd-init.service
Documentation=man:httpd.service(8)
[Service]
Type=notify
Environment=LANG=C
ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
# Send SIGWINCH for graceful stop
KillSignal=SIGWINCH
KillMode=mixed
PrivateTmp=true
[Install]
WantedBy=multi-user.target
The httpd.service unit file explicitly specifies that it should load after
the network.target and the httpd-init.service (among others). I tried to
find all of these services using the systemctl list-units command and
searching for them in the resulting data stream. All were present and
should have ensured that the httpd service did not load before the network
IP address was set.
First solution
A bit of searching on the internet confirmed that others had encountered
similar problems with httpd and other services. This appears to happen
because one of the required services indicates to systemd that it has
finished its startup—but it actually spins off a child process that has
not finished. After a bit more searching, I came up with a circumvention.
I could not figure out why the IP address was taking so long to be assigned
to the network interface card. So, I thought that if I could delay the
start of the HTTPD service by a reasonable amount of time, the IP address
would be assigned by that time.
Fortunately, the /lib/systemd/system/httpd.service file above provides some
direction. Although it says not to alter it, it does indicate how to
proceed: Use the command systemctl edit httpd, which automatically creates
a new file (/etc/systemd/system/httpd.service.d/override.conf) and opens
the GNU Nano editor. (If you are not familiar with Nano, be sure to look
at the hints at the bottom of the Nano interface.)
Add the following text to the new file and save it:
[root@yorktown ~]# cd /etc/systemd/system/httpd.service.d/
[root@yorktown httpd.service.d]# ll
total 4
-rw-r--r-- 1 root root 243 Apr 16 11:43 override.conf
[root@yorktown httpd.service.d]# cat override.conf
# Trying to delay the startup of httpd so that the network is
# fully up and running so that httpd can bind to the correct
# IP address
#
# By David Both, 2020-04-16
[Service]
ExecStartPre=/bin/sleep 30
The [Service] section of this override file contains a single line that
delays the start of the HTTPD service by 30 seconds. The following status
command shows the service status during the wait time:
[root@yorktown ~]# systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor
preset: disabled)
Drop-In: /etc/systemd/system/httpd.service.d
└─override.conf
/usr/lib/systemd/system/httpd.service.d
└─php-fpm.conf
Active: activating (start-pre) since Thu 2020-04-16 12:14:29 EDT; 28s
ago
Docs: man:httpd.service(8)
Cntrl PID: 1102 (sleep)
Tasks: 1 (limit: 38363)
Memory: 260.0K
CPU: 2ms
CGroup: /system.slice/httpd.service
└─1102 /bin/sleep 30
Apr 16 12:14:29 yorktown.both.org systemd[1]: Starting The Apache HTTP
Server...
Apr 16 12:15:01 yorktown.both.org systemd[1]: Started The Apache HTTP
Server.
[root@yorktown ~]#
And this command shows the status of the HTTPD service after the 30-second
delay expires. The service is up and running correctly:
[root@yorktown ~]# systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor
preset: disabled)
Drop-In: /etc/systemd/system/httpd.service.d
└─override.conf
/usr/lib/systemd/system/httpd.service.d
└─php-fpm.conf
Active: active (running) since Thu 2020-04-16 12:15:01 EDT; 1min 18s ago
Docs: man:httpd.service(8)
Process: 1102 ExecStartPre=/bin/sleep 30 (code=exited, status=0/SUCCESS)
Main PID: 1567 (httpd)
Status: "Total requests: 0; Idle/Busy workers 100/0;Requests/sec: 0;
Bytes served/sec: 0 B/sec"
Tasks: 213 (limit: 38363)
Memory: 21.8M
CPU: 82ms
CGroup: /system.slice/httpd.service
├─1567 /usr/sbin/httpd -DFOREGROUND
├─1569 /usr/sbin/httpd -DFOREGROUND
├─1570 /usr/sbin/httpd -DFOREGROUND
├─1571 /usr/sbin/httpd -DFOREGROUND
└─1572 /usr/sbin/httpd -DFOREGROUND
Apr 16 12:14:29 yorktown.both.org systemd[1]: Starting The Apache HTTP
Server...
Apr 16 12:15:01 yorktown.both.org systemd[1]: Started The Apache HTTP
Server.
I could have experimented to see if a shorter delay would work as well, but
my system is not that critical, so I decided not to. It works reliably as
it is, so I am happy.
Because I gathered all this information, I reported it to Red Hat Bugzilla
as Bug 1825554. I believe that it is much more productive to report bugs
than it is to complain about them.
The better solution
A couple of days after reporting this as a bug, I received a response
indicating that systemd is just the manager, and if httpd needs to be
ordered after some requirements are met, it needs to be expressed in the
unit file. The response pointed me to the httpd.service man page. I wish I
had found this earlier because it is a better solution than the one I came
up with. This solution is explicitly targeted to the prerequisite target
unit rather than a somewhat random delay.
From the httpd.service man page:
Starting the service at boot time
The httpd.service and httpd.socket units are disabled by default. To start
the httpd service at boot time, run: systemctl enable httpd.service. In
the default configuration, the httpd daemon will accept connections on
port 80 (and, if mod_ssl is installed, TLS connections on port 443) for
any configured IPv4 or IPv6 address.
If httpd is configured to depend on any specific IP address (for example,
with a "Listen" directive) which may only become available during
start-up, or if httpd depends on other services (such as a database
daemon), the service must be configured to ensure correct start-up
ordering.
For example, to ensure httpd is only running after all configured
network interfaces are configured, create a drop-in file (as described
above) with the following section:
[Unit]
After=network-online.target
Wants=network-online.target
I still think this is a bug because it is quite common—at least in my
experience—to use a Listen directive in the httpd.conf configuration
file. I have always used Listen directives, even on hosts with only a
single IP address, and it is clearly necessary on hosts with multiple
network interface cards (NICs) and internet protocol (IP) addresses.
Adding the lines above to the /usr/lib/systemd/system/httpd.service
default file would not cause problems for configurations that do not use a
Listen directive and would prevent this problem for those that do.
In the meantime, I will use the suggested solution.
Next steps
This article describes a problem I had with starting the Apache HTTPD
service on my server. It leads you through the problem determination steps
I took and shows how I used systemd to assist. I also covered the
circumvention I implemented using systemd and the better solution that
followed from my bug report.
As I mentioned at the start, it is very likely that this is the result of a
problem with systemd, specifically the configuration for httpd startup.
Nevertheless, systemd provided me with the tools to locate the likely
source of the problem and to formulate and implement a circumvention.
Neither solution really resolves the problem to my satisfaction. For now,
the root cause of the problem still exists and must be fixed. If that is
simply adding the recommended lines to the
/usr/lib/systemd/system/httpd.service file, that would work for me.
One of the things I discovered during this is process is that I need to
learn more about defining the sequences in which things start. I will
explore that in my next article, the fifth in this series.
Resources
There is a great deal of information about systemd available on the
internet, but much is terse, obtuse, or even misleading. In addition to
the resources mentioned in this article, the following webpages offer more
detailed and reliable information about systemd startup.
- The Fedora Project has a good, practical guide to systemd. It has pretty
much everything you need to know in order to configure, manage, and
maintain a Fedora computer using systemd.
- The Fedora Project also has a good cheat sheet that cross-references the
old SystemV commands to comparable systemd ones.
- For detailed technical information about systemd and the reasons for
creating it, check out Freedesktop.org's description of systemd.
- Linux.com's "More systemd fun" offers more advanced systemd information
and tips.
There is also a series of deeply technical articles for Linux sysadmins by
Lennart Poettering, the designer and primary developer of systemd. These
articles were written between April 2010 and September 2011, but they are
just as relevant now as they were then. Much of everything else good that
has been written about systemd and its ecosystem is based on these papers.
- Rethinking PID 1/g
- systemd for Administrators, Part I/g
- systemd for Administrators, Part II/g
- systemd for Administrators, Part III/g
- systemd for Administrators, Part IV/g
- systemd for Administrators, Part V/g
- systemd for Administrators, Part VI/g
- systemd for Administrators, Part VII/g
- systemd for Administrators, Part VIII/g
- systemd for Administrators, Part IX/g
- systemd for Administrators, Part X/g
- systemd for Administrators, Part XI/g
What to read next
David Both(Correspondent)