[STOR-1395] StoRM Backend service enters failed state when stopped Created: 16/Apr/21  Updated: 27/May/21  Resolved: 27/Apr/21

Status: Closed
Project: StoRM
Component/s: backend
Affects Version/s: 1.11.20
Fix Version/s: 1.11.21
Security Level: Public (Visbile by non-authn users.)

Type: Bug Priority: Major
Reporter: Enrico Vianello Assignee: Enrico Vianello
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to STOR-1400 StoRM WebDAV service enters failed st... Closed

 Description   

Starting the backend service works fine but when stopping, the service remains in a failed state:

Apr 16 16:06:07 omii005-vm03.cnaf.infn.it systemd[1]: storm-backend-server.service: main process exited, code=exited, status=143/n/a
Apr 16 16:06:07 omii005-vm03.cnaf.infn.it systemd[1]: Stopped StoRM Backend service.
Apr 16 16:06:07 omii005-vm03.cnaf.infn.it systemd[1]: Unit storm-backend-server.service entered failed state.
Apr 16 16:06:07 omii005-vm03.cnaf.infn.it systemd[1]: storm-backend-server.service failed.
Apr 16 16:06:07 omii005-vm03.cnaf.infn.it systemd[1]: Started StoRM Backend service.

Exit code 143 means that the program received a SIGTERM signal to instruct it to exit. The JVM catches the signal, does a clean shutdown, i.e. it runs all registered shutdown hooks (there's one for StoRM Backend which stops several threads and services), but still exits with an exit code of 143. That's just how Java works.

We should be able to suppress this by adding the exit code into the unit file as a "success" exit status:

[Service]
SuccessExitStatus=143


 Comments   
Comment by Enrico Vianello [ 27/Apr/21 ]

https://github.com/italiangrid/storm/commit/5a142d9a49beb8d9eb64c40a6d1b8d88618e0521

Comment by Andrea Ceccanti [ 20/Apr/21 ]

Ok, I expected that the unit included SuccessExitStatus=143, which it doesn't.

Comment by Enrico Vianello [ 20/Apr/21 ]
[root@transfer-test ~]# systemctl status storm-webdav
● storm-webdav.service - StoRM WebDAV service
   Loaded: loaded (/usr/lib/systemd/system/storm-webdav.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/storm-webdav.service.d
           └─filelimit.conf, storm-webdav.conf
   Active: active (running) since mar 2021-04-20 10:48:28 CEST; 1s ago
 Main PID: 7831 (java)
   CGroup: /system.slice/storm-webdav.service
           └─7831 /usr/bin/java -Xms4192m -Xmx4192m -Djava.io.tmpdir=/var/lib/storm-webdav/work -Dlogging.config=/etc/storm/webdav/logback.xml -jar /usr/share/java/storm-webdav/st...

apr 20 10:48:28 transfer-test.cr.cnaf.infn.it systemd[1]: storm-webdav.service: main process exited, code=exited, status=143/n/a
apr 20 10:48:28 transfer-test.cr.cnaf.infn.it systemd[1]: Stopped StoRM WebDAV service.
apr 20 10:48:28 transfer-test.cr.cnaf.infn.it systemd[1]: Unit storm-webdav.service entered failed state.
apr 20 10:48:28 transfer-test.cr.cnaf.infn.it systemd[1]: storm-webdav.service failed.
apr 20 10:48:28 transfer-test.cr.cnaf.infn.it systemd[1]: Started StoRM WebDAV service.

The service is restarted and status is fine. But the exit code is 143 and it's wrongly considered as a failure code when service is stopped. This is the T1 transfer-test node linked to storm-test testbed. It's easy not to notice it, which is that T1 deployment? We can check that systemctl status.

Comment by Andrea Ceccanti [ 20/Apr/21 ]

?
The unit worked fine AFAIU on the T1 deployment

Comment by Enrico Vianello [ 20/Apr/21 ]

Also StoRM WebDAV has the same issue. Maybe it's something related to the transition to Java 11?

Comment by Enrico Vianello [ 16/Apr/21 ]

Tested, it works.

Apr 16 16:56:52 omii005-vm03.cnaf.infn.it systemd[1]: Stopped StoRM Backend service.
Apr 16 16:56:52 omii005-vm03.cnaf.infn.it systemd[1]: Started StoRM Backend service. 

Opened PR
https://github.com/italiangrid/storm/pull/151

Generated at Fri Apr 18 03:17:16 CEST 2025 using Jira 10.3.4#10030004-sha1:d6812f2d35a143c1c5fc283d2f5a72582f40aaf1.