-
Bug
-
Resolution: Fixed
-
Minor
-
1.11.15
-
Security Level: Public (Visbile by non-authn users.)
-
None
We see a situation whereby a file is empty but has the adler32 attribute with a value different from 00000001 (which is the checksum for an empty file).
The hypthosis is that the following happens:
- FTS executes a PtP on StoRM and starts a transfer on one GridFTP server
- the GridFTP server, for some reason, gets stuck
- FTS times out the transfer, srmRm the file and re-issues the PtP, landing on another GridFTP server
- this time the transfer succeeds, the file is not empty and the checksum is correctly computed and recorded in the xattr
- the first GridFTP server wakes up and starts from where it left the first transfer. In particular it opens the file with O_TRUNC, making it empty. Soon after it stops because the transfer was aborted, without touching the xattr with the checksum
More details in a mail thread on storm-devel.
The tentative fix (rather, a mitigation) consists in checking that the file is empty before opening it in the StoRM GridFTP DSI module. This assumes that the GridFTP server was stuck before the open; of course if the server gets stuck after the check the problem would persist.
- relates to
-
STOR-1220 Enabling GPFS preallocation causes gfal-copy failures
- Closed