Observed errors on storm-fronted-server log file:
10/23 10:42:45.859 Thread 34 - ERROR [27c4a938-0d27-44eb-8b62-6ed696c50d3d]: rpcResponseHandler_ReleaseFiles : ERROR: XML-RPC Fault: RPC failed at server. Failed to invoke method releaseFiles in class it.grid.storm.xmlrpc.XMLRPCMethods: java.lang.RuntimeException: (File: /storage/gpfs_tsm_atlas/atlas/atlasmctape/mc16_13TeV/EVNT/e6719_e5984/mc16_13TeV.411033.PowhegHerwig7EvtGen_H7UE_704_tchan_lept_top.merge.EVNT.e6719_e5984_tid14297408_00/EVNT.14297408._000605.pool.root.1) Error removing extended attribute user.storm.pinned: No data available (code: 0)
At Backend's side:
it.grid.storm.ea.ExtendedAttributesException: java.lang.RuntimeException: (File: /storage/gemss_test1/tape/1-ddtest10GB) Error removing extended attribute user.storm.pinned: No data available at it.grid.storm.ea.ExtendedAttributesSwigImpl.rmXAttr(ExtendedAttributesSwigImpl.java:49) at it.grid.storm.ea.MetricsEAAdapter.rmXAttr(MetricsEAAdapter.java:54) at it.grid.storm.ea.StormEA.removePinned(StormEA.java:153) at it.grid.storm.synchcall.command.datatransfer.ReleaseFilesCommand.removePinneExtendedAttribute(ReleaseFilesCommand.java:398) at it.grid.storm.synchcall.command.datatransfer.ReleaseFilesCommand.execute(ReleaseFilesCommand.java:253) at it.grid.storm.synchcall.SimpleSynchcallDispatcher.processRequest(SimpleSynchcallDispatcher.java:61) at it.grid.storm.xmlrpc.XMLRPCExecutor.execute(XMLRPCExecutor.java:81) at it.grid.storm.xmlrpc.XMLRPCMethods.releaseFiles(XMLRPCMethods.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.invoke(ReflectiveXmlRpcHandler.java:111) at org.apache.xmlrpc.server.ReflectiveXmlRpcHandler.execute(ReflectiveXmlRpcHandler.java:102) at org.apache.xmlrpc.server.XmlRpcServerWorker.execute(XmlRpcServerWorker.java:43) at org.apache.xmlrpc.server.XmlRpcServer.execute(XmlRpcServer.java:83) at org.apache.xmlrpc.server.XmlRpcStreamServer.execute(XmlRpcStreamServer.java:182) at org.apache.xmlrpc.webserver.XmlRpcServletServer.execute(XmlRpcServletServer.java:103) at org.apache.xmlrpc.webserver.XmlRpcServlet.doPost(XmlRpcServlet.java:120) at javax.servlet.http.HttpServlet.service(HttpServlet.java:755) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448) at it.grid.storm.xmlrpc.XmlRpcTokenFilter.doFilter(XmlRpcTokenFilter.java:97) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at com.codahale.metrics.jetty8.InstrumentedHandler.handle(InstrumentedHandler.java:192) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:488) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:943) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: (File: /storage/gemss_test1/tape/1-ddtest10GB) Error removing extended attribute user.storm.pinned: No data available at it.grid.storm.filesystem.swig.storm_xattrsJNI.remove_xattr(Native Method) at it.grid.storm.filesystem.swig.storm_xattrs.remove_xattr(storm_xattrs.java:40) at it.grid.storm.ea.ExtendedAttributesSwigImpl.rmXAttr(ExtendedAttributesSwigImpl.java:46) ... 44 common frames omitted
The tested sequence that triggers this behaviour is:
- srmBoL on SURL1 + SURL2 => got Token T_BOL
- srmPtG on SURL1 => got Token T_PTG1
- srmPtG on SURL2 => got Token T_PTG2
- srmRf on SURL1 + T_PTG1
- srmRf on SURL2 + T_PTG2
- srmRf on T_BOL
the latest srmRf returns a SRM_INTERNAL_ERROR due to the exception and the client starts to resubmit the srmRf again. Until the main operation reach a timeout.
Adding a try-catch on the remove-pin native call we can skip the exception and go ahead to a successful Rf (no pin is found because it has been removed by the srmRf submitted before).
This exception seems to be a consequence of GPFS update from v4 to v5.
It may also be related to CentOS 7.
This task needs further investigation.
At the moment, with a fixed rpm, the problem seems mitigated and no loop of srmRf happens anymore.
Backend's log contains not so many entries like:
17:42:01.978 - INFO [xmlrpc-73] - Cannot remove 'user.storm.pinned'. Attribute not found for file: /storage/gpfs_tsm_atlas/atlas/atlasmctape/mc16_13TeV/EVNT/e4944_e5984/mc16_13TeV.363602.MGPy8EG_N30NLO_Wenu_Ht0_70_BFilter.merge.EVNT.e4944_e5984_tid14442895_00/EVNT.14442895._000688.pool.root.1