[CREAM-76] Minor issues from INFN-T1 Created: 27/May/13  Updated: 27/May/13  Due: 15/May/13  Resolved: 27/May/13

Status: Resolved
Project: CREAM
Component/s: None
Affects Version/s: None
Fix Version/s: 1.15.2
Security Level: Public (Visbile by non-authn users.)

Type: Bug Priority: Major
Reporter: Paolo Andreetto Assignee: Paolo Andreetto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

This ticket duplicates https://savannah.cern.ch/bugs/?101108 and refers to https://ggus.eu/ws/ticket_info.php?ticket=92492

The fix concerns:

1) After configuring with yaim, many tomcat6 errors are logged in catalina.out:

java.lang.IllegalArgumentException: Document base /usr/share/tomcat6/webapps/ce-cream-es does not exist

SEVERE: A web application appears to have started a thread named [Timer-4] but has failed to stop it. This is very likely to create a memory leak.

SEVERE: A web application created a ThreadLocal with key of type [null] (value [org.apache.axiom.util.UIDGenerator$1@4a88e4c0]) and a value of type [long[]] (value [[J@24edb15c]) but failed to remove it when the web application was stopped. To prevent a memory leak, the ThreadLocal has been forcibly removed.

After a while the ce starts swapping and runs out of health.

WORKAROUND:
rm -f /usr/share/tomcat6/conf/Catalina/localhost/ce-cream-es.xml
/etc/init.d/tomcat6 stop && /etc/init.d/glite-ce-blah-parser stop && sleep 3 && /etc/init.d/glite-ce-blah-parser start && /etc/init.d/tomcat6 start
SOLUTION: Have this fixed in the next update

2) [root@ce01-lcg ~]# cat /etc/glite-ce-cream/log4j.properties | egrep 'MaxFileSize|MaxBackupIndex'
log4j.appender.fileout.MaxFileSize=1000KB
log4j.appender.fileout.MaxBackupIndex=20

These are too little in a production environment. An entire job lifecycle doesnt fit in 20MB of logs. furthermore, any run of yaim restores the too little values.

WORKAROUND:
modify /etc/glite-ce-cream/log4j.properties :
log4j.appender.fileout.MaxFileSize=10M

chattr +i /etc/glite-ce-cream/log4j.properties

SOLUTION: Have this fixed in the next update

3) After configuring with yaim, services are up, but the ce remains unresponsive:

[sdalpra@ui01-ad32 ~]$ glite-ce-job-submit -a -r ce01-lcg.cr.cnaf.infn.it:8443/cream-lsf-dteam my.jdl
2013-03-14 14:41:23,596 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection timed out]

[sdalpra@ui01-ad32 ~]$ glite-ce-job-submit -a -r ce01-lcg.cr.cnaf.infn.it:8443/cream-lsf-dteam my.jdl
2013-03-14 14:43:10,813 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection timed out]

Tomcat is actually in a ill state:

[root@ce01-lcg ~]# service tomcat6 status
tomcat6 (pid 20389) is running... [ OK ]
[root@ce01-lcg ~]# service tomcat6 stop
Stopping tomcat6: [FAILED]

WORKAROUND:
service glite-ce-blah-parser stop
service tomcat6 stop && service glite-ce-blah-parser stop && sleep 3 && service glite-ce-blah-parser start && service tomcat6 start

Then it works:
[sdalpra@ui01-ad32 ~]$ glite-ce-job-submit -a -r ce01-lcg.cr.cnaf.infn.it:8443/cream-lsf-dteam my.jdl
https://ce01-lcg.cr.cnaf.infn.it:84...

SOLUTION: Have this fixed in the next update


Generated at Wed Apr 24 02:26:48 CEST 2024 using Jira 9.12.5#9120005-sha1:fa8821cdb090f6b5ec0424ddb13fa19bc92d8429.