EM12c Agent – java.lang.OutOfMemoryError: Java heap space

After an AIX 7.1 server reboot, there was one agent which did not started. The command emctl start agent resulted in a java.lang.OutOfMemoryError: Java heap space message. In the agent subdirectory some dumpfiles were created:

-rw-r----- 1 oracle dba 291373 Oct 19 11:00 javacore.20151019.110002.14876844.0002.txt
-rw-r----- 1 oracle dba 22182281 Oct 19 11:00 heapdump.20151019.110002.14876844.0001.phd
-rw-r--r-- 1 oracle dba 167408 Oct 19 11:00 Snap.20151019.110002.14876844.0003.trc

First I took a look in the agent logfiles to gather more information.

/u00/app/oracle/product/agent12c/agent_inst/sysman/log/emdctlj.log

2015-10-19 10:59:11,711 [1:F7DA781F:main-16449708] INFO - EmdCtl Timezone = Europe/Zurich
2015-10-19 10:59:11,884 [1:F288756D:main-11600120] INFO - EmdCtl Timezone = Europe/Zurich
2015-10-19 10:59:13,584 [1:2CF29A0A] INFO - Disconnecting: client terminus
2015-10-19 10:59:13,585 [1:2CF29A0A] INFO - stdout: Status agent Failure:unable to connect to http server at https://srvaix111.mvn.ch:3874/emd/lifecycle/main/. [peer not aut
henticated]
2015-10-19 10:59:13,585 [1:2CF29A0A] INFO - Exit Code: 2

There are a log of MOS notes available for the search term peer not authentificated, but none of them were helpful.

/u00/app/oracle/product/agent12c/agent_inst/sysman/log/emagent.nohup

JVMDUMP032I JVM requested Snap dump using '/u00/app/oracle/product/agent12c/agent_inst/sysman/emd/Snap.20151019.110002.14876844.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /u00/app/oracle/product/agent12c/agent_inst/sysman/emd/Snap.Snap.20151019.110002.14876844.0003.trc
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
Exception in thread "HTTP Listener-51" java.lang.NoClassDefFoundError: org/apache/log4j/spi/ThrowableInformation
at ?
Exception in thread "GC.SysExecutor.6" java.lang.OutOfMemoryError: Java heap spacejava/lang/OutOfMemoryError: Java heap space
at java/lang/Throwable.fillInStackTrace (Native Method)
at java/lang/Throwable.<init> (Throwable.java:56)
at java/lang/Throwable.<init> (Throwable.java:67)
at java/lang/OutOfMemoryError.<init> (OutOfMemoryError.java:46)
at java/lang/ThreadLocal$ThreadLocalMap.set (ThreadLocal.java:436)
at java/lang/ThreadLocal$ThreadLocalMap.access$100 (ThreadLocal.java:254)
at java/lang/ThreadLocal.setInitialValue (ThreadLocal.java:156)
at java/lang/ThreadLocal.get (ThreadLocal.java:142)
at oracle/sysman/gcagent/util/logging/InMemoryLogging.isDisabled (InMemoryLogging.java:72)
at oracle/sysman/gcagent/util/logging/InMemoryLogging.addMessage (InMemoryLogging.java:100)
at oracle/sysman/gcagent/util/logging/Logger.info (Logger.java:157)
at oracle/sysman/gcagent/util/system/ThreadManager$ThreadUncaughtExceptionHandler.uncaughtException (ThreadManager.java:426)
at java/lang/Thread.uncaughtException (Thread.java:1218)
Exception in thread "HTTP Listener-49"
java.lang.OutOfMemoryError: Java heap space
at ?

Here we see that the EM12c agent had a JVM memory problem at startup. Let’s try out the emctl clearstate agent command.

emctl clearstate agent

From the Oracle documentation about the clearstate command – http://docs.oracle.com/cd/E29597_01/doc.1111/e24473/emctl.htm#r21c1-t27

emctl clearstate Clears the state directory contents. The files that are located under $ORACLE_HOME/sysman/emd/state will be deleted if this command is run. The state files are the files which are ready for the agent to convert them into corresponding xml files.

The emctl clearstate agent command produced the same error:

oracle@srvaix111:/u00/app/oracle/product/agent12c/agent_inst/sysman/log/ [agent12c] emctl clearstate agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/10/19 11:00:54 - please wait.
JVMDUMP032I JVM requested Heap dump using '/u00/app/oracle/product/agent12c/agent_inst/sysman/log/heapdump.20151019.110054.23920788.0001.phd' in response to an event
JVMDUMP010I Heap dump written to /u00/app/oracle/product/agent12c/agent_inst/sysman/log/heapdump.20151019.110054.23920788.0001.phd
JVMDUMP032I JVM requested Java dump using '/u00/app/oracle/product/agent12c/agent_inst/sysman/log/javacore.20151019.110054.23920788.0002.txt' in response to an event
JVMDUMP010I Java dump written to /u00/app/oracle/product/agent12c/agent_inst/sysman/log/javacore.20151019.110054.23920788.0002.txt
JVMDUMP032I JVM requested Snap dump using '/u00/app/oracle/product/agent12c/agent_inst/sysman/log/Snap.20151019.110054.23920788.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /u00/app/oracle/product/agent12c/agent_inst/sysman/log/Snap.20151019.110054.23920788.0003.trc
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
EMD clearstate failed: Offline clearstate failed : java.lang.OutOfMemoryError: Java heap space

After some research in My Oracle Support for OutOfMemoryError: Java heap space, I found this note:

Duplicate 1952593.1 – EM12c: emctl start agent Fails With ‘ Target Interaction Manager failed at Startup java.lang.OutOfMemoryError: Java heap space’ reported in gcagent_errors.log (Doc ID 1902124.1)

From the MOS note:

  1. Kill any leftover process -> not for me, the agent was no started at all
  2. Move old files from  /agent_inst/sysman/emd/state/* to a new directory -> sounds very interesting
  3. Execute clearstate command -> let’s try it out

The directory u00/app/oracle/product/agent12c/agent_inst/sysman/emd/state/

oracle@srvaix111:/u00/app/oracle/product/agent12c/agent_inst/sysman/emd/state/ [agent12c] ll
total 144
drwxr----- 16 oracle dba 4096 Oct 19 11:07 .
drwxr----- 13 oracle dba 8192 Oct 19 16:25 ..
-rw-r----- 1 oracle dba 10 Oct 19 11:07 2F18E248E1C379BFA31D904F149E3FDE_failedLogin.log
-rw-r----- 1 oracle dba 30 Oct 19 11:07 2F18E248E1C379BFA31D904F149E3FDE_seg_adv_count.log
-rw-r----- 1 oracle dba 10 Oct 19 11:07 4E85BB986B86B4995D5004CE37EB5C68_failedLogin.log
-rw-r----- 1 oracle dba 32 Oct 19 11:07 4E85BB986B86B4995D5004CE37EB5C68_seg_adv_count.log
-rw-r----- 1 oracle dba 19 Oct 19 11:07 TETI57.DB_ocm
-rw-r----- 1 oracle dba 19 Oct 19 11:07 SETZ71.DB_ocm
drwxr----- 2 oracle dba 4096 Oct 19 11:07 adr
drwxr----- 2 oracle dba 4096 Oct 19 11:07 configLogs
drwxr----- 6 oracle dba 256 Oct 19 11:07 configstate
drwxr----- 8 oracle dba 256 Oct 19 11:07 fetchlet_state
drwxr----- 3 oracle dba 256 Oct 19 11:07 has
drwxr----- 2 oracle dba 256 Oct 19 11:07 inbox
drwxr----- 2 oracle dba 256 Oct 19 11:07 mcePersist
drwxr----- 2 oracle dba 256 Oct 19 11:07 oracleHomeFirstConfig
-rw-r----- 1 oracle dba 178 Oct 19 11:07 parse-log-B52A1AB1C0789307041D7DFFE3574C2C
drwxr----- 2 oracle dba 4096 Oct 19 11:07 persistSchedules
-rw-r----- 1 oracle dba 12368 Oct 19 11:07 progResUtil.log
drwxr----- 5 oracle dba 256 Oct 19 11:07 recvlet_state
drwxr----- 2 oracle dba 4096 Oct 19 11:07 severityTraceRecords
drwxr----- 9 oracle dba 256 Oct 19 11:07 statemgmt
drwxr----- 5 oracle dba 256 Oct 19 11:07 storage
drwxr----- 3 oracle dba 256 Oct 19 11:07 trace

There was a lot of stuff in this directory – time to move:

oracle@srvaix111:/u00/app/oracle/product/agent12c/agent_inst/sysman/emd/ [agent12c] mv /state/*  /u04/tmp/

emctl clearstate agent – 2nd run

Now the clearstate command has worked without errors.

oracle@srvaix111:/u00/app/oracle/product/agent12c/agent_inst/sysman/emd/ [agent12c] emctl clearstate agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD clearstate completed successfully

emctl start agent

The start was successful – fine.

oracle@srvaix111:/u00/app/oracle/product/agent12c/agent_inst/sysman/emd/ [agent12c] emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting agent ............................... started.

Summary

The cleanup of the state directory solved the problem. But why does a emctl clearstate agent command not clean up the state directory as described in the documentation? Aaah, and do’t forget to clean up the moved files. They are not needed anymore.