Details
-
Type:
Bug
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: CDH 5.7.1
-
Fix Version/s: None
-
Component/s: Cloudera Manager
Description
After moving Service Monitor role from one host to another we started to get error message about dropping messages by the role stage of the Service Monitor pipeline. Service Monitor process running, but can't serve requests as before, Role Stage Queue Size graph is always equal 2049 messages.
Although we can see currently running YARN applications, we can't see Impala queries – we get "Internal error while querying the Service Monitor" error.
In Service Monitor logs there is only error:
2017-06-01 09:28:07,040 ERROR com.cloudera.cmon.kaiser.BaseTestRunner: Error running subject health tests java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at com.cloudera.cmon.kaiser.BaseTestRunner.submitTestsOnSubjectsByType(BaseTestRunner.java:232) at com.cloudera.cmon.kaiser.SMONTestRunner.runRoleAndServiceTestsForSession(SMONTestRunner.java:166) at com.cloudera.cmon.kaiser.SMONTestRunner.runTestsForSession(SMONTestRunner.java:137) at com.cloudera.cmon.kaiser.BaseTestRunner.runTestsOnAllSubjects(BaseTestRunner.java:143) at com.cloudera.cmon.kaiser.KaiserService$KaiserServiceRunner.run(KaiserService.java:138) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.cloudera.cmon.kaiser.KaiserSubjectRecordFactory.extendStatusForSpecialRoles(KaiserSubjectRecordFactory.java:732) at com.cloudera.cmon.kaiser.KaiserSubjectRecordFactory.createForRole(KaiserSubjectRecordFactory.java:485) at com.cloudera.cmon.kaiser.BaseTestRunner$2.run(BaseTestRunner.java:295) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ... 1 more
It's a huge problem since we can't see currently running impala queries on cluster.
I think it's relates to DISTRO-802 also.