Details
Description
When running kerberos jenkins job: http://sandbox.jenkins.cloudera.com/view/RecordService/job/record-service-kerberos-test-5.4.x/610/
the job failed with the following error:
16:42:49 Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 172.141 sec <<< FAILURE! - in com.cloudera.recordservice.core.TestKerberosConnection
16:42:49 testPersistedTokens(com.cloudera.recordservice.core.TestKerberosConnection) Time elapsed: 126.574 sec <<< ERROR!
16:42:49 java.io.IOException: Could not connect to RecordServiceWorker: vd0226.halxg.cloudera.com:13050
16:42:49 at java.net.SocketInputStream.socketRead0(Native Method)
16:42:49 at java.net.SocketInputStream.read(SocketInputStream.java:152)
16:42:49 at java.net.SocketInputStream.read(SocketInputStream.java:122)
16:42:49 at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
16:42:49 at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
16:42:49 at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
16:42:49 at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
16:42:49 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
16:42:49 at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
16:42:49 at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:277)
16:42:49 at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
16:42:49 at com.cloudera.recordservice.core.ThriftUtils.createTransport(ThriftUtils.java:159)
16:42:49 at com.cloudera.recordservice.core.RecordServiceWorkerClient.connect(RecordServiceWorkerClient.java:426)
16:42:49 at com.cloudera.recordservice.core.RecordServiceWorkerClient.access$1200(RecordServiceWorkerClient.java:45)
16:42:49 at com.cloudera.recordservice.core.RecordServiceWorkerClient$Builder.connect(RecordServiceWorkerClient.java:234)
16:42:49 at com.cloudera.recordservice.core.TestKerberosConnection.testPersistedTokens(TestKerberosConnection.java:395)
When we investigating in CM, we found the following error in zookeeper log:
Dec 8, 4:21:45.430 PM WARN org.apache.zookeeper.server.NIOServerCnxn
caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x150bb4e4ac9a94f, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
Dec 8, 4:21:45.432 PM INFO org.apache.zookeeper.server.NIOServerCnxn
Closed socket connection for client /10.17.223.26:53823 which had sessionid 0x150bb4e4ac9a94f
...
Dec 8, 4:22:50.326 PM INFO org.apache.zookeeper.server.PrepRequestProcessor
Got user-level KeeperException when processing sessionid:0x150bb4e4ac9a9dd type:delete cxid:0x5 zxid:0x2083e9 txntype:-1 reqpath:n/a Error Path:/recordservice/planners/recordserviced@vd0226.halxg.cloudera.com:13050 Error:KeeperErrorCode = NoNode for /recordservice/planners/recordserviced@vd0226.halxg.cloudera.com:13050
And attached is the recordservice and zookeeper log.