Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Bug
    • Affects Version/s: CDH4.4.0
    • Fix Version/s: None
    • Component/s: HDFS
    • Labels:

      Description

      Hi, In our HDFS HA, I see the following excpetions when I try to failback. I have an auto failover mechanism enabled.

      So when the auto failover happens, I am doing health checks and figure everything is alright with the primary so I try to failback after the failover manually using the haadmin command. Although the failback operation succeeds, the exceptions and the return status of 255 tend to worry me (because I cannot script this if I needed to) Please let me know if this is anything that is known and easily resolvable.
      I am using Cloudera Hadoop 4.4.0, if that helps.

      Thanks.

      sudo -u hdfs hdfs haadmin -failover nn2 nn1
      Operation failed: Unable to become active. Service became unhealthy while trying to failover. at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:652) at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:591) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:588) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:588) at org.apache.hadoop.ha.ZKFCRpcServer.gracefulFailover(ZKFCRpcServer.java:94) at org.apache.hadoop.ha.protocolPB.ZKFCProtocolServerSideTranslatorPB.gracefulFailover(ZKFCProtocolServerSideTranslatorPB.java:61) at org.apache.hadoop.ha.proto.ZKFCProtocolProtos$ZKFCProtocolService$2.callBlockingMethod(ZKFCProtocolProtos.java:1351) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mnikhil Nikhil
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: