[DISTRO-411] HDFS put to hdfs://namenode:8020//path causes edit log corruption - Cloudera Open Source

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: CDH4.0.0
Fix Version/s: CDH4.1.0
Component/s: HDFS
Labels:
None
Environment:
CentOS 6.2

Description

The following command results in a corrupt NN editlog (note the double slash and reading from stdin):
$ cat /usr/share/dict/words | hadoop fs -put - hdfs://localhost:8020//path/file

After this, restarting the namenode will result into the following fatal exception:

2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_0000000000000000173-0000000000000000188 expecting start txid #173
2012-07-10 06:29:19,912 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation MkdirOp [length=0, path=/, timestamp=1341915658216, permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1728)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1743)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1562)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:377)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:178)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)

This exception is triggered by the following entry in the editlog:

  <RECORD>
    <OPCODE>OP_MKDIR</OPCODE>
    <DATA>
      <TXID>182</TXID>
      <LENGTH>0</LENGTH>
      <PATH>/</PATH>
      <TIMESTAMP>1341915658216</TIMESTAMP>
      <PERMISSION_STATUS>
        <USERNAME>cloudera</USERNAME>
        <GROUPNAME>supergroup</GROUPNAME>
        <MODE>493</MODE>
      </PERMISSION_STATUS>
    </DATA>
  </RECORD>
  <RECORD>
    <OPCODE>OP_MKDIR</OPCODE>
    <DATA>
      <TXID>183</TXID>
      <LENGTH>0</LENGTH>
      <PATH>//path</PATH>
      <TIMESTAMP>1341915658216</TIMESTAMP>
      <PERMISSION_STATUS>
        <USERNAME>cloudera</USERNAME>
        <GROUPNAME>supergroup</GROUPNAME>
        <MODE>493</MODE>
      </PERMISSION_STATUS>
    </DATA>
  </RECORD>

This initially happened on a clients HA setup, but I can reproduce it on a fresh CDH4 vm.

Locally I can fix it with a hdfs namenode -recover
Haven't yet tried fixing it on the HA setup.

Attachments

Activity

People

Assignee:

Todd Lipcon

Reporter:

Joris Bontje

Votes:

0 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

10/Jul/12 11:03 AM

Updated:

26/Feb/13 5:35 PM

Resolved:

26/Feb/13 5:35 PM