[DISTRO-470] There is race condtion in FSEditLog when removing error edit stream - Cloudera Open Source

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: CDH3u5
Fix Version/s: CDH4.0.0
Component/s: HDFS
Labels:
None

Description

In our cluster, we configure the NameNode to write to both local and NFS mounted directories. When the NFS mounted directory is inaccessible, the NameNode should keep running without error, but our NameNode crash with following stack trace.

2013-04-02 23:35:21,536 FATAL org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unable to find edits stream with IO error
java.lang.Exception: Unable to find edits stream with IO error
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:430)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsStreamsAndStorageDirs(FSEditLog.java:519)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:1139)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1641)
at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:689)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)

According to the stack trace, When NameNode tries to sync edit log, it does identify the mounted NFS directory is inaccessible, and attempt to remove it from the FSEditLog#editStreams. However, it found the edit stream corresponding to the mounted NFS has already been removed. Under this circumstance, NameNode just kill itself, aborted!

After looking through the source code of HDFS, I found there is another code path of removing edit stream from FSEditLog#editStreams, which can cause above race condition. In method FSEditLog#getEditLogSize

synchronized long getEditLogSize() throws IOException
{
assert getNumStorageDirs() == editStreams.size();
long size = 0;
for (int idx = 0; idx < editStreams.size(); idx++)
{
EditLogOutputStream es = editStreams.get(idx);
try

Unknown macro: { long curSize = es.length(); assert (size == 0 || size == curSize) }

catch (IOException ioe)

Unknown macro: { FSNamesystem.LOG.warn( "Unable to determine edit log length. Removing log.", ioe); removeEditsAndStorageDir(idx); }

}
return size;
}

The cause of this race condition lie in FSEditLog#logSync method, there are two steps in FSEditLog#logSync

1. Do sync operation, if any one edit stream is inaccessible, put it into error stream list.(un-synchronized)
2. Delete error stream in above error edit stream list from FSEditLog#editStreams (synchronized)

Step #1 isn't synchronized, so there is a possibility that after step#1 and before step #2 the error stream has already been removed from other thread by invoking FSEditLog#getEditLogSize

From the attached NameNode log, the above analysis is exactly the case.
The secondary NameNode try to make RPC call of NameNode#getEditLogSize which finally call into FSEditLog#getEditLogSize and remove the error edit stream.

We can fix the bug as in apache hadoop brach 1.X done with it, just throw out exception instead of trying to remove error edit stream in FSEditLog#getEditLogSize method;the Secondary NameNode receiving this exception will just re-try.

The fix is minor ,can I submit a patch for this.

Attachments

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Attachments

namenode-error-log.txt
5 kB
08/Apr/13 5:16 PM

There is race condtion in FSEditLog when removing error edit stream

Details

Description

Attachments

Attachments

Activity

People

Dates