Uploaded image for project: 'CDH'
  1. CDH
  2. DISTRO-530

When TaskTracker request block to DataNode and "java.io.IOException" occurs, the fail TCP socket is not closed (in status "CLOSE_WAIT" with port 50010 of DataNode)


    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: CDH4.4.0
    • Fix Version/s: CDH4.6.0
    • Component/s: MapReduce
    • Labels:
    • Environment:


      When running a mapreduce job, TaskTracker request a job's file block (on .staging directory) to DataNode:50010. If request fail because "java.io.IOException: Got error for OP_READ_BLOCK" Occurs (reason is that replication of that block is InvalidateBlocks and has removed on the datanote that TaskTracker request to) and the TCP socket that TaskTracker using is not closed, it make to someday, on the Cloudera Manager WebUI has occurs Warning:
      "Open file descriptors: xxxxxx. File descriptor limit : xxxxxx...."
      I think the problem above is in DatanodeInfo blockSeekTo(long target) - line 503 of Class DFSInputStream
      The connection TaskTracker using is BlockReader, it created on line 538 :
      blockReader = getBlockReader(targetAddr, chosenNode, src, blk,
      accessToken, offsetIntoBlock, blk.getNumBytes() - offsetIntoBlock,
      buffersize, verifyChecksum, dfsClient.clientName);

      and if this connection fail, TaskTracker will request to other DataNode, and old Connection is not closed here.
      I think need small code before the code above to closed old Connection, for ex:
      if (blockReader != null)

      { blockReader.close(); blockReader = null; }




            • Assignee:
              cmccabe Colin Patrick McCabe
              kyo88 long
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: