Details
Description
We are running into HDFS-5225, which is resolved as a duplicate of HDFS-5031. When the DataNode gets into this state, the logs spew out hundreds of gigs of log lines, all the same (same exact line, referencing the same block):
2014-02-02 22:56:07,543 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-911678927-10.159.27.212-1363638216198:blk_2930043100041915344_33721178 is no longer in the dataset 2014-02-02 22:56:07,543 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-911678927-10.159.27.212-1363638216198:blk_2930043100041915344_33721178 is no longer in the dataset 2014-02-02 22:56:07,543 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-911678927-10.159.27.212-1363638216198:blk_2930043100041915344_33721178 is no longer in the dataset
We can suppress this with log4j, but are not sure if it would just hide a real underlying issue. The only way to stop the log spam after it starts is to restart the datanode.
A fix or workaround would be appreciated.