Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Duplicate
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: None
-
Labels:None
-
Environment:Flume 0.9.1+29
Description
We have a ~30 node setup, and a simple single master - single
collector configuration. When flume is started, it collects logs
without a problem for a while (several hours), but then the collector
suddenly stops collecting with the following message:
.......
2011-01-11 11:51:23,156 INFO
com.cloudera.flume.handlers.hdfs.CustomDfsSink: done writing raw file
to hdfs
2011-01-11 11:51:23,356 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: moved from
partial to complete log.
00000146.20110111-115112375-0800.73016743409545744.seq
2011-01-11 11:51:23,356 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Starting
checksum group called log.
00000152.20110111-115112845-0800.2329552799040638.seq
2011-01-11 11:51:23,356 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: initial
checksum is 12d76a1d2ce
2011-01-11 11:51:23,356 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Finishing
checksum group called 'log.
00000152.20110111-115112845-0800.2329552799040638.seq'
2011-01-11 11:51:23,356 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Checksum
succeeded 12d76a1d2ce
2011-01-11 11:51:23,641 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: moved from
partial to complete log.
00000152.20110111-115112845-0800.2329552799040638.seq
2011-01-11 11:51:23,642 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Starting
checksum group called log.
00000146.20110111-115112974-0800.49338303473496963.seq
2011-01-11 11:51:23,642 INFO
com.cloudera.flume.handlers.endtoend.AckChecksumChecker: initial
checksum is 12d76a1d34f
2011-01-11 11:51:23,642 INFO
com.cloudera.flume.handlers.rolling.RollSink: closing RollSink
'collectorSink'
As you see, the RollSink suddenly closes after a while, although no
configuration, or topology change whatsoever. I have tried restarting
the server, but this happened each time after several hours have
passed.
Below is the configuration we use:
collector :
collector, collectorSource, collectorSink("file:///var/www/data/flume/
collected/%Y-%m-%d/%H/", "%
-", 600000)
agents : agent, tail("/var/www/logs/raw.log"),
agentE2ESink("collector")
Attachments
Issue Links
- duplicates
-
FLUME-416 CollectorSink hangs due to ConcurrentModificationException in RollSink
-
- Resolved
-