Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
      Flume 0.9.1+29

      Description

      We have a ~30 node setup, and a simple single master - single
      collector configuration. When flume is started, it collects logs
      without a problem for a while (several hours), but then the collector
      suddenly stops collecting with the following message:

      .......
      2011-01-11 11:51:23,156 INFO
      com.cloudera.flume.handlers.hdfs.CustomDfsSink: done writing raw file
      to hdfs
      2011-01-11 11:51:23,356 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: moved from
      partial to complete log.
      00000146.20110111-115112375-0800.73016743409545744.seq
      2011-01-11 11:51:23,356 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Starting
      checksum group called log.
      00000152.20110111-115112845-0800.2329552799040638.seq
      2011-01-11 11:51:23,356 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: initial
      checksum is 12d76a1d2ce
      2011-01-11 11:51:23,356 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Finishing
      checksum group called 'log.
      00000152.20110111-115112845-0800.2329552799040638.seq'
      2011-01-11 11:51:23,356 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Checksum
      succeeded 12d76a1d2ce
      2011-01-11 11:51:23,641 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: moved from
      partial to complete log.
      00000152.20110111-115112845-0800.2329552799040638.seq
      2011-01-11 11:51:23,642 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: Starting
      checksum group called log.
      00000146.20110111-115112974-0800.49338303473496963.seq
      2011-01-11 11:51:23,642 INFO
      com.cloudera.flume.handlers.endtoend.AckChecksumChecker: initial
      checksum is 12d76a1d34f
      2011-01-11 11:51:23,642 INFO
      com.cloudera.flume.handlers.rolling.RollSink: closing RollSink
      'collectorSink'

      As you see, the RollSink suddenly closes after a while, although no
      configuration, or topology change whatsoever. I have tried restarting
      the server, but this happened each time after several hours have
      passed.

      Below is the configuration we use:
      collector :
      collector, collectorSource, collectorSink("file:///var/www/data/flume/
      collected/%Y-%m-%d/%H/", "%

      {host}

      -", 600000)
      agents : agent, tail("/var/www/logs/raw.log"),
      agentE2ESink("collector")

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jon Jonathan Hsieh
                Reporter:
                enis Enis Soztutar
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: