Uploaded image for project: 'Flume (READ-ONLY)'
  1. Flume (READ-ONLY)
  2. FLUME-717

WAL data grows forever even though data is delivered in E2E

    Details

    • Type: Bug
    • Status: Open
    • Priority: Blocker
    • Resolution: Unresolved
    • Affects Version/s: v0.9.5
    • Fix Version/s: None
    • Component/s: Master, Node, Sinks+Sources
    • Labels:
      None

      Description

      With a heavy enough write load, it appears that the E2E agent WAL will get into a state where data just gets constantly shuffled around between the various directories / states (e.g. writing, logged, sending, sent). When this happens, the WAL directories grow indefinitely until the disk is exhausted, no matter how much data caused the problem.

      To reproduce:

      • Use the supplied config (or something similar).
      • Write to the agent source at a rate of > 1MB/s for a short burst (using something like the provided generator below).
      • Note that data is delivered to the collectorSink but the agent WAL manager constantly grows the data.

      The config:

      n1 : execStream("tail -F datafile") | agentE2ESink("host", 12345);
      n2 : collectorSource(12345) | collectorSink("file://...", "n2-");
      

      Generator:

      perl -e 'while (1) { print $i++, "\n"; }' >> datafile
      

      This looks and smells just like FLUME-430. I haven't yet examined the WAL or destination data for duplicates / missing events.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              esammer Eric Sammer
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: