Uploaded image for project: 'Flume (READ-ONLY)'
  1. Flume (READ-ONLY)
  2. FLUME-600

Have collector source create names that are both lexographically and chronologically ordered

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: v0.9.3
    • Fix Version/s: v0.9.4
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Release Note:
      This patch changes the default filename convention that the collector writes out. Output file names will now have the following format: <prefix>yyyyMMdd-HHmmssSSSSz.<12digitNanos>.<8charTid>

      Description

      We're transitioning to Hadoop. Until then, we're parsing the files that Flume drops on S3.

      S3's API says that keys will be returned in order. It's easy to ask S3:

      "Given I am on 2011-03-17/0400/flume-1.seq, give me one file."

      Assuming the next lexicographically ordered file is 2011-03-17/0400/flume-2.seq, then you don't have to do any cumbersome faux-directory sweeping (since S3 doesn't know about directories per se). You can let Amazon do that work for you.

      We don't have any requirements about sprintf-style formatting of the filename; just that they're written in order

      Rob

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jon Jonathan Hsieh
                Reporter:
                robert.slifka@gmail.com Robert Slifka
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: