Uploaded image for project: 'Flume (READ-ONLY)'
  1. Flume (READ-ONLY)
  2. FLUME-643

Logging from Scribe to Hadoop via Flume breaks UTF-8 encoding

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: v0.9.3
    • Fix Version/s: None
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      Log messages with UTF-8 Characters like äöü end up with broken in Hadoop when logging via Scribe. We used a simple Setup with:
      exec config scribe_input scribe "scribe(1463) "collectorSink("hdfs://localhost/testing/", "test",1000)"
      exec spawn testserver scribe_input
      We usually use avrojson as collector output format and gzip for compression, but the chars are broken if we deactivate both.
      The Problem seems to occur when flume writes the files into Hadoop, as in a more complicated setup like:
      exec config scribe_input scribe "scribe(1463)" autoDFOChain
      exec config hdfs scribe autoCollectorSource "collectorSink("hdfs://localhost/testing/", "test",1000)"
      exec spawn testserver1 scribe_input
      exec spawn testserver2 hdfs

      the chars are still ok in the DFO Logs on testserver1

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tmsmaster Sebastian
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: