Uploaded image for project: 'Sqoop (READ-ONLY)'
  1. Sqoop (READ-ONLY)
  2. SQOOP-28

Overlap DB fetch from InputFormat with write in OutputFormat

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: import, mapreduce
    • Labels:
      None

      Description

      The current import MR framework retrieves each record from the database and then writes its results to HDFS synchronously. We should use a producer/consumer queue in the OutputFormat so that round-trips to the db in RecordReader.next() are overlapped with preceding HDFS writes. (Similar to how ExportOutputFormat works.)

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              aaron Aaron Kimball
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: