Uploaded image for project: 'Sqoop (READ-ONLY)'
  1. Sqoop (READ-ONLY)
  2. SQOOP-169

SqoopExport usability improvement for environment with often changing tables (add --columns for export)

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.0
    • Fix Version/s: None
    • Component/s: export
    • Labels:
      None

      Description

      When Sqoop is used to export aggregated data (stored in HDFS) to RDBMS (e.g. Oracle) there appear a lot of troubles, when doing updates to target tables. It should be possible to define target table column range and order, mapped to the records stored in HDFS for example.

      Having something like "sqoop export ... --table target_table ... --columns 'column1,column5,column10' ..." would allow export to be flexible and re-usable.

      When there is active development and QA phase the order of the columns in RDBMS can be different on different environments - one environment can be installed with initial scripts ( 'create'-statement ) and the other can be upgraded with SQL-deltas ( 'update'-statement ) and order of columns may not be the same. This will result in failed 'sqoop export' execution, which could be successful on another environment. Also when you update tables and want to add some NULLable columns and preserve existing jobs running, not having --columns to define targeted structure will result in failed 'sqoop exports'

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              asudak Aleksei Sudak
            • Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: