Details
-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 1.2.0
-
Fix Version/s: None
-
Component/s: export
-
Labels:None
Description
When Sqoop is used to export aggregated data (stored in HDFS) to RDBMS (e.g. Oracle) there appear a lot of troubles, when doing updates to target tables. It should be possible to define target table column range and order, mapped to the records stored in HDFS for example.
Having something like "sqoop export ... --table target_table ... --columns 'column1,column5,column10' ..." would allow export to be flexible and re-usable.
When there is active development and QA phase the order of the columns in RDBMS can be different on different environments - one environment can be installed with initial scripts ( 'create'-statement ) and the other can be upgraded with SQL-deltas ( 'update'-statement ) and order of columns may not be the same. This will result in failed 'sqoop export' execution, which could be successful on another environment. Also when you update tables and want to add some NULLable columns and preserve existing jobs running, not having --columns to define targeted structure will result in failed 'sqoop exports'