Details
-
Type:
Improvement
-
Status: Resolved
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: 0.16.0
-
Fix Version/s: 1.0.0
-
Component/s: None
-
Labels:None
Description
Storing the Avro schema in HDFS for a Hive-based dataset may solve a couple problems. Specifically:
- The original version of the schema can be preserved to ensure future changes are compatible with it., as discussed at [1]
- It avoids limitations around the size of the schema Hive supports (as discussed at [2]). (This is an issue we run into frequently.)
[1]
https://groups.google.com/a/cloudera.org/forum/#!topic/cdk-dev/fKyK8RfYsRg
[2]
https://groups.google.com/a/cloudera.org/forum/#!topic/cdk-dev/K4YNFbivVBA