Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.0.0
-
Fix Version/s: 1.1.0
-
Component/s: None
-
Labels:None
Description
If you write records using a specific schema and then you later evolve that schema, then you get errors when you read from a Parquet formatted dataset. The problem is the Parquet internally instantiates objects based on the namespace and name in the stored avro schema. When you evolve your schema and compile new specific classes, those objects are not compatible with the old schema if you're using the IndexedRecord interface, which Parquet does.
Currently, I think the only way you can safely evolve the schema of a Parquet dataset is if you're adding fields to the end of the schema,