Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-644

HIVE-5823 in CDH 5.1.x breaks Avro schema evolution

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: CDH 5.1.0, CDH 5.1.2
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None

      Description

      This is a nasty surprise for anyone who uses Hive tables with Avro and has in the past added a default null to a primitive value, thereby evolving the schema from a primitive type to a union. All your mappers will fail with a "Not a union: String"error.

      See my comment on https://issues.apache.org/jira/browse/HIVE-5823?focusedCommentId=14142801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14142801

      TL;DR - either the HIVE-6806 patch needs to be added as well (although it might have other problems lurking), or a manual fix to AvroDeserializer.java needs to be applied, something that will make the deserializeNullableUnion() method look like in the latest trunk:

      https://github.com/apache/hive/blob/2bb8ae7f352694f4becc9ff67e667620b2ee7fe9/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L261

      private Object deserializeNullableUnion(Object datum, Schema fileSchema, Schema recordSchema,
      TypeInfo columnType) throws AvroSerdeException {
      int tag = GenericData.get().resolveUnion(recordSchema, datum); // Determine index of value
      Schema schema = recordSchema.getTypes().get(tag);
      if (schema.getType().equals(Schema.Type.NULL)) {
      return null;
      }
      Schema currentFileSchema = null;
      if (fileSchema != null) {
      currentFileSchema =
      fileSchema.getType() == Type.UNION ? fileSchema.getTypes().get(tag) : fileSchema;
      }
      return worker(datum, currentFileSchema, schema, SchemaToTypeInfo.generateTypeInfo(schema));
      }
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              grisha Gregory Trubetskoy
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: