Details
-
Type:
Bug
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 1.1.0
-
Fix Version/s: None
-
Component/s: Command-line Interface
-
Labels:None
-
Environment:RHEL 6.2, CDH 5.1.4
Description
With the above provided sample file, we hit this error with the following commands:
./kite-dataset json-schema tc_sample_data.flat.json --class MkoSample -o mkosample.avsc ./kite-dataset create dataset:hive:iptv/mkosample -s mkosample.avsc ./kite-dataset json-import tc_sample_data.flat.json dataset:hive:iptv/mkosample
Yields the following exception:
java.lang.Exception: org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"record","name":"badDate","namespace":"org.apache.avro.mapred","fields":[{"name":"timestamp","type":"long","doc":"Type inferred from '1436854413000'"},{"name":"iso","type":"string","doc":"Type inferred from '\"2015-07-14T07:13:33+01:00\"'"}]}]: {"timestamp": 1436854413000, "iso": "2015-07-14T07:13:33+01:00"}
There are 3 records in the sample data. 2 of the 3 records contain a struct type for the "badDate" key. The third record simply doesn't contain this key.
The avro schema produced is:
{ "name" : "badDate", "type" : [ "null", { "type" : "record", "name" : "badDate", "fields" : [ { "name" : "timestamp", "type" : "long", "doc" : "Type inferred from '1436854413000'" }, { "name" : "iso", "type" : "string", "doc" : "Type inferred from '\"2015-07-14T07:13:33+01:00\"'" } ] } ], "doc" : "Type inferred from '{\"timestamp\":1436854413000,\"iso\":\"2015-07-14T07:13:33+01:00\"}'", "default" : null }
Is this an issue with Kite in particular or with Avro or both? I would suspect we should be able to import this content just fine, given that the default here is null.