Details
-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 0.9.0, 0.10.0
-
Fix Version/s: None
-
Component/s: Maven Module
-
Labels:
Description
The create-dataset maven mojo only supports setting the Dataset schema through reflection, or by using a local file. Here's the local file logic:
if (avroSchemaFile != null) { File avroSchema = new File(avroSchemaFile); try { if (avroSchema.exists()) { descriptorBuilder.schema(avroSchema); } else { descriptorBuilder.schema(Resources.getResource(avroSchemaFile).openStream()); } } catch (IOException e) { throw new MojoExecutionException("Problem while reading file " + avroSchemaFile, e); } } else if (avroSchemaReflectClass != null) { ... }
This doesn't matter for FS Datasets, because there is always a HDFS schema URL, but it is a known problem for Hive (CDK-142). It also causes warning messages when using the log4j appender because the schema literal is attached to each message (CDK-269).