Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-270

Maven integration should allow schemas to be stored in HDFS

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.9.0, 0.10.0
    • Fix Version/s: None
    • Component/s: Maven Module
    • Labels:

      Description

      The create-dataset maven mojo only supports setting the Dataset schema through reflection, or by using a local file. Here's the local file logic:

        if (avroSchemaFile != null) {
          File avroSchema = new File(avroSchemaFile);
          try {
            if (avroSchema.exists()) {
              descriptorBuilder.schema(avroSchema);
            } else {
              descriptorBuilder.schema(Resources.getResource(avroSchemaFile).openStream());
            }
          } catch (IOException e) {
            throw new MojoExecutionException("Problem while reading file " + avroSchemaFile, e);
          }
        } else if (avroSchemaReflectClass != null) { ... }
      

      This doesn't matter for FS Datasets, because there is always a HDFS schema URL, but it is a known problem for Hive (CDK-142). It also causes warning messages when using the log4j appender because the schema literal is attached to each message (CDK-269).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kaufman Kaufman Ng
                Reporter:
                blue Ryan Blue
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated: