Release Notes - Kite SDK - Version 0.12.0 - HTML format

Bug

  • [KITE-142] - Descriptors with file: schema URLs don't work with Hive/Impala
  • [KITE-174] - FS MetadataProvider should remove directories if data is not co-located.
  • [KITE-195] - PartitionStrategy names must not duplicate columns in Hive
  • [KITE-276] - Parallel Builds of the Kite HBase Data Module results in random errors in tests
  • [KITE-317] - Repository URI for managed Hive table should not require a path component
  • [KITE-323] - Unpartitioned FS Datasets will create empty files when getWriter is called
  • [KITE-358] - Crunch target for Hive dataset fails
  • [KITE-360] - DatasetKeyOutputFormat creates orphan directories
  • [KITE-361] - Demo example fails with 'java.lang.ClassNotFoundException: org.apache.hcatalog.common.HCatUtil'
  • [KITE-362] - Dataset example is failing with 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'

New Feature

  • [KITE-311] - Add DatasetRepository#getUri method
  • [KITE-318] - Add a sampling morphline command that forwards each input record with a given probability to its child, and silently ignores all other input records
  • [KITE-319] - Add morphline command that ignores all input records beyond the N-th record, thus emitting at most N records, akin to the Unix head command

Task

  • [KITE-313] - Deprecate MetadataProvider usages

Improvement

  • [KITE-247] - The dataset-staging example should run in parallel
  • [KITE-310] - addValuesIfAbsent morphline command should avoid performance degradation for large N
  • [KITE-312] - Add a module that contains examples for how to unit test Morphline config files and custom Morphline commands
  • [KITE-315] - Improve morphline import performance if all commands are specified via fully qualified class names

Sub-task

  • [KITE-304] - Backport flume DatasetSink changes
  • [KITE-307] - Add DatasetInputFormat
  • [KITE-308] - Add DatasetOutputFormat
  • [KITE-324] - Regression in support for POJOs (Avro reflect) in Crunch dataset source
  • [KITE-325] - Minimize view constraints when filtering entities
  • [KITE-343] - Move partition strategy and field partitioners to SPI
  • [KITE-348] - RefineableView should be spelled without the "e" in the middle (i.e., RefinableView)
  • [KITE-354] - Add View#deleteAll

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.