Release Notes - Kite SDK - Version 0.15.0 - HTML format

Bug

  • [KITE-395] - CLI Dataset csv-schema examples do not include the required --class/--recordName setting
  • [KITE-405] - Kite and CDH 5 Issues
  • [KITE-426] - File writer caching should increase if thrashing
  • [KITE-457] - Snappy compression doesn't work with Parquet when Hadoop native snappy libs are not available
  • [KITE-463] - FilesystemDataset.deleteAll() on non-partitioned Dataset throws IllegalStateException
  • [KITE-469] - Dataset CLI should use avro schema URI property
  • [KITE-470] - Identity PartitionStrategy on String values with slashes leads to invalid Paths
  • [KITE-481] - SchemaTool createOrMigrateSchemaDirectory should not require StorageKey
  • [KITE-483] - Hive integration does not use environment's metastore config
  • [KITE-495] - Type inference error in Datasets.create
  • [KITE-505] - TestSchemaCommand unit test failure on OSX
  • [KITE-506] - Add compat for hbase-0.98
  • [KITE-508] - tryRules morphline command should also catch exceptions in doNotify() if catchExceptions : true
  • [KITE-509] - Fix crunch and solr versions for CDH5 test
  • [KITE-517] - Creating constraints from a view URI can result in incompatible string types
  • [KITE-521] - Merge docs changes from 0.14.x
  • [KITE-522] - Video links use target link attribute
  • [KITE-532] - ReadUserDatasetGenericOnePartition example returns no records
  • [KITE-533] - Dataset Compatibility Example fails with error in loader
  • [KITE-534] - Demo example no longer compiles since it uses deprecated methods
  • [KITE-535] - FileSystemDatasetRepository.partitionKeyForPath breaks backwards compatibility
  • [KITE-536] - AvroSerialization.setDataModelClass is not available on CDH4
  • [KITE-537] - Empty views cause MR failures
  • [KITE-538] - Write tests for DataModelUtil
  • [KITE-540] - Datasets.load(String, ? extends GenericRecord) will always use GenericData.Record for reading
  • [KITE-542] - Fix Avro 1.7.4 incompatabilities

Epic

  • [KITE-374] - Support for data compaction util

New Feature

  • [KITE-268] - Add morphline command that parses a non-Avro Parquet file
  • [KITE-454] - Add morphline command that removes all record fields for which the field name matches a blacklist but not a whitelist
  • [KITE-504] - Update to crunch-0.10

Task

  • [KITE-219] - Update HCatalog Maven group ID from org.apache.hcatalog to org.apache.hive.hcatalog
  • [KITE-281] - Push Kite artifacts to maven central
  • [KITE-283] - Move to Apache releases for dependencies
  • [KITE-332] - Call DatasetReader#open inside View#newReader
  • [KITE-407] - Document new PartitionStrategy format
  • [KITE-465] - Convert documentation from GitHub to Jekyll
  • [KITE-484] - Clean up separation between Registration and Datasets
  • [KITE-485] - Document the new CopyCommand in CLI
  • [KITE-490] - Remove methods deprecated in 0.14.0
  • [KITE-491] - Remove methods deprecated in 0.13.0
  • [KITE-498] - Add update command to CLI
  • [KITE-499] - Add a CLI command to help create column mappings
  • [KITE-500] - Add option to create command for column mapping file
  • [KITE-511] - Integrate kite-docs into release process
  • [KITE-514] - Move examples over to the views API

Improvement

  • [KITE-92] - Support use of a generic dataset reader even if reflect pojo is on classpath
  • [KITE-385] - Dataset URIs
  • [KITE-423] - Remove HCatalog dependency
  • [KITE-428] - Separate intro documentation from API intro
  • [KITE-429] - Document new ColumnMapping format
  • [KITE-445] - Hive repositories should open any Kite/Hive table.
  • [KITE-453] - MapReduce API
  • [KITE-455] - Use generic Dataset type in DatasetRepository
  • [KITE-486] - Add EmbeddedSolrServer option to SolrLocator
  • [KITE-496] - Datasets#load, create, and update should take in a Class<E> parameter so you can control the type of the entities you want
  • [KITE-510] - Add PartitionedDataset
  • [KITE-512] - Add DatasetTarget.toString
  • [KITE-525] - Add page titles to kite-docs jekyll
  • [KITE-527] - Code block style should be consistent
  • [KITE-529] - Add Datasets methods without type.
  • [KITE-530] - Deprecate CrunchDatasets#asSource with type argument

Sub-task

  • [KITE-94] - Create CDK runtime convenience Maven dependencies
  • [KITE-192] - Write a tool to convert a dataset to another format
  • [KITE-238] - Deprecate PartitionKey
  • [KITE-240] - Remove bound checking for unrestricted RandomAccessDataset writers
  • [KITE-265] - Hive repositories can see tables that cannot be opened
  • [KITE-367] - Support key mapping that uses a key field rather than a column
  • [KITE-391] - Publish different versions of the Kite Maven Plugin corresponding to the different Hadoop versions
  • [KITE-392] - Make Hadoop 2 the default profile
  • [KITE-456] - Add view constraints for derived partition fields.
  • [KITE-488] - Add reducers to copy task to control the number of output files
  • [KITE-492] - Add Parquet appender that uses Avro for durability

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.