Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-972

Add PartitionView for data management operations generally associated with partitions

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.1.0
    • Component/s: None
    • Labels:
      None

      Description

      As discussed on the mailing list, [1] add a PartitionView to support operations that use partitions in other Hadoop-based tools, such as HCatalog/Hive.

      The semantics of a PartitionView should be refined as part of this issue, but here's a proposed starting point:

      • A PartitionView is simply a Kite View with certain properties.
      • These properties include:
        • A PartitionView is uniquely identified by a set of keys (e.g. {year = 2015, month = 3, day = 26}

          )

        • A PartitionView can be deleted (i.e., view.deleteAll() is guaranteed to work).
        • A PartitionView can be efficiently moved or archived for data management needs – but functions to do so may be out of the scope of this issue.
      • There should be a way to enumerate PartitionViews for a Dataset or View
      • Enumerating all PartitionViews should match the behavior of enumerating partitions in Hive (e.g. "show partitions") if possible. This seems like the Least Surprising behavior and stays consistent for those also using other systems.

      [1]
      https://groups.google.com/a/cloudera.org/forum/#!topic/cdk-dev/RAJIdJSadT0

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                blue Ryan Blue
                Reporter:
                rbrush Ryan Brush
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: