[KITE-144] Support loading of dataset repositories from URIs - Cloudera Open Source

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.7.0
Fix Version/s: 0.8.0
Component/s: Data Module
Labels:
None

Description

It would be really nice to support loading the correct loading of DatasetRepository instances from a URI.

I currently have code that uses the following format for URIs:
dsr:<storage component>

<storage component> can be one of the following:

file:<path> where <path> is relative or absolute, and indicates the root directory. It is not legal to have an authority in a file: storage component. It's also legal to specify this storage component using the null-authority version that is also common in the wild: e.g. file://<path> in which case, the path must always be absolute.
hdfs://<host>:<port>/<path> where <host> and <port> are required, and <path> designates the root directory. Path may not be relative.
hive://<host>:<port>/<database> where <host>, <port>, <database> are required. The authority (host+port) indicate the metastore server to connect to.

The hive: storage component implementation is currently incomplete. I'll open another JIRA for that as an enhancement. The intention is to let DatasetRepository implementations continue to pick their datasets' paths. In the case of hive://, the thinking is that a dataset created with a specific location in its DatasetDescriptor will function as an external table. All others will be "internal" or "normal" Hive tables. This is entirely independent of ~~CDK-139~~ and does not conflict with it.

All of this is done outside of the existing code as a thin layer atop that simply instantiates things correctly.

URIs are used rather than URLs because these identifiers are opaque locations and not necessarily singular resources, per the RFC.

References:
RFC 2396 (URIs) - http://www.ietf.org/rfc/rfc2396.txt

Attachments

Activity

People

Assignee:

Eric Sammer

Reporter:

Eric Sammer

Votes:

0 Vote for this issue

Watchers:

0 Start watching this issue

Dates

Created:

23/Sep/13 12:41 AM

Updated:

25/Sep/13 2:43 PM

Resolved:

25/Sep/13 2:43 PM