Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-54

Crunch dataset sources only read first file in directory

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.3.0
    • Component/s: Data Module
    • Labels:
      None

      Description

      FileSystemDataset#getCrunchSource only uses the first path in the directory. It should either use all of them (if that's possible in Crunch), or it should use the directory (and fail if the dataset is partitioned - so you have to specify a leaf partition). The latter would be simpler, but doesn't currently work since Crunch's CompositePathIterable#FILTER doesn't ignore hidden files (only those that start with _).

        Attachments

          Activity

            People

            • Assignee:
              tom Tom White
              Reporter:
              tom Tom White
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: