Uploaded image for project: 'Kite SDK'
  1. Kite SDK
  2. KITE-600

Sporadic DatasetNotFoundException

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.15.0
    • Fix Version/s: 0.17.0
    • Component/s: None
    • Labels:
      None

      Description

      I have a Hadoop process running on a CDH 4.4 cluster that is scheduled to run multiple times per day which uses Kite+Crunch to write out some Avro data. About 50% of the time, the process fails with the following exception:

      2014-08-19 05:26:39,165 WARN org.apache.hadoop.mapred.Child: Error running child
      org.kitesdk.data.DatasetNotFoundException: Descriptor location is missing: hdfs://host/my/path/datasetname/.temp/job_201408151403_31467/job_201408151403_31467/.metadata
          at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.checkExists(FileSystemMetadataProvider.java:426)
          at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.load(FileSystemMetadataProvider.java:108)
          at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:216)
          at org.kitesdk.data.spi.AbstractDatasetRepository.load(AbstractDatasetRepository.java:41)
          at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadJobDataset(DatasetKeyOutputFormat.java:444)
          at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptDataset(DatasetKeyOutputFormat.java:455)
          at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getRecordWriter(DatasetKeyOutputFormat.java:329)
          at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:548)
          at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:653)
          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
          at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
          at org.apache.hadoop.mapred.Child.main(Child.java:262)
      

      This originated on this mailing list thread.

        Attachments

          Activity

            People

            • Assignee:
              blue Ryan Blue
              Reporter:
              allan Allan Shoup
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: