Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Duplicate
-
Affects Version/s: 1.0.0
-
Fix Version/s: None
-
Component/s: Data Module
-
Labels:None
Description
If a job is ran with the default setting for mapreduce.am.max-attempts or a value greater than 1, then subsequent attempts are doomed to fail with exceptions like the following:
Job setup failed : org.kitesdk.data.DatasetExistsException: Descriptor directory already exists: hdfs://cluster/foo/default/.temp/job_1427843847272_2226/mr/job_1427843847272_2226/.metadata at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:192) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:136) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.createJobDataset(DatasetKeyOutputFormat.java:537) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.access$200(DatasetKeyOutputFormat.java:64) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.setupJob(DatasetKeyOutputFormat.java:358) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:233) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
This is because subsequent attempts have the same jobId.
Attachments
Issue Links
- duplicates
-
KITE-735 Crunch-based MapReduce fails when writing to a View with provided values
-
- Resolved
-