Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.1.0
-
Fix Version/s: 1.2.0
-
Component/s: Data Module
-
Labels:None
Description
I'm still trying to track down the issue but when upgrading a project to use Kite 1.1.0 I started getting some test failures. The tests use MiniDFSCluster with local job tracker and I am currently on CDH 5.4.2
Looking through the logs of the tests I see messages like the following where it is creating the DatasetWriter.
2015-06-23 11:20:09,516 [pool-18-thread-1] INFO org.kitesdk.data.spi.filesystem.FileSystemWriter - Opened output appender AvroAppender{path=hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/.4c0bdb72-f644-46c3-9780-7566f93ff564.avro.tmp, schema={"type":"record","name":"WolfeEvent","namespace":"com.cerner.wolfe","doc":"Represents a data event that flows through Wolfe.","fields":[{"name":"entityKey","type":{"type":"string","avro.java.string":"String"},"doc":"String representation of the event's EntityKey."},{"name":"entityType","type":{"type":"string","avro.java.string":"String"},"doc":"String representation of the event's EntityType."},{"name":"source","type":{"type":"string","avro.java.string":"String"},"doc":"Consumer Key of the event's originating source."},{"name":"version","type":"long","doc":"The consumer-specified version of the event."},{"name":"creationTime","type":"long","doc":"The time in milliseconds since epoch when the event was created in Wolfe."},{"name":"extractionTime","type":"long","doc":"The time in milliseconds since epoch when the event was extracted from Wolfe's staging or -1 if this value has not been set.","default":-1},{"name":"value","type":["null","bytes"],"doc":"The optional payload of the event.","default":null},{"name":"metadata","type":["null","bytes"],"doc":"The optional metadata of the event.","default":null},{"name":"properties","type":[{"type":"map","values":{"type":"string","avro.java.string":"String"},"avro.java.string":"String"},"null"],"doc":"Optional properties of the event.","default":{}},{"name":"operation","type":["null",{"type":"string","avro.java.string":"String"}],"doc":"Type of operation for the event.","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-908318914_1, ugi=mw010351 (auth:SIMPLE)]], enableCompression=true, dataFileWriter=org.apache.avro.file.DataFileWriter@6a013bf7, writer=org.apache.avro.reflect.ReflectDatumWriter@77080b7f} for hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/4c0bdb72-f644-46c3-9780-7566f93ff564.avro
The test however then fails the pipeline and the following message appears in the logs...
2015-06-23 11:20:12,112 [Thread-129] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local420942868_0001 java.lang.Exception: org.kitesdk.data.DatasetIOException: Could not move contents of hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 to hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/job_local420942868_0001/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) Caused by: org.kitesdk.data.DatasetIOException: Could not move contents of hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 to hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/job_local420942868_0001/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 at org.kitesdk.data.spi.filesystem.FileSystemUtil.stageMove(FileSystemUtil.java:117) at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:400) at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:62) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitTask(DatasetKeyOutputFormat.java:426) at org.apache.hadoop.mapred.Task.commit(Task.java:1163) at org.apache.hadoop.mapred.Task.done(Task.java:1025) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: File hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:658) at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:104) at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:716) at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:712) at org.kitesdk.data.spi.filesystem.FileSystemUtil.stageMove(FileSystemUtil.java:96) ... 12 more 1 job failure(s) occurred:
I haven't tracked down where yet but looking at where the FileWriter is pointing:
hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/4c0bdb72-f644-46c3-9780-7566f93ff564.avro
And then comparing with the files that are trying to be moved:
hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00
You'll notice that that
%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64 becomes
%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64
So it is re-escaping the URI or specifically the percent sign.
Attachments
Issue Links
- links to