Uploaded image for project: 'Kite SDK (READ-ONLY)'
  1. Kite SDK (READ-ONLY)
  2. KITE-1023

Persisting to dataset fails on moving file because of double encoding

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: Data Module
    • Labels:
      None

      Description

      I'm still trying to track down the issue but when upgrading a project to use Kite 1.1.0 I started getting some test failures. The tests use MiniDFSCluster with local job tracker and I am currently on CDH 5.4.2

      Looking through the logs of the tests I see messages like the following where it is creating the DatasetWriter.

      2015-06-23 11:20:09,516 [pool-18-thread-1] INFO  org.kitesdk.data.spi.filesystem.FileSystemWriter - Opened output appender AvroAppender{path=hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/.4c0bdb72-f644-46c3-9780-7566f93ff564.avro.tmp, schema={"type":"record","name":"WolfeEvent","namespace":"com.cerner.wolfe","doc":"Represents a data event that flows through Wolfe.","fields":[{"name":"entityKey","type":{"type":"string","avro.java.string":"String"},"doc":"String representation of the event's EntityKey."},{"name":"entityType","type":{"type":"string","avro.java.string":"String"},"doc":"String representation of the event's EntityType."},{"name":"source","type":{"type":"string","avro.java.string":"String"},"doc":"Consumer Key of the event's originating source."},{"name":"version","type":"long","doc":"The consumer-specified version of the event."},{"name":"creationTime","type":"long","doc":"The time in milliseconds since epoch when the event was created in Wolfe."},{"name":"extractionTime","type":"long","doc":"The time in milliseconds since epoch when the event was extracted from Wolfe's staging or -1 if this value has not been set.","default":-1},{"name":"value","type":["null","bytes"],"doc":"The optional payload of the event.","default":null},{"name":"metadata","type":["null","bytes"],"doc":"The optional metadata of the event.","default":null},{"name":"properties","type":[{"type":"map","values":{"type":"string","avro.java.string":"String"},"avro.java.string":"String"},"null"],"doc":"Optional properties of the event.","default":{}},{"name":"operation","type":["null",{"type":"string","avro.java.string":"String"}],"doc":"Type of operation for the event.","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-908318914_1, ugi=mw010351 (auth:SIMPLE)]], enableCompression=true, dataFileWriter=org.apache.avro.file.DataFileWriter@6a013bf7, writer=org.apache.avro.reflect.ReflectDatumWriter@77080b7f} for hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/4c0bdb72-f644-46c3-9780-7566f93ff564.avro
      

      The test however then fails the pipeline and the following message appears in the logs...

      2015-06-23 11:20:12,112 [Thread-129] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local420942868_0001
      java.lang.Exception: org.kitesdk.data.DatasetIOException: Could not move contents of hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 to hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/job_local420942868_0001/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00
      	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
      Caused by: org.kitesdk.data.DatasetIOException: Could not move contents of hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 to hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/job_local420942868_0001/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00
      	at org.kitesdk.data.spi.filesystem.FileSystemUtil.stageMove(FileSystemUtil.java:117)
      	at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:400)
      	at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:62)
      	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitTask(DatasetKeyOutputFormat.java:426)
      	at org.apache.hadoop.mapred.Task.commit(Task.java:1163)
      	at org.apache.hadoop.mapred.Task.done(Task.java:1025)
      	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.FileNotFoundException: File hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00 does not exist.
      	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:658)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:104)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:716)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:712)
      	at org.kitesdk.data.spi.filesystem.FileSystemUtil.stageMove(FileSystemUtil.java:96)
      	... 12 more
      1 job failure(s) occurred:
      

      I haven't tracked down where yet but looking at where the FileWriter is pointing:

      hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000001_0/entitytype_partition=%2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64/source_partition=source-2/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00/4c0bdb72-f644-46c3-9780-7566f93ff564.avro
      

      And then comparing with the files that are trying to be moved:

      hdfs://localhost:55425/wolfe/default/.temp/job_local420942868_0001/mr/attempt_local420942868_0001_r_000000_0/entitytype_partition=%252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64/source_partition=source-1/extractiontime_year_partition=1970/extractiontime_month_partition=01/extractiontime_day_partition=01/extractiontime_hour_partition=00
      

      You'll notice that that
      %2Fsource%3Astring%2Ftype_4%3Aint64%2Fpayload%3Aint64 becomes
      %252Fsource%253Astring%252Ftype_0%253Aint64%252Fpayload%253Aint64

      So it is re-escaping the URI or specifically the percent sign.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                blue Ryan Blue
                Reporter:
                mkwhitacre Micah Whitacre
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: