[DISTRO-76] Map reduce fails when providing an S3N folder as input - Cloudera Open Source

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: CDH3b3
Fix Version/s: CDH3b4
Component/s: MapReduce
Labels:
- S3
Environment:
Ubuntu Linux on EC2

Description

Bug is on cdh version 0.20.2+737.

Steps to reproduce:
1. Create an S3 bucket and folder.
2. Upload 1 file to the folder.
3. Run the identity map reduce job on that folder.
4. It will incorrectly say that the number of inputs to process is 2, erroneously counting the folder as an input.
5. In the middle of the map-reduce job, it will fail with a null pointer exception and the job wont complete.

The same map reduce job works perfectly fine on apache hadoop 0.21 (latest release from apache).

java.lang.NullPointerException
at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:129)
at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
at org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:171)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:208)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:317)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
at org.apache.hadoop.mapred.Child.main(Child.java:211)

Attachments

Activity

People

Assignee:

Tom White

Reporter:

bc Wong

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

15/Jan/11 2:53 AM

Updated:

28/Jan/11 8:46 PM

Resolved:

28/Jan/11 8:46 PM