Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 1.2.0
-
Component/s: None
-
Labels:None
-
Environment:Fedora 13 64 bit
Description
Fetching blob data from a BlobRef using getData() returns an array that contains the correct data but is larger than the blob and zero padded.
Replication instructions:
Import a blob column into hdfs as a SequenceFile.
Read blob from hdfs code similar to the following:
Configuration config = new Configuration(); config.set("fs.default.name", "hdfs://localhost:9000"); FileSystem hdfs = FileSystem.get(config); Path path = new Path(TABLE_NAME); FileStatus[] statuses = hdfs.listStatus(path); for (FileStatus status : statuses) { if (status.getPath().getName().startsWith("part-m-")) { SequenceFile.Reader reader = new SequenceFile.Reader(hdfs, status.getPath(), config); LongWritable key = new LongWritable(); @SuppressWarnings("unchecked") SqoopRecord value = ((Class<SqoopRecord>)reader.getValueClass()).getConstructor().newInstance(); while (reader.next(key, value)) { Map<String,Object> fields = value.getFieldMap(); BlobRef blob = (BlobRef)fields.get(BLOB_COLUMN); byte[] data = blob.getData(); } } }
data will be a byte[] that contains the blob but is longer and zero padded.