Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.2.0
-
Fix Version/s: 1.2.0
-
Component/s: Data Module
-
Labels:None
Description
In doing some testing to validate I could remove a workaround now that KITE-976 was fixed I started getting a similar error but it looks like the stack trace is showing a different code path in the current master branch than what was previously reported...
2015-11-12 08:32:49,608 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: java.net.UnknownHostException: fakedev at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:664) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:608) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.kitesdk.data.spi.filesystem.FileSystemDataset$Builder.build(FileSystemDataset.java:689) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:199) at org.kitesdk.data.Datasets.load(Datasets.java:108) at org.kitesdk.data.Datasets.load(Datasets.java:165) at org.kitesdk.data.mapreduce.DatasetKeyInputFormat.load(DatasetKeyInputFormat.java:305) at org.kitesdk.data.mapreduce.DatasetKeyInputFormat.setConf(DatasetKeyInputFormat.java:241) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.crunch.impl.mr.run.CrunchRecordReader.initNextRecordReader(CrunchRecordReader.java:70) at org.apache.crunch.impl.mr.run.CrunchRecordReader.<init>(CrunchRecordReader.java:49) at org.apache.crunch.impl.mr.run.CrunchInputFormat.createRecordReader(CrunchInputFormat.java:77) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:515) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.net.UnknownHostException: fakedev ... 31 more
The key thing to note is that it looks like the calls are coming through Crunch classes instead of through the DatasetKeyInputFormat class. So this is likely not calling the "createRecordReader(...)"[1] method that needs to be called for this to be fixed.
I'm supplying a supplementary config file when launching like the following and viewing the running job I can see it is all part of the jobs configuration.
<configuration> <property><name>dfs.nameservices</name><value>ingestiondev,fakedev</value></property> <property><name>dfs.client.failover.proxy.provider.fakedev</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property> <property><name>dfs.namenode.servicerpc-address.fakedev.namenode831</name><value>host2net:8022</value></property> <property><name>dfs.namenode.servicerpc-address.fakedev.namenode864</name><value>host1.net:8022</value></property> <property><name>dfs.namenode.rpc-address.fakedev.namenode864</name><value>host1.net:8020</value></property> <property><name>dfs.namenode.rpc-address.fakedev.namenode831</name><value>host2.net:8020</value></property> <property><name>dfs.ha.namenodes.fakedev</name><value>namenode864,namenode831</value></property> </configuration>
Attachments
Issue Links
- related to
-
KITE-976 DatasetKeyInputFormat/DatasetKeyOutputFormat not setting job configuration before loading dataset
-
- Resolved
-