Details
-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Won't Fix
-
Affects Version/s: CDH3u3
-
Component/s: Hadoop Common
-
Labels:None
-
Environment:Debian Wheezy 64-bit
uname -a = "Linux desktop 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux"
cat /etc/issue = "Debian GNU/Linux wheezy/sid \n \l"
/etc/apt/sources.list = "
deb http://ftp.us.debian.org/debian/ wheezy main contrib non-free
deb-src http://ftp.us.debian.org/debian/ wheezy main contrib non-free
deb http://security.debian.org/ wheezy/updates main contrib non-free
deb-src http://security.debian.org/ wheezy/updates main contrib non-free
deb http://archive.cloudera.com/debian squeeze-cdh3 contrib
deb-src http://archive.cloudera.com/debian squeeze-cdh3 contrib"
Hadoop specific configuration (disabled permissions, pseudo-distributed mode, replication set to 1, from my own blog post here: http://j.mp/tsVBR4Debian Wheezy 64-bit uname -a = "Linux desktop 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux" cat /etc/issue = "Debian GNU/Linux wheezy/sid \n \l" /etc/apt/sources.list = " deb http://ftp.us.debian.org/debian/ wheezy main contrib non-free deb-src http://ftp.us.debian.org/debian/ wheezy main contrib non-free deb http://security.debian.org/ wheezy/updates main contrib non-free deb-src http://security.debian.org/ wheezy/updates main contrib non-free deb http://archive.cloudera.com/debian squeeze-cdh3 contrib deb-src http://archive.cloudera.com/debian squeeze-cdh3 contrib" Hadoop specific configuration (disabled permissions, pseudo-distributed mode, replication set to 1, from my own blog post here: http://j.mp/tsVBR4
Description
When running MR jobs on Hadoop after installing CDH3u3 all map tasks fail with the following kind of error message:
[exec] 12/02/03 09:50:58 INFO mapred.JobClient: Task Id : attempt_201202030949_0001_m_000000_0, Status : FAILED
[exec] Map output lost, rescheduling: getMapOutput(attempt_201202030949_0001_m_000000_0,0) failed :
[exec] EINVAL: Invalid argument
[exec] at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method)
[exec] at org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:177)
[exec] at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4026)
[exec] at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
[exec] at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
[exec] at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
[exec] at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
[exec] at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829)
[exec] at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
[exec] at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
[exec] at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
This is after installing Hadoop with the following apt-get command:
apt-get install hadoop-0.20 hadoop-0.20-namenode hadoop-0.20-datanode hadoop-0.20-jobtracker hadoop-0.20-tasktracker
This command picks up hadoop-0.20-native automatically so it will come up potentially for lots of users running Wheezy. While it's unsupported at the moment it probably makes sense to investigate it since other distros could be affected.
harshj/QwertyM on IRC mentioned that this package could be the problem and suggested a workaround.
Workarounds include:
- Removing the native package and restarting Hadoop ("apt-get remove hadoop-0.20-native" / "stop-all.sh ; start-all.sh")
- harshj - Setting mapred.tasktracker.shuffle.fadvise to false in mapred-site.xml
Both workarounds work on my installation.
This does not affect the functionality of HDFS. "hadoop fs" commands like put, ls, cat, rmr all work properly. Only MR is affected.