Uploaded image for project: 'Sqoop (READ-ONLY)'
  1. Sqoop (READ-ONLY)
  2. SQOOP-177

The number of executed map tasks is not equal to the number specified by "-m"

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.2.0
    • Fix Version/s: None
    • Component/s: import
    • Labels:
    • Environment:
      CentOS 5.4, CDH3B4

      Description

      The number of executed map tasks is not equal to the specified argument.

      For example, specify the number of map tasks as 2 using the "-m" argument as follows:

      $ sqoop import --connect jdbc:postgresql://192.168.30.30:5432/sqoop_db --table table1 -m 2

      log shows:

      11/03/09 19:11:57 INFO mapred.JobClient: Running job: job_201103091445_0012
      11/03/09 19:11:58 INFO mapred.JobClient: map 0% reduce 0%
      11/03/09 19:12:03 INFO mapred.JobClient: map 66% reduce 0%
      11/03/09 19:12:04 INFO mapred.JobClient: map 100% reduce 0%
      11/03/09 19:12:04 INFO mapred.JobClient: Job complete: job_201103091445_0012
      11/03/09 19:12:04 INFO mapred.JobClient: Counters: 12
      11/03/09 19:12:04 INFO mapred.JobClient: Job Counters
      11/03/09 19:12:04 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=9909
      11/03/09 19:12:04 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
      11/03/09 19:12:04 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
      11/03/09 19:12:04 INFO mapred.JobClient: Launched map tasks=3
      11/03/09 19:12:04 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
      11/03/09 19:12:04 INFO mapred.JobClient: FileSystemCounters
      11/03/09 19:12:04 INFO mapred.JobClient: HDFS_BYTES_READ=324
      11/03/09 19:12:04 INFO mapred.JobClient: FILE_BYTES_WRITTEN=159311
      11/03/09 19:12:04 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=18786
      11/03/09 19:12:04 INFO mapred.JobClient: Map-Reduce Framework
      11/03/09 19:12:04 INFO mapred.JobClient: Map input records=1000
      11/03/09 19:12:04 INFO mapred.JobClient: Spilled Records=0
      11/03/09 19:12:04 INFO mapred.JobClient: Map output records=1000
      11/03/09 19:12:04 INFO mapred.JobClient: SPLIT_RAW_BYTES=324

      The starting number of map tasks is "Launched map tasks=3" according to the log.

      HDFS directory shows:

      $ hadoop fs -ls /user/sqoop/table1
      Found 4 items
      rw-rr- 1 sqoop supergroup 0 2011-03-09 19:12 /user/sqoop/table1/_SUCCESS
      rw-rr- 1 sqoop supergroup 9265 2011-03-09 19:12 /user/sqoop/table1/part-m-00000
      rw-rr- 1 sqoop supergroup 9481 2011-03-09 19:12 /user/sqoop/table1/part-m-00001
      rw-rr- 1 sqoop supergroup 40 2011-03-09 19:12 /user/sqoop/table1/part-m-00002

      These generated files shows that there were 3 map tasks.

      From these results, the number of executed map tasks is not equalled to the specified number of map tasks.
      The patch file for this issue is submitted.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              yamashitamr MARIKO YAMASHITA
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: