Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-749

Hive: select count(*) from <table> always show only one record in table

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: CDH 5.4.0
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None

      Description

      Problem

      I have a tables in Hive (secured cluster) with multiple records in it. When I execute a SQL on beeline console:

      0: jdbc:hive2://host> select count(*) smallairport;
      INFO  : Number of reduce tasks determined at compile time: 1
      INFO  : In order to change the average load for a reducer (in bytes):
      INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
      INFO  : In order to limit the maximum number of reducers:
      INFO  :   set hive.exec.reducers.max=<number>
      INFO  : In order to set a constant number of reducers:
      INFO  :   set mapreduce.job.reduces=<number>
      DEBUG : Configuring job job_1440059170439_0031 with /user/hive/.staging/job_1440059170439_0031 as the submit dir
      DEBUG : adding the following namenodes' delegation tokens:[hdfs://host:8020]
      WARN  : Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
      DEBUG : default FileSystem: hdfs://host:8020
      DEBUG : Creating splits at hdfs://host:8020/user/hive/.staging/job_1440059170439_0031
      INFO  : number of splits:1
      INFO  : Submitting tokens for job: job_1440059170439_0031
      INFO  : Kind: HDFS_DELEGATION_TOKEN, Service: host:8020, Ident: (HDFS_DELEGATION_TOKEN token 49 for hive)
      INFO  : The url to track the job: http://host:8088/proxy/application_1440059170439_0031/
      INFO  : Starting Job = job_1440059170439_0031, Tracking URL = http://host:8088/proxy/application_1440059170439_0031/
      INFO  : Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1440059170439_0031
      INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
      INFO  : 2015-08-24 06:50:27,696 Stage-1 map = 0%,  reduce = 0%
      INFO  : 2015-08-24 06:50:33,901 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.12 sec
      INFO  : 2015-08-24 06:50:42,158 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.7 sec
      INFO  : MapReduce Total cumulative CPU time: 2 seconds 700 msec
      INFO  : Ended Job = job_1440059170439_0031
      Getting log thread is interrupted, since query is done!
      +---------------+--+
      | smallairport  |
      +---------------+--+
      | 1             |
      +---------------+--+
      1 row selected (33.023 seconds)

      but, when I execute the command:

      select * from smallairport
      |
      |
      |The complete listing of table
      59,906 rows selected (17.096 seconds)
      

      This looks like a problem with Hive.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              sanyalsubroto Subroto Sanyal
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: