Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-813

Hive uber tasks fail: native snappy library not available

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None
    • Environment:
      CentOS 6

      Description

      Even completely trivial operations fail in hive in uber mode with CDH 5.7.1 (used to work fine in CDH 5.4.4).

      I'm attaching a minimal example parquet data file - you can create a table on top of it like so:

      CREATE TABLE `testuber`(
        `date` string, 
        `domain` string, 
        `countrycode` string, 
        `platform` string, 
        `day_of_week` tinyint, 
        `pageviews` bigint, 
        `unique_visitors` bigint, 
        `day` string, 
        `year` string, 
        `month` string)
      ROW FORMAT SERDE 
        'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
      STORED AS INPUTFORMAT 
        'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
      OUTPUTFORMAT 
        'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
      LOCATION
        'hdfs://nameservice1/tmp/hivebug'
      

      and trigger the crash via hive's beeline like so:

      set mapreduce.job.ubertask.enable=true;
      select count(*) from testuber;
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              dwatzke David Watzke
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: