Uploaded image for project: 'Hue (READ ONLY)'
  1. Hue (READ ONLY)
  2. HUE-3035

[metastore] "Data Browser" and "Sample" on partitioned tables in can put significant stress on HMS

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.9.0
    • Fix Version/s: 3.10.0
    • Component/s: app.catalog
    • Labels:
      None

      Description

      The Hue Metastore browser allows users to browse data and view sample data from tables. For heavily partitioned tables, this can put significant stress on the HMS. This is because Hue runs essentially a full table scan with a limit: "SELECT * <db>.<tbl> LIMIT X" on the target table, which results in all partitions in the table being fetched. This can take an ordinate amount of time and stress the backend database.

      Would it make sense for Hue to first load the table metadata to find our the partition keys, then either:
      1) Harder - Loop over each partition key and issue a SELECT query with a filter on the key, until the limit is reached
      2) Easier - Add a warning or block this action for users if the # of partitions is > N (say 100).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jennykim Jenny Kim
                Reporter:
                lskuff Lenni Kuff
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: