Uploaded image for project: 'Hue (READ ONLY)'
  1. Hue (READ ONLY)
  2. HUE-2861

[core] Not timing out getting thrift connections from the pool

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.9.0
    • Fix Version/s: 3.9.0
    • Component/s: core.api
    • Labels:
      None

      Description

      Hue does not appear to be setting a timeout when it attempts to get a connection out of the thrift connection pool. This is exhibited by stack traces that look like this:

      Thread CP WSGIServer Thread-4 140060258342656 (most recent call last):
        File "/usr/lib64/python2.6/threading.py", line 504, in __bootstrap
          self.__bootstrap_inner()
        File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
          self.run()
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/wsgiserver.py", line 1294, in run
          conn.communicate()
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/wsgiserver.py", line 1196, in communicate
          req.respond()
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/wsgiserver.py", line 568, in respond
          self._respond()
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/wsgiserver.py", line 580, in _respond
          response = self.wsgi_app(self.environ, self.start_response)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/handlers/wsgi.py", line 206, in __call__
          response = self.get_response(request)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/handlers/base.py", line 112, in get_response
          response = wrapped_callback(request, *callback_args, **callback_kwargs)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/db/transaction.py", line 371, in inner
          return func(*args, **kwargs)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/views.py", line 592, in install_examples
          beeswax.management.commands.beeswax_install_examples.Command().handle(app_name=app_name, user=request.user)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/management/commands/beeswax_install_examples.py", line 68, in handle
          self._install_tables(user, app_name, tables)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/management/commands/beeswax_install_examples.py", line 96, in _install_tables
          table.install(django_user)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/management/commands/beeswax_install_examples.py", line 135, in install
          if self.create(django_user):
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/management/commands/beeswax_install_examples.py", line 156, in create
          results = db.execute_and_wait(query)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 479, in execute_and_wait
          handle = self.client.query(query)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 863, in query
          return self._client.execute_async_query(query, statement)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 650, in execute_async_query
          return self.execute_async_statement(statement=query_statement, confOverlay=configuration)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 668, in execute_async_statement
          res = self.call(self._client.ExecuteStatement, req)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 320, in __getattr__
          superclient = _connection_pool.get_client(self.conf)
        File "/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.969/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 208, in get_client
          block=True, timeout=this_round_timeout)
        File "/usr/lib64/python2.6/Queue.py", line 168, in get
          self.not_empty.wait()
        File "/usr/lib64/python2.6/threading.py", line 239, in wait
          waiter.acquire()
      

      After some debugging, we found that while the thrift_util.get_client code supports a timeout when fetching a connection from the thread pool, it's not actually specified in this stack trace. It appears that this was accidentally removed way back in 2010 with this patch:

      -    superclient = _connection_pool.get_client(self.klass, self.host, self.port,
      -                                              kerberos_principal=self.kerberos_principal,
      -                                              get_client_timeout=self.timeout_seconds,
      -                                              service_name=self.service_name)
      +    superclient = _connection_pool.get_client(self.conf)
      

      I propose we add back the get_client_timeout. This may introduce some risk since this code has been in production for 5 years, but the default timeout_seconds is 2 minutes, so it seems like a pretty reasonable time to wait.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                erickt Erick Tryzelaar
                Reporter:
                erickt Erick Tryzelaar
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: