Uploaded image for project: 'Hue (READ ONLY)'
  1. Hue (READ ONLY)
  2. HUE-2720

[oozie] Intermittent 500s when trying to view oozie workflow history v1

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.7.0
    • Fix Version/s: 3.9.0
    • Component/s: con.oozie
    • Labels:
      None
    • Environment:

      CDH5.4.0, CM5.4
      EC2, master nodes: 3x m3.xlarge, compute nodes: 32x d2.xlarge
      ubuntu 12
      Chrome 42/Firefox 32 Mac OS 10.9

      Description

      I'm getting intermittent but repeatable 500s when trying to view the status of some Oozie workflows in Hue – /oozie/list_oozie_workflow endpoint. I turned on debug and have attached the 500 html.

      This started happening after we upgraded to CDH 5.4.0 but it doesn't happen on all jobs. I cannot seem to figure out a pattern for what jobs it fails on what jobs it doesn't. At first I thought it had something to do with the problems we had with the V2 editor (since disabled) – there appeared to be multiple workflows for the same user with the same name. However, I'm still getting the errors even after freshly creating a new workflow (xml is attached). I'm also getting this error when browsing status of workflow jobs from a few long running coordinators (but not all such long running jobs for this user).

      I have tried resynchronizing the DB, but that had no effect.

      Status in the Oozie console looks normal (attached), as it does in the Yarn RM, so I think it's just a Hue problem.

      Sorry for dropping so many bugs on you guys today, but CDH 5.4.0 has been a bit of minefield for us.

        Attachments

          Activity

            People

            • Assignee:
              romain Romain Rigaux
              Reporter:
              jtraupman Jonathan Traupman
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: