Hue should expose various perf metrics:
- Page return time. We can capture that per-URL (without GET params) and aggregate up to per-app.
- RPC call time. We make various RPC calls to other systems (WebHDFS, HS2, Impala, Oozie, Sentry, SAML, etc). We should capture timing stats for every call type and also report a per-external system aggregate.
For every metric, we should report on min, max, mean, stddev, and last. With exponential decay. RPC metrics often depend on the amount of work done, e.g. retrieving 1000 things vs retrieving 1 thing. So we could also report on the normalized value.
We can expose these in JSON format at a /metrics endpoint.