Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-533

Solr server watchdog makes heartbeat request using server short hostname, not FQDN

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: CDH4.4.0
    • Fix Version/s: CDH5.0.0, search-1.1.0
    • Component/s: Search
    • Labels:
      None

      Description

      Hello,
      Using CDH 4.4 and CM 4.7, I installed Solr.

      30 seconds after starting the Solr services, watchdog killed the service as the heartbeats were failing.

      The reason the heartbeats were failing was, I believe, environmental/configuration. We are running in a corporate network, with an HTTP proxy that forces DNS lookup on the corporate name server and overrides any local DNS configuration (to protect against malware/DNS forgeries on the client – this point is key).

      In our environment, our Hadoop nodes have the following environmental configuration:
      1) A local DNS server, for .hadoop.mydomain.com, that is only reachable on the subnet where hadoop is running. that is, this hadoop subdomain is not in the corporate/global DNS
      2) An http_proxy env set, to the corporate http proxy, so these nodes can access the internet
      3) A no_proxy env set, which includes .hadoop.mydomain.com, so that http requests to the (local) hadoop subdomain are not routed through the corporate proxy.

      Our nodes have FQDNs that look like myserver.hadoop.mydomain.com. Again, these name records are in the local DNS server, not in the corporate nameserver.

      The watchdog issues an HTTP request to the nodes' short hostname, not a fully qualified hostname. Specifically, it will make a request to http://mysolr1:8939/solr.
      Because the shortname is NOT configured in the no_proxy env variable (nor can it be in a maintainable way at scale), this request is routed through the corporate proxy, which cannot resolve mysolr1 and fails.

      If, on the other hand, the watchdog made a request to the FQDN of the solr server, i.e. http://mysolr1.hadoop.mydomain.com:8939/solr, the no_proxy rule would match, the request would not be routed through the proxy, and the call would be successful.

      My workaround for this was to use the environment variable safety valve in CM for Solr, to explicitly unset http_proxy for the Solr service.

        Attachments

          Activity

            People

            • Assignee:
              rvs Roman V Shaposhnik
              Reporter:
              jtravaglini Joe Travaglini
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: