We've directly copied our third party dependencies into our repository, which could lead us to accidentally forking ourselves from those projects. There are two strategies I can think of:
- We leverage pip to download our dependencies, just like how the rest of the hadoop projects use maven to fetch their dependencies. We can use python wheels if we want to support installing hue without contacting the internet.
- We factor out all our dependencies into git submodules. This would allow us to track the upstream projects at the same time as cleanly separating out our customizations. This would simplify us upgrading our dependencies and make it easier to push our fixes upstream.