Uploaded image for project: 'CDH (READ-ONLY)'
  2. DISTRO-507

Pig temp files not deleted when run in Web App


    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: CDH4.3.0
    • Fix Version/s: None
    • Component/s: Pig
    • Labels:
    • Environment:
      Linux CentOS 6.4, CDH4.3, Jetty 6


      We are executing Pig, via PigRunner, in a web app, but the temporary files being created by Pig are not being deleted until the web app is shutdown. This causes the /tmp directory to become very cluttered very quickly, and run the risk of filling it up over time.

      The work started in FileLocalizer with the deleteTempFiles() is an excellent start, but it does not go far enough. If it was fully implemented then we could use that method to delete those temp file after each Pig job has completed.

      What needs to happen is that the creation of temp files needs to always be passed thru the FileLocalizer.getTemporaryPath() instead of using the File.createTempFile() as it is now.

      The places this needs to be implemented:

      1) FileLocalizer creation of localTempDir

      2) JobControlCompiler.getJob() creation of job jar

      3) DefaultAbstractBag creation of pigbag for spills

      If these temp files are then stored into the already existing ThreadLocal<Deque<ElementDescriptor>>() then we could use the FileLocalizer.deleteTempFiles() to clean up after each Pig job and not need to restart the web app.




            • Assignee:
              bfreitas Bob Freitas
            • Votes:
              0 Vote for this issue
              0 Start watching this issue


              • Created: