Uploaded image for project: 'CDH (READ-ONLY)'
  1. CDH (READ-ONLY)
  2. DISTRO-507

Pig temp files not deleted when run in Web App

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: CDH4.3.0
    • Fix Version/s: None
    • Component/s: Pig
    • Labels:
    • Environment:
      Linux CentOS 6.4, CDH4.3, Jetty 6

      Description

      We are executing Pig, via PigRunner, in a web app, but the temporary files being created by Pig are not being deleted until the web app is shutdown. This causes the /tmp directory to become very cluttered very quickly, and run the risk of filling it up over time.

      The work started in FileLocalizer with the deleteTempFiles() is an excellent start, but it does not go far enough. If it was fully implemented then we could use that method to delete those temp file after each Pig job has completed.

      What needs to happen is that the creation of temp files needs to always be passed thru the FileLocalizer.getTemporaryPath() instead of using the File.createTempFile() as it is now.

      The places this needs to be implemented:

      1) FileLocalizer creation of localTempDir

      2) JobControlCompiler.getJob() creation of job jar

      3) DefaultAbstractBag creation of pigbag for spills

      If these temp files are then stored into the already existing ThreadLocal<Deque<ElementDescriptor>>() then we could use the FileLocalizer.deleteTempFiles() to clean up after each Pig job and not need to restart the web app.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              bfreitas Bob Freitas
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated: