Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: v0.9.0, v0.9.1
-
Fix Version/s: v0.9.2
-
Component/s: Sinks+Sources
-
Labels:None
Description
DirWatcher is used by TailDirSource to keep a track of which files are added and removed from a directory.
If you supply a regex to TailDirSource then it also supplies the regex to DirWatcher, so that DirWatcher only informs TailDirSource about the files that match the regex.
For new files that are added to the directory, DirWatcher checks the name of the file against the regex and if there is a match, will tell TailDirSource about the new file.
But for files that are deleted from the directory, DirWatcher does not match the file's name against the regex and therefore tells TailDirSource about every file that is deleted, regardless of whether it would have matched the regex or not.
For files that do not match the regex but DirWatcher still fires a 'File Deleted' event for that file to TailDirSource, the following exception is thrown and the flume node stops streaming data:
Exception in thread "DirWatcher" java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:75)
at com.cloudera.flume.handlers.text.TailSource.removeCursor(TailSource.java:403)
at com.cloudera.flume.handlers.text.TailDirSource$1.fileDeleted(TailDirSource.java:102)
at com.cloudera.util.dirwatcher.DirWatcher.fireDeletedFile(DirWatcher.java:181)
at com.cloudera.util.dirwatcher.DirWatcher.check(DirWatcher.java:148)
at com.cloudera.util.dirwatcher.DirWatcher$Periodic.run(DirWatcher.java:111)
This is because TailDirSource was never told to add this file and therefore never had a cursor to the file, so when it tries to delete the file, it can't find the cursor and a null object is sent to the TailSource.removeCursor() method.