Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: v0.9.0, v0.9.1
-
Component/s: Sinks+Sources
-
Labels:None
Description
I tried to use output bucketing based on the scribe category by specifying
collectorSink("hdfs://localhost:9000/somepath/%
as the sink.
However, %{scribe.category}
does not get replaced, and shows up literally in the path name.
After some poking around, it turns out that the regular expression used to match tags is too restricted in what it matches:
final public static String TAG_REGEX = "\\%(\\w|\\%)|\\%
";
The \w character class is equivalent to [a-zA-Z0-9_], so it will never match tags including a dot.
The regex should be expanded to match dots, and possibly also underscores. Maybe even any character that is not a closing curly brackets:
final public static String TAG_REGEX = "\\%(\\w|\\%)|\\%
";
or
final public static String TAG_REGEX = "\\%(\\w|\\%)|\\%
]+)
}";
It could be even more elaborate (e.g. it could allow single or double quotes so the tags themselves could contain curly brackets), but I guess it's a much better idea to just keep things reasonable