I am trying to use string template to generate Pig/Hadoop code. Since I am a novice I couldn't figure it out myself. Any help will be appreciated.
I have a List of LocalDate like one show below
List<LocalDate> dates = Arrays.asList("20100101", "20100102").stream().map(d -> LocalDate.parse(d,formatter)).collect(Collectors.toList());
The list can have 1 dates or many dates.
If the list "dates" contains more than one element then I would like to generate:
SPLIT finalizedEvents INTO splitByDay_20100101 IF dataDate == 20100101,
INTO splitByDay_20100102 IF dataDate == 20100102, ....; // for all date in "dates" list
// similarly for all dates
// formatting substitution variable e.g. 2010/01/01 instead of 20100101 is needed
STORE splitByDay_20100101 INTO '/a/b/2010/01/01' USING AvroStorage();
STORE splitByDay_20100102 INTO '/a/b/2010/01/02' USING AvroStorage();
If the list "dates" contain one element only then I would like to generate (assume dates = [ 20100101] )
splitByDay_20100101 = FOREACH finalizedEvents GENERATE $0..;
STORE splitByDay_20100101 INTO '/a/b/2010/01/01' USING AvroStorage();
So far I have done something like the following but not sure how to do the conditionals
ST e = new ST("SPLIT finalizedEvents INTO <[dates]:{ d | IF split_<d> BY daysSinceEpoch == <d>}; separator=\", \">;");
e.add("dates", dates);
System.out.println(e.render());
Here is what I came up with (detailed explanation below):
List<LocalDate> dates = new ArrayList<>();
dates.add(LocalDate.parse("20100101", DateTimeFormatter.BASIC_ISO_DATE));
dates.add(LocalDate.parse("20100201", DateTimeFormatter.BASIC_ISO_DATE));
List<List<Character>> charListList = new ArrayList<>();
for (LocalDate date : dates) {
List<Character> charList = new ArrayList<>();
char[] dateCharArray = date.toString().toCharArray();
for (char c : dateCharArray) {
charList.add(c);
}
charListList.add(charList);
}
STGroup dateGroup = new STGroupFile("./src/com/stackoverflow/DateList/dates.stg");
ST dateTemp = dateGroup.getInstanceOf("writeCode");
dateTemp.add("formattedDates", charListList);
dateTemp.add("isSingle", charListList.size() == 1);
System.out.println(dateTemp.render());
writeCode(formattedDates, isSingle) ::= <<
<if(isSingle)><writeSingleStuff(formattedDates)>
<else><writeMultipleStuff(formattedDates)>
<endif>
<writeStoreList(formattedDates)>
>>
writeSingleStuff(date)::= "<date:{d|splitByDay_<wordReplaceWSlash(d)> = FOREACH finalizedEvents GENERATE $0..;}>"
writeMultipleStuff(rawDates)::= "SPLIT finalizedEvents <rawDates:{d|INTO splitByDay_<wordReplaceWEmpty(d)> IF dataDate == <wordReplaceWEmpty(d)>}; separator=\", \">;"
writeStoreList(formattedDates)::= "<formattedDates:{d|STORE splitByDay_<wordReplaceWEmpty(d)> INTO '/a/b/<wordReplaceWSlash(d)>' USING AvroStorage();<\n>}>"
wordReplaceWSlash(word) ::= "<word:{char|<charReplaceWSlash(char)>}>"
charReplaceWSlash(theChar) ::= <%<charReplaceWSlashMap.(theChar)>%>
charReplaceWSlashMap ::= [
"-":"/",
default:{<theChar>}
]
wordReplaceWEmpty(word) ::= "<word:{char|<charReplaceWEmpty(char)>}>"
charReplaceWEmpty(theChar) ::= <%<charReplaceWEmptyMap.(theChar)>%>
charReplaceWEmptyMap ::= [
"-":"",
default:{<theChar>}
]
List<LocalDate> dates = new ArrayList<>();
dates.add(LocalDate.parse("20100101", DateTimeFormatter.BASIC_ISO_DATE));
dates.add(LocalDate.parse("20100201", DateTimeFormatter.BASIC_ISO_DATE));
This is a list with LocalDates
that we wanna use as input for the templates. I formatted them with DateTimeFormatter.BASIC_ISO_DATE
so that eg 20100101
becomes 2010-01-01
. We need this later because we will tell StringTemplate to replace -
with either /
or an empty string to get the two types of date formats that we want (and I didn't find a formatter that gets 2010/01/01
in the first place).
Two different approaches would be:
-
with /
in Java code, not in StringTemplate. Then, we only had to replace /
with an empty string, if we need this date format. /
when we need to. List<List<Character>> charListList = new ArrayList<>();
for (LocalDate date : dates) {
List<Character> charList = new ArrayList<>();
char[] dateCharArray = date.toString().toCharArray();
for (char c : dateCharArray) {
charList.add(c);
}
charListList.add(charList);
}
It's not very pretty that we have to do this, but:
We have to have a list (and not an array) of the chars of a date so that we can "iterate" over it in StringTemplate. And there is no direct way to convert char[] to List. In the end, we have to put all these lists together in one list so that we can generate code for every date we have ( charListList
).
STGroup dateGroup = new STGroupFile("./src/com/stackoverflow/DateList/dates.stg");
ST dateTemp = dateGroup.getInstanceOf("writeCode");
dateTemp.add("formattedDates", charListList);
dateTemp.add("isSingle", charListList.size() == 1);
System.out.println(dateTemp.render());
Here we fill the template with values. We have to tell StringTemplate here whether or not we have exactly one date in our charListList
because StringTemplate is ( by design ) not capable of doing so.
writeCode(formattedDates, isSingle) ::= <<
<if(isSingle)><writeSingleStuff(formattedDates)>
<else><writeMultipleStuff(formattedDates)>
<endif>
<writeStoreList(formattedDates)>
>>
This is the "root" template that basically just delegates the work to other templates. It handles the case distiction between one or many dates.
writeSingleStuff(date)::= "<date:{d|splitByDay_<wordReplaceWSlash(d)> = FOREACH finalizedEvents GENERATE $0..;}>"
writeMultipleStuff(rawDates)::= "SPLIT finalizedEvents <rawDates:{d|INTO splitByDay_<wordReplaceWEmpty(d)> IF dataDate == <wordReplaceWEmpty(d)>}; separator=\", \">;"
writeStoreList(dates)::= "<dates:{d|STORE splitByDay_<wordReplaceWEmpty(d)> INTO '/a/b/<wordReplaceWSlash(d)>' USING AvroStorage();<\n>}>"
With these three lines, we write the code that is specific for a single item, for mutltiple items, and the code that both have in common.
Although we know that date
has only one item when we enter writeSingleStuff
, we have to iterate through the list.
wordReplaceWSlash(word) ::= "<word:{char|<charReplaceWSlash(char)>}>"
charReplaceWSlash(theChar) ::= <%<charReplaceWSlashMap.(theChar)>%>
charReplaceWSlashMap ::= [
"-":"/",
default:{<theChar>}
]
wordReplaceWEmpty(word) ::= "<word:{char|<charReplaceWEmpty(char)>}>"
charReplaceWEmpty(theChar) ::= <%<charReplaceWEmptyMap.(theChar)>%>
charReplaceWEmptyMap ::= [
"-":"",
default:{<theChar>}
]
These are two groups of templates that almost do the same thing: Replace every -
char in every "word" with either /
or an empty string. We use a little dictionary for that that replaces every char but -
with itself.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.