简体   繁体   中英

Code generation using string template

I am trying to use string template to generate Pig/Hadoop code. Since I am a novice I couldn't figure it out myself. Any help will be appreciated.

I have a List of LocalDate like one show below

List<LocalDate> dates = Arrays.asList("20100101", "20100102").stream().map(d -> LocalDate.parse(d,formatter)).collect(Collectors.toList());

The list can have 1 dates or many dates.

If the list "dates" contains more than one element then I would like to generate:

SPLIT finalizedEvents INTO splitByDay_20100101 IF dataDate == 20100101,
                  INTO splitByDay_20100102 IF dataDate == 20100102, ....; // for all date in "dates" list
// similarly for all dates
// formatting substitution variable e.g. 2010/01/01 instead of 20100101 is needed
STORE splitByDay_20100101 INTO '/a/b/2010/01/01' USING AvroStorage();
STORE splitByDay_20100102 INTO '/a/b/2010/01/02' USING AvroStorage();

If the list "dates" contain one element only then I would like to generate (assume dates = [ 20100101] )

splitByDay_20100101 = FOREACH finalizedEvents GENERATE $0..;
STORE splitByDay_20100101 INTO '/a/b/2010/01/01' USING AvroStorage();

So far I have done something like the following but not sure how to do the conditionals

ST e = new ST("SPLIT finalizedEvents INTO <[dates]:{ d | IF split_<d> BY daysSinceEpoch == <d>}; separator=\", \">;");
e.add("dates", dates);
System.out.println(e.render());

Here is what I came up with (detailed explanation below):

Java Code:

List<LocalDate> dates = new ArrayList<>();
dates.add(LocalDate.parse("20100101", DateTimeFormatter.BASIC_ISO_DATE));
dates.add(LocalDate.parse("20100201", DateTimeFormatter.BASIC_ISO_DATE));

List<List<Character>> charListList = new ArrayList<>();
for (LocalDate date : dates) {
    List<Character> charList = new ArrayList<>();
    char[] dateCharArray = date.toString().toCharArray();
    for (char c : dateCharArray) {
        charList.add(c);
    }
    charListList.add(charList);
}

STGroup dateGroup = new STGroupFile("./src/com/stackoverflow/DateList/dates.stg");
ST dateTemp = dateGroup.getInstanceOf("writeCode");
dateTemp.add("formattedDates", charListList);
dateTemp.add("isSingle", charListList.size() == 1);

System.out.println(dateTemp.render());


StringTemplate Code (dates.stg):

writeCode(formattedDates, isSingle) ::= <<
<if(isSingle)><writeSingleStuff(formattedDates)>
<else><writeMultipleStuff(formattedDates)>
<endif>
<writeStoreList(formattedDates)>
>>


writeSingleStuff(date)::= "<date:{d|splitByDay_<wordReplaceWSlash(d)> = FOREACH finalizedEvents GENERATE $0..;}>"

writeMultipleStuff(rawDates)::= "SPLIT finalizedEvents <rawDates:{d|INTO splitByDay_<wordReplaceWEmpty(d)> IF dataDate == <wordReplaceWEmpty(d)>}; separator=\", \">;"

writeStoreList(formattedDates)::= "<formattedDates:{d|STORE splitByDay_<wordReplaceWEmpty(d)> INTO '/a/b/<wordReplaceWSlash(d)>' USING AvroStorage();<\n>}>"


wordReplaceWSlash(word) ::= "<word:{char|<charReplaceWSlash(char)>}>"

charReplaceWSlash(theChar) ::= <%<charReplaceWSlashMap.(theChar)>%>

charReplaceWSlashMap ::= [
    "-":"/",
    default:{<theChar>}
]


wordReplaceWEmpty(word) ::= "<word:{char|<charReplaceWEmpty(char)>}>"

charReplaceWEmpty(theChar) ::= <%<charReplaceWEmptyMap.(theChar)>%>

charReplaceWEmptyMap ::= [
    "-":"",
    default:{<theChar>}
]


What the code does:

Java Code:

List<LocalDate> dates = new ArrayList<>();
dates.add(LocalDate.parse("20100101", DateTimeFormatter.BASIC_ISO_DATE));
dates.add(LocalDate.parse("20100201", DateTimeFormatter.BASIC_ISO_DATE));

This is a list with LocalDates that we wanna use as input for the templates. I formatted them with DateTimeFormatter.BASIC_ISO_DATE so that eg 20100101 becomes 2010-01-01 . We need this later because we will tell StringTemplate to replace - with either / or an empty string to get the two types of date formats that we want (and I didn't find a formatter that gets 2010/01/01 in the first place).
Two different approaches would be:

  • Replace - with / in Java code, not in StringTemplate. Then, we only had to replace / with an empty string, if we need this date format.
  • add year, month and day as three different variables into the template. Then we could concatenate the strings and add / when we need to.


List<List<Character>> charListList = new ArrayList<>();
for (LocalDate date : dates) {
    List<Character> charList = new ArrayList<>();
    char[] dateCharArray = date.toString().toCharArray();
    for (char c : dateCharArray) {
        charList.add(c);
    }
    charListList.add(charList);
}

It's not very pretty that we have to do this, but:
We have to have a list (and not an array) of the chars of a date so that we can "iterate" over it in StringTemplate. And there is no direct way to convert char[] to List. In the end, we have to put all these lists together in one list so that we can generate code for every date we have ( charListList ).


STGroup dateGroup = new STGroupFile("./src/com/stackoverflow/DateList/dates.stg");
ST dateTemp = dateGroup.getInstanceOf("writeCode");
dateTemp.add("formattedDates", charListList);
dateTemp.add("isSingle", charListList.size() == 1);

System.out.println(dateTemp.render());

Here we fill the template with values. We have to tell StringTemplate here whether or not we have exactly one date in our charListList because StringTemplate is ( by design ) not capable of doing so.


StringTemplateCode:

writeCode(formattedDates, isSingle) ::= <<
<if(isSingle)><writeSingleStuff(formattedDates)>
<else><writeMultipleStuff(formattedDates)>
<endif>
<writeStoreList(formattedDates)>
>>

This is the "root" template that basically just delegates the work to other templates. It handles the case distiction between one or many dates.


writeSingleStuff(date)::= "<date:{d|splitByDay_<wordReplaceWSlash(d)> = FOREACH finalizedEvents GENERATE $0..;}>"

writeMultipleStuff(rawDates)::= "SPLIT finalizedEvents <rawDates:{d|INTO splitByDay_<wordReplaceWEmpty(d)> IF dataDate == <wordReplaceWEmpty(d)>}; separator=\", \">;"

writeStoreList(dates)::= "<dates:{d|STORE splitByDay_<wordReplaceWEmpty(d)> INTO '/a/b/<wordReplaceWSlash(d)>' USING AvroStorage();<\n>}>"

With these three lines, we write the code that is specific for a single item, for mutltiple items, and the code that both have in common.
Although we know that date has only one item when we enter writeSingleStuff , we have to iterate through the list.


wordReplaceWSlash(word) ::= "<word:{char|<charReplaceWSlash(char)>}>"

charReplaceWSlash(theChar) ::= <%<charReplaceWSlashMap.(theChar)>%>

charReplaceWSlashMap ::= [
    "-":"/",
    default:{<theChar>}
]


wordReplaceWEmpty(word) ::= "<word:{char|<charReplaceWEmpty(char)>}>"

charReplaceWEmpty(theChar) ::= <%<charReplaceWEmptyMap.(theChar)>%>

charReplaceWEmptyMap ::= [
    "-":"",
    default:{<theChar>}
]

These are two groups of templates that almost do the same thing: Replace every - char in every "word" with either / or an empty string. We use a little dictionary for that that replaces every char but - with itself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM