I have a sequence of strings like this (lines in a file)
[20150101] error a
details 1
details 2
[20150101] error b
details
[20150101] error c
I am trying to map this to a sequence of strings like this (log entries)
[20150101] error a details 1 details 2
[20150101] error b details
[20150101] error c
I can do this in an imperative way (by translating the code I would write in C#) - this works but it reads like pseudo-code because I have omitted the referenced functions:
let getLogEntries logFilePath =
seq {
let logEntryLines = new ResizeArray<string>()
for lineOfText in getLinesOfText logFilePath do
if isStartOfNewLogEntry lineOfText && logEntryLines.Any() then
yield joinLines logEntryLines
logEntryLines.Clear()
logEntryLines.Add(lineOfText)
if logEntryLines.Any() then
yield joinLines logEntryLines
}
Is there a more functional way of doing this?
I can't use Seq.map
since it's not a one to one mapping, and Seq.fold
doesn't seem right because I suspect it will process the entire input sequence before returning the results (not great if I have very large log files). I assume my code above isn't the ideal way to do this in F# because it's using ResizeArray<string>
.
In general, when there is no built-in function that you can use, the functional way to solve things is to use recursion. Here, you can recursively walk over the input, remember the items of the last chunk (since the last [xyz] Info
line) and produce new results when you reach a new starting block. In F#, you can write this nicely with sequence expressions:
let rec joinDetails (lines:string list) lastChunk = seq {
match lines with
| [] ->
// We are at the end - if there are any records left, produce a new item!
if lastChunk <> [] then yield String.concat " " (List.rev lastChunk)
| line::lines when line.StartsWith("[") ->
// New block starting. Produce a new item and then start a new chunk
if lastChunk <> [] then yield String.concat " " (List.rev lastChunk)
yield! joinDetails lines [line]
| line::lines ->
// Ordinary line - just add it to the last chunk that we're collection
yield! joinDetails lines (line::lastChunk) }
Here is an example showing the code in action:
let lines =
[ "[20150101] error a"
"details 1"
"details 2"
"[20150101] error b"
"details"
"[20150101] error c" ]
joinDetails lines []
There is not much in-built in Seq
that is going to help you, so you have to roll your own solution. Ultimately, parsing a file like this involves iterating and maintaining state, but what F# does is encapsulate that iteration and state by means of computation expressions (hence your use of the seq
computation expression).
What you've done isn't bad but you could extract your code into a generic function that computes the chunks (ie sequences of strings) in an input sequence without knowledge of the format. The rest, ie parsing an actual log file, can be made purely functional.
I have written this function in the past to help with this.
let chunkBy chunkIdentifier source =
seq {
let chunk = ref []
for sourceItem in source do
let isNewChunk = chunkIdentifier sourceItem
if isNewChunk && !chunk <> [] then
yield !chunk
chunk := [ sourceItem ]
else chunk := !chunk @ [ sourceItem ]
yield !chunk
}
It takes a chunkIdentifier
function which returns true if the input is the start of a new chunk.
Parsing a log file is simply a case of extracting the lines, computing the chunks and joining each chunk:
logEntryLines |> chunkBy (fun line -> line.[0] = '[')
|> Seq.map (fun s -> String.Join (" ", s))
By encapsulating the iteration and mutation as much as possible, while creating a reusable function, it's more in the spirit of functional programming.
Alternatively, another two variants:
let lst = ["[20150101] error a";
"details 1";
"details 2";
"[20150101] error b";
"details";
"[20150101] error c";]
let fun1 (xs:string list) =
let sb = new System.Text.StringBuilder(xs.Head)
xs.Tail
|> Seq.iter(fun x -> match x.[0] with
| '[' -> sb.Append("\n" + x)
| _ -> sb.Append(" " + x)
|> ignore)
sb.ToString()
lst |> fun1 |> printfn "%s"
printfn "";
let fun2 (xs:string list) =
List.fold(fun acc (x:string) -> acc +
match x.[0] with| '[' -> "\n" | _ -> " "
+ x) xs.Head xs.Tail
lst |> fun2 |> printfn "%s"
Print:
[20150101] error a details 1 details 2
[20150101] error b details
[20150101] error c
[20150101] error a details 1 details 2
[20150101] error b details
[20150101] error c
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.