简体   繁体   中英

Refactoring small F# function

I've made the following F# function that will get me an url from the html contents of a web page:

let getPicUrl (urlContents : string) =
  let START_TOKEN = "jpg_url="
  let startIndex = urlContents.IndexOf(START_TOKEN)
  let endIndex = urlContents.IndexOf("&amp", startIndex)
  let s = startIndex + START_TOKEN.Length
  let l = endIndex-startIndex-START_TOKEN.Length

  urlContents.Substring(s, l)

what the last line, urlContents.Substring(s, l) , actually needs is only s and l , so I was wondering whether I could refactor parts of this function into some internal functions so I'd let my intentions be clearer. Ideally getPicUrl would only have 2 let instructions, s and l , and all the others would be internal definitions to those let instructions. If this can in any way be achieved or not is another story..

The only obvious way I can think at the moment to improve the above code would be to switch endIndex of place so we'd have

let getPicUrl (urlContents : string) =
  let START_TOKEN = "jpg_url="
  let startIndex = urlContents.IndexOf(START_TOKEN)
  let s = startIndex + START_TOKEN.Length
  let l =
    let endIndex = urlContents.IndexOf("&amp", startIndex)
    endIndex-startIndex-START_TOKEN.Length

  urlContents.Substring(s, l)

but I keep wondering if there'd be a clearer way of organizing this function's let definitions.

Firstly, your function is buggy. A non-matching string will make it grumpy.

I like regexes for this sort of thing. With this active pattern:

open System.Text.RegularExpressions

let (|Regex|_|) pattern input =
  let m = Regex.Match(input, pattern)
  if m.Success then Some(List.tail [for g in m.Groups -> g.Value])
  else None

you can do:

let tryGetPicUrl = function
  | Regex @"jpg_url=([^&]+)&amp" [url] -> Some url
  | _ -> None

You could also turn your original approach into an active pattern:

let (|Between|_|) (prefix:string) (suffix:string) (value:string) =
  match value.IndexOf(prefix) with
  | -1 -> None
  | s ->
    let n = s + prefix.Length + 1
    match value.IndexOf(suffix, n) with
    | -1 -> None
    | e -> Some (value.Substring(n, e - n))

and do:

let tryGetPicUrl = function
  | Between "jpg_url" "&amp" url -> Some url
  | _ -> None

You can write it this way:

let getPicUrl (urlContents : string) =
  let s =
    let START_TOKEN = "jpg_url="
    let startIndex = urlContents.IndexOf(START_TOKEN)
    startIndex + START_TOKEN.Length
  let l =
    let endIndex = urlContents.IndexOf("&amp", s)
    endIndex-s

  urlContents.Substring(s, l)

Another option would be to use split method of string (I hope the string is not too long as that would be a performance hit) and use option type to indicate whether the URL was found or not.

let getPicUrl (urlContents : string) =
    let splitAndGet n (sep:string) (str:string) = 
        let spl = str.Split([|sep|],StringSplitOptions.None)
        match spl.Length with
        | x when x > n -> Some (spl.[n])
        | _ -> None 
    match urlContents |> splitAndGet 1 "jpg_url=" with
    | Some str -> str |> splitAndGet 0 "&amp"
    | _ -> None

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM