简体   繁体   English

F#中的字典推导(?)(从C#转换)

[英]Dictionary comprehensions(?) in F# (converting from C#)

OK, so, I'm just starting out learning F#. 好,所以,我刚开始学习F#。 I have some exposure to functional languages from university etc, but still I'm quite green when it comes to real world programming in languages such as F#. 我曾经接触过大学等提供的功能性语言,但是在使用F#等语言进行现实世界编程时,我还是很环保。

On a day-to-day basis I work in C#, but today I had the opportunity to spend some time with my company's code base, and look at it from an F# perspective. 我每天都使用C#工作,但是今天我有机会花一些时间在公司的代码库中,并从F#的角度进行研究。 I decided to try to rewrite some of our C# code in F#, to get a feel for the language in a realistic business setting. 我决定尝试用F#重写一些C#代码,以在现实的业务环境中体会这种语言。

Here's a paraphrase of some C# code that I struggled to translate: 这是我努力翻译的一些C#代码的解释:

// MyData is a class with properties Id, Analysis, and some other relevant properties
// Each pair of (Id, Analysis) is (should be) distinct
IEnumerable<MyData> data = // fetch from DB...

// dataDict[id[analysis]] = MyData object (or "row") from DB
var dataDict = new Dictionary<String, Dictionary<String, MyData>> ();
foreach(var d in data)
{
    if(!dataDict.ContainsKey(d.Id))
        dataDict.Add(d.Id, new Dictionary<string, MyData>());

    if (dataDict[d.Id].ContainsKey(d.Analysis))
    {
        logger.Warn(String.Format("Id '{0}' has more than one analysis of type '{1}', 
            rows will be ignored", d.Id, d.Analysis));
    }
    else
    {
        dataDict[d.Id].Add(d.Analysis, d);
    }
} 

My attempt at rewriting the loop in a "functional" manner resulted in the following code, but I don't feel all that good about it. 我尝试以“功能性”方式重写循环导致了以下代码,但是我对此并不满意。

let dataDict = 
      dict [ 
        for d in data 
          |> Seq.distinctBy(fun d -> d.Id) -> d.Id, 
             dict [                                                                                                   
                 for x in data |> Seq.filter(fun a -> a.Id = d.Id) -> x.Analysis, x
             ]
      ]

A couple of issues with this code: 此代码有两个问题:

  • It does not log a warning in case of duplicate (Id, Analysis) pairs, and, even worse 如果存在重复的(标识,分析)对,则不会记录警告,甚至更糟
  • I run through the data (at least) twice with the for and the Seq.filter. 我使用for和Seq.filter处理数据(至少)两次(至少)。

How can I improve this? 我该如何改善? Am I doing it all wrong? 我做错了吗?

What I would consider a more functional approach: 我认为更实用的方法是:

let intoMap (data: seq<MyData>) = 
    Seq.fold (fun (datamap, dups) (data: MyData) -> 
        match datamap |> Map.tryFind data.Id with
        | Some submap when submap |> Map.containsKey data.Analysis -> 
            datamap, data :: dups
        | Some submap ->
            let ext = Map.add data.Analysis data submap
            (Map.add data.Id ext datamap), dups
        | None ->
            let submap = Map.ofArray [| (data.Analysis, data) |]
            (Map.add data.Id submap datamap), dups
        ) (Map.empty, List.empty) data

It's a fold over the data, so it traverses the sequence once. 它是数据的折叠,因此它遍历序列一次。 It's also more functional in that it's not side-effecting - instead of logging duplicates, they're collected and made part of the output. 它还具有更多的功能,因为它不会产生副作用-收集日志并将其作为输出的一部分,而不是记录重复项。 You can do whatever you like with them later. 以后您可以对他们进行任何操作。

Also, I use the immutable Map instead of Dictionary - I find Dictionary to be a kind of code smell in F# code. 另外,我使用不变的Map而不是Dictionary-我发现Dictionary是F#代码中的一种代码味道。 The mutability it provides has its uses in some more esoteric scenarios, but for actually holding and passing around data, I would use Map exclusively. 它提供的可变性在更深奥的场景中有其用途,但是对于实际保存和传递数据,我将专门使用Map。

That's the answer to your immediate question - but to be honest, I would probably go for a separate function for finding and splitting out duplicates, and a separate function that would build up a map without caring for potential duplicates - even if that would mean multiple passes over the data. 这是您眼前一个问题的答案-但老实说,我可能会选择一个单独的函数来查找和拆分重复项,并使用一个单独的函数来构建地图而无需照顾潜在的重复项-即使那意味着多个传递数据。

Given your requirements, what you have is probably best. 根据您的要求,您所拥有的可能是最好的。 You can tighten the code a bit using pattern matching. 您可以使用模式匹配来加紧代码。

let dataDict = Dictionary<_,Dictionary<_,_>>()
for d in data do
    match dataDict.TryGetValue(d.Id) with
    | true, m when m.ContainsKey(d.Analysis) ->
        (d.Id, d.Analysis)
        ||> sprintf "Id '%s' has more than one analysis of type '%s', rows will be ignored" 
        |> logger.Warn
    | true, m -> 
        m.Add(d.Analysis, d)
    | _ ->
        let m = Dictionary()
        m.Add(d.Analysis, d)
        dataDict.Add(d.Id, m)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM