简体   繁体   English

如何确定哪个值在我的收藏中出现最多?

[英]How can I determine which value occurs the most in my collection?

So, I have a json file that has a list of fruits. 因此,我有一个包含水果列表的json文件。 Fruits key can map to a single fruit or a collection of fruits. 水果键可以映射到单个水果或水果集合。

Eg: 例如:

[
    {
        "fruits": [
            "banana"
        ]
    },
    {
        "fruits": [
            "apple"
        ]
    },
    {
        "fruits": [
            "orange",
            "apple"
        ]
    }
]

I was wondering, how can I determine which fruit(s) occur the most in my json structure? 我想知道如何确定json结构中出现最多的水果? That is, how do I know my how often a value occurs and which one is leading above the others? 也就是说,我怎么知道我的价值出现的频率以及哪个领先于另一个?

Not sure if you're interested in having a class to deserialize into, but here's how you would do it. 不知道您是否有兴趣对反序列化的类感兴趣,但是这是您的方法。 Feel free to skip the class and use dynamic deserialization: 随意跳过该类并使用动态反序列化:

class FruitCollection
{
    string[] Fruits { get; set; }
}

var fruitColls = JsonConvert.DeserializeObject<FruitCollection>(json);
var mostCommon = fruitColls
    .SelectMany(fc => fc.Fruits)
    .GroupBy(f => f)
    .OrderByDescending(g => g.Count())
    .First()
    .Key;

EDIT : 编辑

This question's pretty old, but I'll mention that the OrderByDescending , First thing is doing redundant work: you don't really need to sort to get the maximum. 这个问题已经很老了,但是我要提到OrderByDescendingFirst件事是做多余的工作:您实际上不需要进行排序就可以得到最大的结果。 This is an age-old lazy hack that people keep doing because LINQ does not provide a nice MaxBy extension method. 这是人们一直在做的一个古老的惰性黑客,因为LINQ没有提供一种很好的MaxBy扩展方法。

Usually your input size is small enough and the other stuff adds enough overhead that you don't really care, but the "correct" way (eg if you had billions of fruit types) would be to use a proper MaxBy extension method or hack something out of Aggregate . 通常,您的输入大小足够小,而其他内容则增加了您并不真正在乎的开销,但是“正确”的方式(例如,如果您有数十亿种水果类型)将是使用适当的MaxBy扩展方法或修改某些内容不Aggregate Finding the max is worst-case linear, whereas sorting is worst case O(n log(n)) . 找到最大值是最坏情况的线性,而排序是最坏情况O(n log(n))

If you use Json.NET , you can load your json using LINQ to JSON , then use SelectTokens to recursively find all "fruits" properties, then recursively collect all descendants string values (those of type JValue ), group them by their string value, and put them in descending order: 如果您使用Json.NET ,则可以使用LINQ到JSON来加载json,然后使用SelectTokens递归查找所有"fruits"属性,然后递归收集所有后代字符串值( JValue类型的JValue ), JValue其字符串值分组,并按降序排列:

        var token = JToken.Parse(jsonString);

        var fruits = token.SelectTokens("..fruits")  // Recursively find all "fruit" properties
            .SelectMany(f => f.DescendantsAndSelf()) // Recursively find all string literals undernearh each
            .OfType<JValue>()                        
            .GroupBy(f => (string)f)                 // Group by string value
            .OrderByDescending(g => g.Count())       // Descending order by count.
            .ToList();

Or, if you prefer to put your results into an anonymous type for clarity: 或者,如果您想将结果放入匿名类型中以求清楚,请执行以下操作:

        var fruits = token.SelectTokens("..fruits")  // Recursively find all "fruit" properties
            .SelectMany(f => f.DescendantsAndSelf()) // Recursively find all string literals undernearh each
            .OfType<JValue>()
            .GroupBy(f => (string)f)                 // Group by string value
            .Select(g => new { Fruit = (string)g.Key, Count = g.Count() } )
            .OrderByDescending(f => f.Count)       // Descending order by count.
            .ToList();

Then afterwards: 然后,然后:

        Console.WriteLine(JsonConvert.SerializeObject(fruits, Formatting.Indented));

Produces: 生产:

 [ { "Fruit": "apple", "Count": 2 }, { "Fruit": "banana", "Count": 1 }, { "Fruit": "orange", "Count": 1 } ] 

** Update ** **更新**

Forgot to include the following extension method 忘记包含以下扩展方法

public static class JsonExtensions
{
    public static IEnumerable<JToken> DescendantsAndSelf(this JToken node)
    {
        if (node == null)
            return Enumerable.Empty<JToken>();
        var container = node as JContainer;
        if (container != null)
            return container.DescendantsAndSelf();
        else
            return new [] { node };
    }
}

The original question was a little vague on the precise structure of the JSON which is why I suggested using Linq rather than deserialization. 最初的问题在JSON的精确结构上有点模糊,这就是为什么我建议使用Linq而不是反序列化。

The serialization class for this structure is simple: 此结构的序列化类很简单:

public class RootObject
{
    public List<List<string>> fruits { get; set; }
}

So to deserialize: 所以要反序列化:

var fruitListContainer = JsonConvert.DeserializeObject<RootObject>(jsonString);

Then you can put all fruits in one list: 然后,您可以将所有水果放在一个列表中:

List<string> fruits = fruitListContainer.fruits.SelectMany(f => f);

Now you have all fruits in one list, and you can do whatever you want. 现在,所有水果都列在一个列表中,您可以做任何您想做的事情。 For sorting, see the other answers. 有关排序,请参见其他答案。

Assuming that the data is in a file named fruits.json, that jq ( http://stedolan.github.io/jq/ ) is on the PATH, and that you're using a Mac or Linux-style shell: 假设数据在一个名为fruits.json的文件中,则jq( http://stedolan.github.io/jq/ )在PATH上,并且您使用的是Mac或Linux风格的shell:

$ jq 'reduce (.[].fruits[]) as $fruit ({}; .[$fruit] += 1)' fruits.json
{
  "banana": 1,
  "apple": 2,
  "orange": 1
}

On Windows, the same thing will work if the quotation marks are suitably adjusted. 在Windows上,如果对引号进行了适当的调整,同样的事情将起作用。 Alternatively, if the one-line jq program is put in a file, say fruits.jq, the following command could be run in any supported environment: 或者,如果将单行jq程序放在文件中,例如Fruits.jq,则可以在任何受支持的环境中运行以下命令:

jq -f fruits.jq fruits.json

If the data is coming from some other process, you can pipe it into jq, eg like so: 如果数据来自其他进程,则可以将其通过管道传输到jq中,例如:

jq -f fruits.jq

One way to find the maximum count is to add a couple of filters, eg as follows: 查找最大计数的一种方法是添加几个过滤器,例如,如下所示:

$ jq 'reduce (.[].fruits[]) as $fruit ({}; .[$fruit] += 1) |
      to_entries | max_by(.value)' fruits.json
{
  "key": "apple",
  "value": 2
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获得集合中出现次数最多的值? - How can I get the value with most occurrences in a collection? 如何确定文件最近何时重命名? - How can I determine when a file was most recently renamed? 如何比较 Word Interop 对象的“引用相等性”并确定某个段落所属的集合或父对象? - How can I compare Word Interop objects for "reference equality" AND determine collection or parent object to which, say, a paragraph belongs? 如何确定我的 C# 应用程序运行的“位”? - How can I determine the "bit-ness" under which my C# application runs? 如何使用C#确定本地网络上的哪些IP地址是静态/动态的? - How can I use C# to determine which IP addresses on my local network are static/dynamic? 如何确定我的数据库中存在/填充了哪些属性 - 并创建排名 - c# - How can I determine which properties exist/are filled in my database - and create rank - c# 当我的PC上有多个.net程序集时,该如何确定该产品的.net程序集已注册? - How can i determine which of my product's .net assemblies are registered, when i have several on my pc? 我如何从我本地收藏中存在的DB中获取元素? - How i can get elements from DB which exists in my local collection? 如何让LINQ返回集合中具有最大值的对象的索引? - How can I get LINQ to return the index of the object which has the max value in a collection? 如何确定合并范围内的哪些单元格可见并可以有值? - How to determine which cells in a merged range are visible and can have a value?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM