简体   繁体   English

请为我解释以下代码行。 即使用数据框的 2 列创建熊猫系列

[英]Please explain the following line of code for me. i.e pandas series creation using 2 columns of a dataframe

industry_usa = f500["industry"][f500["country"] == "USA"].value_counts().head(2)

This is a dataframe where some of its columns are industry and country .这是一个数据框,其中一些列是industrycountry So why do we need to locate the 2 columns side by side while creating the indsutry_usa series.那么为什么我们需要在创建indsutry_usa系列时并排放置 2 列。 Please explain.请解释。

I will break it down for you:我给你分解一下:

f500["industry"] : This selects the series (column) with the same name. f500["industry"] :选择同名的系列(列)。

f500["country"] == "USA" : This returns a boolean index containing True for all the rows which have their country column as USA. f500["country"] == "USA" :这将返回一个布尔索引,其中包含所有国家列为美国的行的True

f500["industry"][f500["country"] == "USA"] : As you might have guessed, this now is just like any other indexing we do in pandas. f500["industry"][f500["country"] == "USA"] :正如您可能已经猜到的,这就像我们在 Pandas 中所做的任何其他索引一样。 So, it selects all those " industry "s where the country is "USA".因此,它选择了国家为“美国”的所有“行业”。

.value_counts() : is just to do a count of the unique values. .value_counts() :只是对唯一值进行计数。 Like we have in Counter class in python就像我们在python Counter类中一样

NOTE: The interesting fact is that you could change the order to - f500[f500["country"] == "USA"]["industry"] and still get the same result!!注意:有趣的事实是,您可以将顺序更改为 - f500[f500["country"] == "USA"]["industry"]并且仍然得到相同的结果!!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM