简体   繁体   English

Label 基于另一列(同一行)的值的列 pandas dataframe

[英]Label a column based on the value of another column (same row) in pandas dataframe

I have a list of sub-categories that correspond to a particular category, think of it like this:我有一个对应于特定类别的子类别列表,可以这样想:

Category Sub Category类别子类别

a |一个 | 1 1个

a |一个 | 2 2个

a |一个 | 3 3个

b |乙 | 4 4个

b |乙 | 5 5个

etc... ETC...

I was wondering the best way to apply the Category value to each row of the dataframe (~800,000 rows) based on the Sub Category which is defined.我想知道根据定义的子类别将类别值应用于 dataframe(~800,000 行)的每一行的最佳方法。

I am currently using this method, but I know its not the best or even good:我目前正在使用这种方法,但我知道它不是最好的,甚至不是最好的:

df.loc[df.Subcategory =='1', 'Category'] = 'a'

df.loc[df.Subcategory =='2', 'Category'] = 'a'

df.loc[df.Subcategory =='3', 'Category'] = 'a' 

df.loc[df.Subcategory =='4', 'Category'] = 'b'
and so on...

That leaves me with a long chunk of ugly code and isnt very efficient.这给我留下了一大堆丑陋的代码,而且效率不高。

I was wondering if anyone has another method that might be able to help, I am fairly new to coding so this is only the 5th or so code I've written and am mostly self taught so any help would be really appreciated.我想知道是否有人有另一种方法可以提供帮助,我对编码还很陌生,所以这只是我编写的第 5 个左右的代码,而且大部分都是自学的,因此非常感谢任何帮助。

Based on your code, it looks like you have a DataFrame column called "Subcategory" and you want to create the column "Category" based on some mapping of subcategory to category.根据您的代码,您似乎有一个名为“子类别”的 DataFrame 列,并且您希望基于子类别到类别的某些映射来创建“类别”列。 (Your initial description suggests that you already have the "Category" column, but then there would be no point to your code.) (您的初始描述表明您已经有了“类别”列,但是您的代码就没有意义了。)

If I'm understanding correctly and you want to create the "Category" column, equal to "a" when subcategory == 1, equal to "a" when subcategory == 2, ..., equal to "b" when subcategory == 5, and so on, then you could use the pandas map() function.如果我理解正确并且您想创建“类别”列,当子类别 == 1 时等于“a”,当子类别 == 2 时等于“a”,...,当子类别时等于“b” == 5,依此类推,那么你可以使用 pandas map() function。

subcategory_to_category_map = { "1": "a", "2": "a", "3": "a", "4": "b", "5": "b" }

df["Category"] = df["Subcategory"].map( subcategory_to_category_map )

Make sure you use the same data type in the dictionary/map as your "Subcategory" values (ie if they are numeric use numberic keys, and if they are strings ("1", "2", etc.) then use strings (as shown)).确保在字典/地图中使用与“子类别”值相同的数据类型(即,如果它们是数字,则使用数字键,如果它们是字符串(“1”、“2”等),则使用字符串(如图所示))。 Also note that any value of "Subcategory" that is not a key in the dictionary will result in the new "Category" column having a missing value.另请注意,任何不是字典中键的“子类别”值都将导致新的“类别”列具有缺失值。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据Pandas中第二列的条件,用另一行的同一列的值填充特定行的列中的值 - Fill values in a column of a particular row with the value of same column from another row based on a condition on second column in Pandas 根据列值删除Pandas中的DataFrame行 - Deleting DataFrame row in Pandas based on column value 基于同一 dataframe 的另一列更新 pandas dataframe 中的列 - Update column in pandas dataframe based on another column of the same dataframe Pandas Dataframe:对于给定的行,尝试基于在另一列中查找值来分配特定列中的值 - Pandas Dataframe: for a given row, trying to assign value in a certain column based on a lookup of a value in another column Python Pandas Dataframe 根据同一列中的前一行值计算新行值 - Python Pandas Dataframe calculating new row value based on previous row value within same column Pandas dataframe,在一行中,查找所选列中的最大值,并根据该值查找另一列的值 - Pandas dataframe, in a row, to find the max in selected column, and find value of another column based on that 使用基于 ID 列的另一行的值来估算 Pandas 数据框列 - Impute Pandas dataframe column with value from another row based on ID column 用另一列中的相同行值替换 pandas dataframe 列中的值 - Replacing values in pandas dataframe column with same row value from another column pandas dataframe 如何判断一行的字符串值是否包含在同一列的另一行的字符串值中 - How to check if a string value of one row is contained in the string value of another row in the same column in pandas dataframe 当同一行中的另一列为NaN时,如何从熊猫数据框中选择特定的列值? - How to select a particular column value from a pandas dataframe when another column in the same row is NaN?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM