简体   繁体   English

使用 Python 中的循环根据字符串值修改新列中的行值

[英]Modify row values in new column based on string value with loop in Python

I'd like to recode row values in a different column based on a string match in pandas using a loop.我想使用循环根据 pandas 中的字符串匹配来重新编码不同列中的行值。 I found a way to do it by creating an entirely new column each time, but that doesn't work when I need to modify select rows from multiple columns at different points in the analysis.我找到了一种方法,每次都创建一个全新的列,但是当我需要在分析的不同点从多列中修改 select 行时,这不起作用。

Here was the solution I used with an example dataframe:这是我在示例 dataframe 中使用的解决方案:

iris = sns.load_dataset('iris')
iris.head()
iris.species.value_counts()

pattern = ['setosa', 'virginica']
iris['new_column'] = 0
lis = []

for index, row in iris.iterrows():
  #print (row['species'])
  if any(ele in row.species for ele in pattern):
    lis.append('matched')
  else:
    lis.append("notmatched")

iris['new_column'] = lis

I know there may be other ways through list comprehensions in Pandas or using lambda/apply methods, but I'm requesting a solution using loops.我知道可能还有其他方法可以通过 Pandas 中的列表理解或使用 lambda/apply 方法,但我请求使用循环的解决方案。 (I don't have the full dataset here, but there's some complications with it and I believe a loop may be the most flexible). (我这里没有完整的数据集,但它有一些复杂性,我相信循环可能是最灵活的)。

Any suggestions on how to use a loop and string match to modify rows in a different column?关于如何使用循环和字符串匹配来修改不同列中的行的任何建议? Thank you and let me know if I can make this question better!谢谢你,如果我能把这个问题做得更好,请告诉我!

One of the simpliest loop solution is iterate by each value of column iris['species'] and append to list lis by condition with in :最简单的循环解决方案之一是通过列iris['species']和 append 的每个值进行迭代,以按条件in lis

pattern = ['setosa', 'virginica']
lis = []
for val in iris['species']:
  if val in pattern:
    lis.append('matched')
  else:
    lis.append("notmatched")

iris['new_column'] = lis

Pandas solution is possible by numpy.where and Series.isin : Pandas 解决方案可以通过numpy.whereSeries.isin

iris['new_column'] = np.where(iris['species'].isin(pattern), 'matched', 'notmatched')        

I ended up finding an answer through a few different threads.我最终通过几个不同的线程找到了答案。

Here's how I did it我是这样做的

iris = sns.load_dataset('iris')
iris.head()
print (iris.species.value_counts())

pattern = ['setosa', 'virginica']
iris['new_column'] = 0

for index, row in iris.iterrows():
  match = re.match('|'.join(pattern), row.species)
  if match:
    iris.loc[index, "new_column"] = match.group(0)
  else:
    iris.loc[index, "new_column"] = 'no match'


print (iris.new_column.value_counts())

I imagine there's a more efficient way to do this and I also have to specify the column, which isn't ideal.我想有一种更有效的方法可以做到这一点,而且我还必须指定列,这并不理想。 Feel free to comment!随意评论!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas 数据框根据清除字符串值并分配给新列的函数修改列值 - Python Pandas dataframe modify column value based on function that cleans string value and assign to new column 根据上一行另一列中的值修改1列中的值 - Modify values in 1 column based on the value in another column in the previous row Python:循环矩阵以检查向量打印字符串中的值+还有行/列值 - Python: loop over matrix to check for values in vector printing string + also row/column value Python/Pandas:根据多个列/行值为列赋值 - Python/Pandas: assign a value to a column based on multiple column/row values 在熊猫的CSV的另一列中创建基于数字行值分配的字符串值的新列会返回错误:“ KeyError:0” - Creating a new column of string values that is assigned based on numerical row value in another column of CSV in Pandas returns error: “KeyError: 0” 根据条件将行值分配为新列 - python - Assign row value as new column based on condition - python 移动行值包含特定字符串到 Python 中的新列 - Moving row values contains specific string to new column in Python python 根据循环内的其他 2 列计算新列值 - python compute new column value based on 2 other columns within loop 熊猫:比较行值并修改下一列的行值 - Pandas: Comparing a row value and modify next column's row values Python Pandas Dataframe 根据同一列中的前一行值计算新行值 - Python Pandas Dataframe calculating new row value based on previous row value within same column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM