简体   繁体   English

如果列值以pandas数据框开头/匹配字符串,则替换列值

[英]Replace column value if they starts with/match a string for pandas dataframe

I have a column in my dataframe prices_df as thumbnail_url . 我在数据prices_df有一列作为thumbnail_url

    zipcode thumbnail_url
0   11201   https://a0.muscache.com/im/pictures/6d7cbbf7-c...
1   10019   0
2   10027   https://a0.muscache.com/im/pictures/6fae5362-9...
3   94117   https://a0.muscache.com/im/pictures/72208dad-9...
4   20009   0
5   94131   https://a0.muscache.com/im/pictures/82509143-4...

I need to replace all values where the row contains https:// or lets say contains .com with numeric value 1. 我需要用数字值1 替换 包含https://所有 ,或者说包含.com

zipcode thumbnail_url
0   11201   1
1   10019   0
2   10027   1

Tried this 试过这个

img_Uploaded = prices_df['thumbnail_url'].str.contains("http") == True
prices_df.replace(to_replace=prices_df[img_Uploaded],value=1,inplace=True)

My dataframe is of shape (74111, 2) 我的数据(74111, 2)的形状(74111, 2)

This line of code takes too much time and my system froze. 这行代码花费了太多时间,并且我的系统冻结了。 Can someone suggest a better vectorized operation and explain it. 有人可以提出更好的矢量化操作并进行解释吗?

My issue is resolved but I am curious what was wrong with my code ? 我的问题已解决, 我很好奇我的代码出了什么问题? Apart from the fact that it did not optimized using vectorized operations? 除了它没有使用向量化操作进行优化之外,还包括以下事实: It should still run right? 它应该仍然运行正确吗? Or THAT is the reason why it froze and did not run whereas the codes suggested below ran in seconds 或这就是它冻结而无法运行而下面建议的代码在几秒钟内运行的原因

您可以使用apply()函数来完成此操作:

prices_df.thumbnail_url = prices_df.thumbnail_url.apply(lambda url: 1 if 'http' in str(url) else url)

You can use lambda expression 您可以使用lambda表达式

prices[['thumbnail_url']] = prices[['thumbnail_url']].apply(lambda x: 1 if 'https://' in str(x) else 0)

They are a shorthand to create anonymous functions; 它们是创建匿名函数的简写。 the expression lambda parameters: expression yields a function object. 表达式lambda参数:表达式产生一个函数对象。 The unnamed object behaves like a function object defined with 未命名对象的行为类似于使用定义的函数对象

here`s doc 这里的文件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM