[英]Find a specific symbol in a string and remove a number of characters to the left of it and to the right in python
I have a panda column with strings like these:我有一个带有如下字符串的熊猫列:
"[{'node': {'text': '2900₸ размері: 2-3-4-5-6..." "[{'node': {'text': '2900₸ размері: 2-3-4-5-6..."
"[{'node': {'text': '3000₸ размері: 1-2-3-4..." “[{'node': {'text': '3000₸размері: 1-2-3-4...”
I'd like to remove everything to the left of the "₸" symbol except keeping the price ie 2900₸ (it may also be a 5 digit number);我想删除“₸”符号左侧的所有内容,除了保留价格,即 2900₸(也可能是 5 位数字); then remove everything to the right of the "₸" symbol.
然后删除“₸”符号右侧的所有内容。 The Unicode for ₸ is this: U+20B8
₸的Unicode是这样的:U+20B8
Defining a function to get the price (need to import regex
)定义一个 function 来获取价格(需要
import regex
)
def get_price(s):
match = re.search("^.+'(\d+₸).*", s)
if match != None:
return match.group(1)
else:
print('No match:', s)
Now mapping the above: df['price'] = df['old_price'].map(get_price)
现在映射上述内容:
df['price'] = df['old_price'].map(get_price)
Where 'old_price' is the column name of the data to reduce.其中'old_price'是要减少的数据的列名。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.