Python：将lambda与startswith一起使用

Question

I need to writing my dataframe to csv, and some of the series start with "+-= ", so I need to remove them first. 我需要将数据帧写入csv，并且某些系列以“ +-=”开头，因此我需要先将其删除。

I tried to test by using a string: 我尝试使用字符串进行测试：

test="+++++-= I love Mercedes-Benz"
while True:
    if test.startswith('+') or test.startswith('-') or test.startswith('=') or test.startswith(' '):
        test=test[1:]
        continue

    else:
        print(test)
        break

Output looks perfect: 输出看起来很完美：

I love Mercedes-Benz.

Now when I want to do the same while using lambda in my dataframe: 现在，当我想在数据帧中使用lambda时做同样的事情时：

import pandas as pd

col_names =  ['A', 'B', 'C']
my_df  = pd.DataFrame(columns = col_names)
my_df.loc[len(my_df)] = ["++++-= I love Mercedes-Benz", 4, "Love this"]
my_df.loc[len(my_df)] = ["=Looks so good!", 2, "5-year-old"]
my_df

my_df["A"]=my_df["A"].map(lambda x: x[1:] if x.startswith('=') else x)
print(my_df["A"])

I am not sure how to put 4 startswith "-","=","+"," " together and loop them until they meet the first alphabet or character(sometimes it might be in Japanese or Chinese.) 我不确定如何将4个以“-”，“ =”，“ +”，“”开头的字符放在一起，然后循环直到它们遇到第一个字母或字符（有时可能是日语或中文）。

expected final my_df: 预期的最终my_df：

         A                    B          C
0   I love Mercedes-Benz      4       Love this
1   Looks so good!            2       5-year-old

Answer 1

You can use str.lstrip in order to remove these leading characters: 您可以使用str.lstrip来删除这些前导字符：

my_df.A.str.lstrip('+-=')

0     I love Mercedes-Benz
1           Looks so good!
Name: A, dtype: object

Answer 2

The function startswith accepts a tuple of prefixes: 函数startswith接受一个前缀的元组：

while test.startswith(('+','-','=',' ')):
    test=test[1:]

But you can't put that in a lambda. 但是您不能将其放在lambda中。 But then, you don't need a lambda: just write the function and pass its name to map . 但是然后，您不需要lambda：只需编写函数并将其名称传递给map 。

Answer 3

One way to achieve it could be 实现它的一种方法可能是

old = ""
while old != my_df["A"]:
    old = my_df["A"]
    my_df["A"]=my_df["A"].map(lambda x: x[1:] if any(x.startswith(char) for char in "-=+ ") else x)

But I'd like to warn you about the strip() method for strings: 但我想警告您关于字符串的strip（）方法：

>>> test="+++++-= I love Mercedes-Benz"
>>> test.strip("+-=")
' I love Mercedes-Benz'

So your data extraction can become simpler: 因此，您的数据提取可以变得更加简单：

my_df["A"].str=my_df["A"].str.strip("+=- ")

Just be careful because strip will remove the characters from both sides of the string. 请小心，因为strip将从字符串的两侧删除字符。 lstrip instead can do the job only on the left side. lstrip只能在左侧执行此工作。

Answer 4

As a lover of regex and possibly convoluted solutions, I will add this solution as well: 作为正则表达式和可能复杂的解决方案的爱好者，我还将添加以下解决方案：

import re

my_df["A"]=my_df["A"].map(lambda x: re.sub('^[*-=\s]*', '', x))

the regex reads: 正则表达式为：
^ from the beginning ^从一开始
[] items in this group 此群组中的[]项目
\\s any whitespace \\s任何空格
* zero or more *零或更多
so this will match (and replace with nothing) all the characters from the beginning of the string that are in the square brackets 因此这将匹配（并且不替换任何内容）字符串开头的所有方括号中的字符

Python：将lambda与startswith一起使用

问题描述

4 个解决方案

解决方案1
3 已采纳 2019-03-12 09:08:11

解决方案2
0 2019-03-12 09:09:22

解决方案3
0 2019-03-12 09:19:37

解决方案4
0 2019-03-12 09:24:03

Python：将lambda与startswith一起使用

问题描述

4 个解决方案

解决方案1 3 已采纳 2019-03-12 09:08:11

解决方案2 0 2019-03-12 09:09:22

解决方案3 0 2019-03-12 09:19:37

解决方案4 0 2019-03-12 09:24:03

解决方案1
3 已采纳 2019-03-12 09:08:11

解决方案2
0 2019-03-12 09:09:22

解决方案3
0 2019-03-12 09:19:37

解决方案4
0 2019-03-12 09:24:03