繁体   English   中英

在 python 中使用 pandas 从列中提取特定字符串

[英]Using pandas in python to pull a specific string from a column

我有一个 CSV 文件,其中包含以下列:

"Date","Time","TimeZone","Name","Type","Status","Currency","Gross","Fee","Net","From Email Address","To Email Address","Transaction ID","Shipping Address","Address Status","Item Title","Item ID","Shipping and Handling Amount","Insurance Amount","Sales Tax","Option 1 Name","Option 1 Value","Option 2 Name","Option 2 Value","Reference Txn ID","Invoice Number","Custom Number","Quantity","Receipt ID","Balance","Address Line 1","Address Line 2/District/Neighborhood","Town/City","State/Province/Region/County/Territory/Prefecture/Republic","Zip/Postal Code","Country","Contact Phone Number","Subject","Note","Country Code","Balance Impact"

我试图在Item Title列中抓取包含字符串Chain × Jewelry × Necklace的数据行。

每个项目标题下的名称是不同的。 例如。 一个可能是链条 × 珠宝 × 项链 爆米花项链其他是空白值,但我只想要所有包含链条 × 珠宝 × 项链

如何使用 pandas 来提取包含此字符串的这些特定行? 我遇到了麻烦。 非常感谢您的任何帮助。

您可以使用正则表达式:

df[df["Item Title"].str.contains(r"^(?=.*\bChain\b)(?=.*\bJewelry\b)(?=.*\bNecklace\b).+", regex=True)]

尝试这个:

df = pd.read_csv('path/to/your/file.csv')
df = df[df['Item Title'].fillna('').str.contains('Chain × Jewelry × Necklace') & df['Name'].fillna('').str.len().gt(0)]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM