[英]Using pandas in python to pull a specific string from a column
我有一个 CSV 文件,其中包含以下列:
"Date","Time","TimeZone","Name","Type","Status","Currency","Gross","Fee","Net","From Email Address","To Email Address","Transaction ID","Shipping Address","Address Status","Item Title","Item ID","Shipping and Handling Amount","Insurance Amount","Sales Tax","Option 1 Name","Option 1 Value","Option 2 Name","Option 2 Value","Reference Txn ID","Invoice Number","Custom Number","Quantity","Receipt ID","Balance","Address Line 1","Address Line 2/District/Neighborhood","Town/City","State/Province/Region/County/Territory/Prefecture/Republic","Zip/Postal Code","Country","Contact Phone Number","Subject","Note","Country Code","Balance Impact"
我试图在Item Title列中抓取包含字符串Chain × Jewelry × Necklace的数据行。
每个项目标题下的名称是不同的。 例如。 一个可能是链条 × 珠宝 × 项链 爆米花项链其他是空白值,但我只想要所有包含链条 × 珠宝 × 项链
如何使用 pandas 来提取包含此字符串的这些特定行? 我遇到了麻烦。 非常感谢您的任何帮助。
您可以使用正则表达式:
df[df["Item Title"].str.contains(r"^(?=.*\bChain\b)(?=.*\bJewelry\b)(?=.*\bNecklace\b).+", regex=True)]
尝试这个:
df = pd.read_csv('path/to/your/file.csv')
df = df[df['Item Title'].fillna('').str.contains('Chain × Jewelry × Necklace') & df['Name'].fillna('').str.len().gt(0)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.