简体   繁体   中英

How to remove all string or numbers after certain Character in Pandas?

good Evening, i have been scraping amazon site in sneakers section.

now i get two prices in range that means 1100 - 2300

How can i remove everything from the - to the end in Pandas and also the currency symbol. this is what i have tried and my respective output

The csv File

https://easyupload.io/ayaivg

import pandas as pd

Azs = pd.read_csv("amazsneakers.csv")

Azs['Price'].str.replace("-","")

Output:

熊猫

You can split by spaces with Series.str.split and then select second splitted values, remove , and convert to numeric:

Azs = pd.read_csv("amazsneakers.csv")

Azs['Price'] = Azs['Price'].str.split().str[1].str.replace(',','').astype(float)

Or first remove , and extract floats by Series.str.extract :

Azs['Price'] = Azs['Price'].str.replace(',','').str.extract('(\d+\.\d+)').astype(float)

print (Azs['Price'])

0      645.0
1      655.0
2      799.0
3      799.0
4      849.0
 
169    367.0
170    199.0
171    386.0
172    499.0
173    401.0
Name: Price, Length: 174, dtype: float64

Please try this -

test_string = "ab-cd"

split_string = test_string.split("-", 1)

#Split into "ab" and "cd"

required_string = split_string[0]

print(required_string)

Azs['Price'] = Azs['Price'].apply(lambda x: x.split(".")[0])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM