I am new to Pandas and am working with a dataset of 8000 rows. Here is a snippet from it:
These are some of the lines. ( https://i.stack.imgur.com/8ftng.png ) I have imported the file and named it 'df'.
I have been trying to delete every line in the dataset that contains a link in the source text.
Here is my code so far:
def cleanLinks(col):
if re.search('http\S+', col):
return index(col)
df = df.drop(df.index[df['source'].apply(cleanLinks)])
I have no idea where to go from here so would greatly appreciate any help.
If I understood you right:
df = df[~df['source'].str.contains('http')]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.