Im accessing a csv file, looping through all of its rows(strings) and i want too keep / print all parts of each string which start with a ".", has two words in the middle and ends with either a "." "?" or ".".
For example, if the string was: "This is my new channel. Please subscribe." i'd only want to keep the ". Please subscribe!"
So far i only have this to show me how many words are inside each string:
with open("data2.csv", encoding="utf-8", newline='') as f:
reader = csv.reader(f)
for row in reader:
rowstr = str(row[1])
res = len(row[1].split())
print(res)
I've tried:
with open("data2.csv", encoding="utf-8", newline='') as f:
reader = csv.reader(f)
for row in reader:
rowstr = row[1]
res = len(row[1].split())
re.findall(r"\.\S+\s\S+[.?!]", rowstr)
print(row[1])
I get no output from findall, only from printing row[1]
Fixed it
Working code:
with open("data2.csv", encoding="utf-8", newline='') as f:
reader = csv.reader(f)
for row in reader:
rowstr = row[1]
res = len(row[1].split())
finalData = re.findall(r"(\.\W\w+\W\w+[\.\?!])", rowstr)
print(finalData)
You can use regular expression:
re.findall(r'(\.\W\w+\W\w+[\.\?!])$',"This is my new channel. Please subscribe!" )
which output: ['. Please subscribe!']
['. Please subscribe!']
Regex is the best solution to the problems like this. Please refer here here !
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.