Python "in" operator not finding substring in text

Question

I am trying to find if any substring in a list of substrings is in a given string. To do so, I loop over the items of the list and check if they exist in the string using python's in operator. I am getting False values even though I am sure one of the substrings exists in the string. I have tried all the methods meant to unify the text and the substrings: replaced all " " with "", used casefold() method, strip() , even used unidecode . Still, the substring is not found.

My code:

from unidecode import unidecode

example_string = '''available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/nanotoday
REVIEW
Synthesis, properties and applications of Janus
nanoparticles
Marco Lattuada a, T. Alan Hatton b,''' # as extracted from PDF file using fitz's `doc.load_page(0)` and then `.get_text()` 

list_of_titles = ["Synthesis, properties and applications of Janus nanoparticles", "another_title", "another_title"]

example_string = example_string.casefold()
example_string = example_string.replace(" ", "")

for title in list_of_titles:
    title = title.replace(" ", "")
    title = title.casefold()
    if unidecode(title) in unidecode(example_string):
         print("Yes")

# Outputs nothing

Answer 1

Try with

example_string = example_string.replace("\n", " ")
example_string = example_string.casefold()

for title in list_of_titles:
    if title.casefold() in example_string: # here casefold() again!
         print("Yes")

I think the \n make some conflicts

Python "in" operator not finding substring in text

Question

1 answers

solution1
1 2022-06-06 18:01:27

Python "in" operator not finding substring in text

Question

1 answers

solution1 1 2022-06-06 18:01:27

solution1
1 2022-06-06 18:01:27