简体   繁体   中英

Nested loop not iterating through entire list

I'm having a problem with my nested loop.

I have a list of titles and a list of keywords, and I want to create a new list containing all of the titles that contain any of the keywords in the list of keywords.

My code reads as follows:

Updated with test data

test_key_list = ['Argentinian Americans', 'Belizean Americans', 'Chicano Americans', 'Latino Americans', 'Latine', 'Bolivian Americans', 'Boricuas', 'Brazilian Americans', 'Chilean Americans', 'Colombian Americans', 'Costa Rican Americans', 'Costarisences', 'Cuban Americans', 'Dominican Americans', 'Ecuadorian Americans', 'Afro-Hispanics', 'Afro-Latinos', 'Guatemalan Americans', 'Hispanic Americans', 'Hispanos', 'Honduran Americans', 'Mejicano', 'Mexican Americans', 'Nicaraguan Americans', 'Panamanean Americans', 'Paraguayan Americans', 'Peruvian Americans', 'Puerto Rican Americans', 'Salvadoran Americans', 'Tejano', 'Uruguayan Americans', 'Venezuelan Americans', 'Argentinians', 'Belizeans', 'Chicanos', 'Latin Americans', 'Chicanas', 'Bolivians', 'Chicanx', 'Brazilians', 'Chileans', 'Colombians', 'Costa Ricans', 'Latinos', 'Cubans', 'Dominicans', 'Ecuadorians', 'Latinas', 'Afro-Latinas', 'Guatemalans', 'Hispanics', 'Latinx', 'Hondurans', 'Mexicano', 'Mexicans', 'Nicaraguans', 'Panamaneans', 'Paraguayans', 'Peruvians', 'Puerto Ricans', 'Salvadorans', 'Texano', 'Uruguayans', 'Venezuelans', 'Argentinos', 'Belizeanos', 'Bolivianos', 'Puerto Ricans', 'Brasileños', 'Chilenos', 'Colombianos', 'Costarricences', 'Costarricences', 'Cubanos', 'Dominicanos', 'Ecuatorianos', 'Guatemaltecos', 'Mexican Americans', 'Hondureños', 'Nicaragüenses', 'Panameños', 'Paraguayos', 'Peruanos', 'Puertorriqueños', 'Salvadoreños', 'Uruguayos', 'Venezolanos', 'latinx', 'latina', 'latino', 'latine', 'hispanic', 'hispanos']
test_title_list = ['The University library "Antonio Machado Ruiz" and its support for the teaching and educational process in the forestry career at the university of Granma, Cuba', 'The right to information and the right of information', 'Relational dimension of social capital in public libraries: A case study', 'Diagnosis of information literacy skills of professionals from the National Library of Cuba', 'Unravelling the basic concepts and intents of misbehavior in post-truth society', "Notes on Suzanne Briet and her “Qu'est-ce que la documentation?”", 'Critical spaces of social responsibility for Digital Humanities', 'Undergraduate research: Evaluation of its quality through theses', 'Information literacy, bastion in the post-truth era', 'Research on archival science, library science and information science in Colombia: 2007-2017', 'Approach to the Social Epistemology as theoretical project for Library Science', 'Web 2.0 in the Nordic libraries', 'Information system: Conceptual and methodological approach', 'Written information and the iconic representation of death in art', 'Professional perspectives in cloud computing environments', 'Information science in Uruguay (2013-2017): Research lines and academic output', 'Application and improvement of graph analysis in financial intelligence reports', 'Traditional, digital, on-demand and self-surfaced print publishing. Four models of book publishing requiring different evaluations', 'Cuban research in Information Sciences: The case of postgraduate studies (2008-2018)', 'Citation networks of Ibero-American journals of Library and Information Science in Scopus', 'Towards an Ibero-American informational thinking', 'Domain analysis on risks and climate in Web of Science', 'The information science in Brazil: Research mapping and institutional outlook', 'Knowledge audit oriented to the main process and human capital. A case study in the National Library of Cuba', 'Publication trends of the journal Cuba (1962-1969): A bibliometric analysis', 'Research on library and information science in Peru: A state of art', 'Information Science in Portugal in the first decades of the 21st century: A preliminary approach to an Ibero-American cartography', 'Impact of emerging Library and Information Science journals on the Web of Science (2017)', 'Procedure proposal to self-manage health knowledge from the web, through mobile devices and computers', "Cuban magazines of the 60's and the teaching of the history of Cuba: Course for school librarians", 'Altmetric study of the social repercussion of open access brazilian scientific journals', 'Libraries journal. Annals of research: Notes on the evolution of structure and content', 'Labor competencies for commercialization in a science, technology and innovation organization: Isotope center', 'Behavior of scientific production on digital marketing indicated in the scopus database, in the period 2016-2019', 'ASCUBI on its 35th anniversary: Development of Library Science', 'Quality evaluation parameters for electronic newsletters of the national medical library of Cuba', 'The research worker and the library.', "The graduate student's use of the subject catalog", 'Regional library centers tomorrow; a symposium', 'COLLEGE and university library statistics', 'Centralized cataloging in college and university libraries.', 'The reference survey as an administrative tool.', "What professional librarians expect from administrators: One librarian's view", 'Library skills, critical thinking, and the teacher-training curriculum', 'A quarter century of Advanced Data Processing in the University Library', 'Cooperation, collection management, and scientific journals', 'The configuration of reference in an electronic environment', 'Pay equity for women in academic libraries: An analysis of ARL salary surveys, 1976/77-1983/84', 'Automating bibliographic research: Identifying American fiction, 1901-1925', 'Special collections: Strategies for support in an era of limited resources','Immigrant rights advocacy as records literacy in Latinx communities','Access to the inter

test = []
for title in title_list:
    for key in key_list:
        if key in title:
            test.append(title)

I added this in a comment but figured I'd put it here as well:

I added a small sample set for everyone, but it seems to working fine for that. That said, I know, for example, by searching through the raw data, that the keyword hispanic should show up over 30 times in the full data set. That said, when I run the full loop, I get about 30 matches total for ALL of the keywords.

For example, I'm doing this with the list of author assigned keywords I have as well ( keyword_list ). I'm comparing that to the other keywords ( kw_list ).

keyword_matches = []
count = 0

for item in keywords_list:
    if "Cuban" in item:
        keyword_matches.append(item)
        count+=1

print(count)

This prints 6.

keyword_matches = []
count = 0

for item in keywords_list:
    if "Mexican" in item:
        keyword_matches.append(item)
        count+=1

print(count)

This prints 2.

keyword_matches = []
count = 0

for item in keywords_list:
    if "Latino" in item:
        keyword_matches.append(item)
        count+=1

print(count)

This prints 15.

I can keep searching different keywords and I can get the count up further and further. That said, when I run the loop for the entire title list and all of the keywords, I get this:

keyword_matches = []
count = 0

for item in keywords_list:
    for key in kw_list:
        if key in item:
            keyword_matches.append(item)
            count+=1

print(count)

This prints 8. There are more than 8 articles that match these keywords.

What I find is that it's not iterating through all of the titles. It stops somewhere. I made another list to append the else values to, and that returned more items than I had in the initial titles list by a significant margin.

What should I do to make sure the list iterates through all of the titles in the list and compares each of the titles to each of the keywords in the keyword list?

This code is ok if key_list is a list or tuple . It can crush if key_list is a generator , because you can iterate over a generator only once. You can create key_real_list = list(key_list) and use it in iteration later (but note that if the key_list is a generator it will expire anyway).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM