简体   繁体   English

如何比较python中存储在两个列表中的字符串?

[英]How to compare the string stored in two list in python?

Issue is with the loop问题在于循环

I can't iterate and check the value from solu with dgu list.我不能迭代,并检查从价值soludgu名单。

It prints above output upto print(solu)它在输出上方print(solu) upto print(solu)

The loop used later lags and stops there with no output and I'm clueless here.后来使用的循环滞后并在没有输出的情况下停止,我在这里一无所知。

Could Someone explain how to compare strings if they exist in two different files from different sources?如果字符串存在于来自不同来源的两个不同文件中,有人可以解释如何比较字符串吗?

from pandas import *
import pandas as pd
import csv
import re
import deepdiff
from pprint import pprint
import xlrd
from difflib import SequenceMatcher
import xlsxwriter
import tocamelcase
from spellchecker import SpellChecker
import numpy as np

xlsx = ExcelFile('WrongSpelling.xlsx')
df = xlsx.parse(xlsx.sheet_names[0])

dg = pd.read_csv("pfm.csv", usecols = ['Place Id','Name','Category'])
pla = dg['Place Id'].values.tolist()
nam = dg['Name'].values.tolist()
cat = dg['Category'].values.tolist()

print()
df2 = pd.DataFrame(df, columns = ['Spelling'])
bat= df2['Spelling'].values.tolist()

namo = [x.lower() for x in nam]
bato = [x.lower() for x in bat]

sol = set(namo) & set(bato)
solu = list(sol)
dgu= dg.values.tolist()
nam=list(nam)

print(solu)

print()

print("The Count of Matches with the incorrect data is" ,len(solu))

print(dg[:5])


print()

while i < len(dgu):
    while i < len(solu):
        # a = solu[i]
        # b = dgu[i]
        # c = nam[i]
        if solu[i] in dgu[i]:
            print(dgu[i])
        else:
            pass
    i+=1

Your inner while loop is using the variable i as the conditional to when it passes the length of solu , but you enver increment within that while loop, so it will loop forever checking for i < len(solu) which will never evaluate to False if it enters the loop the first time.您的内部 while 循环使用变量i作为它通过solu长度时的条件,但是您在该 while 循环中递增,因此它将永远循环检查i < len(solu) ,如果它永远不会评估为False它第一次进入循环。

As @offeltoffel mentioned, for loop seems to fit your need better here.正如@offeltoffel 所提到的,for 循环在这里似乎更适合您的需要。 Without being able to compile your code without a verifiable example, here is what the for loop could look like:如果没有可验证的示例,就无法编译您的代码,以下是 for 循环的样子:

for i in range(len(dgu):
    for j in range(len(solu)):
        if solu[j] in dgu[i]:
            print(dgu[i])
        # don't need elsepass here, as it serves no purpose
    # don't need to increment i/j in a for loop manually as it iterates through the range created from the length of dgu/solu

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM