繁体   English   中英

如何在 Python 的循环中循环?

[英]How to loop within a loop in Python?

我需要计算两组字符串在一个句子中出现了多少次。 然而,每当 A 组中的字符串前面有一个否定时,我希望将计数添加到 B 组中。

为此,我编写了一个运行良好的代码。 首先让我向您展示 dataframe 和字符串组:

# Dataframe
df = pd.DataFrame({'X': ['Ciao, I would like to count the number of occurrences in this text considering negations that can change the meaning of the sentence',
                    "Hello, not number of negations, in this case we need to take care of the negation.",
                    "Hello world, don't number is another case in which where we need to consider negations."]})

# Group of words to look into text
a = pd.DataFrame(['number','ciao','text','care'], columns = ['A'])
d = pd.DataFrame(['need'], columns = ['D'])


这就是完成这项工作的代码:

res0 = []
res1 = []

for i in range(len(df)):

    if df['X'][i].find('not') < df['X'][i].find('number') and df['X'][i].find('not') > 0 and abs(
            df['X'][i].find('not') - df['X'][i].find('number')) < 15:
        pattern0 = '|'.join(a[a.A != 'number'].A)
        text = df['X'][i]
        count0 = len(re.findall(pattern0, text))
        res0.append(count0)

        pattern1 = '|'.join(d.append({'D': 'number'}, ignore_index=True).D)
        count1 = len(re.findall(pattern1, text))
        res1.append(count1)

    else:
        pattern2 = '|'.join(a.A)
        text = df['X'][i]
        count2 = len(re.findall(pattern2, text))
        res0.append(count2)

        pattern3 = '|'.join(d.D)
        count3 = len(re.findall(pattern3, text))
        res1.append(count3)

pd.Series(res0)  # [2,1,1]
pd.Series(res1)  # [0,2,1]

那问题是什么? 问题是我考虑一个否定('not')和('number')中a词。 我想做的是扩展代码以遍历每个否定neg (见下文)和a的每个元素。 但是,当我尝试这样做时,我得到了错误的结果。 在下面找到我的尝试:

neg = ['not','dont',"wasnt"]

res0=[]
res1=[]

for i in range(len(df)):
    for j in range(len(neg)):
        for k in range(len(a)):
            if df['X'][i].find(neg[j]) < df['X'][i].find(a.A[k]) and df['X'][i].find(neg[j]) > 0 and abs(df['X'][i].find(neg[j]) - df['X'][i].find(a.A[k])) < 15:
                pattern0 = '|'.join(a[a.A != a.A[k]].A)
                text = df['X'][i]
                count0 = len(re.findall(pattern0, text))
                res0.append(count0)
    
                pattern1 = '|'.join(d.append({'D': a.A[k]}, ignore_index = True).D)
                count1 = len(re.findall(pattern1, text))
                res1.append(count1)
        
            else:
                pattern2 = '|'.join(a.A)
                text = df['X'][i]
                count2 = len(re.findall(pattern2, text))
                res0.append(count2)
    
                pattern3 = '|'.join(d.D)
                count3 = len(re.findall(pattern3, text))
                res1.append(count3)


pd.Series(res0)    # non sense
pd.Series(res1)    # non sense

# results should remain a 3x1 vector

我究竟做错了什么?

谢谢你的帮助!

你可以用正面的眼光检查否定:

pattern = r"(?:(?<=not)|(?<=don't)|(?<=wasn't))\s+(?:number|other|words)"

df['neg_count'] = df['X'].str.findall(pattern).str.len()
print(df)

# Output
                                                   X  neg_count
0  Ciao, I would like to count the number of occu...          0
1  Hello, not number of negations, in this case w...          1
2  Hello world, don't number is another case in w...          1

正则表达式101

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM