简体   繁体   English

python 双 for 循环不提供预期结果

[英]python dual for loops does not provide the expected results

I am new to python.我是 python 的新手。 i am trying to run the below code but the results are not as expected:我正在尝试运行以下代码,但结果与预期不符:

c = [0,1,2,3,4]
clus = [c0,c1,c2,c3,c4] #each element in the list is a dataframe
for i in c:
    Movie = data.Title[data.labels == i]
    for j in clus:
        vect = CountVectorizer(stop_words='english',max_features=5)
        cv_fit = vect.fit_transform(j).toarray()
        key_features = vect.get_feature_names()
    print("Cluster",i,"details:")
    print('-'*80)
    print("Key Features:", key_features)
    print("Movies in the cluster:")
    print(Movie)
    print("Movies in the cluster:",i)
    print(' ')
    print(' ')  

Expected Output:预期 Output:


Cluster 0 details:
--------------------
Key features: ['water', 'on the', 'her', 'while', 'she']
Movies in this cluster:
One Flew Over the Cuckoo's Nest, The Sound of Music, Star Wars, Chinatown, The Bridge on the River Kwai, Apocalypse Now, Jaws, The Good, the Bad and the Ugly, Butch Cassidy and the Sundance Kid
========================================
Cluster 1 details:
--------------------
Key features: ['her', 'she', 'about', 'to her', 'that she']
Movies in this cluster:
Gone with the Wind, The Wizard of Oz, Titanic, Psycho, Sunset Blvd., Vertigo
========================================

and so on .... 

But my Current Output is:但我当前的 Output 是:

Cluster 0 details:
--------------------------------------------------------------------------------
Key Features: ['water', 'on the', 'her', 'while', 'she']
Movies in the cluster:
0     One Flew Over the Cuckoo's Nest
1     The Sound of Music
3     Star Wars
4     Chinatown
6    The Bridge on the River Kwai
93    Apocalypse Now
94    Jaws
95    The Good
97    the Bad and the Ugly
99    Butch Cassidy and the Sundance Kid
Name: Title, Length: 67, dtype: object
 

Cluster 1 details:
--------------------------------------------------------------------------------
Key Features: ['water', 'on the', 'her', 'while', 'she']
Movies in the cluster:
7     Gone with the Wind
56    The Wizard of Oz
85    Titanic
89    Psycho
92    Sunset Blvd
100   Vertigo
Name: Title, dtype: object
 
and so on ...

Key features remains the same for all the clusters.所有集群的关键特性保持不变。 What should i adjust in my code so that my key features also changes for different clusters.我应该在我的代码中进行哪些调整,以便我的关键功能也会针对不同的集群进行更改。

data.head(2) looks like the below:

       Title       |          Synopsis                |    Labels |
     --------------------------------------------------------------
0    |The Godfather|Guests are gathered last summer...|       0   |
1    |Raging Bull  |The film opens in 1964 ....       |       1
    

CountVectorizer is an algorithmn that we use in natural language processing (NLP) CountVectorizer 是我们在自然语言处理 (NLP) 中使用的一种算法

from sklearn.feature_extraction.text import CountVectorizer

I need the cluster number like (0,1,2,3,4) then followed by key features in each cluster.我需要像 (0,1,2,3,4) 这样的集群编号,然后是每个集群中的关键特征。 Each cluster is a dataframe which is a subset of the "data".每个集群都是一个 dataframe,它是“数据”的一个子集。 c0 was taken from the data whereever the labels are "0" similarly it was done for all the c0,c1,c2,c3,c4. c0 取自标签为“0”的数据,类似地,它对所有 c0、c1、c2、c3、c4 进行。

Each cluster will have a unique key features since the input is different for each cluster.每个集群都有一个独特的关键特征,因为每个集群的输入都不同。 but my code prints the c0 key features for all the clusters which is incorrect.但我的代码为所有不正确的集群打印了 c0 关键特性。

11th line of the code has some problem because of which it prints the same key feature results which it got for cluster0 instead of printing the result of cluster1代码的第 11 行有一些问题,因为它打印了与 cluster0 相同的关键特征结果,而不是打印 cluster1 的结果

There is no need for the nested loop.不需要嵌套循环。

c = [0,1,2,3,4]
clus = [c0,c1,c2,c3,c4] #each element in the list is a dataframe
for i in c:
    Movie = data.Title[data.labels == i]
    cluster = clus[i]
    vect = CountVectorizer(stop_words='english',max_features=5)
    cv_fit = vect.fit_transform(cluster).toarray()
    key_features = vect.get_feature_names()
    print("Cluster",i,"details:")
    print('-'*80)
    print("Key Features:", key_features)
    print("Movies in the cluster:")
    print(Movie)
    print("Movies in the cluster:",i)
    print(' ')
    print(' ')  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM