简体   繁体   English

尝试根据索引的值将列表过滤到另一个列表中

[英]Trying to filter a list into another list based on the value of an index

I am trying to create an app that will need to use the IMDb dataset.我正在尝试创建一个需要使用 IMDb 数据集的应用程序。 IMDb gives out datasets that have every title on IMDb, including tv shows and videogames. IMDb 提供了包含 IMDb 上所有标题的数据集,包括电视节目和视频游戏。 For example:例如:

"tt9620292 movie Promising Young Woman Promising Young Woman 0 2020 \N 113 Crime,Drama,Thriller “tt9620292 电影有前途的年轻女子有前途的年轻女子 0 2020 \N 113 犯罪,戏剧,惊悚片

tt1568322 videoGame Batman: Arkham City Batman: Arkham City 0 2011 \N \N Action,Adventure,Crime tt1568322 电子游戏蝙蝠侠:阿卡姆城 蝙蝠侠:阿卡姆城 0 2011 \N \N 动作、冒险、犯罪

tt0141842 tvSeries The Sopranos The Sopranos 0 1999 2007 55 Crime,Drama" tt0141842 电视连续剧《黑道家族》《黑道家族 0 1999 2007 55 犯罪、戏剧》

I have read in the IMDb dataset and broken each line into a list, splitting them by tabs, which gives me a bunch of lists where index one of each list has the description of the title(movie, tvEpisode, short, videogame, etc.) The ultimate goal is to read in this IMDb dataset text file, then write to another file with only titles with the description of movie, tvMovie, or short.我已经阅读了 IMDb 数据集,并将每一行分成一个列表,用选项卡将它们分开,这给了我一堆列表,其中每个列表的索引一个都有标题的描述(电影、tvEpisode、短片、视频游戏等。 ) 最终目标是读入这个 IMDb 数据集文本文件,然后写入另一个文件,其中只有包含电影、tvMovie 或短片描述的标题。 For now, I am printing out test lists to see if I can correctly filter out everything but movie, tvMovie and short titles.现在,我正在打印测试列表,看看我是否可以正确过滤除电影、tvMovie 和短片之外的所有内容。

Now I am stuck at a point where I can print out index one of each list, which only prints out the description of each title: like this but when I try to use the same list and index to populate another list with only movie, tvMovie and short titles, it fills the list with every title regardless.现在我被困在一个可以打印出每个列表的索引之一的点上,它只打印出每个标题的描述:像这样但是当我尝试使用相同的列表和索引来填充另一个列表时只有电影,tvMovie和短标题,它无论如何都会用每个标题填充列表。

IMDbFile = open("titleTest", "r")
lineList = IMDbFile.readlines()

listOfList = []
for x in lineList:
    listOfList.append(x.split("\t"))

movieOnly = []
for x in listOfList:
    if x[1] == "short" or "movie" or "tvMovie":
        movieOnly.append(x)
    #print(x[1])


for n in movieOnly:
    print(n)
IMDbFile.close()

When printing the "movieOnly" list, I get every title: like this打印“movieOnly”列表时,我得到每个标题:像这样

I think your problem lies in the way that you wrote the if statement.我认为您的问题在于您编写 if 语句的方式。 Instead of:代替:

movieOnly = []
for x in listOfList:
    if x[1] == "short" or "movie" or "tvMovie":
        movieOnly.append(x)

try尝试

movieOnly = []
for x in listOfList:
    if x[1] in ["short","movie","tvMovie"]:
        movieOnly.append(x)

The reason is that this:原因是这样的:

if 'a' == 'b' or 'c' or 'd':
    print('here')

will print 'here' because将打印“这里”,因为

or 'c'

is evaluating if 'c' is True and not whether 'a'=='c'正在评估 'c' 是否为 True 而不是 'a'=='c'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM