[英]Python: Looking for specific word or values in an excel file
So i was given two excel files which is like this example 所以我得到了两个excel文件,就像这个例子
movieId title genres
1 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
2 Jumanji (1995) Adventure|Children|Fantasy
3 Grumpier Old Men (1995) Comedy|Romance
4 Waiting to Exhale (1995) Comedy|Drama|Romance
5 Father of the Bride Part II (1995) Comedy
the thing I'm trying to make is when someone types in a title the code would find the movieID and the movie name. 我要尝试做的是,当有人键入标题时,代码将找到movieID和电影名称。 The only problem is I have no idea where to start I'm a noob coder and I've been trying my best to learn but i have no idea, if you guys can help me and point me in the right direction that would be amazing.
唯一的问题是我不知道从哪里开始我是一个菜鸟编码者,我一直在努力学习,但是我不知道,如果你们可以帮助我并指出正确的方向,那将是惊人的。
Thank you 谢谢
Okay, since you're a noob coder, I'll explain it to you in a simple way that doesn't actually require any libraries. 好的,因为您是菜鸟编码者,所以我将以一种实际上不需要任何库的简单方式向您解释。 Also I'm going to assume you are using movie title and move name interchangeably.
我还要假设您正在交替使用电影标题和名称。
First, you can transform an excel file into a .csv
, which stands for comma separated file (via excel, just save as, select csv. You can do it via google sheets too). 首先,您可以将excel文件转换为
.csv
,代表用逗号分隔的文件(通过excel,只需另存为,然后选择csv即可。您也可以通过Google表格来完成此操作)。 What is a csv file? 什么是csv文件? It's like the excel file except every row is on a line by itself and different columns are separated by commas.
就像excel文件一样,不同之处在于每一行本身都是一行,并且不同的列之间用逗号隔开。 So the first three lines in your csv would be:
因此,csv中的前三行将是:
movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
Now, the .csv can be read as a regular file. 现在,.csv可以作为常规文件读取。 You should read them in line by line.
您应该逐行阅读它们。 Here is the python doc for that.
这是用于此的python文档。 It's pretty straight forward.
非常简单。
Now that you have every line as a string, we can split them via the string.split()
command. 现在您已经将每一行都作为字符串,我们可以通过
string.split()
命令将它们分割。 We need to split using the comma as a delimiter since it's a comma separated file. 我们需要使用逗号作为分隔符来进行拆分,因为它是一个逗号分隔的文件。 So far our code is something like this (I assume you read the different lines of the csv into the
lines
arrays): 到目前为止,我们的代码是这样的(我假设您将csv的不同行读入
lines
数组):
lines = [...] # a list of strings which are the different lines of the csv
name_im_looking_for = "move you like" # the movie you're looking for
for(l in lines):
columns = l.split(',')
id = columns[0]
name = columns[1]
if(name.find(name_im_looking_for) != -1):
# this means the name you're looking for is within the 'name' col
print "id is", id, "and full name is", name
This is just a crude way to do it, but if you're really new to programming should help you get on your way! 这只是一种粗略的方法,但是,如果您真的是编程新手,应该可以帮助您上路! If you have any questions, feel free to ask (and if you're actually good and you just want to know how to use openpyxl, please specify so in your question).
如果您有任何疑问,请随时提问(如果您真的很不错,并且只想知道如何使用openpyxl,请在问题中指定。)
Here's how you'd do it in openpyxl, since you included the openpyxl tag in your question: 这是在openpyxl中的处理方式,因为您在问题中包含了openpyxl标签:
import openpyxl as xl
workbook = xl.load_workbook(filename="test.xlsx")
title_column_name = "title"
# Get the active worksheet
ws = workbook.active
# The String we'll search for. You could prompt the user to provide
# this using python2's raw_input, oder python3's input function.
searchstring = "Grumpier"
# ws.rows[1:] means we'll skip the first row (the header row).
for row in ws.rows[1:]:
# row[1] is the title column. string.find(str) returns -1
# if the value was not found, or the index in the string if
# the value was found.
if row[1].value.find(searchstring) != -1:
print("Found a matching row! MovieId={0}, Title={1}".format(row[0].value, row[1].value))
Output: 输出:
Found a matching row! MovieId=3, Title=Grumpier Old Men (1995)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.