简体   繁体   中英

Python: Looking for specific word or values in an excel file

So i was given two excel files which is like this example

movieId   title                 genres
1       Toy Story (1995)        Adventure|Animation|Children|Comedy|Fantasy
2       Jumanji   (1995)        Adventure|Children|Fantasy
3       Grumpier Old Men (1995) Comedy|Romance
4       Waiting to Exhale (1995)    Comedy|Drama|Romance
5       Father of the Bride Part II (1995)  Comedy

the thing I'm trying to make is when someone types in a title the code would find the movieID and the movie name. The only problem is I have no idea where to start I'm a noob coder and I've been trying my best to learn but i have no idea, if you guys can help me and point me in the right direction that would be amazing.

Thank you

Okay, since you're a noob coder, I'll explain it to you in a simple way that doesn't actually require any libraries. Also I'm going to assume you are using movie title and move name interchangeably.

First, you can transform an excel file into a .csv , which stands for comma separated file (via excel, just save as, select csv. You can do it via google sheets too). What is a csv file? It's like the excel file except every row is on a line by itself and different columns are separated by commas. So the first three lines in your csv would be:

movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji   (1995),Adventure|Children|Fantasy

Now, the .csv can be read as a regular file. You should read them in line by line. Here is the python doc for that. It's pretty straight forward.

Now that you have every line as a string, we can split them via the string.split() command. We need to split using the comma as a delimiter since it's a comma separated file. So far our code is something like this (I assume you read the different lines of the csv into the lines arrays):

lines = [...] # a list of strings which are the different lines of the csv
name_im_looking_for = "move you like" # the movie you're looking for
for(l in lines):
    columns = l.split(',')
    id = columns[0]
    name = columns[1]
    if(name.find(name_im_looking_for) != -1): 
        # this means the name you're looking for is within the 'name' col
        print "id is", id, "and full name is", name

This is just a crude way to do it, but if you're really new to programming should help you get on your way! If you have any questions, feel free to ask (and if you're actually good and you just want to know how to use openpyxl, please specify so in your question).

Here's how you'd do it in openpyxl, since you included the openpyxl tag in your question:

import openpyxl as xl

workbook = xl.load_workbook(filename="test.xlsx")

title_column_name = "title"

# Get the active worksheet
ws = workbook.active

# The String we'll search for. You could prompt the user to provide
# this using python2's raw_input, oder python3's input function.
searchstring = "Grumpier"

# ws.rows[1:] means we'll skip the first row (the header row).
for row in ws.rows[1:]:
    # row[1] is the title column. string.find(str) returns -1
    # if the value was not found, or the index in the string if
    # the value was found.
    if row[1].value.find(searchstring) != -1:
        print("Found a matching row! MovieId={0}, Title={1}".format(row[0].value, row[1].value))

Output:

Found a matching row! MovieId=3, Title=Grumpier Old Men (1995)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM