简体   繁体   中英

How can I iterate through a column of a pandas DataFrame and return value from another column?

I am writing this code because I need to go to a folder that contains several images with names that I need to change. I need to get the useful number from the file name, search for that number on an excel file, return the correspondent value for that row but a different column, and rename the file with the new value found in the different column. I also need it to know if the useful number obtained from the file name is present in column 1 or column 2 (Is the value present in Nbr1 or in Nbr2?). My problem is that the "useful number" on the file name is a string, and the values in Excel are numbers. I have tried changed them both to string or both to integer, but the columns of the DataFrame remain an object, so I can't iterate over it and find the value that I need.

Nbr1  Nbr2  Nbr3
456  9630  778899
123  8520  445566
999  7410  112233

As an example, if an image is named "999-3.jpeg" I want it to be renamed as *"112233c.jpeg", "112233" being the correspondent value for "999" in another column of an Excel file.

Feel free to criticize my code, I know it's not too organized or clean, but what I care about the most is getting it to work. Thanks a lot for your help.

I have used pandas and os, and changed several times from string to integer both the file name and the values from the DataFrame. I have also stored the columns on a variable each to see if I could iterate over them, but it didn't work.

import os
import pandas as pd

os.chdir("C:\\Users\\Documents\\Rename")

changes = {
    "1":"a",
    "2":"b",
    "3":"c"
    }

def pic_rename(separator):
    table = pd.read_excel("List.xlsx")
    df = pd.DataFrame(table)
    column1 = df["Nbr1"]
    column2 = df["Nbr2"]
    name_list = []
    for f in os.listdir():
        file_name, file_ext = os.path.splitext(f)
        if file_ext == (".jpg" or ".jpeg"):
            useful_name, extra = file_name.split(separator)
            useful_name = int(useful_name.strip())
            name_list.append(useful_name)
            counter1 = 0
            counter2 = 0
            for x in name_list:
                if x in column1:
                    counter2 = 0
                    if counter1 == 0:
                        df = df.set_index("Nbr1", drop = True, append = False, inplace = False, verify_integrity=False)
                        result = df.loc[x, "Nbr3"]
                        extra = extra.strip()[-1]
                        final_name = str(result) + str(changes.get(extra))
                        os.rename(f, result + file_ext)
                        counter1 += 1
                    else:
                        result = df.loc[x, "Nbr3"]
                        extra = extra.strip()[-1]
                        final_name = str(result) + str(changes.get(extra))
                        os.rename(f, result + file_ext)
                        counter1 += 1
                elif x in column2:
                    counter1 = 0
                    if counter2 == 0:
                        df = df.set_index("Nbr2", drop = True, append = False, inplace = False, verify_integrity=False)
                        result = df.loc[x, "Nbr3"]
                        extra = extra.strip()[-1]
                        final_name = str(result) + str(changes.get(extra))
                        os.rename(f, result + file_ext)
                        counter2 += 1
                    else:
                        result = df.loc[x, "Nbr3"]
                        extra = extra.strip()[-1]
                        final_name = str(result) + str(changes.get(extra))
                        os.rename(f, result + file_ext)
                        counter2 += 1
                else:
                    print("This number isn't in Column 1 or 2")
        else:
            print("This file is not an image")


separator = input("Please insert the character that separates the useful name from the extra that you don't want")

pic_rename(separator)

The latest error I have been getting is "TypeError: 'int' object is not iterable" but I have gotten a couple more errors, mainly while trying to iterate with the filename through the columns ("Nbr1") and trying to get as a result "Nbr3". I can be more specific with the Error of the code in a couple of hours.

Edit: The issue I'm currently getting is that the code works and iterates, but it's not finding the value inside the Excel column (even though I know it's there), and it's skipping the if and just printing my else statement.

I didn't entirely get your code but here are some observations.

You can change the dataframe values by using:

df.astype(str)

They will become 'object' type, but that's fine for assignment/comparison with strings.

To iterate over a dataframe you can use:

for index, row in df.iterrows():

That will return the index of the row and the entire row of the dataframe you're iterating. Then to get some value of a columns of the current row you can simply use:

value1 = row['Nbr1']
value2 = row['Nbr2']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM