简体   繁体   中英

counting number with 3 csvs python

I have 3 csv that i will like to change one column to a running number that depends on the number on rows in the file. For exmaple, file 1 got 400 rows, file 2 got 240, and file 3 got 100. so the added column for file 1 will be running number from 1 to 400. so the added column for file 2 will be running number from 401 to 640. so the added column for file 3 will be running number from 641 to 741.

what I wrote is this

file1 = str(path) + "file1"
file2 = str(path) + "file2"
file3 = str(path) + "file3"
files = [file1, file2, file3]


class File_Editor():
    def line_len(self):
        for k in range(0,2):
            file_name = open(files[k] + ".csv")
            numline = len(file_name.readlines())
            print (numline)

I am stuck with making the running number for each file by remembering the number of row that were on the file before.

Thanks Alot!

+++++EDIT+++++

@roganjosh Thanks alot, I used your code with a little fixed for the running_number = 1, I have put it inside the def, that both files will have the same running number.

One last thing, How can I add at the first row Index, for example, "Number" and then from the 2nd row, run the running_number_in_csv.

Thanks

@roganjosh I have fixed my code. I know what is the lenght on the file, now i need to add a column with running numbers like:

file1 1 to 400

file2 401 to 641

file 3

642 to 742

Thanks alot!

Looking at your previous questions that are left open, the common theme is fundamental issue with understanding in how to use functions in Python that isn't being addressed. I will try and unpick part of this to prevent similar questions arising. I'm assuming you come from a scientific background like me so I'll stick to that.

You never pass arguments to your functions, only self . Instead you try to reference globals from within the function, but there is no need and it is confusing. For example, I might have the equation y = x^2 + 3x + 5 that is both a mathematical function and can be a python function.

def quadratic(value_of_x):
    y = (value_of_x **2) + (3*value_of_x) + 5
    return y

eg_1 = quadratic(5)
print (eg_1)
eg_2 = quadratic(3)
print (eg_2)

# But this will fail
#print (y)

y exists only within the Python function as a local variable and is destroyed once you leave the def / return block. In this case, eg_1 , eg_2 assume the value of y at the end of the function and value_of_x assumes the value that I put in brackets on the function call (the argument/variable). That's the point of functions, they can be used over and over.

I can also pass multiple arguments to the function.

def new_quadratic(value_of_x, coefficient):
    y = coefficient*(value_of_x **2) + (3*value_of_x) + 5
    return y

eg_3 = new_quadratic(5, 2)
print (eg_3)

Not only can I not get a value for y outside of the scope of a function, but a function does nothing unless it's called. This does nothing; it's the equivalent of knowing the formula in your head but never running a number through it - you're just defining it as something that your script could use.

starting_number = 5

def modify_starting_number(starting_number):
    starting_number = starting_number * 2
    return starting_number

print (starting_number)

Whereas this does what you expected it to do. You call the function ie pass the number through the formula.

starting_number = 5

def modify_starting_num(starting_num):
    starting_num = starting_num * 2
    return starting_num

starting_number = modify_starting_num(starting_number) # Calling the function
print (starting_number)

With that out of the way, on to your question.

import csv

files = ['file_1', 'file_2']

def running_number_in_csv(filename_list):
    """ running_number resets every time the function is called, but is 
    remembered within the function itself"""
    running_number = 1 

    for individual_file in filename_list:
        new_rows = [] # Make something to hold row + extra column

        # Read contents of each row and append the running number to the list
        with open(individual_file + '.csv', 'r') as infile:
             reader = csv.reader(infile)
             for row in reader:
                 row.append(running_number)
                 new_rows.append(row)
                 running_number += 1 # Increments every row, regardless of file name number

        # Write the list containing the extra column for running number
        with open(individual_file + '.csv', 'w') as outfile: # Might need 'wb' in Windows
            writer = csv.writer(outfile)
            writer.writerows(new_rows)

get_running_number = running_number_in_csv(files) # CALL THE FUNCTION :) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM