简体   繁体   English

将一个文件的代码应用于多个文件 python(新手问题)

[英]Applying code for one file to multiple files python (Newbie Question)

I wrote the following code that takes a column from a csv file, it then converts it into an integer and adds them all up.我编写了以下代码,该代码从 csv 文件中获取一列,然后将其转换为 integer 并将它们全部加起来。 I have done this for only one file and I have around 80 files to apply the same code to.我只为一个文件做了这个,我有大约 80 个文件可以应用相同的代码。

import csv
from collections import defaultdict
columns = defaultdict(list)
with open('Team11BoM.csv') as f:
    reader = csv.DictReader(f)
    for row in reader:
        for (k,v) in row.items():
            if k not in columns:
                columns[k] = list()
            columns[k].append(v)

import pandas as pd
df = pd.read_csv("Team11BoM.csv")

b = list(df['Reported Price'])
a = list(df['Actual Price'])

for i in range(0, len(a)):
    a[i] = int(float(a[i]))

v = sum(a)
print("the total actual cost(s) for team 11 is:", v)

for i in range(0, len(b)):
    b[i] = int(float(b[i]))

h = sum(b)
print("the total reported price for team 11 is:", h)

it prints out the following:它打印出以下内容:

the total actual cost(s) for team 11 is: 945
the total reported price for team 11 is: 707

I want it to print out:我希望它打印出来:

the total actual cost(s) for *filename* is: *Total cost of that team*
the total reported price for *filename* is: *Total reported price of that team*

Is there any simple way to do this?有什么简单的方法可以做到这一点吗?

thanks, Irfan S.谢谢,Irfan S。

First, you should define a function that you can reuse to avoid code repetition.首先,您应该定义一个可以重复使用的 function 以避免代码重复。

import csv
from collections import defaultdict

def process_file(file_name):
    columns = defaultdict(list)
    with open(file_namename) as f:
        reader = csv.DictReader(f)
        for row in reader:
            for (k,v) in row.items():
                if k not in columns:
                    columns[k] = list()
                columns[k].append(v)

    import pandas as pd
    df = pd.read_csv(file_name)

    b = list(df['Reported Price'])
    a = list(df['Actual Price'])

    for i in range(0, len(a)):
        a[i] = int(float(a[i]))

    v = sum(a)
    print(f"the total actual cost(s) for {file_name} 11 is:", v)

    for i in range(0, len(b)):
        b[i] = int(float(b[i]))

    h = sum(b)
    print(f"the total reported price for {file_name} 11 is:", h)

Second, call this function and iterate over the list of files:其次,调用这个 function 并遍历文件列表:

# assuming all of this files are in the current directory

list_of_files = [f for f in os.listdir('.') if os.path.isfile(f)]
for file_name in list_of_files:
    process_file(file_name)
import os
import csv
import pandas as pd
from collections import defaultdict

files_dir = 'csv'

csv_files = os.listdir(files_dir)
print(csv_files)

def convert_to_int(file_name):
    file_name = f'{files_dir}/{file_name}'
    columns = defaultdict(list)
    with open(file_name) as f:
        reader = csv.DictReader(f)
        for row in reader:
            for (k,v) in row.items():
                if k not in columns:
                    columns[k] = list()
                columns[k].append(v)

    df = pd.read_csv(file_name)

    b = list(df['Reported Price'])
    a = list(df['Actual Price'])

    for i in range(0, len(a)):
        a[i] = int(float(a[i]))

    v = sum(a)
    print("the total actual cost(s) for team 11 is:", v)

    for i in range(0, len(b)):
        b[i] = int(float(b[i]))

    h = sum(b)
    print("the total reported price for team 11 is:", h)

for file in csv_files:
    convert_to_int(file)

You can use for loop and iterate over each file in the cwd and do the same with all, Make sure all files are in the same directory您可以使用 for 循环并遍历 cwd 中的每个文件并对所有文件执行相同操作,确保所有文件位于同一目录中

import csv
from collections import defaultdict
import pandas as pd
import os

def valueSum(filename):
    columns = defaultdict(list)
    with open(filename) as f:
        reader = csv.DictReader(f)
        for row in reader:
            for (k,v) in row.items():
                if k not in columns:
                    columns[k] = list()
                columns[k].append(v)

    df = pd.read_csv(filename)

    b = list(df['Reported Price'])
    a = list(df['Actual Price'])

    for i in range(0, len(a)):
        a[i] = int(float(a[i]))

    v = sum(a)

    for i in range(0, len(b)):
        b[i] = int(float(b[i]))

    h = sum(b)

    print("the total actual cost(s) for team 11 is:", v)
    print("the total reported price for team 11 is:", h)

for filename in os.listdir("."):
    if filename.endswith(".csv"): #count only csv files
        valueSum(filename)

How about placing csvs in one directory and perfom loop like this:如何将 csvs 放在一个目录中并像这样执行循环:

import pandas as pd
def summer(f):
    name = f.split('.')[0]
    df = pd.read_csv(f)

    b = list(df['Reported Price'])
    a = list(df['Actual Price'])

    for i in range(0, len(a)):
        a[i] = int(float(a[i]))

    v = sum(a)
    print(f"the total actual cost(s) for {name} is:", v)

    for i in range(0, len(b)):
        b[i] = int(float(b[i]))

    h = sum(b)
    print("the total reported price for {name} is:", h)

path = 'path/to/csv-files/directory/'

import os
for fil in os.listdir(path):
    summer(fil)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM