简体   繁体   English

使用 Python 中的模块运行 CSV 的每一行

[英]Running each row of a CSV with a module in Python

Working in Python 3.8.5在 Python 3.8.5 中工作

Problem: CSV file containing 100 rows, two columns containing data that are needed for a module to perform a calculation.问题:CSV 文件包含 100 行,两列包含模块执行计算所需的数据。 I would like to run each row with the two data points, take the output and insert it into a third column.我想用两个数据点运行每一行,取 output 并将其插入第三列。

Action so far: I found the module CSV and can use CSV.reader to read each line.到目前为止的操作:我找到了模块 CSV 并且可以使用 CSV.reader 读取每一行。 I can see how I would get the output of the data points but not how to take them and use them in the module I need to run to process the data.我可以看到如何获得数据点的 output,但看不到如何获取它们并在我需要运行以处理数据的模块中使用它们。 I also found subprocess which I believe is the module which will let me process each line.我还发现了子进程,我相信它是可以让我处理每一行的模块。 I'm just finding it difficult to connect both of these.我只是发现很难将这两者联系起来。

Example data:示例数据:

DateTime,Date,Time,Wind_Direction,Effective_Fetch,Wind_Speed
01/10/2012 00:00,01/10/2012,00:00:00,228,510,1.976
01/10/2012 00:10,01/10/2012,00:10:00,231,516,1.389
01/10/2012 00:20,01/10/2012,00:20:00,239,532,1.759

The two columns I want to process are Effective_Fetch and Wind_Speed .我要处理的两列是Effective_FetchWind_Speed

The module is as follows:模块如下:

def Hs(w, Lf):
    gravity=9.81 #ms^-2
    slope=0.0026
    x = (slope)*(gravity**(-0.53))*(w**(1.06)*(Lf**(0.47)))
    return x

w is Wind_Speed , Lf is Effective_Fetch and x is the value that I would like to insert into a column following Wind_Speed with the column header "Wave_Height" - I've read other modules that should be able to do this too in Pandas. wWind_SpeedLfEffective_Fetchx是我想插入到Wind_Speed的列中的值,列 header “Wave_Height” - 我已经阅读了其他模块,它们也应该能够在 Pandas 中执行此操作。

You probably want something like this你可能想要这样的东西

output_rows = []
with open('mycsv.csv', newline='') as f:
    reader = csv.reader(f)
    # Skip the first (header) row
    headers = next(reader)
    for row in reader:
        # Unpack the two values we are interested in, ignore the others
        *_, effective_fetch, wind_speed = row
        # Values read from CSVs are strings, so cast them to numeric types
        result = Hs(float(wind_speed), int(effective_fetch))
        # Make a new row of the original row and the result of calling Hs
        output_rows.append(row + [result])

# Write out a new csv (if required)
with open('mynewcsv.csv', 'w', newline='') as new_f:
    writer = csv.writer(new_f)
    writer.writerow(headers + ['wave_height'])
    writer.writerows(output_rows)

The csv.reader object is an iterator, so using the next function advances it one step. csv.reader object 是一个迭代器,因此使用next function 将其推进一步。 This is less awkward than having a condition within the main for loop to check if we are processing the first row.这比在主 for 循环中使用条件来检查我们是否正在处理第一行更尴尬。

The Hs function requires two inputs, and luckily they are the final two columns in the csv rows. Hs function 需要两个输入,幸运的是它们是 csv 行中的最后两列。

*_, effective_fetch, wind_speed = row

tells the interpreter to assign the values of the last two columns to effective_fetch and wind_speed , and assign all the previous columns to a variable named _ , which is a common naming convention for a variable that we intend to ignore (you can call it whatever you like, of course).告诉解释器将最后两列effective_fetch值分配给 Effective_fetch 和wind_speed ,并将所有前面的列分配给名为_的变量,这是我们打算忽略的变量的通用命名约定(你可以随便调用它)喜欢,当然)。

You could also do this by row index, especially if the columns were less conveniently placed:您也可以通过行索引来执行此操作,尤其是在列放置不太方便的情况下:

 effective_fetch, wind_speed = row[4], row[5]

or by indexing from the end:或从末尾开始索引:

 effective_fetch, wind_speed = row[-2], row[-1]

or by list slicing:或通过列表切片:

 effective_fetch, wind_speed = row[4:]
 effective_fetch, wind_speed = row[-2:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM