使用 Python 逐行写入 CSV

Question

I've written a Python boto3 code to get the average EC2 CPU utilization/day for the last 2 days.我写了一个 Python boto3 代码来获取最近 2 天的平均 EC2 CPU 利用率/天。 Here's the code:这是代码：

import boto3
import datetime
import csv

accountId = boto3.client('sts').get_caller_identity()['Account']
session = boto3.session.Session()
region = session.region_name
ec2 = session.resource('ec2',region_name=region)
s3 = session.resource('s3')

fields = ['Account' , 'Region' , 'InstanceID' , 'InstanceName']
start = datetime.datetime.today() - datetime.timedelta(days=2)
end = datetime.datetime.today()

instanceId = ''
instanceName = ''
rows = []
filename = 'CPUUtilization.csv'

def get_cpu_utilization(instanceId):
    cw = boto3.client('cloudwatch',region_name=region)
    res = cw.get_metric_statistics(
        Namespace = 'AWS/EC2',
        Period = 86400,
        StartTime = start,
        EndTime = end,
        MetricName = 'CPUUtilization',
        Statistics = ['Average'],
        Unit = 'Percent',
        Dimensions = [
            {
                'Name' : 'InstanceId',
                'Value' : instanceId
            }
        ]
    )
    return res

def lambda_handler(event, context):
    for instance in ec2.instances.all():
        if instance.tags != None:
            for tags in instance.tags:
                if tags['Key'] == 'Name':
                    instanceName = tags['Value']
                    break 
        instanceId = str(instance.id)
        response = get_cpu_utilization(instanceId)
        rows.append([accountId, region, instanceId, instanceName])

        for r in response['Datapoints']:
            day = r['Timestamp'].date()
            week = day.strftime('%a')
            avg = r['Average']
            day_uti = ' '.join([str(day),week])
            fields.append(day_uti)
            rows.append([avg])

    with open("/tmp/"+filename, 'w+') as csvfile:
        csvwriter = csv.writer(csvfile)
        csvwriter.writerow(fields)
        csvwriter.writerows(rows)
    csvfile.close()

    s3.Bucket('instances-cmdb').upload_file('/tmp/CPUUtilization.csv', 'CPUUtilization.csv')

The output written to the CSV file is like this:写入CSV文件的output是这样的：

在此处输入图像描述

The average CPU utilization value is printed in the A3 cell, but this has to be printed/written to E2 cell under the date.平均 CPU 利用率值打印在 A3 单元格中，但这必须打印/写入日期下方的 E2 单元格。 And all the subsequent days to be written to 1st row and the corresponding values should go to 2nd row, cell by cell, under their respective dates.所有后续日期都将写入第一行，相应的值应为 go 到第二行，逐个单元格，在各自的日期下。

How can I achieve this?我怎样才能做到这一点？

I have a couple of other questions related to AWS CloudWatch metrics.我有几个与 AWS CloudWatch 指标相关的其他问题。

This particular instance was in stopped state the whole day (1st April 2022).此特定实例全天（2022 年 4 月 1 日）停止 state。 Still this Lambda function is giving some CPU utilization value on that day.这个 Lambda function 仍然在那天给出了一些 CPU 利用率值。 When I checked for the same from the console, I don't see any data.当我从控制台检查相同内容时，我没有看到任何数据。 How is this possible?这怎么可能？ Am I making any mistake?我犯了什么错误吗？
When I ran this function multiple times, I got different CPU utilization values.当我多次运行这个 function 时，我得到了不同的 CPU 利用率值。 The above attached image was from 1st execution (Avg CPU utilization=0.110935...).上面的附加图像来自第一次执行（平均 CPU 利用率 = 0.110935 ...）。 Below is the result from 2nd execution下面是第二次执行的结果

在此处输入图像描述

Here the avg CPU utilization for the same instance on the same day is different(0.53698..) from previous result.这里同一天同一实例的平均 CPU 利用率与之前的结果不同 (0.53698..)。 Is this mistake from my side or what?这是我这边的错误还是什么？

Please help.请帮忙。

NOTE: There is only one instance in my account and it was in stopped state the whole day (1st April 2022) and started only on 2nd April 2022 at around 8:00PM IST.注意：我的帐户中只有一个实例，它一整天（2022 年 4 月 1 日）都处于停止状态 state，并且仅在美国标准时间 2022 年 4 月 2 日晚上 8:00 左右才开始。

Answer 1

You need to rethink your logic for adding columns for each datapoint returned.您需要重新考虑为返回的每个数据点添加列的逻辑。

The row list contains one entry per row. row列表每行包含一个条目。 It starts with this:它是这样开始的：

rows.append([accountId, region, instanceId, instanceName])

That creates one entry in the list that is a list with four values.这会在列表中创建一个条目，该条目是一个包含四个值的列表。

Later, the code attempts to add another column with:稍后，代码尝试添加另一列：

rows.append([avg])

This results in rows having the value of [[accountId, region, instanceId, instanceName], [avg]] .这导致rows的值为[[accountId, region, instanceId, instanceName], [avg]] 。

This is adding another row , which is why it is appearing in the CSV file as a separate line.这是添加另一行，这就是为什么它作为单独的行出现在 CSV 文件中的原因。 Rather than adding another row, the code needs to add another entry in the existing row .代码不需要添加另一行，而是需要在现有行中添加另一个条目。

The easiest way to do this would be to save the row in a list and only add the 'row' once you have all the information for the row .执行此操作的最简单方法是将行保存在列表中，并且只有在获得该行的所有信息后才添加“行” 。

So, you could replace this line:所以，你可以替换这一行：

rows.append([accountId, region, instanceId, instanceName])

with:和：

current_row = [accountId, region, instanceId, instanceName]

And you could later add to it with:你可以稍后添加到它：

current_row.append(avg)

Then, after the for loop has completed adding all the columns, it can be stored with:然后，在for循环完成添加所有列后，可以存储：

rows.append(current_row)

Also, be careful with this line:另外，请注意这一行：

fields.append(day_uti)

It is adding the date to the fields list, but if there is more than one instance, each instance will add an entry.它将日期添加到fields列表中，但如果有多个实例，每个实例都会添加一个条目。 I presume you want them to be the same date, so it won't work out like you expect.我假设您希望它们是同一天，所以结果不会像您期望的那样。

使用 Python 逐行写入 CSV

问题描述

1 个解决方案

解决方案1
0 2022-04-02 22:24:55

使用 Python 逐行写入 CSV

问题描述

1 个解决方案

解决方案1 0 2022-04-02 22:24:55

解决方案1
0 2022-04-02 22:24:55