简体   繁体   English

熊猫应用系列-列的顺序

[英]Pandas apply Series- Order of the columns

To aggregate and to find values per second, I am doing following in Python using pandas, however, in the output logged to a file doesn't show columns in the way they appear here. 为了汇总和查找每秒的值,我正在Python中使用pandas进行跟踪,但是,记录到文件中的输出并没有按照它们在此处显示的方式显示列。 Somehow these column names are sorted and hence TotalDMLsSec shows up before UpdateTotal and UpdatesSec. 以某种方式对这些列名称进行排序,因此在UpdateTotal和UpdatesSec之前会显示TotalDMLsSec。

    'DeletesTotal': x['Delete'].sum(),
    'DeletesSec': x['Delete'].sum()/VSeconds,
    'SelectsTotal': x['Select'].sum(),
    'SelectsSec': x['Select'].sum()/VSeconds,
    'UpdateTotal': x['Update'].sum(),
    'UpdatesSec': x['Update'].sum()/VSeconds,
    'InsertsTotal': x['Insert'].sum(),
    'InsertsSec': x['Insert'].sum()/VSeconds,
    'TotalDMLsSec':(x['Delete'].sum()+x['Update'].sum()+x['Insert'].sum())/VSeconds
    })
)
df.to_csv(/home/summary.log,sep='\t', encoding='utf-8-sig')

Apart from above questions, have couple of other questions- 除了上述问题,还有其他几个问题-

  1. Despite logging as csv format, all values/columns are appearing in one column in excel, is there anyway to properly load data CSV 尽管以csv格式记录,但所有值/列都显示在excel的一列中,是否可以正确加载数据CSV
  2. Can rows be sorted based on one column(let say InsertsSec) by default when writing to csv file? 在写入csv文件时,默认情况下行可以基于一列进行排序吗(让我们说InsertsSec)?

Any help here would be really appreciated. 在这里的任何帮助将不胜感激。

Assume that your DataFrame is something like this: 假设您的DataFrame是这样的:

      Deletes  Selects  Updates  Inserts
Name                                    
Xxx        20       10       40       50
Yyy        12       32       24       11
Zzz        70       20       30       20

Then both total and total per sec can be computed as: 然后, 总数每秒总数都可以计算为:

total = df.sum().rename('Total')
VSeconds = 5   # I assumed some value
tps = (total / VSeconds).rename('Total per sec')

Then you can add both above rows to the DataFrame: 然后,您可以将以上两行都添加到DataFrame中:

df = df.append(totals).append(tps)

The downside is that all numbers are converted to float . 缺点是所有数字都转换为float But in Pandat there is no other way, as each column must have values of one type. 但是在Pandat中没有其他方法,因为每一列必须具有一种类型的值。

Then you can eg write it to a CSV file (with totals included). 然后,您可以将其写入CSV文件(包括总数)。

This is how I eneded up doing 这就是我的努力

    df.to_excel(vExcelFile,'All')
    vSortedDF=df.sort_values(['Deletes%'],ascending=False)
    vSortedDF.loc[vSortedDF['Deletes%']> 5, ['DeletesTotal','DeletesSec','Deletes%']].to_excel(vExcelFile,'Top Delete objects')
vExcelFile.save()

For CSV, instead of using separate \\t used , and it worked just fine. 对于CSV,而不是使用单独的\\ t used ,它工作得很好。 df.to_csv(/home/summary.log,sep='\\t', encoding='utf-8-sig')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM