简体   繁体   English

使用列名创建一个空的数据框或列表,然后按列名添加数据

[英]Creating an empty dataframe or List with column names then add data by column names

I am trying to learn python 2.7 by converting code I wrote in VB to python.我正在尝试通过将我在 VB 中编写的代码转换为 python 来学习 python 2.7。 I have column names and I am trying to create a empty dataframe or list then add rows by iterating (see below).我有列名,我正在尝试创建一个空的数据框或列表,然后通过迭代添加行(见下文)。 I do not know the total number of rows I will need to add in advance.我不知道我需要提前添加的总行数。 I can create a dataframe with the column names but can't figure out how to add the data.我可以用列名创建一个数据框,但不知道如何添加数据。 I have looked at several questions like mine but the row/columns of data are unknown in advance.我已经看过几个像我这样的问题,但数据的行/列事先是未知的。

snippet of code:代码片段:

cnames=['Security','Time','Vol_21D','Vol2_21D','MaxAPV_21D','MinAPV_21D' ]
df_Calcs = pd.DataFrame(index=range(10), columns=cnames)

this creates the empty df (df_Calcs)...then the code below is where I get the data to fill the rows...I use n as a counter for the new row # to insert (there are 20 other columns that I add to the row), but the below should explain what I am trying to do.这会创建空的 df (df_Calcs)...然后下面的代码是我获取数据以填充行的地方...我使用 n 作为新行 # 插入的计数器(我添加了其他 20 列到行),但下面应该解释我想要做什么。

i = 0
n = 0
while True:
        df_Calcs.Security[n] = i + 1
        df_Calcs.Time[n] = '09:30:00'
        df_Calcs.Vol_21D[n] = i + 2
        df_Calcs.Vol2_21D[n] = i + 3
        df_Calcs.MaxAPV_21D[n] = i + 4
        df_Calcs.MinAPV_21D[n] = i + 5
        i = i +1
        n = n +1
        if i > 4:
           break

print df_Calcs If I should use a list or array instead please let me know, I am trying to do this in the fastest most efficient way.打印 df_Calcs 如果我应该使用列表或数组,请告诉我,我正在尝试以最快最有效的方式执行此操作。 This data will then be sent to a MySQL db table.然后,这些数据将被发送到 MySQL 数据库表。

Result...结果...

  Security      Time Vol_21D Vol2_21D MaxAPV_21D MinAPV_21D
0        1  09:30:00       2        3          4          5
1        2  09:30:00       3        4          5          6
2        3  09:30:00       4        5          6          7
3        4  09:30:00       5        6          7          8
4        5  09:30:00       6        7          8          9
5      NaN       NaN     NaN      NaN        NaN        NaN
6      NaN       NaN     NaN      NaN        NaN        NaN
7      NaN       NaN     NaN      NaN        NaN        NaN
8      NaN       NaN     NaN      NaN        NaN        NaN
9      NaN       NaN     NaN      NaN        NaN        NaN

You have many ways to do that.你有很多方法可以做到这一点。

Create empty dataframe:创建空数据框:

cnames=['Security', 'Time', 'Vol_21D', 'Vol2_21D', 'MaxAPV_21D', 'MinAPV_21D']
df = pd.DataFrame(columns=cnames)

Output:输出:

Empty DataFrame
Columns: [Security, Time, Vol_21D, Vol2_21D, MaxAPV_21D, MinAPV_21D]
Index: []

Then, in loop you can create a pd.series and append to your dataframe, example:然后,在循环中,您可以创建一个 pd.series 并附加到您的数据帧,例如:

df.append(pd.Series([1, 2, 3, 4, 5, 6], cnames), ignore_index=True)

Or you can append a dict:或者你可以附加一个字典:

df.append({'Security': 1,
           'Time': 2,
           'Vol_21D': 3,
           'Vol2_21D': 4,
           'MaxAPV_21D': 5,
           'MinAPV_21D': 6
          }, ignore_index=True)

It will be the same output:这将是相同的输出:

  Security Time Vol_21D Vol2_21D MaxAPV_21D MinAPV_21D
0        1    2       3        4          5          6

But I think, more faster and pythonic way: first create an array, then append all raws to array and make data frame from array.但我认为,更快和 Pythonic 的方式:首先创建一个数组,然后将所有原始数据附加到数组并从数组制作数据帧。

data = []
for i in range(0,5):
    data.append([1,2,3,4,i,6])
df = pd.DataFrame(data, columns=cnames)

I hope it helps.我希望它有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM