简体   繁体   English

Pandas Dataframe,使用 Pandas 在 Python 中为每一行添加包含数据范围的新列

[英]Pandas Dataframe, add new column with range of data for each row in Python using Pandas

I have a dataframe with a single column and want to create a new column called 'Hour' for hours 0-23 but for each row.我有一个带有单列的 dataframe,我想为 0-23 小时但为每一行创建一个名为“Hour”的新列。

Current:当前的:

AN_LOG_ID
00000001
00000002
00000003

Desired output: (0-23 for each hour of the day for each row)所需的 output:(每行每天每小时 0-23)

AN_LOG_ID    HOUR
00000001      0
00000001      1
...          ...
00000001      23
00000002      0
00000002      1
...          ...
00000002      23
00000003      0
00000003      1
...          ...
00000003      23
>>> df = df.assign(HOUR=[range(24)] * len(df)).explode("HOUR", ignore_index=True)
>>> df

   AN_LOG_ID HOUR
0   00000001    0
1   00000001    1
2   00000001    2
3   00000001    3
4   00000001    4
..       ...  ...
67  00000003   19
68  00000003   20
69  00000003   21
70  00000003   22
71  00000003   23

[72 rows x 2 columns]

Assign range(24) to each row as "HOUR" first, and then explode that "HOUR" column to spread hours on its own rows.首先将range(24)分配给每一行作为“HOUR”,然后展开该“HOUR”列以在其自己的行上展开小时数。 (ignore_index=True makes the resultant index 0, 1, 2, ...) (ignore_index=True 使结果索引为 0, 1, 2, ...)

We can use Index.repeat then use groupby.cumcount to get your HOUR column:我们可以使用Index.repeat然后使用groupby.cumcount来获取您的HOUR列:

df = df.loc[df.index.repeat(24)]
df = df.assign(HOUR=df.groupby(level=0).cumcount()).reset_index(drop=True)
    N_LOG_ID  HOUR
0   00000001     0
1   00000001     1
2   00000001     2
3   00000001     3
4   00000001     4
..       ...   ...
67  00000003    19
68  00000003    20
69  00000003    21
70  00000003    22
71  00000003    23

Use a cross merge :使用交叉merge

out = df.merge(pd.DataFrame({'HOUR': range(24)}), how='cross')

Output: Output:

   AN_LOG_ID  HOUR
0   00000001     0
1   00000001     1
2   00000001     2
3   00000001     3
4   00000001     4
..       ...   ...
67  00000003    19
68  00000003    20
69  00000003    21
70  00000003    22
71  00000003    23

[72 rows x 2 columns]

Another possible solution, based on numpy :另一种可能的解决方案,基于numpy

pd.DataFrame(
    np.concatenate(
        (np.repeat(df.values, 24).reshape(-1,1),
         np.tile(np.arange(24), len(df)).reshape(-1,1)), axis=1), 
    columns=['AN_LOG_ID', 'HOUR'])

Output: Output:

   AN_LOG_ID HOUR
0   00000001    0
1   00000001    1
2   00000001    2
3   00000001    3
4   00000001    4
..       ...  ...
67  00000003   19
68  00000003   20
69  00000003   21
70  00000003   22
71  00000003   23

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将新的列元素显式添加到Pandas DataFrame(Python 2)中的现有行 - Add new column elements to explicitly to existing row in Pandas DataFrame (Python 2) 有没有办法向pandas数据框添加新列,将新列的每个唯一值附加到数据帧的每个现有行? - Is there a way to add a new column to a pandas dataframe, appending each unique value of the new column to every existing row of the dataframe? 将列添加到现有 dataframe 并将数据导入到 Python 中的新列 Pandas - Add column to existing dataframe and import data into new column in Python Pandas Python-在Pandas面板的每个数据框中添加一列 - Python - Add a column to each dataframe of a Pandas Panel Python Pandas将每一列放到新的一行 - Python Pandas put each column to a new row 无法使用 python 在 dataframe 中添加新列 pandas - unable to add new column in dataframe using python pandas 汇总数据框每一行的列,并在多级索引熊猫数据框中添加新列 - Sum columns for each row of dataframe, and add new column in multi level index pandas dataframe 如何根据每行中的数据以及满足特定条件的其他行的存在向 Pandas Dataframe 添加新列? - How to add a new column to a Pandas Dataframe based on data both in each row, and on the existence of other rows that meet a specific criteria? Python3.7 Pandas1.0.1 Dataframe - 计算范围内列的总和并重新组合为一个新行? - Python3.7 Pandas1.0.1 Dataframe - Calculate sum of column within a range and regroup as one new row? Python-Pandas-导入Excel文件,遍历每一行,添加新值,然后添加到数据框 - Python - Pandas - Import Excel file, iterate through each row, add new value, and add to dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM