简体   繁体   English

如何使用Python / Pandas在Excel中插入新行(带条件)

[英]How to Insert New Rows (with a condition) in Excel using Python / Pandas

I am trying to insert new rows into an excel using pandas data-frame when a particular columns has a specific condition 我试图在特定列具有特定条件时使用pandas data-frame将新行插入到Excel中

For ex: 例如:

Input

    A   B   C   D   E
0   AA  111 2   2   
1   CC  222 8   12  
2   DD  333 3   3

Output
    A   B   C   D   E (Output Column)
0   AA  111 2   2   111-2   
1   CC  222 8   8   222-8
2   CC  222 9   9   222-9   
3   CC  222 10  10  222-10
4   CC  222 11  11  222-11
5   CC  222 12  12  222-12
6   DD  333 3   3   333-3

If you see here the Column C and D has a range of 8-12 for Row # 1. So I need to split the row accordingly. 如果你看到这里的C和D列的行#1的范围是8-12。所以我需要相应地拆分行。 If C and D are same, no appending of new rows. 如果C和D相同,则不添加新行。

Another solution, using Index.repeat to create the output frame, then groupby.cumcount and str concatenation to update the values of columns C , D and E : 另一种解决方案,使用Index.repeat创建输出框架,然后使用groupby.cumcountstr连接来更新列CDE

df1 = df.loc[df.index.repeat((df.D - df.C).add(1))]
df1['C'] = df1['C'] + df1.groupby('A').cumcount()
df1['D'] = df1['C']
df1['E'] = df['B'].astype(str) + '-' + df1['C'].astype(str)

[out] [OUT]

    A    B   C   D       E
0  AA  111   2   2   111-2
1  CC  222   8   8   222-8
1  CC  222   9   9   222-9
1  CC  222  10  10  222-10
1  CC  222  11  11  222-11
1  CC  222  12  12  222-12
2  DD  333   3   3   333-3

My example uses to get data from lines with different values for C and D columns and create new data for them. 我的示例用于从具有不同C和D列值的行获取数据,并为它们创建新数据。 Next add this new data to data with no differences. 接下来将此新数据添加到数据中,没有任何差异。

import pandas as pd

# setup data
data_raw = [['AA', 111, 2, 2], ['CC', 222, 8, 12], ['DD', 333, 3, 3]]
data = pd.DataFrame(data_raw, columns=['A', 'B', 'C','D'])

# get items with no difference
rest_of_data = data.loc[data['C'] == data['D']]

# create value for E column
rest_of_data = rest_of_data.copy()
rest_of_data['E'] = str(str(rest_of_data['B'].values[0]) + '-' + str(rest_of_data['C'].values[0]))

# find items with difference
difference_data = data.loc[data['C'] != data['D']]

# get numbers of elements to create
start = int(difference_data['C'])
stop = int(difference_data['D'])

# create new data
create_data = []
for i in range(start,stop+1,1):
    new = [difference_data['A'].values[0], difference_data['B'].values[0], i, i, str(difference_data['B'].values[0])+'-'+str(i)]
    create_data.append(new)

new_data = pd.DataFrame(create_data, columns=['A', 'B', 'C','D', 'E'])

# concatenate frames
frames = [rest_of_data, new_data]
result = pd.concat(frames, ignore_index=True)

Result: 结果:

    A    B   C   D       E
0  AA  111   2   2   111-2
1  DD  333   3   3   111-2
2  CC  222   8   8   222-8
3  CC  222   9   9   222-9
4  CC  222  10  10  222-10
5  CC  222  11  11  222-11
6  CC  222  12  12  222-12
df = pd.DataFrame(
    data={
        'A': ['AA', 'CC', 'DD'],
        'B': [111, 222, 333],
        'C':[2, 8, 3],
        'D':[2, 12, 3],
        'E':[None, None, None],
    }
)

new_df = pd.DataFrame(
    data={
        'A': [],
        'B': [],
        'C': [],
        'D': [],
        'E': [],
    },
    dtype=np.int64
)

for idx, row in df.iterrows():
    if row['C'] == row['D']:
        new_df = new_df.append(
            pd.DataFrame(
                data={
                    'A': [row['A']],
                    'B': [int(row['B'])],
                    'C': [int(row['C'])],
                    'D': [int(row['D'])],
                    'E': [str(row['B']) + '-' + str(row['D'])],
                }
            )
        )
    elif int(row['D']) > int(row['C']):
        tmp_c = int(row['C'])
        tmp_d = int(row['D'])
        while tmp_d >= tmp_c: 
            new_df = new_df.append(
                pd.DataFrame(
                    data={
                        'A': [row['A']],
                        'B': [int(row['B'])],
                        'C': [int(row['C'])],
                        'D': [tmp_c],
                        'E': [str(row['B']) + '-' + str(tmp_c)],
                    }
                )
            )
            tmp_c += 1

print(new_df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何仅使用 Python pandas 将符合条件的行写入 excel 工作表 - How to write to excel sheet only those rows which match the condition using Python pandas 我如何在 Pandas python 中的每个新行之后插入空行 - How do i insert blank rows after every new rows in pandas python Python 等效于 excel 嵌套 if 过滤 Pandas DataFrame 行的条件 - Python equivalent of excel nested if condition for filtering Pandas DataFrame rows If else condition to compare 2 columns and insert a new column in Python pandas dataframe - If else condition to compare 2 columns and insert a new column in Python pandas dataframe 如何使用 python 将嵌套列表写入 excel 新行? - How to write a Nested lists into excel new rows using python? 根据条件 python pandas 向 dataframe 添加新行 - add new rows to dataframe based on condition python pandas 如何在熊猫python的数据框中插入行? - How to insert rows into dataframes in pandas python? 对于每 1000 行,插入新的递增日期 python pandas - For each 1000 rows, insert new incrementing date python pandas 如何使用pandas按条件修改Excel文件中的行? - How to use pandas to modify rows in an Excel file by condition? 如何使用 Python 将一个 excel 拆分为多个 excel,并在所有新 excel 中分配相同的行数? - How to split one excel into multiple excel with common number of rows distribution across all the new excel using Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM