错误的列连接 Python

Question

I have a data frame where in the first column I have to concatenate the other two if this record is empty.我有一个数据框，如果该记录为空，则在第一列中我必须连接其他两个。

 Cuenta CeCo   GLAccount   CeCoCeBe
  123 A           123         A
  234 S           234         S
  NaN             345         B
  NaN             987         A


for x in df1["Cuenta CeCo"].isna():
if x:
    df1["Cuenta CeCo"]=df1["GLAccount"].apply(str)+" "+df1["CeCoCeBe"]
else :
    df1["Cuenta CeCo"]

TYPES:类型：

df1["Cuenta CeCo"] = dtype('O')
df1["GLAccount"] = dtype('float64')
df1["CeCoCeBe"] = dtype('O')

expected output:预期 output：

Cuenta CeCo   GLAccount   CeCoCeBe
  123 A           123         A
  234 S           234         S
  345 B           345         B
  987 A           987         A

however it seems that when concatenating it does something strange and throws me other numbers and letters但是，似乎在连接时它会做一些奇怪的事情，并向我抛出其他数字和字母

 Cuenta CeCo   
  251 O
  471 B
  791 R
  341 O

Could someone support me to know why this happens and how to correct it to have my expected exit?有人可以支持我知道为什么会发生这种情况以及如何纠正它以实现我的预期退出吗？

Answer 1

Iterating over dataframes is typically bad practice and not what you intend.迭代数据框通常是不好的做法，而不是您想要的。 As you have done it, you are actually iterating over the columns.正如您所做的那样，您实际上是在遍历列。 Try尝试

for x in df:
    print(x)

and you will see it print the column headings.你会看到它打印列标题。

As for what you're trying to do, try this:至于你想要做什么，试试这个：

cols = ['Cuenta CeCo', 'GLAccount', 'CeCoCeBe']
mask = df[cols[0]].isna()
df.loc[mask, cols[0]] = df.loc[mask, cols[1]].map(str) + " " + df.loc[mask, cols[2]]

This generates a mask (in this case a series of True and False) that we use to get a series of just the NaN rows, then replace them by getting the string of the second column and concatenating with the third, using the mask again to get only the rows we need.这会生成一个掩码（在本例中是一系列 True 和 False），我们使用它来获取一系列仅 NaN 行，然后通过获取第二列的字符串并与第三列连接来替换它们，再次使用掩码来只获取我们需要的行。

Answer 2

import pandas as pd
import numpy as np

df = pd.DataFrame([
        ['123 A', 123, 'A'],
        ['234 S', 234, 'S'],
        [np.NaN, 345, 'B'],
        [np.NaN, 987, 'A']
    ], columns = ['Cuenta CeCo', 'GLAccount', 'CeCoCeBe']
)

def f(r):
    if pd.notna(r['Cuenta CeCo']):
        return r['Cuenta CeCo']
    else:
        return f"{r['GLAccount']} {r['CeCoCeBe']}"

df['Cuenta CeCo'] = df.apply(f, axis=1)
df

prints印刷

index指数	Cuenta CeCo昆塔CeCo	GLAccount GL帐户	CeCoCeBe CeCoCeBe
0 0	123 A 123 一个	123 123	A一个
1 1	234 S 234秒	234 234	S小号
2 2	345 B 345乙	345 345	B乙
3 3	987 A 987一	987 987	A一个

错误的列连接 Python

问题描述

2 个解决方案

解决方案1
0 2022-08-12 21:15:19

解决方案2
0 2022-08-12 21:17:36

错误的列连接 Python

问题描述

2 个解决方案

解决方案1 0 2022-08-12 21:15:19

解决方案2 0 2022-08-12 21:17:36

解决方案1
0 2022-08-12 21:15:19

解决方案2
0 2022-08-12 21:17:36