简体   繁体   English

如何将每列只有 1 个非空条目的 Pandas 数据框中的多行合并为一行?

[英]How to combine multiple rows in a pandas dataframe which have only 1 non-null entry per column into one row?

I am using json_normalize to parse json entries of a pandas column.我正在使用 json_normalize 来解析 Pandas 列的 json 条目。 But, as an output I am getting a dataframe with multiple rows with each row having only one non-null entry.但是,作为输出,我得到一个包含多行的数据框,每行只有一个非空条目。 I want to combine all these rows to one row in pandas.我想将所有这些行合并为熊猫中的一行。

currency    custom.gt   custom.eq   price.gt    price.lt
0   NaN 4.0 NaN NaN NaN
1   NaN NaN NaN 999.0   NaN
2   NaN NaN NaN NaN 199000.0
3   NaN NaN other   NaN NaN
4   USD NaN NaN NaN NaN

You can use ffill (forward fill) and bfill (backfill), which are methods for filling NA values in pandas.您可以使用ffill (前向填充)和bfill (回填),它们是在Pandas中填充 NA 值的方法。

# fill NA values
# option 1: 
df = df.ffill().bfill()

# option 2: 
df = df.fillna(method='ffill').fillna(method='bfill')

print(df)

    currency    custom.gt   custom.eq   price.gt    price.lt
0   USD 4.0 other   999.0   199000.0
1   USD 4.0 other   999.0   199000.0
2   USD 4.0 other   999.0   199000.0
3   USD 4.0 other   999.0   199000.0
4   USD 4.0 other   999.0   199000.0

You can then drop the duplicated rows using drop_duplicates and keep the first one :然后,您可以使用drop_duplicates删除重复的行并保留第一个:

df = df.drop_duplicates(keep='first')
print(df)

    currency    custom.gt   custom.eq   price.gt    price.lt
0   USD 4.0 other   999.0   199000.0

Depending on how many times you have to repeat the task, I might also take a look at how the JSON file is structured to see if using a dictionary comprehension could help clean things up so that json_normalize can parse it more easily the first time.根据您必须重复执行任务的次数,我可能还会查看 JSON 文件的结构,看看使用字典理解是否有助于清理问题,以便json_normalize可以在第一次更轻松地解析它。

you could do你可以

import pandas as pd
from functools import reduce

df = pd.DataFrame.from_dict({"a":["1", None, None],"b" : [None, None, 1], "c":[None, 3, None]})

def red_func(x,y) :
   if pd.isna(x) or pd.isnull(x) :
     return y
 result = [*map( lambda x : reduce(f,x), [list(row) for i, row in df.iterrows()]),]

Outputs :输出:

In [135]: df
Out[135]:
      a    b    c
0     1  NaN  NaN
1  None  NaN  3.0
2  None  1.0  NaN

In [136]: [*map( lambda x : reduce(f,x), [list(row) for i, row in df.iterrows()]),]
Out[136]: ['1', 3.0, 1.0]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 仅从pandas df保存非空条目值和列号,每行仅一个非空值 - Saving only non-null entry value and column number from pandas df with only one non-null value per row 如何将DataFrame列的非空条目组合到新列中? - How to combine non-null entries of columns of a DataFrame into a new column? pandas - 过滤至少有一列在 groupby 中包含非空值的组 - pandas - filter on groups which have at least one column containing non-null values in a groupby 如何组合 pandas dataframe 中在一列中具有相同值的行 - How to combine rows in a pandas dataframe that have the same value in one column 如何合并具有相同索引的多行,每一行在熊猫中只有一个真实值? - How to combine multiple rows with the same index with each row have only one true value in pandas? Pandas 将行值组合到列表中,用于另一列的每个非空值 - Pandas combine row values into list for every non-null value of another column 熊猫数据框-如何将多行合并为一个 - Pandas Dataframe - How to combine multiple rows to one 如何将 pandas dataframe 中的 null 值替换为 Z6A55075B3CDF4754 中同一列中的非空值? - How to replace a null value from a pandas dataframe with a non-null value from the same column in the dataframe? 获取pandas DataFrame中行及其列的最后一个非空值 - Get last non-null value of a row and its column in pandas DataFrame 压缩 pandas DataFrame 具有非空值并修改列名 - Squeezing pandas DataFrame to have non-null values and modify column names
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM