简体   繁体   English

连接每行值直到 python dataframe 中的 NaN 值

[英]Concatenating each row values until a NaN value in a python dataframe

I am very new to python and I am trying to find a way to solve the below issue, if anyone knows a solution to this query please do help.我对 python 非常陌生,我正在尝试找到解决以下问题的方法,如果有人知道此查询的解决方案,请提供帮助。

Thanks in advance!提前致谢!

I want to concatenate each row values until a nan value and then store it as the respective row value of a new column.我想连接每行值直到一个 nan 值,然后将其存储为新列的相应行值。

Below is an example to support my query: The input dataframe is as below:下面是一个支持我的查询的例子: 输入 dataframe 如下:

df = pd.DataFrame({"student_name":['mike','maria','alex','mary','shirin'],"student_id":[1,2,3,4,5], "a1":[70,np.nan,64,78,79],"a2":[65,75,72,np.nan,61],"a3":[82,79,80,99,20],"a4":[90,34,56,89,67],"a5":[78,89,90,90,55],"a6":[55,78,88,77,84]})

I have attached a picture of the input dataframe below:我在下面附上了输入 dataframe 的图片:

输入数据框

My need is:我的需要是:

  1. I want to concatenate each corresponding row values of columns a1...an and then store the value to tag column.我想连接列 a1...an 的每个对应行值,然后将值存储到标记列。 2)If the row contains a nan or an empty value then concatenating should stop at that value. 2)如果该行包含一个 nan 或一个空值,则连接应该在该值处停止。

For the above input dataframe there are a1 to a6 columns, hence the desired output is as shown in the below picture.对于上述输入 dataframe 有 a1 到 a6 列,因此所需的 output 如下图所示。

所需的输出

Here's one quick Python 3 solution for your DataFrame (assuming your table will be non-empty as a whole for using this code snippet)这是您的 DataFrame 的一个快速 Python 3 解决方案(假设您的表在使用此代码段时整体上不是空的)

# ... following your code up to df= padas.DataFrame() line 

tags_ = []
for rowIndex in range(len(df[df.columns[0]])):
    tag_ = ""
    for col in df:
        if col.startswith('a'):
            try: tag_ += str(int(df[col][rowIndex]))
            except: break
    tags_.append(tag_)

df.insert(len(df.columns), "tag_", tags_)

Here, tags_ is just a list to store your string concatenated values per row until a numpy.nan is encountered (though code doesn't check for it specifically) The nested loops iterate through every row and column of your DataFrame, and non-empty DataFrame columns is required to assure the identification of rows in it, at the initial rowIndex for-loop.在这里,tags_ 只是一个列表,用于存储每行的字符串连接值,直到遇到 numpy.nan(尽管代码没有专门检查它)嵌套循环遍历 DataFrame 的每一行和每一列,并且非空DataFrame 列需要确保在初始 rowIndex for 循环中识别其中的行。

df.insert(<location>, <column_name>, <values>) finally inserts the desired tag_ column as the ending column of your DataFrame. df.insert(<location>, <column_name>, <values>)最终插入所需的 tag_ 列作为 DataFrame 的结束列。

Hopefully, it helps.希望它有所帮助。 Any corrections against it are most welcome.任何对它的更正都是最受欢迎的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python - 消除 numpy 数组或 pandas Z6A8064B53DF4794555570C53DF4794555570 - Python - Eliminating NaN values in each row of a numpy array or pandas dataframe Python Pandas:为每个索引值串联多个行值 - Python Pandas: concatenating multiple row values for each index value Pandas DataFrame 将连续值设置为 nan 直到列中的值发生变化 - Pandas DataFrame set consecutive values to nan until value in column changes 如何用 NaN 替换 dataframe 中每一行的异常值? - how can i replace outliers values of each row in dataframe with NaN? 如何在不绘制 NaN 值的情况下为 dataframe 中的每一行绘制 plot 曲线 - How to plot curve for each row in dataframe without plotting NaN values 如何通过每行的非NAN值计数来提取此数据框中的所有非NAN值 - How to extract all non-nan values in this dataframe by the non-nan values count of each row 如何在Python中将每n行的&#39;NaN&#39;值范围插入数据框中 - How to insert range of 'NaN' values each n rows into dataframe in Python Pandas 遍历一个数据帧,将行值和列值连接到一个关于特定列值的新数据帧中 - Pandas-iterate through a dataframe concatenating row values and column values into a new dataframe with respect to a specific column value 查找每行的最小值和最大值,不包括NaN值 - Finding minimum and maximum value for each row, excluding NaN values 将数据帧列表连接到单个 dataframe 会导致 NaN 值 - Concatenating a list of dataframes into a single dataframe results in NaN values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM