连接每行值直到 python dataframe 中的 NaN 值

Question

I am very new to python and I am trying to find a way to solve the below issue, if anyone knows a solution to this query please do help.我对 python 非常陌生，我正在尝试找到解决以下问题的方法，如果有人知道此查询的解决方案，请提供帮助。

Thanks in advance!提前致谢！

I want to concatenate each row values until a nan value and then store it as the respective row value of a new column.我想连接每行值直到一个 nan 值，然后将其存储为新列的相应行值。

Below is an example to support my query: The input dataframe is as below:下面是一个支持我的查询的例子：输入 dataframe 如下：

df = pd.DataFrame({"student_name":['mike','maria','alex','mary','shirin'],"student_id":[1,2,3,4,5], "a1":[70,np.nan,64,78,79],"a2":[65,75,72,np.nan,61],"a3":[82,79,80,99,20],"a4":[90,34,56,89,67],"a5":[78,89,90,90,55],"a6":[55,78,88,77,84]})

I have attached a picture of the input dataframe below:我在下面附上了输入 dataframe 的图片：

My need is:我的需要是：

I want to concatenate each corresponding row values of columns a1...an and then store the value to tag column.我想连接列 a1...an 的每个对应行值，然后将值存储到标记列。 2)If the row contains a nan or an empty value then concatenating should stop at that value. 2）如果该行包含一个 nan 或一个空值，则连接应该在该值处停止。

For the above input dataframe there are a1 to a6 columns, hence the desired output is as shown in the below picture.对于上述输入 dataframe 有 a1 到 a6 列，因此所需的 output 如下图所示。

Answer 1

Here's one quick Python 3 solution for your DataFrame (assuming your table will be non-empty as a whole for using this code snippet)这是您的 DataFrame 的一个快速 Python 3 解决方案（假设您的表在使用此代码段时整体上不是空的）

# ... following your code up to df= padas.DataFrame() line 

tags_ = []
for rowIndex in range(len(df[df.columns[0]])):
    tag_ = ""
    for col in df:
        if col.startswith('a'):
            try: tag_ += str(int(df[col][rowIndex]))
            except: break
    tags_.append(tag_)

df.insert(len(df.columns), "tag_", tags_)

Here, tags_ is just a list to store your string concatenated values per row until a numpy.nan is encountered (though code doesn't check for it specifically) The nested loops iterate through every row and column of your DataFrame, and non-empty DataFrame columns is required to assure the identification of rows in it, at the initial rowIndex for-loop.在这里，tags_ 只是一个列表，用于存储每行的字符串连接值，直到遇到 numpy.nan（尽管代码没有专门检查它）嵌套循环遍历 DataFrame 的每一行和每一列，并且非空DataFrame 列需要确保在初始 rowIndex for 循环中识别其中的行。

df.insert(<location>, <column_name>, <values>) finally inserts the desired tag_ column as the ending column of your DataFrame. df.insert(<location>, <column_name>, <values>)最终插入所需的 tag_ 列作为 DataFrame 的结束列。

Hopefully, it helps.希望它有所帮助。 Any corrections against it are most welcome.任何对它的更正都是最受欢迎的。

连接每行值直到 python dataframe 中的 NaN 值

问题描述

1 个解决方案

解决方案1
0 2022-01-12 18:34:07

连接每行值直到 python dataframe 中的 NaN 值

问题描述

1 个解决方案

解决方案1 0 2022-01-12 18:34:07

解决方案1
0 2022-01-12 18:34:07