[英]Concatenating each row values until a NaN value in a python dataframe
I am very new to python and I am trying to find a way to solve the below issue, if anyone knows a solution to this query please do help.我对 python 非常陌生,我正在尝试找到解决以下问题的方法,如果有人知道此查询的解决方案,请提供帮助。
Thanks in advance!提前致谢!
I want to concatenate each row values until a nan value and then store it as the respective row value of a new column.我想连接每行值直到一个 nan 值,然后将其存储为新列的相应行值。
Below is an example to support my query: The input dataframe is as below:下面是一个支持我的查询的例子: 输入 dataframe 如下:
df = pd.DataFrame({"student_name":['mike','maria','alex','mary','shirin'],"student_id":[1,2,3,4,5], "a1":[70,np.nan,64,78,79],"a2":[65,75,72,np.nan,61],"a3":[82,79,80,99,20],"a4":[90,34,56,89,67],"a5":[78,89,90,90,55],"a6":[55,78,88,77,84]})
I have attached a picture of the input dataframe below:我在下面附上了输入 dataframe 的图片:
My need is:我的需要是:
For the above input dataframe there are a1 to a6 columns, hence the desired output is as shown in the below picture.对于上述输入 dataframe 有 a1 到 a6 列,因此所需的 output 如下图所示。
Here's one quick Python 3 solution for your DataFrame (assuming your table will be non-empty as a whole for using this code snippet)这是您的 DataFrame 的一个快速 Python 3 解决方案(假设您的表在使用此代码段时整体上不是空的)
# ... following your code up to df= padas.DataFrame() line
tags_ = []
for rowIndex in range(len(df[df.columns[0]])):
tag_ = ""
for col in df:
if col.startswith('a'):
try: tag_ += str(int(df[col][rowIndex]))
except: break
tags_.append(tag_)
df.insert(len(df.columns), "tag_", tags_)
Here, tags_ is just a list to store your string concatenated values per row until a numpy.nan is encountered (though code doesn't check for it specifically) The nested loops iterate through every row and column of your DataFrame, and non-empty DataFrame columns is required to assure the identification of rows in it, at the initial rowIndex for-loop.在这里,tags_ 只是一个列表,用于存储每行的字符串连接值,直到遇到 numpy.nan(尽管代码没有专门检查它)嵌套循环遍历 DataFrame 的每一行和每一列,并且非空DataFrame 列需要确保在初始 rowIndex for 循环中识别其中的行。
df.insert(<location>, <column_name>, <values>)
finally inserts the desired tag_ column as the ending column of your DataFrame. df.insert(<location>, <column_name>, <values>)
最终插入所需的 tag_ 列作为 DataFrame 的结束列。
Hopefully, it helps.希望它有所帮助。 Any corrections against it are most welcome.任何对它的更正都是最受欢迎的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.