简体   繁体   English

在 Pandas 中使用 for 循环创建新的 df

[英]Create new df with for loop in Pandas

Not sure if I am doing this right - first post here so please be gentle:)不知道我这样做是否正确 - 第一次在这里发帖所以请温柔:)

Se below picture.见下图。

Print screen from my Jupyter Notebook从我的 Jupyter Notebook 打印屏幕

What I am trying to do is to create a new dataframe from the df_Grundinladdning['Datan'] dataframe which only include the rows that contain the string "#TRANS".我要做的是从 df_Grundinladdning['Datan'] dataframe 创建一个新的 dataframe,其中包含包含字符串“#TRANS”的行。

Here's a way to do that:这是一种方法:

df = pd.DataFrame({"Datan": ["x", "TRANS y", "z", "TRANS u", "v", "TRANS w"]})
print(df)

new_df = df[df.Datan.str.contains("TRANS")]
print(new_df)

Results:结果:

(original dataframe)
     Datan
0        x
1  TRANS y
2        z
3  TRANS u
4        v
5  TRANS w

(new dataframe)
     Datan
1  TRANS y
3  TRANS u
5  TRANS w

The right method is described here. 这里描述了正确的方法。 The loop, even if it did not have syntax errors, would be very very slow.循环,即使它没有语法错误,也会非常非常慢。

You don't need to loop over the dataframe you can get the result dataframe easily with this:您无需遍历 dataframe 即可轻松获得结果 dataframe :

df_transOnly= df_Grundinladdning[df_Grundinladdning["Datan"].str.contains('#TRANS')]
df_transOnly #for printing df

So you will get the needed dataframe like this:因此,您将获得所需的 dataframe,如下所示:

      Datan
5     #TRANS232
12    #TRANS455
20    #TRANS3144
104   #TRANS1234
500   #TRANS213

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM