[英]How to move data from one column to another?
I have following data:我有以下数据:
id date oked_1 oked_2 KPS address type
225 001041004832 2000-10-12 71209 01111 105 196430100 3
225 001041004832 2000-10-12 71209 46211 105 196430100 3
225 001041004832 2000-10-12 71209 52101 105 196430100 3
i need to move "oked_2" to "oked_1" in a way that all other columns have to replicated.我需要以所有其他列都必须复制的方式将“oked_2”移动到“oked_1”。 For example, below you can see how oked_2 values are copied to oked_1, while other column data are the same.
比如下面你可以看到 oked_2 的值是如何复制到 oked_1 的,而其他列数据是一样的。 I want to have only oked_1 for my final dataframe(all oked_2 data have to be moved to oked_1).I expect:
我只想将 oked_1 用于我的最终数据帧(所有 oked_2 数据都必须移动到 oked_1)。我希望:
id date oked_1 oked_2 KPS address type
225 001041004832 2000-10-12 71209 01111 105 196430100 3
225 001041004832 2000-10-12 01111 46211 105 196430100 3
225 001041004832 2000-10-12 46211 52101 105 196430100 3
225 001041004832 2000-10-12 52101 52101 105 196430100 3
How can I do that?我怎样才能做到这一点? I have not tried, because I do not have any clue how to process it...
我没有尝试过,因为我不知道如何处理它......
If you see the expected dataframe, you can clearly notice that values from oked_2 are copied to oked_1.如果您看到预期的数据帧,您可以清楚地注意到 oked_2 中的值被复制到 oked_1。 Furthermore, because one row was added because there was 3 different values in oked_2 and one was in oked_1.
此外,因为在 oked_2 中有 3 个不同的值而添加了一行,而在 oked_1 中有一个。 Total 4 unique values.
共有 4 个唯一值。
You can try this:你可以试试这个:
import pandas as pd
df=pd.DataFrame({"oked_1":["71209","71209","71209"],"oked_2":["01111","46211","52101"]})
print(df)
"""
oked_1 oked_2
0 71209 01111
1 71209 46211
2 71209 52101
"""
df.loc[len(df.index)] = df.loc[len(df.index)-1]
df["aa"]=pd.unique(df[["oked_1","oked_2"]].values.ravel('K'))
print(df)
"""
oked_1 oked_2
0 71209 01111
1 01111 46211
2 46211 52101
3 52101 52101
"""
I don't think I have completely understood you logic but it is giving expected result, as I understand.我认为我没有完全理解你的逻辑,但据我所知,它给出了预期的结果。
Edit: I have tested it with this dataset:编辑:我已经用这个数据集测试过它:
id,date,oked_1,oked_2,KPS,address,type
001041004832,2000-10-12,71209,01111,105,196430100,3
001041004832,2000-10-12,71209,46211,105,196430100,3
001041004832,2000-10-12,71209,52101,105,196430100,3
And the output is:输出是:
id date oked_1 oked_2 KPS address type
0 1041004832 2000-10-12 71209 1111 105 196430100 3
1 1041004832 2000-10-12 71209 46211 105 196430100 3
2 1041004832 2000-10-12 71209 52101 105 196430100 3
id date oked_1 oked_2 KPS address type
0 1041004832 2000-10-12 71209 1111 105 196430100 3
1 1041004832 2000-10-12 1111 46211 105 196430100 3
2 1041004832 2000-10-12 46211 52101 105 196430100 3
3 1041004832 2000-10-12 52101 52101 105 196430100 3
And it is working as expected!它按预期工作!
from io import StringIO
import pandas as pd
data = """
_ id date oked_1 oked_2 KPS address type
225 001041004832 2000-10-12 71209 01111 105 196430100 3
225 001041004832 2000-10-12 71209 46211 105 196430100 3
225 001041004832 2000-10-12 71209 52101 105 196430100 3
"""
df = pd.read_csv(StringIO(data), dtype=str, delim_whitespace=True)
df['oked_1'] = df[['oked_1', 'oked_2']].to_numpy().tolist()
df = (df.explode('oked_1')
.drop_duplicates('oked_1', ignore_index=True)
.drop('oked_2', axis=1)
)
Output for df
: df
输出:
_ id date oked_1 KPS address type
0 225 001041004832 2000-10-12 71209 105 196430100 3
1 225 001041004832 2000-10-12 01111 105 196430100 3
2 225 001041004832 2000-10-12 46211 105 196430100 3
3 225 001041004832 2000-10-12 52101 105 196430100 3
You can create separate data frames for oked_1 and oked_2 and then drop duplicates & combine the dataframe.您可以为 oked_1 和 oked_2 创建单独的数据框,然后删除重复项并合并数据框。 As shown below.
如下所示。
df = pd.read_csv(filepath, dtype = str) #this is your main dataframe
df1 = df.drop(columns = ['oked_2']).drop_duplicates(subset=['oked_1'])
df2 = df.drop(columns = ['oked_1']).drop_duplicates(subset=['oked_2']).rename(columns = {'oked_2': 'oked_1'})
data = pd.concat([df1,df2]).reset_index()
print(data)
which looks like this看起来像这样
index id date oked_1 KPS address type
0 0 1041004832 2000-10-12 71209 105 196430100 3
1 0 1041004832 2000-10-12 01111 105 196430100 3
2 1 1041004832 2000-10-12 46211 105 196430100 3
3 2 1041004832 2000-10-12 52101 105 196430100 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.