I have a 1996 * 9 array:
array([[ 0., 1., 1., ..., 1., 1., 0.],
[ 1., 1., 0., ..., 1., 0., 1.],
[ 0., 1., 1., ..., 1., 1., 0.],
...,
[ 0., 0., 0., ..., 0., 0., 1.],
[ 0., 1., 1., ..., 1., 1., 0.],
[ 0., 1., 1., ..., 1., 1., 0.]])
I want a 1996 * 1 array.
What I did:
pd.DataFrame(train_L.astype(int)).apply(lambda x: ''.join(str(x)), axis = 1)
I get
0 0 0\n1 1\n2 1\n3 1\n4 1\n5 1...
1 0 1\n1 1\n2 0\n3 0\n4 0\n5 0...
2 0 0\n1 1\n2 1\n3 0\n4 1\n5 1...
3 0 0\n1 1\n2 1\n3 0\n4 1\n5 1...
4 0 1\n1 0\n2 0\n3 0\n4 0\n5 0...
The problem:
\\n1
My question: Is there a easy way to do the merge without such caveats?
Example output
What I have:
v1 v2 v3 ... v9
1 0 0 ... 1
I want:
v1
1\t0\t0\t...\t1
\\t
. Why I need such weird form:
For image processing, we have one column for the labels of image. However, one image may have multiple labels. I have to squeeze multiple labels into 1 column. That's the requirement by the library.
You can apply
a lambda after converting the dtype to str:
In [14]:
df = pd.DataFrame(np.random.randn(4,5))
df
Out[14]:
0 1 2 3 4
0 1.036485 -1.243777 1.286254 1.973786 -0.083245
1 1.698828 1.696846 0.037732 -0.630546 -0.135069
2 -1.231337 -1.166480 0.046414 -0.965710 1.341809
3 0.591176 0.275267 -0.446553 -0.230353 0.258817
In [16]:
df.astype(str).apply(lambda x: ''.join(x), axis=1)
Out[16]:
0 1.03648484941-1.243776761241.286253591521.9737...
1 1.698827772721.696846119330.0377324485782-0.63...
2 -1.23133722226-1.166480155330.046414100678-0.9...
3 0.5911755605680.275266550205-0.446552705185-0....
dtype: object
It seems you want a tab you can just join
with a tab:
In [17]:
df.astype(str).apply(lambda x: '\t'.join(x), axis=1)
Out[17]:
0 1.03648484941\t-1.24377676124\t1.28625359152\t...
1 1.69882777272\t1.69684611933\t0.0377324485782\...
2 -1.23133722226\t-1.16648015533\t0.046414100678...
3 0.591175560568\t0.275266550205\t-0.44655270518...
dtype: object
This results in a string, which is probably not what you want. Perhaps you should explain why you would like your data in your requested format.
a = np.array([[ 0., 1., 1., 1., 1., 0.],
[ 1., 1., 0., 1., 0., 1.],
[ 0., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 1.],
[ 0., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 0.]])
v = pd.DataFrame(['\t'.join([str(val) for val in row]) for row in a], columns=['v1'])
for row in v.iterrows():
print(row[1].v1)
0.0 1.0 1.0 1.0 1.0 0.0
1.0 1.0 0.0 1.0 0.0 1.0
0.0 1.0 1.0 1.0 1.0 0.0
0.0 0.0 0.0 0.0 0.0 1.0
0.0 1.0 1.0 1.0 1.0 0.0
0.0 1.0 1.0 1.0 1.0 0.0
>>> v
v1
0 0.0\t1.0\t1.0\t1.0\t1.0\t0.0
1 1.0\t1.0\t0.0\t1.0\t0.0\t1.0
2 0.0\t1.0\t1.0\t1.0\t1.0\t0.0
3 0.0\t0.0\t0.0\t0.0\t0.0\t1.0
4 0.0\t1.0\t1.0\t1.0\t1.0\t0.0
5 0.0\t1.0\t1.0\t1.0\t1.0\t0.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.