简体   繁体   English

Pandas:如何加入没有 header 的 csv 列?

[英]Pandas: How to join csv columns of no header?

I have csv data like the following.我有 csv 数据,如下所示。

1,2,3,4
a,b,c,d

1,2,3,4 is not a csv header. 1,2,3,4不是 csv header。 It is data.它是数据。
That values is all strings data.该值是所有字符串数据。
I want join columns of index (of list) of 1 and 2 by Pandas.我想通过 Pandas 连接 1 和 2 的索引(列表)列。
I want get result like the following.我想得到如下结果。
Result data is strings.结果数据是字符串。

1,23,4
a,bc,d

Python's code is like the following. Python 的代码如下所示。

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]
vals = lines[0]
s = vals[0] + ',' + (vals[1] + vals[2]) + ',' + vals[3] + '\n'
vals = lines[1]
s += vals[0] + ',' + (vals[1] + vals[2]) + ',' + vals[3] + '\n'
print(s)

How to you do it?你怎么做?

You can loop over it using for or a list-comprehension.您可以使用for或 list-comprehension 对其进行循环。

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

vals = [','.join([w, f'{x}{y}', *z]) for w, x, y, *z in lines]
s = '\n'.join(vals)
print(x)

# prints:
1,23,4
a,bc,d

If you wand to use pandas, you could create new column and remove old ones:如果您想使用 pandas,您可以创建新列并删除旧列:

import pandas as pd

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

df = pd.DataFrame(lines)

# Create new column
df['new_col'] = df[1] + df[2]

print(df)
#    0  1  2  3 new_col
# 0  1  2  3  4      23
# 1  a  b  c  d      bc

# Remove old columns if needed
df.drop([1, 2], axis=1, inplace=True)

print(df)
#    0  3 new_col
# 0  1  4      23
# 1  a  d      bc

If you want columns to be in specific order, use something like this:如果您希望列按特定顺序排列,请使用以下内容:

print(df[[0, 'new_col', 3]])
#    0 new_col  3
# 0  1      23  4
# 1  a      bc  d

But it's better to save headers in csv但最好将头文件保存在 csv

You can do something like this.你可以做这样的事情。

import pandas as pd
lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

df = pd.DataFrame(lines)
df['new_col'] = df.iloc[:, 1] + df.iloc[:, 2]
print(df)

Output Output

在此处输入图像描述

You can then drop the columns you don't want.然后,您可以删除不需要的列。

Since OP specified pandas, here's a solution that may work.由于 OP 指定了 pandas,因此这里有一个可行的解决方案。

Once in pandas, eg with pd.read_csv() You can simply concatenate text (object) columns with +一旦进入 pandas,例如使用pd.read_csv()您可以简单地使用+连接文本(对象)列

import pandas as pd

lines = [ ['1', '2', '3', '4'],
        ['a', 'b', 'c', 'd']]
df = pd.DataFrame(lines)

df[1] = df[1]+df[2]
df.drop(columns=2, inplace=True)
df
# 0 1 3
# 0 1 23 4
# 1 a bc d

Should give you what you want in a pandas dataframe.应该在 pandas dataframe 中为您提供您想要的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM