使用 pandas dataframe 时，如果不存在，如何添加列？

Question

I'm new to using pandas and am writing a script where I read in a dataframe and then do some computation on some of the columns.我是使用 pandas 的新手，并且正在编写一个脚本，我在其中读取 dataframe 然后对某些列进行一些计算。

Sometimes I will have the column called "Met":有时我会有一个名为“Met”的专栏：

df = pd.read_csv(File, 
  sep='\t', 
  compression='gzip', 
  header=0, 
  names=["Chrom", "Site", "coverage", "Met"]
)

Other times I will have:其他时候我会有：

df = pd.read_csv(File, 
  sep='\t', 
  compression='gzip', 
  header=0, 
  names=["Chrom", "Site", "coverage", "freqC"]
)

I need to do some computation with the "Met" column so if it isn't present I will need to calculate it using:我需要对“Met”列进行一些计算，所以如果它不存在，我需要使用以下方法计算它：

df['Met'] = df['freqC'] * df['coverage']

is there a way to check if the "Met" column is present in the dataframe, and if not add it?有没有办法检查 dataframe 中是否存在“Met”列，如果不添加？

Answer 1

You check it like this:你像这样检查它：

if 'Met' not in df:
    df['Met'] = df['freqC'] * df['coverage']

Answer 2

When interested in conditionally adding columns in a method chain , consider using pipe() with a lambda :如果有兴趣在方法链中有条件地添加列，请考虑将pipe()与lambda一起使用：

df.pipe(lambda d: (
    d.assign(Met=d['freqC'] * d['coverage'])
    if 'Met' not in d else d
))

Answer 3

If you were creating the dataframe from scratch, you could create the missing columns without a loop merely by passing the column names into the pd.DataFrame() call:如果您从头开始创建 dataframe，则只需将列名传递给pd.DataFrame()调用即可创建没有循环的缺失列：

cols = ['column 1','column 2','column 3','column 4','column 5']
df = pd.DataFrame(list_or_dict, index=['a',], columns=cols)

Answer 4

Alternatively you can use get :或者，您可以使用get ：

df['Met'] = df.get('Met', df['freqC'] * df['coverage'])

If the column Met exists, the values inside this column are taken.如果Met列存在，则采用该列内的值。 Otherwise freqC and coverage are multiplied.否则freqC和coverage会相乘。

使用 pandas dataframe 时，如果不存在，如何添加列？

问题描述

4 个解决方案

解决方案1
83 已采纳 2014-09-17 17:15:26

解决方案2
6 2021-03-05 14:52:46

解决方案3
5 2020-05-27 20:05:54

解决方案4
3 2022-06-07 20:58:21

使用 pandas dataframe 时，如果不存在，如何添加列？

问题描述

4 个解决方案

解决方案1 83 已采纳 2014-09-17 17:15:26

解决方案2 6 2021-03-05 14:52:46

解决方案3 5 2020-05-27 20:05:54

解决方案4 3 2022-06-07 20:58:21

解决方案1
83 已采纳 2014-09-17 17:15:26

解决方案2
6 2021-03-05 14:52:46

解决方案3
5 2020-05-27 20:05:54

解决方案4
3 2022-06-07 20:58:21