简体   繁体   English

使用 pandas dataframe 时,如果不存在,如何添加列?

[英]When using a pandas dataframe, how do I add column if does not exist?

I'm new to using pandas and am writing a script where I read in a dataframe and then do some computation on some of the columns.我是使用 pandas 的新手,并且正在编写一个脚本,我在其中读取 dataframe 然后对某些列进行一些计算。

Sometimes I will have the column called "Met":有时我会有一个名为“Met”的专栏:

df = pd.read_csv(File, 
  sep='\t', 
  compression='gzip', 
  header=0, 
  names=["Chrom", "Site", "coverage", "Met"]
)

Other times I will have:其他时候我会有:

df = pd.read_csv(File, 
  sep='\t', 
  compression='gzip', 
  header=0, 
  names=["Chrom", "Site", "coverage", "freqC"]
)

I need to do some computation with the "Met" column so if it isn't present I will need to calculate it using:我需要对“Met”列进行一些计算,所以如果它不存在,我需要使用以下方法计算它:

df['Met'] = df['freqC'] * df['coverage'] 

is there a way to check if the "Met" column is present in the dataframe, and if not add it?有没有办法检查 dataframe 中是否存在“Met”列,如果不添加?

You check it like this:你像这样检查它:

if 'Met' not in df:
    df['Met'] = df['freqC'] * df['coverage'] 

When interested in conditionally adding columns in a method chain , consider using pipe() with a lambda :如果有兴趣在方法链中有条件地添加列,请考虑将pipe()lambda一起使用:

df.pipe(lambda d: (
    d.assign(Met=d['freqC'] * d['coverage'])
    if 'Met' not in d else d
))

If you were creating the dataframe from scratch, you could create the missing columns without a loop merely by passing the column names into the pd.DataFrame() call:如果您从头开始创建 dataframe,则只需将列名传递给pd.DataFrame()调用即可创建没有循环的缺失列:

cols = ['column 1','column 2','column 3','column 4','column 5']
df = pd.DataFrame(list_or_dict, index=['a',], columns=cols)

Alternatively you can use get :或者,您可以使用get

df['Met'] = df.get('Met', df['freqC'] * df['coverage'])    

If the column Met exists, the values inside this column are taken.如果Met列存在,则采用该列内的值。 Otherwise freqC and coverage are multiplied.否则freqCcoverage会相乘。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas Dataframe-在一个数据框中不存在一列时合并数据 - Pandas Dataframe - merge data when a column does not exist in one dataframe 当列为一系列列表时,如何有条件地将其添加到pandas数据框列中的单元格选择中? - How do I add conditionally to a selection of cells in a pandas dataframe column when the the column is a series of lists? 如何在 Pandas 数据框的列中添加空白单元格? - How do I add a blank cell inside a column of a Pandas dataframe? 如何在pandas数据帧的第二行中添加列标题? - How do i add column header, in the second row in a pandas dataframe? 如何在pandas DataFrame中复制行并添加id列 - How do I copy rows in a pandas DataFrame and add an id column 使用 pandas,如何检查列中是否存在特定序列? - Using pandas, how do I check if a particular sequence exist in a column? Python - Pandas - Dataframe 如何在使用.count时将notnull添加到列 - Python - Pandas - Dataframe How to add notnull to a column when using .count Python - Pandas - Dataframe 如何在使用.count时将变量添加到列 - Python - Pandas - Dataframe How to add variables to a column when using .count 如何仅在 Pandas DataFrame 中的列为 1 时填写 - How do I only fill down when a column is 1 in Pandas DataFrame 使用 pandas.DataFrame.to_html 时如何设置列宽? - How do I set the column width when using pandas.DataFrame.to_html?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM