将 pandas 数据框中的最后一个非空行写入 CSV？

Question

This writes all records, including null PbRatios.这将写入所有记录，包括空 PbRatios。 I would like to write the last non-null record only.我只想写最后一个非空记录。 When I add df[df.asOfDate == df.asOfDate.max()].to_csv , it gets the last record, which is always null.当我添加df[df.asOfDate == df.asOfDate.max()].to_csv时，它会获取最后一条记录，该记录始终为空。

import pandas as pd
from yahooquery import Ticker
symbols = ['AAPL','GOOG','MSFT','NVDA']
header = ["asOfDate","PbRatio"]
           
for tick in symbols:
    faang = Ticker(tick)
    faang.valuation_measures
    df = faang.valuation_measures
    try:
        for column_name in header :
            if column_name  not in df.columns:
                df.loc[:,column_name  ] = None
        df.to_csv('output.csv', mode='a', index=True, header=False, columns=header)
    except AttributeError:
        continue

Current output:当前输出：

Desired output:期望的输出：

Answer 1

This should work.这应该工作。 Just filter for the not Nan values in the df and filter for the max asOfDate .只需过滤 df 中的非 Nan 值并过滤最大asOfDate 。

for tick in symbols:
    faang = Ticker(tick)
    faang.valuation_measures
    df = faang.valuation_measures
    try:
        for column_name in header :
            if column_name  not in df.columns:
                df.loc[:,column_name  ] = None
    except AttributeError:
        continue

    # filter for notna
    df = df[df['PbRatio'].notna()]
   
    # filter for max date
    df = df[df['asOfDate'] == df['asOfDate'].max()]
    df.to_csv('output.csv', mode='a', index=True, header=False, columns=header)

Answer 2

Here I created a dummy data to work with, would have been nice if you provided data.在这里我创建了一个虚拟数据来处理，如果你提供数据就更好了。

df = pd.DataFrame([['A',12,123],['A',13,125],['A',2,None],['B',16,133],
['B',16,None],['B',14,139]], columns=['Name','id','score'])

    Name    id  score
0   A   12  123.0
1   A   13  125.0
2   A   2   NaN
3   B   16  133.0
4   B   16  NaN
5   B   14  139.0

then you drop the rows with missing values然后删除缺少值的行

df = df.dropna(how = 'any')

this looks like this:这看起来像这样：

    Name    id  score
0   A   12  123.0
1   A   13  125.0
3   B   16  133.0
5   B   14  139.0

I get the set of unique names, this is whatever 'AAPL'/'NVDA' column you have我得到了一组唯一名称，这是你拥有的任何“AAPL”/“NVDA”列

names = set(df['Name'])

create a new dataframe where I grab only the last row for each unique name, in my example that would be 'A' and 'B'.创建一个新的数据框，在其中我只获取每个唯一名称的最后一行，在我的示例中为“A”和“B”。 in yours that should be 'AAPL'/'NVDA'.在你的应该是'AAPL'/'NVDA'。

new_df = pd.DataFrame(columns=df.columns)
for n in names:
    new_df.loc[new_df.shape[0]] = df.loc[df.query(f"Name== '{n}'").index[-1]]

and this should look like这应该看起来像

new_df
>>>

    Name    id  score
0   B   14  139.0
1   A   13  125.0

将 pandas 数据框中的最后一个非空行写入 CSV？

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-12-22 16:55:27

解决方案2
0 2022-12-22 17:11:40

将 pandas 数据框中的最后一个非空行写入 CSV？

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-12-22 16:55:27

解决方案2 0 2022-12-22 17:11:40

解决方案1
2 已采纳 2022-12-22 16:55:27

解决方案2
0 2022-12-22 17:11:40