简体   繁体   中英

Is there a way to clean the null data of 2 different CSVs at once?

My aim with this is to be able to create one dataframe that will have data from the 2 CSVs in it and also be able to address the rows with null values.

I have 2 CSVs (link to the google sheet), sheet one is nifty, sheet 2 is nsebank. When I have to clean the sheets individually, I either use dropna or replace null values with say a mean. Mentioned the code below.

But nifty sheet has 35 null values while nsebank has 305 null values. Because the null values are different, I wanted to know if there is a way I can read both CSVs in the same dataframe and act upon the null values. For instance, because there are a lot more null values in the nsebank sheet, if I just read it into the same dataframe as nifty sheet and dropna, a lot of the nifty sheet data for those dates will be gone.

For eg if I am to pull data out of AV or from any other data provider, cleaning up data for individual stocks is going to be tedious and slow.

import pandas as pd
import numpy as np

bnf = pd.read_csv('nsebank.csv', index_col=0)
newbnf = bnf.dropna()

newbnf['Daily Returns'] = newbnf['Adj Close'].pct_change()

I simply concatenated the two excel sheets and did dropna(). See, if that works for you.

df_nifty = pd.read_excel('Indexdata.xlsx', sheet_name= 'nifty')
df_bnf = pd.read_excel('Indexdata.xlsx', sheet_name= 'bnf')
df = pd.concat([df_nifty, df_bnf])
df.dropna(inplace = True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM