简体   繁体   中英

Faster way to get row data (based on a condition) from one dataframe and merge onto another b pandas python

I have two dataframes with different indices and lengths. I'd like to grab data from asset column from asset_df if the row matches the same ticker and year. See below.

I created a simplistic implementation using for-loops, but I imagine there are fancier, faster ways to do this?

Ticker_df

Year   Ticker    Asset    Doc
2011   Fb        NaN      doc1
2012   Fb        NaN      doc2

asset_df

Year   Ticker    Asset
2011   FB        100
2012   FB        200
2013   GOOG      300

Ideal result for ticker_df

Year   Ticker    Asset    Doc     
2011   Fb        100      doc1
2012   Fb        200      doc2

My Sucky implementation:

for i in ticker_df.Name.index:

    c_asset = asset_df[asset_df.tic == ticker_df.Name.ix[i]]
    if len(c_asset) > 0:
    #This checks to see if there is actually asset data on this company
        asset = c_asset[c_asset.fyear == ticker_df.Year.ix[i]]['at']

        if len(asset) > 0:
            asset =  int(asset)
            print 'g', asset, type(asset)
            ticker_df.asset.ix[i] = asset
            continue

        else:
            ticker_df.asset.ix[i] = np.nan
            continue

    if len(c_asset) == 0:
        ticker_df.asset.ix[i] = np.nan
        continue

You can use the update method. Just get the indices aligned first.

In [23]: ticker_df['Ticker'] = ticker_df.Ticker.str.upper()

In [24]: ticker_df = ticker_df.set_index(idx)

In [25]: asset_df = asset_df.set_index(idx)

In [26]: ticker_df.update(asset_df)

In [27]: ticker_df
Out[27]: 
             Asset   Doc
Year Ticker             
2011 FB        100  doc1
2012 FB        200  doc2

[2 rows x 2 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM