简体   繁体   中英

dataframes have different sizes when referencing same dataframe?

I have a strange problem. I have a dataset that I'm trying to select unique values from then save those values into a dataframe. After that, I want one dataframe with all the data and another with simply a numerical value. Both dataframes should be the same size.

Here's my loop and not sure why it's not working:

for uniqueFundName in HoldingCompanies['fund_ticker'].unique():
  print(uniqueFundName)
  modelY = (HoldingCompanies.loc[HoldingCompanies['fund_ticker'] == uniqueFundName]) 
  modelX = modelY['fund_ticker']
  modelX['fund_ticker'] = 1

  del modelY['fund_ticker']
  print(modelX.shape)
  print(modelY.shape)

This is my output:

GBRE.LSE
(234,)
(233, 174)
MACEX.US
(35,)
(34, 174)
ANFVX.US
(43,)
(42, 174)
LQGH.LSE
(11,)
(10, 174)
HAC.TO
(39,)
(38, 174)
JSAYX.US
(26,)
(25, 174)

The modelX is always one value less than the modelY variable. This is confusing because I'm referencing the modelY value to create the modelX column.

What am I doing wrong?

ModelY['fund_ticker'] is a pandas Series object, when trying to access it later using modelX['fund_ticker']=1 , you are simply adding another value to the series in the index 'fund_ticker' with a value of 1 to the exact same Series. So the size of modelX is increasing in 1 since basically all you did was add another row to the Series.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM