简体   繁体   中英

Creating new pandas dataframe by extracting columns from other dataframes - ValueError

I have to extract columns from different pandas dataframes and merge them into a single new dataframe. This is what I am doing:

newdf=pd.DataFrame()
newdf['col1']=sorted(df1.columndf1.unique())
newdf['col2']=df2.columndf2.unique(),
newdf['col3']=df3.columndf3.unique()
newdf

I am sure that the three columns have the same length (I have checked) but I get the error

ValueError: Length of values does not match length of index

I have tried to pass them as pd.Series but the result is the same. I am on Python 2.7.

It seems there is problem length of unique values is different.

One possible solution is concat all data together and apply unique .
If unique data not same sizes, get NaN s in last values of columns.

newdf = pd.concat([df1.columndf1, df2.columndf2, df3.columndf3], axis=1)
          .apply(lambda x: pd.Series(x.unique()))

EDIT:

Another possible solution:

a = sorted(df1.columndf1.unique())
b = list(df2.columndf2.unique())
c = list(df3.columndf3.unique())

newdf=pd.DataFrame({'col1':a, 'col2':b, 'col3':c})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM