I would like to fill in an existing dataframe another column. The column will contain the value of a dictionary. And those values are list of tokens. So far, it is not working, is there a way to add them to the dataframe?
df = pd.read_csv(sys.argv[1], na_values=['no info', '.'], encoding='Cp1252', delimiter=';')
s = pd.DataFrame(dict1).T.reset_index()
print(s)
#result
'''
index 0 1 2
0 231 2470 11854 2368
1 236 3132 11130 1236
2 237 4527 14593 1924
3 238 6167 8222 1070
'''
s.columns = ['number','grade1','grade2','grade3']
print(s.head())
#result
'''
number grade1 grade2 grade3
0 231 2470 11854 2368
1 236 3132 11130 1236
2 237 4527 14593 1924
3 238 6167 8222 1070
'''
df=pd.concat([df,s],axis=1)
print(df)
#result
'''
id ... grade3
0 231 ... 2368
1 236 ... 1236
'''
#Filling to excel file
df.to_excel('exit_test2.xlsx')
#filling a new column with a list of tokens for each cell. The key of two dict are the same so I just need the list of tokens
df['tokens'] = ' '
for k,v in dict2.items():
df.at[int(k), 'tokens'] = v
print(df)
#have error
traceback error
File "Script_JDM_sans_sens.py", line 101, in <module>
df.at[int(k), 'tokens'] = v #change -1 for verbatim
File "C:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 2287, in __setitem__
self.obj._set_value(*key, takeable=self._takeable)
File "C:\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2823, in _set_value
self.loc[index, col] = value
File "C:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 190, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 611, in _setitem_with_indexer
raise ValueError('Must have equal len keys and value '
ValueError: Must have equal len keys and value when setting with an iterable
#Display of dict2
dict2 = {'231': ['look','eat','at'], '236': ['lay','good', 'fun'], ….}
How can I resolve this error?
You can convert dict2 to a pandas' Series
object and then add that series as a column of you DataFrame df.
Convert dict2 to Series:
s = pd.Series(dict2)
If the 'id' column in df is of type int
, you'll have to convert the series index to int
:
s.index = s.index.astype(int)
Then, set the index of df to be the same as the Series' index:
df.set_index('id', inplace=True)
And finally add the 'token' column:
df['token'] = s
Here is the result:
grade1 grade2 grade3 token
id
231 2470 11854 2368 [look, eat, at]
236 3132 11130 1236 [lay, good, fun]
237 4527 14593 1924 NaN
238 6167 8222 1070 NaN
You need to set index of df
to column id
. It currently is rangeindex
. Try this
df['tokens'] = ' '
df = df.set_index('id')
for k,v in dict2.items():
df.at[int(k), 'tokens'] = v
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.