I have a string which is a file name
as File_Name = 23092020_indent.xlsx
Now I have a dataframe as follows:
Id fileKey fileSource fileStringLookup
10 rel_ind sap_indent indent
20 dm_material sap_mm mater
30 dm_vendor sap_vm vendor
Objective: Find the fileKey
and fileSource
where fileStringLookup
matches with file name
.
Exact match is not possible, hence we may set regex = True
for this I am using the following code snippets:
if tbl_master_file['fileStringLookup'].str.contains(File_Name,regex=True):
File_Key = np.where(tbl_master_file['fileStringLookup'].str.contains(File_Name,regex=True),\
tbl_master_file['fileKey'],'')
File_Source = np.where(tbl_master_file['fileStringLookup'].str.contains(File_Name,regex=True),\
tbl_master_file['fileSource'],'')
But this is not returning any value for File_Key
and File_Source
. Instead I am getting the following error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I investigated further to see whether df['fileStringLookup'].str.contains(File_Name,regex=True)
is returning any value which is True
. But it is returning False
, even for the Id=10
!!
My desired output:
File_Key = 'rel_ind'
File_Source = 'sap_indent'
Am I missing out anything?
Your error is caused because your call to str.contains
returns a Series of booleans, one for every element of the original Series. Thus, the if
statement does not know what to check for, as a Series of booleans' truth value is ambiguous.
I would use pd.iterrows()
inside a function, like :
def get_filekey_filesource(filename, df):
return [{"fileSource": data.loc["fileSource"],
"fileKey": data.loc["fileKey"]}
if filename in data.loc["fileStringLookup"]
else {}
for index, data in df.iterrows()]
As you can see, this will return you a list of dictionnaries where the keys fileSource
, fileKey
hold their respective value for rows that match, or an empty dic where matching fails.
This looks far from ideal, but is the best i could come up with. Feedback welcome.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.