[英]Python: create a boolean df from existing df, if column values equal to
I noticed an error in my code and would like to use your help with my GUI.我注意到我的代码中有一个错误,并希望在我的 GUI 中使用您的帮助。
I have a function which get a selected column name (line 3), identifies all the unique values of the column and later on create new data frames equal to the number of unique values.我有一个 function,它获取选定的列名(第 3 行),标识该列的所有唯一值,然后创建等于唯一值数量的新数据框。
I noticed an issue with the line 8,我注意到第 8 行有问题,
I couldn't a function equal to contains() but which checks the equality, and I am trying to avoid loops in this case.我不能让 function 等于 contains() 但它会检查相等性,在这种情况下我试图避免循环。 Any help will be appreciated.任何帮助将不胜感激。 thanks!谢谢!
1) def basic_splitter():
2) global df
3) column = combobox_column_list.get()
4) unique_values = df[column].unique()
5) for i in unique_values:
6)
7) # first df[] will split the original data frame into smaller data frames based on i value
8) df_output = df[df[column].str.contains(i)]
9)
10) output_path = csv_xlsx_file_path + '/' + i + '.xlsx'
11) df_output.to_excel(output_path, sheet_name = i, index = False)
12) label_after_split = Label(my_frame_1, text = "Saved in: " + csv_xlsx_file_path)
13) label_after_split.grid(row = 4, column = 1)
Error message:错误信息:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\orkhamir\AppData\Local\Programs\Python\Python310\lib\tkinter\__init__.py", line 1921, in __call__
return self.func(*args)
File "C:\Users\orkhamir\AppData\Local\Temp\1/ipykernel_1976/2220190921.py", line 76, in basic_splitter
df_output = df[df[column].str.contains(i)]
raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!
converting column to str and then run the function.将列转换为 str,然后运行 function。
UPDATE: I have changed the code to the following one.更新:我已将代码更改为以下代码。 To solve all the issues I had previously.解决我之前遇到的所有问题。
def basic_splitter():
global df
column = combobox_column_list.get()
unique_values = df[column].unique()
for i in range(len(unique_values)):
# create a new file to store the df
output_path = 'C:/Users/orkhamir/Desktop/New folder/' + str(unique_values[i]) + '.xlsx'
# create a first df where the column value is equal to first unique value
df_output = df[df[column] == unique_values[i]]
df_output.to_excel(output_path, sheet_name = str(unique_values[i]), index = False)
label_after_split = Label(my_frame_1, text = "Saved in: " + csv_xlsx_file_path)
label_after_split.grid(row = 4, column = 1)
You need to make sure your column is of type string
before trying to call the str
accessor on it.在尝试对其调用str
访问器之前,您需要确保您的列是string
类型。 Just try:试一试:
df_output = df[df[column].astype('string').str.contains(i)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.