[英]pandas query for string returns empty result
I have a data frame imported with read_csv, as in this sample, but when i use query to filter the content i get an empty result.我有一个使用 read_csv 导入的数据框,如本示例中所示,但是当我使用查询来过滤内容时,我得到一个空结果。
The expected result i'm looking to get from query is row 0 and 2.我希望从查询中获得的预期结果是第 0 行和第 2 行。
(pandas v1.3.1, python v3.9) (熊猫 v1.3.1,python v3.9)
df1 = pd.read_csv(r'C:\Users\Dorin\Desktop\folder_files\test_1.txt',
encoding='utf-8',
sep=';',
names=["i_line", "f_path", "f_type", "f_hash"],
dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
keep_default_na=False,
na_values=['_'],
index_col=False)
DataFrame print(df1)
数据帧print(df1)
i_line f_path f_type f_hash
0 i: 1 "content 1" d n/a
1 i: 2 "content 2" f 1111
2 i: 3 "content 3" d n/a
Result of query print(df1.query("f_hash == 'n/a'"))
查询结果print(df1.query("f_hash == 'n/a'"))
Empty DataFrame
Columns: [i_line, f_path, f_type, f_hash]
Index: []
File content文件内容
In your file, the separator is not ;
在您的文件中,分隔符不是;
but rather ;
而是;
(with an optional space). (带有可选空间)。
Thus your n/a
is in fact a n/a
因此,您的n/a
实际上是n/a
You have to change the separator in read_csv
:您必须更改read_csv
的分隔符:
df1 = pd.read_csv('/tmp/t.csv',
encoding='utf-8',
sep='; ?', ## sep is ";" with optional space
names=["i_line", "f_path", "f_type", "f_hash"],
dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
keep_default_na=False,
na_values=['_'],
index_col=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.