简体   繁体   English

对字符串的熊猫查询返回空结果

[英]pandas query for string returns empty result

I have a data frame imported with read_csv, as in this sample, but when i use query to filter the content i get an empty result.我有一个使用 read_csv 导入的数据框,如本示例中所示,但是当我使用查询来过滤内容时,我得到一个空结果。

The expected result i'm looking to get from query is row 0 and 2.我希望从查询中获得的预期结果是第 0 行和第 2 行。

(pandas v1.3.1, python v3.9) (熊猫 v1.3.1,python v3.9)

df1 = pd.read_csv(r'C:\Users\Dorin\Desktop\folder_files\test_1.txt',
              encoding='utf-8',
              sep=';',
              names=["i_line", "f_path", "f_type", "f_hash"],
              dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
              keep_default_na=False,
              na_values=['_'],
              index_col=False)

DataFrame print(df1)数据帧print(df1)

  i_line        f_path f_type f_hash
0   i: 1   "content 1"      d    n/a
1   i: 2   "content 2"      f   1111
2   i: 3   "content 3"      d    n/a

Result of query print(df1.query("f_hash == 'n/a'"))查询结果print(df1.query("f_hash == 'n/a'"))

Empty DataFrame
Columns: [i_line, f_path, f_type, f_hash]
Index: []

File content文件内容

在此处输入图片说明

In your file, the separator is not ;在您的文件中,分隔符不是; but rather ;而是; (with an optional space). (带有可选空间)。

Thus your n/a is in fact a n/a因此,您的n/a实际上是n/a

You have to change the separator in read_csv :您必须更改read_csv的分隔符:

df1 = pd.read_csv('/tmp/t.csv',
              encoding='utf-8',
              sep='; ?',  ## sep is ";" with optional space
              names=["i_line", "f_path", "f_type", "f_hash"],
              dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
              keep_default_na=False,
              na_values=['_'],
              index_col=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM