对字符串的熊猫查询返回空结果

Question

I have a data frame imported with read_csv, as in this sample, but when i use query to filter the content i get an empty result.我有一个使用 read_csv 导入的数据框，如本示例中所示，但是当我使用查询来过滤内容时，我得到一个空结果。

The expected result i'm looking to get from query is row 0 and 2.我希望从查询中获得的预期结果是第 0 行和第 2 行。

(pandas v1.3.1, python v3.9) （熊猫 v1.3.1，python v3.9）

df1 = pd.read_csv(r'C:\Users\Dorin\Desktop\folder_files\test_1.txt',
              encoding='utf-8',
              sep=';',
              names=["i_line", "f_path", "f_type", "f_hash"],
              dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
              keep_default_na=False,
              na_values=['_'],
              index_col=False)

DataFrame print(df1)数据帧print(df1)

  i_line        f_path f_type f_hash
0   i: 1   "content 1"      d    n/a
1   i: 2   "content 2"      f   1111
2   i: 3   "content 3"      d    n/a

Result of query print(df1.query("f_hash == 'n/a'"))查询结果print(df1.query("f_hash == 'n/a'"))

Empty DataFrame
Columns: [i_line, f_path, f_type, f_hash]
Index: []

File content文件内容

Answer 1

In your file, the separator is not ;在您的文件中，分隔符不是; but rather ;而是; (with an optional space). （带有可选空间）。

Thus your n/a is in fact a n/a因此，您的n/a实际上是n/a

You have to change the separator in read_csv :您必须更改read_csv的分隔符：

df1 = pd.read_csv('/tmp/t.csv',
              encoding='utf-8',
              sep='; ?',  ## sep is ";" with optional space
              names=["i_line", "f_path", "f_type", "f_hash"],
              dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
              keep_default_na=False,
              na_values=['_'],
              index_col=False)

对字符串的熊猫查询返回空结果

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-07-29 12:33:23

对字符串的熊猫查询返回空结果

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-07-29 12:33:23

解决方案1
2 已采纳 2021-07-29 12:33:23