I have a data frame imported with read_csv, as in this sample, but when i use query to filter the content i get an empty result.
The expected result i'm looking to get from query is row 0 and 2.
(pandas v1.3.1, python v3.9)
df1 = pd.read_csv(r'C:\Users\Dorin\Desktop\folder_files\test_1.txt',
encoding='utf-8',
sep=';',
names=["i_line", "f_path", "f_type", "f_hash"],
dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
keep_default_na=False,
na_values=['_'],
index_col=False)
DataFrame print(df1)
i_line f_path f_type f_hash
0 i: 1 "content 1" d n/a
1 i: 2 "content 2" f 1111
2 i: 3 "content 3" d n/a
Result of query print(df1.query("f_hash == 'n/a'"))
Empty DataFrame
Columns: [i_line, f_path, f_type, f_hash]
Index: []
File content
In your file, the separator is not ;
but rather ;
(with an optional space).
Thus your n/a
is in fact a n/a
You have to change the separator in read_csv
:
df1 = pd.read_csv('/tmp/t.csv',
encoding='utf-8',
sep='; ?', ## sep is ";" with optional space
names=["i_line", "f_path", "f_type", "f_hash"],
dtype={'i_line': 'string', 'f_path': 'string', 'f_type': 'string', 'f_hash': 'string'},
keep_default_na=False,
na_values=['_'],
index_col=False)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.