sample_file.txt
6|test|3|4
5|test||8
9|test|NA|12
Script
import pandas as pd
df = pd.read_csv('sample_file.txt', dtype='str', sep='|', names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'])
df["chr_vl"].fillna("NOT AVLBL", inplace = True)
print(df)
Current output
upc_cd chr_typ chr_vl chr_vl_typ
0 6 test 3 4
1 5 test NOT AVLBL 8
2 9 test NOT AVLBL 12
Required output
upc_cd chr_typ chr_vl chr_vl_typ
0 6 test 3 4
1 5 test NOT AVLBL 8
2 9 test NA 12
Basically I need NA as it is in the output same time it should replace null values with the specific text 'NOT AVLBL' Tried replace method as well, but couldn't get the desired output
Pandas read_csv functiomn already defines a set of strings that will be interpreted as NaNs when you load a csv file. Here you have the option to either extend that list with other strings or to also completely overwrite it. In your case you have to overwrite it, as NA is one of the default values used by pandas. To do so, you could try something like
df = pd.read_csv('sample_file.txt', dtype='str', sep='|',
names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'],
na_values=[''], keep_default_na=False)
...
This will only interpret the empty string as NA as we have set keep_default_na
to False
and have only given ''
as a NA value with na_values
argument. If you want to learn more, have a look at the pandas docs .
Pandas read_csv
is a bit too clever here. The problem is that many strings are commonly used to identify missing values in CSV files.
According to official documentation
... By default the following values are interpreted as NaN: '', '#N/A', '#N/AN/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan', '1.#IND', '1.#QNAN', '', 'N/A', 'NA', 'NULL', 'NaN', 'n/a', 'nan', 'null'.
So your dataframe does contain an NaN and fillna
normally fills it.
To only accept the empty string as NaN, you have to both set na_values
to ''
and keep_default_na
to false:
df = pd.read_csv('sample_file.txt', dtype='str', sep='|',
names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'],
na_values='', keep_default_na=False)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.