So want to count the occurrences of contaminants but some cases has more than one contaminants so when I use the value_counts it counts them as one. For example "Gasoline, Diesel = 8" How would I count the them as separate without doing it manually.
And would it be possible to create a function that would make it easier to categorize them into lets say 4 types of contaminant? I just need a clue or a simple explanation on what I need to do.
data=pd.read_csv('Data gathered.csv') data
data['CONTAMINANTS'].value_counts().plot(kind = 'barh').invert_yaxis()
Assuming the contaminants are always separated by commas in your data, you can use pandas.Series.str.split()
to get them into lists. Then you can get them into distinct rows with pandas.DataFrame.explode()
, which finally allows using the value_counts()
method.
For example:
import pandas as pd
data = pd.DataFrame({'File Number': [1, 2, 3, 4],
'CONTAMINANTS': ['ACENAPHTENE, ANTHRACENE, BENZ-A-ANTHRACENE',
'CHLORINATED SOLVENTS',
'DIESEL, GASOLINE, ACENAPHTENE',
'GASOLINE, ACENAPHTENE']})
data
File Number CONTAMINANTS
0 1 ACENAPHTENE, ANTHRACENE, BENZ-A-ANTHRACENE
1 2 CHLORINATED SOLVENTS
2 3 DIESEL, GASOLINE, ACENAPHTENE
3 4 GASOLINE, ACENAPHTENE
data['CONTAMINANTS'] = data['CONTAMINANTS'].str.split(pat=', ')
data_long = data.explode('CONTAMINANTS')
data_long['CONTAMINANTS'].value_counts()
ACENAPHTENE 3
GASOLINE 2
DIESEL 1
ANTHRACENE 1
BENZ-A-ANTHRACENE 1
CHLORINATED SOLVENTS 1
Name: CONTAMINANTS, dtype: int64
To categorize the contaminants, you could define a dictionary that maps them to types. Then you can use that dictionary to add a column of types to the exploded dataframe:
types = {'ACENAPHTENE': 1,
'GASOLINE': 2,
'DIESEL': 2,
'ANTHRACENE': 1,
'BENZ-A-ANTHRACENE': 1,
'CHLORINATED SOLVENTS': 3}
data_long['contaminant type'] = data_long['CONTAMINANTS'].apply(lambda x: types[x])
data_long
File Number CONTAMINANTS contaminant type
0 1 ACENAPHTENE 1
0 1 ANTHRACENE 1
0 1 BENZ-A-ANTHRACENE 1
1 2 CHLORINATED SOLVENTS 3
2 3 DIESEL 2
2 3 GASOLINE 2
2 3 ACENAPHTENE 1
3 4 GASOLINE 2
3 4 ACENAPHTENE 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.