简体   繁体   中英

python count occurrences in csv with pandas

I'm new to Python and I'm trying to work on a small project and got a little confused.

I have 2 csv files that looks like this:

all_cars:

first_Car,second_car
Mazda, Skoda
Ferrari, Volkswagen
Volkswagen, Toyota
BMW, Ferrari
BMW, Mercedes

super_cars:

super_car_name
Ferrari
BMW
Mercedes

What I'm basicly trying to do is just to count how many times a car from file 2 represented in file 1. If the car represented only in file 1 and not in file 2, I don't want it.

What I'm trying to do based on my example files is :

Ferrari : 2
BMY : 2
Mercedes : 1

I'd do it this way:

In [220]: d1.stack().value_counts().to_frame('car').loc[d2.super_car_name]
Out[220]:
          car
Ferrari     2
BMW         2
Mercedes    1

where d1 and d2 - your source DataFrames (which can be easily parsed from CSV files using pd.read_csv() method):

In [218]: d1
Out[218]:
    first_Car  second_car
0       Mazda       Skoda
1     Ferrari  Volkswagen
2  Volkswagen      Toyota
3         BMW     Ferrari
4         BMW    Mercedes

In [219]: d2
Out[219]:
  super_car_name
0        Ferrari
1            BMW
2       Mercedes

You can use isin to find the matches, then stack and value_counts to get everything in one table:

df1[df1.isin(df2.super_car_name.values)].stack().value_counts()

Ferrari     2
BMW         2
Mercedes    1
dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM