I'm new to Python and I'm trying to work on a small project and got a little confused.
I have 2 csv files that looks like this:
all_cars:
first_Car,second_car
Mazda, Skoda
Ferrari, Volkswagen
Volkswagen, Toyota
BMW, Ferrari
BMW, Mercedes
super_cars:
super_car_name
Ferrari
BMW
Mercedes
What I'm basicly trying to do is just to count how many times a car from file 2 represented in file 1. If the car represented only in file 1 and not in file 2, I don't want it.
What I'm trying to do based on my example files is :
Ferrari : 2
BMY : 2
Mercedes : 1
I'd do it this way:
In [220]: d1.stack().value_counts().to_frame('car').loc[d2.super_car_name]
Out[220]:
car
Ferrari 2
BMW 2
Mercedes 1
where d1
and d2
- your source DataFrames (which can be easily parsed from CSV files using pd.read_csv()
method):
In [218]: d1
Out[218]:
first_Car second_car
0 Mazda Skoda
1 Ferrari Volkswagen
2 Volkswagen Toyota
3 BMW Ferrari
4 BMW Mercedes
In [219]: d2
Out[219]:
super_car_name
0 Ferrari
1 BMW
2 Mercedes
You can use isin
to find the matches, then stack
and value_counts
to get everything in one table:
df1[df1.isin(df2.super_car_name.values)].stack().value_counts()
Ferrari 2
BMW 2
Mercedes 1
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.