简体   繁体   中英

Is there a way to find the number of occurrences of each value in a column in another column?

I have two dataframes called dataset1 and dataset 2 (shown below). The "pattern" and "SAX" columns contain string values.

dataset1=
       pattern   tstamps
0    glngsyu     1610460
1    zicobgm     1610466
2    eerptow        .
3    cqbsynt        .
4    zvmqben        .
..       ...
475  rfikekw
476  bnbzvqx
477  rsuhgax
478  ckhloio
479  lbzujtw

480 rows × 1 columns

dataset2 =
    SAX     timestamp
0   hssrlcu 16015
1   ktyuymp 16016
2   xncqmfr 16017
3   aanlmna 16018
4   urvahvo 16019
... ... ...
263455  jeivqzo 279470
263456  bzasxgw 279471
263457  jspqnqv 279472
263458  sxwfchj 279473
263459  gxqnhfr 279474

263460 rows × 2 columns

Is there a way to check the the occurrence count of each row of pattern(dataset1) in SAX(dataset2). Basically the number of time's a value in pattern column of(dataset1) exists in the SAX column of (dataset2)?

Something basically like this:

dataset1=
       pattern  no. of occurrences
0    glngsyu          3
1    zicobgm          0
2    eerptow          1
.       .             .
.       .             .
.       .             .
479  lbzujtw          2

480 rows × 2 columns

Thanks.

This should do it

dataset2_SAX_value_counts = dataset2["SAX"].value_counts()
dataset1["no. of occurrences"] = dataset1["pattern"].apply(lambda x: dataset2_SAX_value_counts.loc[x])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM