如何對具有相同變量的兩個頻率數據集求和？

Question

嗨，我想創建兩個贏得一定數量比賽的網球運動員的頻率數據集。 兩個數據集的順序完全相同

我如何創建數據集：

PROC FREQ data=projet.matchs;
    TABLES player1 / out = table1;
run;

player1            Fréquence    Pourcentage Fréquencecumulée    Pourcentagecumulé
Adrian Mannarino    3              1.18        3                      1.18
Agnieszka Radwanska 2               0.79       5                      1.97
Ajla Tomljanovic    1               0.39       6                      2.36
Albert Ramos        1               0.39       7                      2.76

第二個數據集表2

PROC FREQ data=projet.matchs;
    TABLES player2 / out= table2;
run;

player2              Fréquence  Pourcentage Fréquence cumulée   Pourcentage cumulé
Adrian Mannarino       1          0.39              1                 0.39
Alex Bolt              1          0.39              2                 0.79
Alex De Minaur         1          0.39              3                 1.18
Alexander Zverev       3          1.18              6                 2.36

我想要的是用 table1 和 table2 的總和創建一個新數據集。 我的數據集要大得多，我剛剛放置了第 4 個第一個觀察結果。

任何幫助將不勝感激！ 謝謝

Answer 1

這個怎么樣？ 對你起作用嗎？

data combined / view=combined;
set table1 table2;
run;

proc means data=combined nway;
class player1;
var Fréquence,Pourcentage,Fréquence cumulée,Pourcentage cumulé;
run;

Answer 2

Yoy 可以使用 proc sql 和 summary 函數將其加入表中。

Have1 數據集：

+---------------------+-----------+-------------+------------------+-------------------+
|       player1       | Frequence | Pourcentage | Frequencecumulee | Pourcentagecumule |
+---------------------+-----------+-------------+------------------+-------------------+
| Adrian Mannarino    |         3 |        1.18 |                3 |              1.18 |
| Agnieszka Radwanska |         2 |        0.79 |                5 |              1.97 |
| Ajla Tomljanovic    |         1 |        0.39 |                6 |              2.36 |
| Albert Ramos        |         1 |        0.39 |                7 |              2.76 |
+---------------------+-----------+-------------+------------------+-------------------+

Have2 數據集：

+------------------+-----------+-------------+------------------+-------------------+
|     player2      | Frequence | Pourcentage | Frequencecumulee | Pourcentagecumule |
+------------------+-----------+-------------+------------------+-------------------+
| Adrian Mannarino |         1 |        0.39 |                1 |              0.39 |
| Alex Bolt        |         1 |        0.39 |                2 |              0.79 |
| Alex De Minaur   |         1 |        0.39 |                3 |              1.18 |
| Alexander Zverev |         3 |        1.18 |                6 |              2.36 |
+------------------+-----------+-------------+------------------+-------------------+

解決方案：

proc sql noprint;
   create table want1 as
   select 
            coalesce(player1,player2) as player,
            sum(t1.Frequence,t2.Frequence) as Frequence,
            sum(t1.Pourcentage,t2.Pourcentage) as Pourcentage,
            sum(t1.Frequencecumulee,t2.Frequencecumulee) as Frequencecumulee,
            sum(t1.Pourcentagecumule,t2.Pourcentagecumule) as Pourcentagecumule
   from
            have1 t1
   full join 
            have2 t2
   on
            strip(player1)=strip(player2);
quit;

輸出：

+---------------------+-----------+-------------+------------------+-------------------+
|       player        | Frequence | Pourcentage | Frequencecumulee | Pourcentagecumule |
+---------------------+-----------+-------------+------------------+-------------------+
| Adrian Mannarino    |         4 |        1.57 |                4 |              1.57 |
| Agnieszka Radwanska |         2 |        0.79 |                5 |              1.97 |
| Ajla Tomljanovic    |         1 |        0.39 |                6 |              2.36 |
| Albert Ramos        |         1 |        0.39 |                7 |              2.76 |
| Alex Bolt           |         1 |        0.39 |                2 |              0.79 |
| Alex De Minaur      |         1 |        0.39 |                3 |              1.18 |
| Alexander Zverev    |         3 |        1.18 |                6 |              2.36 |
+---------------------+-----------+-------------+------------------+-------------------+

或者你可以嘗試使用 data step + proc summary：

data want2;
  set have2(rename=(player2=player)) have1(rename=(player1=player));
run;

proc summary data=want2 nway;
  var Frequence Pourcentage Frequencecumulee Pourcentagecumule;
  class player;
  output out=want2 (drop=_:) sum=;
run;

輸出：

+---------------------+-----------+-------------+------------------+-------------------+
|       player        | Frequence | Pourcentage | Frequencecumulee | Pourcentagecumule |
+---------------------+-----------+-------------+------------------+-------------------+
| Adrian Mannarino    |         4 |        1.57 |                4 |              1.57 |
| Agnieszka Radwanska |         2 |        0.79 |                5 |              1.97 |
| Ajla Tomljanovic    |         1 |        0.39 |                6 |              2.36 |
| Albert Ramos        |         1 |        0.39 |                7 |              2.76 |
| Alex Bolt           |         1 |        0.39 |                2 |              0.79 |
| Alex De Minaur      |         1 |        0.39 |                3 |              1.18 |
| Alexander Zverev    |         3 |        1.18 |                6 |              2.36 |
+---------------------+-----------+-------------+------------------+-------------------+

Answer 3

當然，請改用 ODS 表輸出。 這為您提供了一個不錯的干凈版本。 名為temp的表是 proc freq 的輸出，然后我將其清理到一個名為want的可顯示表中。 它非常通用，因此在第一步中更改您的數據集名稱和變量名稱，其他一切都應該可以正常工作。

*Run frequency for tables;
ods table onewayfreqs=temp;
proc freq data=sashelp.class;
    table sex age;
run;

*Format output;
data want;
length variable $32. variable_value $50.;
set temp;
Variable=scan(table, 2);

Variable_Value=strip(trim(vvaluex(variable)));

keep variable variable_value frequency percent cum:;
label variable='Variable' 
    variable_value='Variable Value';
run;

*Display;
proc print data=want(obs=20) label;
run;

如何對具有相同變量的兩個頻率數據集求和？

問題描述

3 個解決方案

解決方案1
1 2019-12-11 13:15:21

解決方案2
0 2019-12-11 14:05:09

解決方案3
0 2019-12-11 16:22:19

如何對具有相同變量的兩個頻率數據集求和？

問題描述

3 個解決方案

解決方案1 1 2019-12-11 13:15:21

解決方案2 0 2019-12-11 14:05:09

解決方案3 0 2019-12-11 16:22:19

解決方案1
1 2019-12-11 13:15:21

解決方案2
0 2019-12-11 14:05:09

解決方案3
0 2019-12-11 16:22:19