簡體   English   中英

如何獲取 R 中另一個變量的變量總數? 這兩個變量都是非數字的

[英]How do I get the aggregate number of a variable against another variable in R? Both these variables are non-numeric

我有這個數據集,我正在嘗試創建一個新變量 (n_commitments),它將為我提供每個國家/地區的總段落數。 我知道這是超級基本的,但我現在不知何故被困了一個小時。 我認為這與兩個變量都是字符類的事實有關,我想要一個數字作為 output。

請幫忙,這樣我終於可以繼續前進了。 謝謝你。

      structure(list(country = c("Afghanistan", "Afghanistan"), paragraphs = c("The representative of Afghanistan confirmed that his Government would ensure the transparency of its ongoing privatization programme. He stated that his Government would provide reports to WTO Members on developments in its privatisation programme, periodically and upon request, as long as the programme would be in existence, and along the lines of the information already provided to the Working Party during the accession process. The Working Party took note of this commitment. ", 
"The representative of Afghanistan confirmed that from the date of accession, State-trading enterprises (including State-owned and State-controlled enterprises, enterprises with special or exclusive privileges, and unitary enterprises) in Afghanistan would make any purchases or sales, which were not for the Government's own use or consumption, solely in accordance with commercial considerations, including price, quality, availability, marketability, transportation and other conditions of purchase or sale. He further confirmed that these State trading enterprises would afford the enterprises of other Members adequate opportunity, in accordance with customary business practice, to compete for participation in purchases from or sales to Afghanistan's State enterprises. The Working Party took note of these commitments.  "
)), row.names = 1:2, class = "data.frame")

    Columns: 8
$ country            <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanis…
$ category           <chr> "State Ownership and Privatization; State-Trading Entities", "State Ownership and Pr…
$ paragraphs         <chr> "The representative of Afghanistan confirmed that his Government would ensure the tr…
$ year_complete      <int> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, …
$ year_start         <int> 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, …
$ accession_duration <int> 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, …
$ wto                <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ n_commitments      <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", …

以下是按國家/地區計算獨特段落的方法:

df %>% 
  group_by(country) %>%
  summarize(n_unique_paragraphs = n_distinct(paragraphs))

如果,如您所說, “數據的每一行都是一個唯一的段落” ,那么我們可以簡化並只計算行數:

df %>% group_by(country) %>%
  summarize(n = n())

為此還有內置實用程序 function:

df %>% count(country)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM