简体   繁体   English

如何在 R 的数据框中找出逗号在一行中出现的最大数量?

[英]How do I find out the highest number that commas had appeared in a row in a single column in a data frame in R?

I want to find out the maximum amount comma had appeared in a row in a single column.我想找出在单列中连续出现的最大逗号数量。

For example,例如,

     Cars
1    Bugatti (4)","Ferrari (7)","Audi (10)
2    Toyota (6)
3    Tesla (9)","Mercedes(8)
4    Suzuki (11)","Mitsubishi (19)","Ford (7)","BMW (6)

For the table column above, the maximum number a comma had appeared in a row is 3, and it is on row 4. How do I achieve this on a much more larger data (4000+ rows)?对于上面的表格列,逗号在一行中出现的最大数量是 3,它位于第 4 行。如何在更大的数据(4000+ 行)上实现这一点?

You can use gregexp() to return a vector of the positions of the comma(s) in each string.您可以使用gregexp()返回每个字符串中逗号位置的向量。 Then you can apply the length() function to count up the commas:然后你可以应用length() function 来计算逗号:

sapply(gregexpr(",", df$cars), length)
## 2 1 1 3

To answer the exact question asked, just wrap the above line of code in max() to determine the maximum number of times a comma appeared in one of your strings.要回答所提出的确切问题,只需将上面的代码行包装在max()中,以确定逗号出现在您的一个字符串中的最大次数。


The above actually returns a "1" when a "0" is expected.当预期为“0”时,上述内容实际上返回“1”。 There is probably a more elegant solution, but here's a function that will handle zeros correctly:可能有一个更优雅的解决方案,但这里有一个 function 可以正确处理零:

count_commas <- function(x) {
    y <- sapply(gregexpr(",", x), as.integer) # get position of commas
    y <- lapply(y, function(y) if(y[1] == -1) NULL else y) # replace zeros
    return( sapply(y, length) ) # return count of commas
}

count_commas(df$cars)
# 2 0 1 3

My idea is to remove the non-comma characters and calculate the number of chars.我的想法是删除非逗号字符并计算字符数。

I have no clue which class of object you are using for cars .我不知道您将 object 的 class 用于cars Assuming your input is假设您的输入是

cars <- c(' Bugatti (4)","Ferrari (7)","Audi (10)','Toyota (6)','Tesla (9)","Mercedes(8)','Suzuki (11)","Mitsubishi (19)","Ford (7)","BMW (6)')

then you can use nchar(gsub("[^,]","", cars)) to get the number of commas of each row.那么您可以使用nchar(gsub("[^,]","", cars))来获取每行的逗号数。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在R中的数据框中找到列的最高值? - How to find the highest value of a column in a data frame in R? 如何在 R 中找到一列的最高编号并打印该行的两列? - How to find the highest number of a column and print two columns of that row in R? 如何在数据框列中找到行索引号? - how can I find row index number in data frame column? 如何在 R 中找到数据框特定行的总数? - How do I find the total of a specific row of a data frame in R? 如何找到数据框中完整案例的数量并使用 R 生成仅包含列的指定值的小计的新数据框? - How do I find the number of complete cases in a data frame and produce a new data frame with only subtotals for a specified value of a column using R? R - 如何从数据框中的单列和单行获取字符串 - R - How to get at a string from a single column and row in a data frame 如何根据三列删除重复项,但我使用 R 保留特定列中编号最高的行? - How do I remove duplicates based on three columns, but I keep the row with the highest number in the specific column using R? 在 R 数据框中,对于给定的行,如何找到 A 列中的值与 B 列中的值的百分比? - In an R data frame, for a given row, how can I find what percentage a value in column A is of a value in column B? 如何将具有单个列的R数据帧转换为tm的语料库,以便将每一行作为文档? - How can I convert an R data frame with a single column into a corpus for tm such that each row is taken as a document? 将数据帧列与另一数据帧R中的单行相乘 - Multiply a data frame column with a single row in another data frame, R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM