简体   繁体   English

根据另一列中的重复创建一列

[英]create a column based on repetition in another column

I have a table with several columns and in one of them I need to create a loop that flows down the column and if the same number is present twice another column is created (adding a "p" for example) and in which are present only once in the same column created earlier is added for example "-". 我有一个包含几列的表,其中之一需要创建一个循环,沿着该列向下流动,如果两次出现相同的数字,则会创建另一列(例如添加“ p”),并且其中仅存在在先前创建的同一列中添加一次,例如“-”。 Anyone? 任何人?

the column have Barcodes TCGA-3M-AB47-01A-22R-A414-31 I need the AB47 列中有条形码TCGA-3M-AB47-01A-22R-A414-31我需要AB47

for(code in tabela$Barcode){
     t=sapply(strsplit(as.character(code), "-"), function(x) x[[2]]) #to extract the AB47

This is essentially equivalent to the common find all duplicates question. 这本质上等同于常见的查找所有重复项的问题。 (Which is, ironically, very widely duplicated .) (这是具有讽刺意味的, 非常广泛复制 。)

For a vector x like x = c("A", "A", "C", "B", "C", "D", "E", "F") , the most common answer is to use the duplicated function twice, once with fromLast = T to flag all duplicates. 对于像x = c("A", "A", "C", "B", "C", "D", "E", "F")的向量x ,最常见的答案是使用duplicated函数两次,一次用fromLast = T标记所有重复项。 This will give a boolean vector indicating whether each value is duplicated. 这将提供一个布尔向量,指示每个值是否重复。 Adding 1 to the boolean converts it from TRUE / FALSE to 2 / 1 , which we can use as a subsetting index of your desired markings: 添加1至布尔自其转换TRUE / FALSE2 / 1 ,我们可以为您所需的标记的子集索引的使用:

y = c("-", "P")[(duplicated(x) | duplicated(x, fromLast = T)) + 1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM