简体   繁体   中英

How to get all values of a specific column based on a specific value in another column in R?

My format for CSV files are

Camp.CSV

Campaign,AdGroup,Keyword,Status
florida,orlando,floridaorlando,Paused
new york,albany,new yorkalbany,Active

geo_fl.csv

Campaign,Adgroup
florida,orlando
florida,miami
new york,new york
california,san francisco,
california,los angeles

I want to list all the Adgroup in 'geo_fl.csv' based on 'Campaign' in 'Camp.csv' like for florida in 'Camp.csv' it should return the values (orlando,miami) in 'geo_fl.csv'

So far code is as follows -

 # Declare function to check with the presence of the 'campaignname' or not
 campaignname <- function(point1, point2) {
 conditioncheck <- any(point2==point1)
 }
# Declare a function to check the presence of the 'adgroupname' or not

# Read the CSV files for reference
newlistings <- read.csv("/home/chi/Downloads/Camp.csv",header=TRUE)
georeportrecord <- read.csv("/home/chi/Downloads/geo_fl.csv",header=TRUE)
# Store the data of each column in a variable for 'Camp.csv'
Keyword <- newlistings$keyword
campaign <- newlistings$Campaign
adgroup <- newlistings$AdGroup
status <- newlistings$Status
# Store the data of each column in a variable for 'geo_fl.csv'
geoCampaign <- georeportrecord$Campaign
geoAdGroup <- georeportrecord$Adgroup

# getting the values for 'number of rows' in each CSV list
nCGM <- nrow(newlistings)
nAdwords <- nrow(georeportrecord)

Pts2 <- georeportrecord[,c("Campaign")]
CGMGeoList <- NULL
# checking for the presence of the element in the vector
#for(i in campaign){
for(i in 1:nCGM){
Pts1 <- NULL
Pts1$Campaign <- (newlistings$Campaign[i])
# passing the value to the function for 'campaign' presence check
checkcondition <- campaignname(Pts1,Pts2)
if(checkcondition == TRUE){
   ad <- geoAdgroup[which(geoCampaign==i)# Stuck here(returning no result)
 }
 }

Also I have tried

for(i in campaign)
 { if (any(geoCampaign==i) == TRUE){
print(i)
# But also I want to list all adgroup for 'geo_fl.csv' together.

} }

My desired output

 Campaign,AdGroup,Keyword,Status,Campaignpresentingeo_fl,Adgrouppresentingeo_fl
 florida,orlando,floridaorlando,Paused,YES,YES
 new york,albany,new yorkalbany,Active,YES,NO

Condition for the above desired result

 for(i in campaign){
 If(( i present in georeportrecord)==TRUE))#for that particular 'campaign' in 'Camp.csv' check the condition for 'Adgroup' in 'geo_fl.csv'
{ If ((AdGroup[i] present in georeportrecord$Adgroup)==TRUE))#AdGroup for that particular 'campaign' 'i' in 'Camp.csv' is also present as an adgroup in 'geo_fl.csv'
{
output write.csv(florida,orlando,floridaorlando,Paused,YES,YES)
}else{
write.csv(florida,orlando,floridaorlando,Paused,YES,NO)
}
}else{write.csv(florida,orlando,floridaorlando,Paused,NO,NO)
}

Output the data onto a CSV file , just 2 additional columns in Camp.csv which indicates YES and NO How to list the values as specified above so that I can write to another CSV file, Please help me with the following, new to R, Any help is appreciated.

It's unclear what you want your output to look like, but here's a simple way to concatenate all levels of one factor that belonging to each of the levels of another factor:

georeportrecord <- read.csv(text='Campaign,Adgroup
florida,orlando
florida,miami
new york,new york
california,san francisco
california,los angeles', header=TRUE)

newlistings <- read.csv(text='Campaign,AdGroup,Keyword,Status
florida,orlando,floridaorlando,Paused
new york,albany,new yorkalbany,Active', header=TRUE)

out <- aggregate(subset(georeportrecord, 
                        Campaign %in% newlistings$Campaign)$Adgroup, 
                 list(Campaign=subset(georeportrecord, 
                      Campaign %in% newlistings$Campaign)$Campaign), 
                 paste0)

out

  Campaign              x
1  florida orlando, miami
2 new york       new york

Use write.csv to write the data out to a csv (see ?write.csv ).

EDIT: (After clarification of desired output)

The above code returns a concatenated string containing the Adgroups present in each Campaign that exists in newlistings . To present as requested by the OP:

newlistings$Campaignpresentingeo_fl <- 
  newlistings$Campaign %in% georeportrecord$Campaign

newlistings$Adgrouppresentingeo_fl <- 
  apply(newlistings, 1, function(x) x[2] %in% 
          subset(georeportrecord, Campaign==x[1])[, 'Adgroup'])

After required output,

x<-read.csv(text='Campaign,Adgroup
florida,orlando
florida,miami
new york,new york
california,san francisco
california,los angeles', header=T, stringsAsFactors=F)


y=read.csv(text="Campaign,AdGroup,Keyword,Status
florida,orlando,floridaorlando,Paused
new york,albany,new yorkalbany,Active", header=T, stringsAsFactors=F)

Campaigns<-x$Campaign
AdGroups<-interaction(x$Campaign, x$Adgroup)

y$campaignpresence<-ifelse(y$Campaign %in% Campaigns,"YES", "NO")
y$geopresence<-ifelse(interaction(y$Campaign, y$AdGroup) %in% AdGroups,"YES", "NO")

output

 y
  Campaign AdGroup        Keyword Status campaignpresence geopresence
1  florida orlando floridaorlando Paused              YES         YES
2 new york  albany new yorkalbany Active              YES          NO

ignore below, as it answered separate thing

another approach with data.table. I even don't see the need of first table camp.csv provided that you have all your unique campaigns in second table. I just made dummmy data here where x is your campaign and y is your Adgroup

require(data.table)
x<-data.frame(x=sample(1:10, 100, replace=T), y=sample(100:999,100))
y<-data.table(x)
l<-y[,list(y=list(y)),by=x]
l$y<-sapply(l$y, paste, collapse=",")
write.table(l,...)

Be careful with writing as csv because your second column now has comma in it, so tsv may be better

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM