[英]Match one column of data frame to another, pull in other columns, combine into large dataset
I've got a list of Store IDs and their Zipcodes in a 2 column numeric vector (in R). 我在2列数字向量中(在R中)有一个商店ID及其邮政编码的列表。 I'm using the "Zipcode" package ( https://cran.rproject.org/web/packages/zipcode/zipcode.pdf ) and have access to the longitude/latitude coordinates for these zipcodes. 我正在使用“邮政编码”包( https://cran.rproject.org/web/packages/zipcode/zipcode.pdf ),并可以访问这些邮政编码的经度/纬度坐标。 The zipcode package has a data frame with every zip code, city,state, and longitude and latitude for all the zipcodes (as a large dataframe). 邮政编码包具有一个数据框,其中包含所有邮政编码的每个邮政编码,城市,州以及经度和纬度(作为大型数据框)。
I'm looking to get the longitude and latitude coordinates for my Zipcodes, and add them as columns 3 and 4 (ie Store ID, Zip Code, Longtitude, Latitude) 我正在寻找邮政编码的经度和纬度坐标,并将它们添加为第3列和第4列(即商店ID,邮政编码,经度,纬度)
Any thoughts? 有什么想法吗? Thank you! 谢谢!
EDIT: I've tried the merge function (ie) total<-merged(CleanData,zipcode, by=zip) and I'm getting an error because they must have the same number of columns? 编辑:我已经尝试了合并功能(即)total <-merged(CleanData,zipcode,by = zip),但由于它们必须具有相同的列数,因此我收到了错误消息?
The column name passed as the by
argument has to be enclosed within quotes. 作为by
参数传递的列名必须用引号引起来。 You don't need the by
argument in merge in this example, if zipcode is the only common column in the two dataframes. 如果邮政编码是两个数据帧中唯一的公共列, 则在此示例中,不需要在合并中使用by
参数。
Example datasets: 示例数据集:
#cleanData
d1<-tibble::tribble(~z,~id,131,1,114,2,155,5)
#zipcode
d2<-
tibble::tribble(~z,~x,~y,131,2,5,166,2,6,162,6,5,177,7,1,114,2,1,155,5,9)
result <- merge(d1,d2)
gives 给
z id x y
1 114 2 2 1
2 131 1 2 5
3 155 5 5 9
You can remove any unnecessary columns from the result dataframe by simply using dplyr::select()
. 您只需使用dplyr::select()
即可从结果数据dplyr::select()
删除任何不必要的列。 Suppose you don't need column y (which may be a state name, for example) 假设您不需要y列(例如,可以是州名)
result <- dplyr::select(result, z, id, x)
Ended up using this: How to join (merge) data frames (inner, outer, left, right)? 最终使用此方法: 如何连接(合并)数据框(内部,外部,左侧,右侧)?
essentially I used the Left Outer function because I wanted to keep all of the zipcodes in my store database. 本质上,我使用了Left Outer函数,因为我想将所有邮政编码保留在商店数据库中。 I believe the answer above would eliminate zipcodes not found in the second list of zipcodes. 我相信以上答案将消除在第二个邮政编码列表中找不到的邮政编码。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.