简体   繁体   English

从 R 中的 github 加载原始文件

[英]loading raw file from github in R

I am trying to load a codebook from github in R studio.我正在尝试从 R 工作室的 github 加载密码本。 The url is here . url 在这里 It is a md file based on the link, but I want to load its raw file.它是一个基于链接的md文件,但我想加载它的原始文件。 (As pic1 shows on the top right this is a tab called raw , and when I click that, it shows pic2 ).I try to use the link provided, but it does not work. (如pic1右上角所示,这是一个名为raw的选项卡,当我单击它时,它显示pic2 )。我尝试使用提供的链接,但它不起作用。 Could anyone help to tell how to do that?任何人都可以帮助告诉如何做到这一点? Thanks a lot!非常感谢!

cddf<-url("https://github.com/HimesGroup/BMIN503/blob/master/DataFiles/NHANES_2007to2008_DataDictionary.md")
cd<-read.table(cddf )

Update:更新: 在此处输入图像描述 When I changed the code:当我更改代码时:

codebook<-read.table("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/NHANES_2007to2008_DataDictionary.md",skip = 4, sep = "|", head = TRUE)

The r successfully read most of them, but the sep "|" r 成功读取其中大部分,但 sep "|" did not work for two variables: INDHHIN2 and MCQ010.不适用于两个变量:INDHHIN2 和 MCQ010。 See pic.见图。 Can anyone help to figure out why?任何人都可以帮助找出原因吗? Thanks~~!谢谢~~! 在此处输入图像描述

在此处输入图像描述

There are two issues here.这里有两个问题。

First, the raw file is available at the link https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/NHANES_2007to2008_DataDictionary.md .首先,原始文件可在链接https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/NHANES_2007to2008_DataDictionary.md However, read.table is not going to be able to read that file without some help: read.table is used for tab or comma delimited files, and that's a table marked up for Markdown.但是,如果没有帮助, read.table将无法读取该文件: read.table用于制表符或逗号分隔的文件,这是一个标记为 Markdown 的表。 This comes close:这很接近:

read.table("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/NHANES_2007to2008_DataDictionary.md",
 skip = 4, sep = "|", head = TRUE)

but it will still need some cleanup, to remove the first and last columns of junk it added, and to delete the first line.但它仍然需要一些清理,以删除它添加的第一列和最后一列垃圾,并删除第一行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM