简体   繁体   English

R - 从 SharePoint 列表解析 XML

[英]R - Parsing XML from SharePoint List

I am collecting data from a SharePoint list.我正在从 SharePoint 列表收集数据。 The connection works;连接有效; however, the list is limited to 1,000 items when there are more than 25,000 items in the full list.但是,当完整列表中的项目超过 25,000 项时,列表限制为 1,000 项。 The same connection with Tableau and Excel provides the full list.与 Tableau 和 Excel 相同的连接提供了完整列表。

URL <- "http://XXXXXX/XXXXXX/_vti_bin/ListData.svc/RequiredLearningStatus"
URL_parsed <- xmlParse(readLines(URL,warn=F))
items <- getNodeSet(URL_parsed, "//m:properties")
x <- xmlToDataFrame(items, stringsAsFactors = FALSE)

I receive the following error message when the readLines() function executes:当 readLines() 函数执行时,我收到以下错误消息:

"In readLines(URL): incomplete final line found" “在 readLines(URL): 找到不完整的最后一行”

How can I deal with the EOL error and retrieve the full list?如何处理 EOL 错误并检索完整列表?

I only have a basic understanding here, but I will try to assist, since there are no answers as of yet, and I have been successful myself.我在这里只有基本的了解,但我会尽力提供帮助,因为目前还没有答案,而且我自己已经成功了。

The structure of the URL I use for pulling in my SP list (includes all items) uses the GUID of the list, like so:我用于拉入 SP 列表(包括所有项目)的 URL 结构使用列表的 GUID,如下所示:

"https://XXXX/XXXX/_vti_bin/owssvr.dll?Cmd=Display&Query=*&XMLDATA=TRUE&List={<<YOUR GUID HERE>>}"

To get your GUID, I've plagiarized from this link :为了获得您的 GUID,我从这个链接中抄袭了:

There are times when you need to find the Id (a Guid) of a list – for example, when setting the Task list to be used with SharePoint Designer Workflows (see my blog post here).有时您需要查找列表的 Id(一个 Guid)——例如,在设置要与 SharePoint Designer 工作流一起使用的任务列表时(请参阅我在此处的博客文章)。 Here's a simple way of doing this:这是一个简单的方法:

  • Navigate to the SharePoint list using the browser.使用浏览器导航到 SharePoint 列表。
  • Select the Settings + List Settings menu command.选择设置 + 列表设置菜单命令。
  • Copy the Url from the browser address bar into Notepad.将浏览器地址栏中的 Url 复制到记事本中。 It will look something like:它看起来像:

    • http://moss2007/ProjectX/_layouts/listedit.aspx?List=%7B26534EF9%2DAB3A%2D46E0%2DAE56%2DEFF168BE562F%7D
  • Delete everying before and including “List=”.删除之前的所有内容,包括“List=”。

  • Change “%7B” to “{”将“%7B”更改为“{”
  • Change all “%2D” to “-“将所有“%2D”更改为“-”
  • Change “%7D” to “}”将“%7D”更改为“}”

You are now left with the Id:您现在只剩下 Id:

{26534EF9-AB3A-46E0-AE56-EFF168BE562F}

Once you've got that URL, I am reading in the data in a slightly different fashion:获得该 URL 后,我将以稍微不同的方式读取数据:

library(RCurl)
library(XML)
library(data.table)

URL <- "https://XXXX/XXXX/_vti_bin/owssvr.dll?Cmd=Display&Query=*&XMLDATA=TRUE&List={<<YOUR GUID HERE>>}"
xml <- xmlParse(getURL(URL, userpwd='<<youruser>>:<<yourpass>>'))
finalData <- data.table(do.call(rbind,xmlToList(xmlRoot(xml)[['data']]))

In the code above, be sure to replace <<YOUR GUID HERE>> and <<youruser>>:<<yourpass>> with appropriate values for your environment.在上面的代码中,确保将<<YOUR GUID HERE>><<youruser>>:<<yourpass>>替换为适合您环境的值。 There should be no < or > in any of the code (aside from R's assignment, <- ).任何代码中都不应该有<> (除了 R 的赋值, <- )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM