简体   繁体   English

子集DateVector

[英]Subsetting a DateVector

I'm trying to extract and subset a vector containing date information from a data.frame . 我正在尝试从data.frame提取包含日期信息的向量并将其子集data.frame I'm able to successfully extract the DateVector from the DataFrame ; 我能够成功提取DateVectorDataFrame ; however, I receive an error when trying to subset the data. 但是,尝试对数据进行子集处理时收到错误消息。

The below works fine given the /* */ around the DateVector subsets. 给定DateVector子集周围的/* */ ,以下内容可以正常工作。

Rcpp::cppFunction('
Rcpp::DataFrame test(DataFrame x, StringVector y ) {

  StringVector New = x["string_1"];
  std::string KEY = Rcpp::as<std::string>(y[0]);
  Rcpp::LogicalVector ind(New.size());

  for(int i = 0; i < New.size(); i++){
  ind[i] = (New[i] == KEY);
  }


  Rcpp::StringVector st1 = x["string_1"];
  Rcpp::StringVector Id = x["ID"];
  Rcpp::StringVector NameId = x["NameID"];
  Rcpp::DateVector StDate = x["StartDate"];
  Rcpp::DateVector EtDate = x["EndDate"]; 

  /*
  Rcpp::DateVector StDate_sub = StDate[ind];
  Rcpp::DateVector EtDate_sub = EtDate[ind]; 
  */

  return Rcpp::DataFrame::create(Rcpp::Named("string_1") = st1[ind],
                                 Rcpp::Named("ID") = Id[ind],
                                 Rcpp::Named("NameID") = NameId[ind]/*,
                                 Rcpp::Named("StartDate") = StDate_sub,
                                 Rcpp::Named("EndDate") = EtDate_sub*/
                                 );
}')

There are two notable errors I receive: 我收到两个明显的错误:

error: invalid user-defined conversion from 'Rcpp::LogicalVector {aka Rcpp::Vector<10, Rcpp::PreserveStorage>}' to 'int' [-fpermissive] 错误:无效的用户定义的转换,从'Rcpp :: LogicalVector {aka Rcpp :: Vector <10,Rcpp :: PreserveStorage>}'到'int'[-fpermissive]

Rcpp::DateVector StDate_sub = StDate[ind] Rcpp :: DateVector StDate_sub = StDate [ind]

The second is: 第二个是:

no known conversion from 'SEXP' to 'int' file585c1863151c.cpp:23:53: error: conversion from 'Rcpp::Date' to non-scalar type 'Rcpp::DateVector {aka Rcpp::oldDateVector}' requested 没有已知的从'SEXP'到'int'文件的转换585c1863151c.cpp:23:53:错误:请求从'Rcpp :: Date'到非标量类型'Rcpp :: DateVector {aka Rcpp :: oldDateVector}的转换

Rcpp::DateVector EtDate_sub = EtDate[ind]; Rcpp :: DateVector EtDate_sub = EtDate [ind];

I looked at the docs, but couldn't find a way. 我看了看文档,但是找不到办法。 Sorry, if I missed it. 对不起,如果我错过了。 I have a couple of date variables in data.frame . 我在data.frame有几个日期变量。 I am using the Rcpp to subset the data set in a nested for loop. 我正在使用Rcpp在嵌套的for循环中将数据集子集化。 Currently, it is taking too much time. 当前,这花费了太多时间。 I cannot implement it in data.table or dplyr as the subset data set is required fro some processing. 我无法在data.tabledplyr实现它,因为子集数据集需要进行某些处理。

First off, your example is not minimally reproducible as there is no defined data set. 首先,由于没有定义的数据集,因此您的示例并非具有最小的可重复性。

Second, you are making the (heroic?) assumption that assignment by index vector be defined for Date vectors. 其次,您正在(英雄式?)假设为日期向量定义按索引向量的赋值。 Appears it may not be. 似乎不是。

Third, just looping is trivial. 第三,仅循环是微不足道的。 Amended code below. 修改后的代码如下。 Builds without a hitch, no idea if it run as you supplied no reference data . 构建顺利,不知道它是否在没有提供参考数据的情况下运行。

#define RCPP_NEW_DATE_DATETIME_VECTORS 1
#include <Rcpp.h>

using namespace Rcpp;

// [[Rcpp::export]]
Rcpp::DataFrame dftest(DataFrame x, StringVector y ) {

  StringVector New = x["string_1"];
  std::string KEY = Rcpp::as<std::string>(y[0]);
  Rcpp::LogicalVector ind(New.size());

  for(int i = 0; i < New.size(); i++){
    ind[i] = (New[i] == KEY);
  }


  Rcpp::StringVector st1 = x["string_1"];
  Rcpp::StringVector Id = x["ID"];
  Rcpp::StringVector NameId = x["NameID"];
  Rcpp::DateVector StDate = x["StartDate"];
  Rcpp::DateVector EtDate = x["EndDate"]; 

  int n = sum(ind);
  Rcpp::DateVector StDate_sub = StDate(n);
  Rcpp::DateVector EtDate_sub = EtDate(n);
  for (int i=0; i<n; i++) {
    StDate_sub[i] = StDate( ind[i] );
    EtDate_sub[i] = EtDate( ind[i] );
  }

  return Rcpp::DataFrame::create(Rcpp::Named("string_1") = st1[ind],
                                 Rcpp::Named("ID") = Id[ind],
                                 Rcpp::Named("NameID") = NameId[ind],
                                 Rcpp::Named("StartDate") = StDate_sub,
                                 Rcpp::Named("EndDate") = EtDate_sub);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM