简体   繁体   English

在SAS中进行2向查找

[英]2-way lookup in sas

Let's say I have a (9000x9000) table like the following: 假设我有一个(9000x9000)表格,如下所示:

 zone 304  305  306  307  308 ...

  001   1    2    8    9   12 ...
  002   6    8    3    7    1 ...
  003   4    8    1   12    9 ...
  004   2    7    3   16   34 ...
  ...

The main data table looks like this: 主数据表如下所示:

  package #    weight    origin    destination    zone
       123      2oz       004          305        7 to be inputted here
        .
        .
        .

I need SAS to output the "zone" corresponding to a given ordered pair. 我需要SAS输出与给定有序对相对应的“区域”。 I fear the only way would be with some type of loop. 我担心唯一的方法是使用某种类型的循环。 For instance, in the example above, the orgin value is from the row labels and the destination from the column labels. 例如,在上面的示例中,原始值来自行标签,目标来自列标签。 The intersection is the target value I need assigned to "zone". 相交点是我需要分配给“区域”的目标值。

A solution using python data wrangling libraries would work also. 使用python数据整理库的解决方案也可以使用。

Also, the 9000x9000 table is an Excel CSV file. 另外,9000x9000表是一个Excel CSV文件。

My approach: 我的方法:

  1. Load the data set into a temporary array (9000x9000) and then lookup each element as needed. 将数据集加载到临时数组(9000x9000)中,然后根据需要查找每个元素。 Could be memory intensive, but 9000*9000 seems small enough to me. 可能占用大量内存,但是9000 * 9000对我来说似乎很小。
  2. Another safe approach, transpose the data to be in a long format: 另一种安全的方法是将数据转置为长格式:

     Key1 Key2 Value 001 304 1 001 305 2 ... 

Then, in any language, it becomes a join/merge instead of lookup. 然后,无论使用哪种语言,它都将成为联接/合并而不是查找。

  1. You can also use PROC IML, which loads the data as a matrix and then you can access using the indexes. 您还可以使用PROC IML,它将数据作为矩阵加载,然后可以使用索引进行访问。

There are also ways in SAS to do this lookup via a merge, primarily using VVALUEX. SAS中还有几种方法可以通过合并来进行此查找,主要是使用VVALUEX。

Without knowing how you're going to use it, I can't provide any more information. 在不知道您将如何使用它的情况下,我无法提供更多信息。

EDIT: added 3'rd option which is IML. 编辑:添加了3'rd选项,即IML。 Basically there are many ways to do this, the best depends on how you're planning to use it overall. 基本上有很多方法可以做到这一点,最好的方法取决于您打算如何整体使用它。

EDIT2: 1. Import first data set into SAS (PROC IMPORT) 2. Transpose using PROC TRANSPOSE 3. Merge either data step or PROC SQL, by ORIGIN DESTINATION, which will be straight forward. 编辑2:1.将第一个数据集导入SAS(PROC IMPORT)。2.使用PROC TRANSPOSE进行转置。3.通过ORIGIN DESTINATION合并数据步骤或PROC SQL,这将很简单。 At this point it's really a standard lookup with 2 keys. 在这一点上,它实际上是带有2个键的标准查找。

You could use pandas , it has a built in function to read from an excel document: pandas.read_excel() 您可以使用pandas ,它具有内置功能可从excel文档中读取: pandas.read_excel()

So for this file: 所以对于这个文件:

在此处输入图片说明

import pandas as pd

df = pd.read_excel('test.xlsx')
print(df[101][502])

Output: 输出:

67

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM