简体   繁体   English

如何在引用表中查找特定的 pandas 数据框列值并将引用表值复制到数据框中?

[英]How do you lookup a particular pandas dataframe column value in a reference table and copy a reference table value to the dataframe?

I have a reference table that I imported into a dataframe(df2) from a .csv.我有一个从 .csv 导入数据框(df2)的参考表。 It's 3 columns and around 400 rows.它是 3 列和大约 400 行。 I have another dataframe (df) that has many columns and rows.我有另一个具有许多列和行的数据框(df)。 I am looking to lookup a value from the reference table and add it to the appropriate column in df.我正在寻找从参考表中查找一个值并将其添加到 df 中的相应列。

The data format for the reference table:参考表的数据格式:

MANUF   PRODTYPE        PRODCODE
 
ALPHA       1           ALPHA1
ALPHA       2           ALPHA2
BETA        1           BETA1
BETA        2           BETA2
DELTA       1           DELTA1
DELTA       2           DELTA2

The dataframe (df) is set up like this:数据框(df)的设置如下:

MANUF    PRODTYPE    SERIALNO   PRODCODE    INVENTORY   
ALPHA       1        00001                      5
ALPHA       2        00001                      3
BETA        1        00001                      4
DELTA       1        00001                      8
ALPHA       2        00002                      3
BETA        1        00002                      4
DELTA       2        00001                      9
DELTA       2        00002                      9
DELTA       1        00002                      8
BETA        2        00001                      12
ALPHA       2        00003                      3

I am trying to populate PRODCODE in df with the appropriate value based on MANUF and PRODTYPE in the reference table.我正在尝试使用基于参考表中的 MANUF 和 PRODTYPE 的适当值填充 df 中的 PRODCODE。

I tried:我试过了:

df3 = df.merge(df2, how='left') 

and

df3 = df2.merge(df, how='left')

but both gave me either inaccurate or incomplete merges.但两者都给了我不准确或不完整的合并。

I expect this to work in your case:我希望这适用于您的情况:

import sys
import pandas as pd
from io import StringIO
from datetime import datetime

data1 = StringIO("""MANUF;PRODTYPE;PRODCODE
ALPHA;1;ACME1
ALPHA;2;ACME2
BETA;1;BETA1
BETA;2;BETA2
DELTA;1;DELTA1
DELTA;2;DELTA2
""")
df1 = pd.read_csv(data1, sep=";")
print(df1)

data2 = StringIO("""MANUF;PRODTYPE;SERIALNO;PRODCODE;INVENTORY   
ALPHA;1;00001;5
ALPHA;2;00001;3
BETA;1;00001;4
DELTA;1;00001;8
ALPHA;2;00002;3
BETA;1;00002;4
DELTA;2;00001;9
DELTA;2;00002;9
DELTA;1;00002;8
BETA;2;00001;12
ALPHA;2;00003;3
""")
df2 = pd.read_csv(data2, sep=";")
print(df2)

df3 = df2.merge(df1, on=['MANUF', 'PRODTYPE'], how='left')
print(df3)

Result:结果:

    MANUF  PRODTYPE  SERIALNO  PRODCODE_x  INVENTORY    PRODCODE_y
0   ALPHA         1         1           5           NaN      ACME1
1   ALPHA         2         1           3           NaN      ACME2
2    BETA         1         1           4           NaN      BETA1
3   DELTA         1         1           8           NaN     DELTA1
4   ALPHA         2         2           3           NaN      ACME2
5    BETA         1         2           4           NaN      BETA1
6   DELTA         2         1           9           NaN     DELTA2
7   DELTA         2         2           9           NaN     DELTA2
8   DELTA         1         2           8           NaN     DELTA1
9    BETA         2         1          12           NaN      BETA2
10  ALPHA         2         3           3           NaN      ACME2

Another way without merge would be this:没有merge的另一种方法是:

df2 = df2.set_index(['MANUF', 'PRODTYPE'])
output = df2.combine_first(df1.set_index(['MANUF', 'PRODTYPE'])).reset_index()
print(output)

    MANUF  PRODTYPE  INVENTORY PRODCODE  SERIALNO
0   ALPHA         1          5   ALPHA1         1
1   ALPHA         2          3   ALPHA2         1
2   ALPHA         2          3   ALPHA2         2
3   ALPHA         2          3   ALPHA2         3
4    BETA         1          4    BETA1         1
5    BETA         1          4    BETA1         2
6    BETA         2         12    BETA2         1
7   DELTA         1          8   DELTA1         1
8   DELTA         1          8   DELTA1         2
9   DELTA         2          9   DELTA2         1
10  DELTA         2          9   DELTA2         2

Used Input:使用的输入:

df1 = pd.DataFrame({'MANUF': {0: 'ALPHA',
  1: 'ALPHA',
  2: 'BETA',
  3: 'BETA',
  4: 'DELTA',
  5: 'DELTA'},
 'PRODTYPE': {0: 1, 1: 2, 2: 1, 3: 2, 4: 1, 5: 2},
 'PRODCODE': {0: 'ALPHA1',
  1: 'ALPHA2',
  2: 'BETA1',
  3: 'BETA2',
  4: 'DELTA1',
  5: 'DELTA2'}})

df2 = pd.DataFrame({'MANUF': {0: 'ALPHA',
  1: 'ALPHA',
  2: 'BETA',
  3: 'DELTA',
  4: 'ALPHA',
  5: 'BETA',
  6: 'DELTA',
  7: 'DELTA',
  8: 'DELTA',
  9: 'BETA',
  10: 'ALPHA'},
 'PRODTYPE': {0: 1,
  1: 2,
  2: 1,
  3: 1,
  4: 2,
  5: 1,
  6: 2,
  7: 2,
  8: 1,
  9: 2,
  10: 2},
 'SERIALNO': {0: 1,
  1: 1,
  2: 1,
  3: 1,
  4: 2,
  5: 2,
  6: 1,
  7: 2,
  8: 2,
  9: 1,
  10: 3},
 'INVENTORY': {0: 5,
  1: 3,
  2: 4,
  3: 8,
  4: 3,
  5: 4,
  6: 9,
  7: 9,
  8: 8,
  9: 12,
  10: 3}})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 参考 pandas dataframe 中的列值减去固定行值 - Subtract fixed row value in reference to column value in pandas dataframe 如何旋转pandas DataFrame列以创建二进制“值表”? - How to pivot pandas DataFrame column to create binary “value table”? 如何将 pandas dataframe 转换为具有 Column==value 列的表? - How to convert pandas dataframe to a table with Column==value columns? 列计算中的 Pandas MultiIndex DataFrame 参考索引值 - Pandas MultiIndex DataFrame reference index value in column calculation 如何使用 Pandas 将单元格值复制到 Word 中的表格中? - How do you use Pandas to copy a cell value into a table in Word? 在 Pandas DataFrame 中创建一个具有特定值的列 - Create a column with particular value in pandas DataFrame 如何在 function 中引用 pandas dataframe 的索引列 - How to reference the index column of a pandas dataframe in a function 使用pandas数据帧作为查找表 - Using a pandas dataframe as a lookup table 如何根据不同的条件为 pandas dataframe 中的特定列赋值? - How to assign value to particular column in pandas dataframe based on different conditions? 如何在 pandas dataframe 的列中以字符串格式查找特定值的索引? - How to find index in a string format for a particular value in a column of a pandas dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM