简体   繁体   English

Pandas 比较同一数据框中的列并根据比较替换值

[英]Pandas compare column in same data frame and replace values based on comparison

I have a excel sheet that consist of over 6000 rows.我有一张 excel 表,其中包含 6000 多行。 There are two column, "IP Address CMDB" that contain IP addresses and another column called "IP Address LM".有两列“IP 地址 CMDB”,其中包含 IP 个地址,另一列称为“IP 地址 LM”。 I am trying to look for IP address that belongs to "IP Address CMDB" in "IP Address LM" and if "IP Address LM" contain that IP address return ABCD.我正在尝试在“IP 地址 LM”中查找属于“IP 地址 CMDB”的 IP 地址,如果“IP 地址 LM”包含该 IP 地址,则返回 ABCD。 I could not attach excel sheet so I have attached screenshot of it.我无法附上 excel 表,所以我附上了它的截图。

在此处输入图像描述

for col in report:
    if col == "IP Address CMDB":
        col_num = report[col]
        for num in col_num:
            if report["IP Address LM"].str.contains(num):
                print("ABCD")
<ipython-input-13-40cfae2bd937>:5: UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
  if report["IP Address LM"].str.contains(num):
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-40cfae2bd937> in <module>
      3         col_num = report[col]
      4         for num in col_num:
----> 5             if report["IP Address LM"].str.contains(num):
      6                 print("ABCD")
      7 

c:\users\rohit verma\appdata\local\programs\python\python39\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1535     @final
   1536     def __nonzero__(self):
-> 1537         raise ValueError(
   1538             f"The truth value of a {type(self).__name__} is ambiguous. "
   1539             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

You can simply use the below code to check whether the IP Address LM contains what belongs to IP Address CMDB :您可以简单地使用以下代码来检查IP 地址 LM是否包含属于IP 地址 CMDB的内容:

checkColumn = []
for index, row in df.iterrows():
  Ip = row["IP Address LM"]
  toCheck = row["IP Address CMDB"]
  if toCheck in Ip:
    checkColumn.append("ABCD")
  else:
    checkColumn.append(None)
df["check"] = checkColumn

Explanation解释

iterrows() function loops over all of the dataframe's rows. iterrows() function 遍历数据框的所有行。 Then using a logic statement such as toCheck in Ip we are trying to check whether the value exists in the abovementioned column or not.然后使用诸如toCheck in Ip类的逻辑语句,我们试图检查该值是否存在于上述列中。 If not, returns None , otherwise, returns ABCD as requested.如果不是,则返回None ,否则,按要求返回ABCD

As the source DataFrame ( report ) I created:作为源 DataFrame(报告)我创建了:

                        IP Address CMDB IP Address LM
0                   10.1.0.36,10.1.53.1     10.1.0.36
1                            10.1.11.21    10.1.11.21
2  10.1.148.20,192.168.128.3,10.1.5.130    10.1.5.130
3                            10.1.5.100    10.1.5.140
4                            10.1.6.120    10.1.6.140

To identify rows where IP Address CMDB contains IP Address LM you can run eg:要识别IP 地址 CMDB包含IP 地址 LM的行,您可以运行例如:

report.apply(lambda row: row['IP Address LM'] in row['IP Address CMDB'], axis=1)

Details:细节:

  1. report.apply - applies the given lambda function to each row (due to axis=1 parameter). report.apply - 将给定的 lambda function 应用于每一(由于axis=1参数)。
  2. row['IP Address LM'] in row['IP Address CMDB'] - creates temporary lists of characters from both columns in the current row and checks whether the left list is contained in the right one. row['IP Address LM'] in row['IP Address CMDB'] - 从当前行的两列创建临时字符列表,并检查左侧列表是否包含在右侧列表中。
  3. The returned value actually answers your question (does IP Address CMDB contain IP Address LM ).返回值实际上回答了您的问题( IP Address CMDB是否包含IP Address LM )。

The result is:结果是:

0     True
1     True
2     True
3    False
4    False
dtype: bool

As you can see, IP Address CMDB in first 3 rows contains IP Address LM from the current row.如您所见,前 3 行中的 IP地址 CMDB包含当前行中的IP 地址 LM

If you want to do something more, write your own function including your actions, returning some result for the current row, and replace the lambda function with this function.如果你想做更多的事情,写你自己的 function 包括你的行动,返回当前行的一些结果,并用这个 function 替换 lambda function。

And a note about your code: str.contains can be used to check whether an element of a column contains a fixed value, but you actually want to check containment for values in the current row only.关于您的代码的注释: str.contains可用于检查列的元素是否包含固定值,但您实际上只想检查当前行中值的包含。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM