Python Pandas将列值更改为NULL并返回其原始值

Question

I am creating on a Python script that will automate in changing the column values to NULL before sending it via e-mail. 我正在创建一个Python脚本，该脚本会自动将列值更改为NULL然后再通过电子邮件发送。

My goal is to temporarily change some column values due to its confidential data. 我的目标是由于机密数据而暂时更改某些列值。 Here is what it looks like: 看起来像这样：

svc_no   last_name   first_name   acc_no     some_column
12345    Parker      Peter        1111111    some_value
11111    Stark       Tony         2222222    some_value
22222    Rogers      Steve        3333333    some_value

I have multiple Excel files and I will be sending the Excel files to someone. 我有多个Excel文件，我将把Excel文件发送给某人。 That someone will be doing some processing on those Excel files but before I send it via e-mail, I need to change some column values to NULL due to its confidentiality. 有人会对这些Excel文件进行一些处理，但是在我通过电子邮件发送它之前，由于其机密性，我需要将某些列值更改为NULL 。

My desired output will be like this: 我想要的输出将是这样的：

svc_no   last_name   first_name   acc_no     some_column
12345    NULL        NULL         NULL       some_value
11111    NULL        NULL         NULL       some_value
22222    NULL        NULL         NULL       some_value

Here is what I did: 这是我所做的：

I iterate all the files and get the path of the directory to back-up all the Excel files which I plan to use as a reference for later in returning the original values of the columns. 我迭代所有文件并获取目录的路径以备份所有Excel文件，这些文件我计划用作以后返回列的原始值的参考。 I used os ,* shutil** and glob libraries. 我使用了os ，* shutil **和glob库。
```
 path = os.path.absolute(__file__) new_path = path + 'source' files = [] if not os.path.exists(new_path): os.makedirs(new_path) for file in files: if file not in new_path: shutil.copy(file, new_path) # line continue in number 2 list 
```

These codes will create a folder in the same directory as the script and copy the all the Excel files in the newly created directory which is new_path . 这些代码将在与脚本相同的目录中创建一个文件夹，并将所有Excel文件复制到新创建的目录new_path 。

Now, I declare each Excel file to be a DataFrame and change the column values to NULL using .loc : 现在，我将每个Excel文件声明为一个DataFrame，然后使用.loc将列值更改为NULL ：
```
  df = pd.read_excel(file) df.loc[df['l_name'].notnull(), 'last_name'] = 'NULL' 
```

I also tried inserting a column that contains NULL values and copy the it to the desired column using iloc although nothing also happened. 我也尝试插入包含NULL值的列，并使用iloc将其复制到所需的列，尽管也没有发生任何事情。 It also did not create the column. 它还没有创建列。

df.insert(loc=5, column='empty_column', value='NULL')
df.iloc[:,1] = df.iloc[:,5]

My problem is that it doesn't change the last_name column values to NULL . 我的问题是它不会将last_name列的值更改为NULL 。 Is there another way to this? 还有另一种方法吗？

I have already used .iloc and .loc in some of my projects and they are working but I am confused here why they are not doing anything. 我已经在一些项目中使用了.iloc和.loc ，它们正在工作，但是我在这里感到困惑，为什么他们没有做任何事情。

Any help will be highly appreciated. 任何帮助将不胜感激。

Answer 1

I really don't see the issue here. 我真的看不到这里的问题。 You seem to be overcomplicating things. 您似乎使事情复杂化了。 Would this not suffice: 这不够吗：

df

0   12345   Parker  Peter   1111111 some_value
1   11111   Stark   Tony    2222222 some_value
2   22222   Rogers  Steve   3333333 some_value

Create a confidential version: 创建一个机密版本：

confidential_columns = ['last_name', 'first_name', 'acc_no']

confidential_df = df.copy()
confidential_df[confidential_columns] = 'NULL'

You get this: 你得到这个：

confidential_df

0   12345   NULL    NULL    NULL    some_value
1   11111   NULL    NULL    NULL    some_value
2   22222   NULL    NULL    NULL    some_value

Then decide which on to write based off of some decision: 然后根据一些决定来决定写在哪个：

confidential = True

def write()
    writer = pd.ExcelWriter('output.xlsx')
    if confidential:
        confidential_df.to_excel(writer, sheet_name='report')
    else:
         df.to_excel(writer, sheet_name='report')

write()

I'm not going to deal with path/file/directory management when it comes time to write because that seems like it's out of the scope of your issue. 我不打算写路径/文件/目录管理，因为这似乎超出了您的讨论范围。

Python Pandas将列值更改为NULL并返回其原始值

问题描述

1 个解决方案

解决方案1
1 2018-08-04 17:38:45

Python Pandas将列值更改为NULL并返回其原始值

问题描述

1 个解决方案

解决方案1 1 2018-08-04 17:38:45

解决方案1
1 2018-08-04 17:38:45