如何将从 salesforce 中提取的表格式化为 python？

Question

I was able to extract some fields from salesforce using python.我能够使用 python 从 salesforce 中提取一些字段。

I used the following code block :我使用了以下代码块：

!pip install simple_salesforce 

from simple_salesforce import Salesforce
import pandas as pd

sf = Salesforce(
username='', 
password='', 
security_token='')

sf_data = sf.query_all("SELECT Brand_Name__c,Name FROM AuthorisedProduct__c")

sf_df = pd.DataFrame(sf_data)

sf_df.head()

This process puts all items in one 'record' field.此过程将所有项目放在一个“记录”字段中。

records记录	total size总尺寸
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-B')]) OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')]), ('Brand_Name__c' ', 'ABB'), ('名称', 'UNO-DM-1.2-TL-PLUS-B')])	14000 14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-SB')]) OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')]), ('Brand_Name__c' ', 'ABB'), ('名称', 'UNO-DM-1.2-TL-PLUS-SB')])	14000 14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-2.0-TL-PLUS-B')]) OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c' ', 'ABB'), ('名称', 'UNO-DM-2.0-TL-PLUS-B')])	14000 14000

Please note there are 14000 values under records.请注意，记录下有 14000 个值。 I wanted to have only two fields in a simple dataframe.我想在一个简单的数据框中只有两个字段。 A Table with 'Brand_Name__c' and 'Name' fields.带有“Brand_Name__c”和“Name”字段的表。

Brand_Name__C品牌_名称__C	Name名称
ABB ABB	UNO-DM-2.0-TL-PLUS-B UNO-DM-2.0-TL-PLUS-B
ABB ABB	UNO-DM-1.2-TL-PLUS-SB UNO-DM-1.2-TL-PLUS-SB

and we will have a matrix of 14000 by 2.我们将有一个 14000 x 2 的矩阵。

Please advise how to achieve that?请指教如何实现？

And also, How to reverse that process?还有，如何扭转这个过程？

Thank you all so much.非常感谢大家。

Answer 1

You can unpack the OrderedDict object in the records column:您可以在records列中解压OrderedDict对象：

from collections import OrderedDict
import pandas as pd

df = pd.DataFrame({
    'records':[
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-B')]),
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-SB')]),
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-2.0-TL-PLUS-B')])
    ],
    'total size': [14000]*3
})

df['Brand_Name__c'] = df['records'].apply(lambda x: x['Brand_Name__c'])
df['Name'] = df['records'].apply(lambda x: x['Name'])

Result:结果：

>>> df
                                             records  total size Brand_Name__c                   Name
0  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB   UNO-DM-1.2-TL-PLUS-B
1  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB  UNO-DM-1.2-TL-PLUS-SB
2  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB   UNO-DM-2.0-TL-PLUS-B

Answer 2

You have to be aware of the actual shape of the JSON response sent by Salesforce, which includes a top-level "records" key under which all of your data is contained.您必须了解 Salesforce 发送的 JSON 响应的实际形状，其中包括一个顶级"records"键，您的所有数据都包含在该键下。 Additionally, each record entry contains an "attributes" key, besides the data for the fields you actually requested.此外，除了您实际请求的字段的数据之外，每个记录条目都包含一个"attributes"键。 You cannot change the shape of the JSON response.您无法更改 JSON 响应的形状。

There is an example provided in the simple_salesforce documentation showing how to digest this API response for Pandas: simple_salesforce文档中提供了一个示例，显示了如何为 Pandas 消化此 API 响应：

Generate Pandas Dataframe from SFDC API Query (ex.query,query_all)从 SFDC API 查询生成 Pandas 数据帧 (ex.query,query_all)

import pandas as pd

sf.query("SELECT Id, Email FROM Contact")

df = pd.DataFrame(data['records']).drop(['attributes'],axis=1)

如何将从 salesforce 中提取的表格式化为 python？

问题描述

2 个解决方案

解决方案1
1 2021-07-26 04:21:17

解决方案2
1 已采纳 2021-07-26 16:17:01

如何将从 salesforce 中提取的表格式化为 python？

问题描述

2 个解决方案

解决方案1 1 2021-07-26 04:21:17

解决方案2 1 已采纳 2021-07-26 16:17:01

解决方案1
1 2021-07-26 04:21:17

解决方案2
1 已采纳 2021-07-26 16:17:01