简体   繁体   中英

Best practice for collections in jsons: array vs dict/map

I need to pass data in a python back-end to a front end through an api call, using a json format. In the python back end, the data is in a dictionary structure, which I can easily and directly convert to a json. But should I?

My front-end developer believes the answer is no, for reasons related to best practice.

But I challenge that:

Is the best to structure a json as it is in python, or should it rather be converted to some other form, such as several arrays (as would be necessary in my example case below)?

Or, differently put:

What should be the governing principles related to collections/dicts/maps/arrays for interfacing information through jsons?

I've done some googling for an answer, but I've not come across much that addresses this directly. Links would be appreciated.

(Note about the example below: of course if the data is written to a database, it would probably make most sense for the front-end to access the database directly, but let's assume this is not the case)

Example:

In the back end there is a collection of objects called pets : each item in the collection has a unique pet_id , some non-optional properties, eg name and date_of_birth , some optional properties registration_certificate_nr , adopted_from_kennel , some lists like siblings and children and some objects like medication .

Assuming that the front end needs all of this info at some point, it could be

{
  "pets": {
    "17-01-24-01": {
      "name": "Buster",
      "date_of_birth": "04/01/2017",
      "registration_certificate_nr": "AAD-1123-1432"
    },
    "17-03-04-01": {
      "name": "Hooch",
      "date_of_birth": "05/02/2015",
      "adopted_from_kennel": "Pretoria Shire",
      "children": [
        "17-05-01-01",
        "17-05-01-02",
        "17-05-01-03"
      ]
    },
    "17-05-01-01": {
      "name": "Snappy",
      "date_of_birth": "17-05-01",
      "siblings": [
        "17-05-01-02",
        "17-05-01-03"
      ]
    },
    "17-05-01-02": {
      "name": "Gizmo",
      "date_of_birth": "17-05-01",
      "siblings": [
        "17-05-01-01",
        "17-05-01-03"
      ]
    },
    "17-05-01-03": {
      "name": "Toothless",
      "date_of_birth": "17-05-01",
      "siblings": [
        "17-05-01-01",
        "17-05-01-03"
      ],
      "medication": [
        {
          "name": "anti-worm",
          "code": "aw445",
          "dosage": "1 pill per day"
        },
        {
          "name": "disinfectant",
          "code": "pdi-2",
          "dosage": "as required"
        }
      ]
    }
  }
}

JSON formatting is a somewhat subjective matter, and related disagreements are usually best settled between colleagues.
That being said, there are some potentially valid criticisms to be made against the JSON format in the question, especially if we are trying to create a consistent, RESTful API.

The 2 pain points that stand out:

  1. A map collection is represented in JSON, which isn't really JSON standard compliant, or particularly RESTful.

  2. None of the pet objects have an id defined. There is a pet_id mentioned in the question, but it seems to be maintained separately from the pet object itself. If a value is accessed in the pets map in the question, a user of the API would have to manually add the pet_id to the provided pet object in order to have the id available further down the line, when the full JSON may no longer be available.

The closest things we have to guiding standards in this situation is the REST architectural style and the JSON standard .


We can start by looking at the JSON standard. Here is a quote from the JSON wiki :

JavaScript syntax defines several native data types that are not included in the JSON standard: Map, Set, Date, Error, Regular Expression, Function, Promise, and undefined .

The key takeaway here is that JSON is not meant to represent the map data type. Python dictionaries are a map implementation, so directly serializing a dictionary to JSON with the intent to represent a map-like collection goes against the intended use of JSON.

For an individual object like a pet, the JSON object is appropriate, but for collections there is one option: the JSON array. There is a usage example with the JSON array further down in this answer.

There may be edge cases where deviating from the standard makes sense, but I don't see a reason in this scenario.


There are also some shortcomings in the JSON format from a RESTful design perspective. RESTful API design is nice because it encourages one to keep things simple and consistent. It also happens to be a de facto industry standard.

In a RESTful HTTP API, this is how fetching a single pet resource should look:

Request: GET /api/pets/17-01-24-01

Response: 200 {
    "id": "17-01-24-01",
    "name": "Buster",
    "date_of_birth": "04/01/2017",
    "registration_certificate_nr": "AAD-1123-1432"
}

The response is a completely defined resource with an explicitly defined id . It is also the simplest complete JSON representation of a pet.

Next, we define what fetching multiple pet resources looks like, assuming only 2 pets are defined:

Request: GET /api/pets

Response: 200 [
    {
        "id": "17-01-24-01",
        "name": "Buster",
        "date_of_birth": "04/01/2017",
        "registration_certificate_nr": "AAD-1123-1432"
    },
    {
        "id": "17-03-04-01",
        "name": "Hooch",
        "date_of_birth": "05/02/2015",
        "adopted_from_kennel": "Pretoria Shire",
        "children": [
            "17-05-01-01",
            "17-05-01-02",
            "17-05-01-03"
         ]
    }
]

The above response format is the most straight forward way to pluralize the single resource response format, thus keeping the API as simple and consistent as possible. (for the sake of brevity, I only used 2 of the sample resources from the question). Once again, the id s are explicitly defined, and belong to their respective pet objects.

Nothing is gained from adding map keys to the above format.

Proponents of the JSON format in the question may suggest to just add the id field into each pet object in order to work around pain point 2, but that would raise the question of repeating data within the response. Why does the id need to be both inside and outside the object? Surely it should only be on the inside? After eliminating the redundant data, the result will look like the response above.

That is the REST argument. There are use cases where REST doesn't really work, but this is far from that.


PS. Front ends should never access databases directly. The API is responsible for writing to and reading from whatever data persistence infrastructure is used. In a lot of bigger real world systems, there is even an additional BFF layer between the front end and the API(s), separating the front end and the DB even further.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM