简体   繁体   English

如何将一段文本转换为父级子级JSON文件?

[英]How do I turn a piece of text into a parent children JSON file?

I have a text file that consists of chapters and clauses. 我有一个包含章节和子句的文本文件。 Its the constitution of Kenya . 是肯尼亚宪法

I want to convert it to something similar to Flare.json which looks like below. 我想将其转换为类似于Flare.json的内容 ,如下所示。

{"name": "ROOT",
 "children": [
        {"name": "Hemiptera",
         "children": [
             {"name": "Miridae",
              "children": [
                  {"name": "Kanakamiris", "children":[]},
                  {"name": "Neophloeobia",
                   "children": [
                       {"name": "incisa", "children":[] }
                   ]}
              ]}
         ]},
        {"name": "Lepidoptera",
         "children": [
             {"name": "Nymphalidae",
              "children": [
                  {"name": "Ephinephile",
                   "children": [
                       {"name": "rawnsleyi", "children":[] }
                   ]}
              ]}
         ]}
    ]}
}

Is there a way I can programatically do this in either Javascript, Python or R? 有没有办法我可以用Javascript,Python或R编程地做到这一点?

First, let me propose an input format for you. 首先,让我为您提出一种输入格式。 which could be like: 可能是这样的:

var kenyaConstitutionArray = ["1#SOVEREIGNTY OF CONSTITUTION", "1:1#All sovereign...", "1:2#...",....,"100#....","100:1#..."]

Where only 1# represents chapter, 1:1# represents first sub-clause of chapter 1, and 1:1:1# represents first sub-sub-clause of chapter 1. I've used # because i assume it will not appear in the text. 只有1#代表章节, 1:1#代表第1:1#章的第一个子条款,而1:1:1#代表第1:1:1#章的第一个子条款。我使用#是因为我假设它不会出现在文本中。

To get chapters and clauses, you need to do the following: 要获取章节和子句,您需要执行以下操作:

var path = text.substr(0, text.indexOf('#'));//it will give path or levels

Here, text is element of array. 在这里,文本是数组的元素。 Eg, text = kenyaConstitutionArray[1] 例如, text = kenyaConstitutionArray[1]

Now, you have to get chapter: 现在,您必须获得以下章节:

var chapter = path.substr(0, path.indexOf(':'));

Get sub-clauses in the same way, with little modifications, 只需少量修改即可以相同方式获取子条款,

And, build json either in the loop or recursively. 并且,可以在循环中或递归地构建json。

Other way is to: 另一种方法是:

for input, you can use nested arrays as-well. 对于输入,您也可以使用嵌套数组。 like: 喜欢:

var kenyaConstitution = [["chapter1",[["clause1", [["subclause1"], ["subclause2"]]],["clause2",[["subclause2-1"], ["subclause2-2"]]]]]];

Converting above nested array to json will be very easy for you. 将上述嵌套数组转换为json对您来说非常容易。 In this case, good way would be using recursion. 在这种情况下,好的方法是使用递归。

EDIT: 编辑:

Complete Code: 完整的代码:

[Ignore comments in the code.] [忽略代码中的注释。]

    <!DOCTYPE html>
<head>
    <title>JSON</title>
        <script>
            function kenyaConstitutionToJSON() {
                var kenyaConstitution = [["chapter1",[["clause1", [["subclause1"], ["subclause2"]]],["clause2",[["subclause2-1"], ["subclause2-2"]]]]]];
                var kenyaChapterJSON;
                var kenJSON = {};
                kenJSON["name"] = "Constitution of Kenya";
                kenJSON["children"] = [];
                if(kenyaConstitution.length === 0) {
                        console.log("Constitution is empty! Please pass constitution through Parliament...");
                        return;
                    } else {
                        for (var chapter in kenyaConstitution) { //for each chapter return json
                            kenyaChapterJSON = convertToJSON(kenyaConstitution[chapter]) || {};
                            kenJSON["children"].push(kenyaChapterJSON);
                        }

                    }
                    return kenJSON;
            }
            function convertToJSON(constitutionArray) { 
                    var obj = {};
                    //constitutionArray[item] = ["chapter1",[["clause1", [["subclause1"], ["subclause2"]]],["clause2",[["subclause2-1"], ["subclause2-2"]]]]]
                    obj["name"] =   constitutionArray[0]; // {name: "children1", children=[ ]}
                    obj["children"] = [];
                    //if(constitutionArray.length > 0) {
                        for (var cl in constitutionArray[1]) {
                            var kenJSON1 = convertToJSON(constitutionArray[1][cl]);
                            obj["children"].push(kenJSON1);
                        }
                    //} else {
                        //obj["children"].push(constitutionArray[0]);
                    //}
                    return obj;

            }

            kenyaConstitutionToJSON();
        </script>
</head>
<body>
</body>

Place breakpoint on return kenJSON; return kenJSON;上放置断点return kenJSON; line and see the output. 行并查看输出。 It'd be like: 就像:

OUTPUT: OUTPUT:

{
    "name":"Constitution of Kenya",
    "children":[
        {
            "name":"chapter1",
            "children":[
                {
                    "name":"clause1",
                    "children":[
                        {
                            "name":"subclause1",
                            "children":[

                            ]
                        },
                        {
                            "name":"subclause2",
                            "children":[

                            ]
                        }
                    ]
                },
                {
                    "name":"clause2",
                    "children":[
                        {
                            "name":"subclause2-1",
                            "children":[

                            ]
                        },
                        {
                            "name":"subclause2-2",
                            "children":[

                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

Hope that'd help. 希望能对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM