簡體 English 中英

使用Python將hOCR解析為JSON

[英]Parsing hOCR to JSON with Python

原文 2018-07-19 11:16:36 8 1 python/ postgresql/ parsing/ python-tesseract/ hocr

我正在使用tesseract-ocr，並以hOCR格式獲取輸出。 我需要將此hOCR輸出存儲到數據庫（在我的情況下為PostgreSQL）。

由於我可能需要分別從此hOCR中獲取每條信息（其中的80％），這是正確的方法嗎？ 應該將其存儲為XML數據類型還是解析為JSON並存儲？ 並且在使用JSON的情況下，如何使用Python將hOCR解析為JSON。 其他相關建議也被贊賞。

hOCR似乎是XML的一種方言，因此您應該能夠使用stdlib中的xml.etree模塊將hOCR代碼解析為Python可導航樹。 然后導航到該樹以組成對象或嵌套字典，然后最終使用stdlib的json模塊將該字典轉換為JSON。

[英]Parsing nested json with python

[英]python parsing json response

[英]Parsing JSON using Python?

[英]Parsing Twitter json with Python

[英]Parsing JSON data in Python

[英]Parsing data JSON and Python

[英]Json array parsing with python

[英]Python: Parsing JSON Keyerror

[英]Parsing JSON file with Python

[英]Parsing nested JSON in python?

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 用python解析嵌套的json python解析JSON響應使用Python解析JSON？使用Python解析Twitter json 在 Python 中解析 JSON 數據解析數據JSON和Python 用Python解析JSON數組 Python：解析JSON Keyerror 使用Python解析JSON文件在python中解析嵌套的JSON？

相關標簽