Python 用解析的單行注釋查找並替換數組中的多個注釋行

Question

假設我們已經讀取了一個 python 文件，其中包含多行注釋和一些代碼。 這作為list或np.ndarray存儲在data中

data = ["# this", "# is" "# the first comment", "print('hello world')", "# second comment"]

expected_output = ["```this is the first comment```", "print('hello world')", "``` second comment```"]
expected_output

所需的 output 會將以#字符開頭的多個元素替換為包含在backtick字符中的單個解析注釋

['```this is the first comment```',
 "print('hello world')",
 '``` second comment```']

我可以進行解析，但我不知道如何用新格式化的單行替換單行（例如上例中的索引[0, 1, 2] ）。

到目前為止的腳本：

from pathlib import Path
import numpy as np 
from itertools import groupby
from operator import itemgetter


def get_consecutive_group_edges(data: np.ndarray):
    # https://stackoverflow.com/a/2154437/9940782
    edges = []

    for k, g in groupby(enumerate(data),lambda x:x[0]-x[1]):
        group = (map(itemgetter(1),g))
        group = list(map(int, group))
        edges.append((group[0],group[-1]))
    
    # convert ranges into group index
    # https://stackoverflow.com/a/952952/9940782
    group_lookup = dict(enumerate(edges))

    return group_lookup

if __name__ == "__main__":

    # https://stackoverflow.com/a/17141572/9940782
    filedata = ["# this", "# is" "# the first comment", "print('hello world')", "# second comment"]

    # find all consecutive lines starting as comments
    comment_lines = np.argwhere([l[0] == "#" for l in filedata])
    group_lookup = get_consecutive_group_edges(comment_lines)

    output_lines = []
    for comment_idx in group_lookup.keys():
        # extract the comment groups
        min_comment_line = group_lookup[comment_idx][0]
        max_comment_line = group_lookup[comment_idx][1] + 1
        data = filedata[min_comment_line: max_comment_line]
        
        # remove the comment characters
        output = "".join(data).replace("\n", " ").replace("#", "")
        # wrap in ```
        output = "```" + output + "```" + "\n"

我在最后一步失敗了：如何用單個新解析的output替換每個group的min_comment_line和max_comment_line之間的所有值？

我可以對未注釋的行做些什么嗎？

non_comment_lines = np.argwhere([l[0] != "#" for l in filedata])

Answer 1

可以賦值給 Python 中的一個列表切片，可以用一個替換多個元素：

    ...
    # make a copy of the original list, so we can replace the comments
    output_lines = filedata.copy()
    # iterate backwards so the indices line up
    for comment_idx in reversed(group_lookup):
        # extract the comment groups
        min_comment_line = group_lookup[comment_idx][0]
        max_comment_line = group_lookup[comment_idx][1] + 1
        data = filedata[min_comment_line:max_comment_line]

        # remove the comment characters
        output = "".join(data).replace("\n", " ").replace("#", "")
        # wrap in ```
        output = "```" + output + "```"
        output_lines[min_comment_line:max_comment_line] = [output]

然而，整個操作可以簡單得多，因為groupby只對連續匹配的元素進行分組：

    output_lines = []
    # iterate over consecutive sections of comments and code
    for is_comment, lines in groupby(filedata, key=lambda x: x[0] == "#"):
        if is_comment:
            # remove the comment characters
            output = "".join(lines).replace("\n", " ").replace("#", "")
            # wrap in ```
            output_lines.append("```" + output + "```")
        else:
            # leave code lines unchanged
            output_lines.extend(lines)

Python 用解析的單行注釋查找並替換數組中的多個注釋行

問題描述

1 個解決方案

解決方案1
1 已采納 2022-03-02 00:30:51

Python 用解析的單行注釋查找並替換數組中的多個注釋行

問題描述

1 個解決方案

解決方案1 1 已采納 2022-03-02 00:30:51

解決方案1
1 已采納 2022-03-02 00:30:51