每當范圍更改時，將范圍的每組的所有行都寫入新文件Python 3.6

Question

試圖找到一種方法來使該過程以Python的方式或完全不起作用。 基本上，我有一個很長的文本文件，該文件分為幾行。 每x個行都有一個主要是大寫的行，大約應該是該特定節的標題。 理想情況下，我希望標題和之后的所有內容都進入文本文件，並使用標題作為文件名。 在這種情況下，這將必須發生3039，因為那里將有許多標題。 到目前為止，我的過程是這樣的：我創建了一個變量，該變量會讀取文本文件，告訴我它是否大部分是大寫字母。

def mostly_uppercase(text):
    threshold = 0.7
    isupper_bools = [character.isupper() for character in text]
    isupper_ints = [int(val) for val in isupper_bools]
    try:
        upper_percentage = np.mean(isupper_ints)
    except:
        return False
    if upper_percentage >= threshold:
        return True
    else:
        return False

之后，我做了一個計數器，以便創建索引，然后將其合並：

counter = 0

headline_indices = []

for line in page_text:
    if mostly_uppercase(line):
        print(line)
        headline_indices.append(counter)
    counter+=1

headlines_with_articles = []
headline_indices_expanded = [0] + headline_indices + [len(page_text)-1]

for first, second in list(zip(headline_indices_expanded, headline_indices_expanded[1:])):
    article_text = (page_text[first:second])
    headlines_with_articles.append(article_text)

據我所知，所有這些似乎都工作正常。 但是，當我嘗試打印要歸檔的文件時，我要做的就是將整個文本打印到所有txt文件中。

for i in range(100):
    out_pathname = '/sharedfolder/temp_directory/' + 'new_file_' + str(i) + '.txt'
    with open(out_pathname, 'w') as fo:
        fo.write(articles_filtered[2])

編輯：這讓我中途了。 現在，我只需要一種用第一行命名每個文件的方法。

for i,text in enumerate(articles_filtered):
    open('/sharedfolder/temp_directory' + str(i + 1) + '.txt', 'w').write(str(text))

Answer 1

處理單個輸入文件的一種傳統方式涉及以下列方式使用帶with語句的Python和for循環。 我還從其他人那里得到了一個很好的答案，用於計算大寫字符，以獲得所需的分數。

def mostly_upper(text):
    threshold = 0.7
    ## adapted from https://stackoverflow.com/a/18129868/131187
    upper_count = sum(1 for c in text if c.isupper())
    return upper_count/len(text) >= threshold

first = True
out_file = None
with open('some_uppers.txt') as some_uppers:
    for line in some_uppers:
        line = line.rstrip()
        if first or mostly_upper(line):
            first = False
            if out_file: out_file.close()
            out_file = open(line+'.txt', 'w')
        print(line, file=out_file)
out_file.close()

在循環中，我們讀取每一行，並詢問是否大部分都是大寫的。 如果是這樣，則以當前行的內容為標題，關閉用於上一個行集合的文件，並為下一個集合打開一個新文件。

我允許第一行可能不是標題。 在這種情況下，代碼創建的第一行作為其名稱的內容的文件，並繼續書寫它找到該文件，直到一切它找到一個標題行。

每當范圍更改時，將范圍的每組的所有行都寫入新文件Python 3.6

問題描述

1 個解決方案

解決方案1
0 2017-11-19 15:33:03

每當范圍更改時，將范圍的每組的所有行都寫入新文件Python 3.6

問題描述

1 個解決方案

解決方案1 0 2017-11-19 15:33:03

解決方案1
0 2017-11-19 15:33:03