简体   繁体   中英

How to transfer different portion of texts from one file to another in python

My goal is to split a text document into different percentages of text (5%,10%,15%…%) and then insert each portion of text into different files located in a directory.

My attempt

Code for opening and splitting a text document into fractions.

def text_percent(fn, *percentages):
    text = open(fn).read()
    return [text[:int(pt/100.*len(text))] for pt in percentages]

vi=range(5,100,5)

for x in vi:
    print "\n\n".join(text_percent("C:\zzzz",x))

Code for opening the files (in the directory) into which the different portions of text will be inserted

files_=[]
def dir_files(paf):
    for dirname, dirnames, filenames in os.walk(paf):
        for filename in filenames:
            l=os.path.join(dirname, filename)
            files_.append(l)
    return (files_)

Area of difficulty: How to automatically take 5% of the text and insert it into the first file of the directory, then 10% into the second file of the directory, and so on.

Thanks for your suggestions.

One problem with your code is that your text_percent function gets all of the text from the beginning of the file to the percent point you specify, rather than only the section you want. The following will break a file into the segments you want:

def text_percent(fn, percentages):
  test = open(fn).read()
  # convert the percents to the number of characters
  percentinchars = map(lambda x: int(x * len(text) / 100), [0] + percentages + [100])
  # convert the markers into pairs of lo/hi bounds
  bounds = zip(percentinchars, percentinchars[1:])
  # use those lo/hi bounds to get the actual character sets
  return [text[lo:hi] for lo,hi in bounds]

Another problem is that you were passing only a single percent marker to the function, rather than all of the percent markers you wanted. Below I pass the whole range of percent markers to the function, and get back the full list of segments from the file.

print "\n\n".join(text_percent("C:\zzzz", range(5,100,5)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM