简体   繁体   中英

Search, and copy paste a text to corresponding file using pandas python

Assuming I have a source file containing product price from local shops.

$ less SourceFile.txt  # see this file using less function in terminal.
Store Price Dollars
>shop1 >price1 $5
>shop2 >price2 $3

And, there are some marketing data called sub files for each Store , unfortunately not complete.

$ less SubFile1.txt  
>owner
TimmyCoLtd
>shop1
grape 

$ less SubFile2.txt 
>shop2
potato
>salesman
John

$ less SubFile3.txt  # no discount information in Source File.
>discount
Nothing

Here is the exact output I'd like to see.

$ less New.SubFile1.txt  
>owner
TimmyCoLtd
>shop1
grape
>price1
$5 

$ less New.SubFile2.txt 
>shop2
potato
>salesman
John
>price2 
$3

$ less New.SubFile3.txt  # duplicate a same file.
>discount
Nothing

If I can find the same Store between Sub File and Source File (all Store and Price name start with > ), then move Price and Dollars from Source File and paste to Sub File .

If there is no identical Store between Source File and Sub File , then simply duplicate an identical file for them, such as New.SubFile3.txt .

Any good python packages to make it?

An efficient way is create a dictionary from sourcefile. In dictionary Id columns is the key and rest of the columns are values.

from pathlib import Path

with open('source_file.txt') as fp:
    next(fp)
    res = dict(line.strip().split(' ', 1) for line in fp)

for file in Path('files').glob('*.txt'):
    with file.open() as fp, open(f'new_{file.stem}.txt', 'w') as fw:
        data = fp.readlines()
        for line in data:
            if line.startswith('>') and line.strip() in res:
                fw.write(''.join(data) + '\n' + '\n'.join(res[line.strip()].split()))
                break
        else:
            fw.writelines(data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM