简体   繁体   English

Python txt提取器和组织器

[英]Python txt extractor & organizer

So i need to extract details of some costumers and save it in a new database all i have its only a txt file so we are talking about 5000 costumers or more that txt file its saved all in this way: 因此,我需要提取一些客户的详细信息并将其保存在新数据库中,而我只拥有一个txt文件,因此我们正在谈论5000个客户或更多以这种方式保存txt文件的客户:

first and last name   
NAME SURNAME            
zip country n. phone number mobile
United Kingdom      +1111111111
e-mail
email@email.email
guest first and last name 1°
NAME SURNAME
guest first and last name 2°
NAME SURNAME
name    address city    province
NAME SURNAME    London  London  
zip
AAAAA
Cancellation of the reservation.

So i since the file is always like this i was thinking there could be a way to scrape so i did some research as far, this is what i have came up with but not really what i need: 因此,由于文件始终是这样的,所以我一直认为可能会有一种抓取方式,因此我进行了一些研究,这是我想出的,但并不是我真正需要的:

with open('input.txt') as infile, open('output.txt', 'w') as outfile:
copy = False
for line in infile:
    if (line.find("first and last name") != -1):
        copy = True
    elif (line.find("Cancellation of the reservation.") != -1):
        copy = False
    elif copy:
        outfile.write(line)

The codes works but simply reads the file from a line to other and copies the content i need something that will copy the content in an other format like this i am able to uploaded on the database the format i need is this: 代码可以工作,但是只是从一行读取文件并将其复制到其他内容,而我需要的东西将以其他格式复制该内容,例如,我可以将其上传到数据库中,我需要的格式是:

first and last name | zip country n. phone number mobile|e-mail|guest first and last name 1°|name    address city    province|zip

So in this case i need it like this: 所以在这种情况下,我需要这样:

NAME SURNAME | United Kingdom      +1111111111|email@email.email|NAME SURNAME   London  London  |AAAAA

For every line in the output.txt 对于output.txt中的每一行

Do you guys think its hard to create this?could someone help me? 你们觉得很难做到这一点吗?有人可以帮我吗? any advice would be help full 任何建议将是充分的帮助

these are some good scraping tools for what you're looking to do: 这些是您想要做的一些很好的抓取工具:

data = '''first and last name   
        NAME SURNAME            
        zip country n. phone number mobile
        United Kingdom      +1111111111
        e-mail
        email@email.email
        guest first and last name 1
        NAME SURNAME
        guest first and last name 2
        NAME SURNAME
        name    address city    province
        NAME SURNAME    London  London  
        zip
        AAAAA
        Cancellation of the reservation.
        '''
# split on space, convert to list
ldata = data.split()
# strip leading and trailing white space from each item
ldata = [i.strip() for i in ldata]
# split on line break, convert to list
ndata = data.split('\n')
ndata = [i.strip() for i in ndata]
#convert list to string   
sdata = ' '.join(ldata)

print ldata
print ndata
print sdata

# two examples of split after, split before
name_surname = sdata.split('first and last name')[1].split('zip')[0]
print name_surname

country_phone = sdata.split('mobile')[1].split('e-mail')[0]
print country_phone

>>>

['first', 'and', 'last', 'name', 'NAME', 'SURNAME', 'zip', 'country', 'n.', 'phone', 'number', 'mobile', 'United', 'Kingdom', '+1111111111', 'e-mail', 'email@email.email', 'guest', 'first', 'and', 'last', 'name', '1', 'NAME', 'SURNAME', 'guest', 'first', 'and', 'last', 'name', '2', 'NAME', 'SURNAME', 'name', 'address', 'city', 'province', 'NAME', 'SURNAME', 'London', 'London', 'zip', 'AAAAA', 'Cancellation', 'of', 'the', 'reservation.']
['first and last name', 'NAME SURNAME', 'zip country n. phone number mobile', 'United Kingdom      +1111111111', 'e-mail', 'email@email.email', 'guest first and last name 1', 'NAME SURNAME', 'guest first and last name 2', 'NAME SURNAME', 'name    address city    province', 'NAME SURNAME    London  London', 'zip', 'AAAAA', 'Cancellation of the reservation.', '']
first and last name NAME SURNAME zip country n. phone number mobile United Kingdom +1111111111 e-mail email@email.email guest first and last name 1 NAME SURNAME guest first and last name 2 NAME SURNAME name address city province NAME SURNAME London London zip AAAAA Cancellation of the reservation.
 NAME SURNAME 
 United Kingdom +1111111111 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM