简体   繁体   中英

How do I split a string BEFORE a certain character?

timeline = '''
1961 - Second child, John, is born. 1963 - Is ordained as a Presbyterian minister. Makes his on-camera debut as host of a series of 15-minute episodes for children produced in Toronto. The program is titled "Mister Rogers." 1964 - Returns to Pittsburgh
    and turns his 15-minute show into the half-hour "Mister Rogers' Neighborhood." 1968 - "Mister Rogers' Neighborhood" debuts on PBS and wins the first of its several Emmy Awards. Rogers is appointed chairman of the Forum on Mass Media and Child Development
    of the White House Conference on Youth. 1969 - Wins the first of his two George Foster Peabody Awards for television excellence. Fred Rogers readies the opening pitch to start the Pirates' season in 1988. (Post-Gazette archives) 1971 - Forms his production
    company, Family Communications Inc. 1975 - Ceases production on "Mister Rogers' Neighborhood," which continues to air on PBS in reruns. 1978 - Creates the PBS series "Old Friends, New Friends," focusing on older people. 1979 - Production resumes on
    "Mister Rogers' Neighborhood." 1981 - Eddie Murphy plays an inner-city children's show host in his "Mister Robinson's Neighborhood" sketch on "Saturday Night Live," which is probably the most famous of the many "Mister Rogers" spoofs. 1984 - The Smithsonian
    Institution in Washington makes Fred Rogers' trademark sweater part of its permanent collection. Rogers forces the fast-food chain Burger King to remove ads featuring a Mister Rogers lookalike. 1987 - Appears on a children's TV show in the Soviet
    Union and reciprocates by having a Soviet host appear on his show later in the year. 1989 - "Mister Rogers' Neighborhood of Make Believe" opens as an attraction at Idlewild Park. 1990 - Sues the Ku Klux Klan and forces the organization to stop playing
    racist telephone recordings featuring imitations of Fred Rogers' voice, speech patterns and theme song. 1991 - Tapes spots for PBS on the eve of the Gulf War to reassure children that they will be all right if hostilities occur. 1996 - TV Guide names
    Fred Rogers one of the 50 greatest TV stars of all time. Rogers makes his only appearance as someone other than himself when he takes a cameo role as a preacher on the drama series "Dr. Quinn, Medicine Woman," which he says is one of his favorite
    shows. 1997 - Is honored for lifetime achievement by the National Academy of Television Arts and Sciences (who hand out the Emmy Awards) and by the Television Critics Association. Named Pittsburgher of the Year by Pittsburgh magazine. 1998 - Gets
    a star on the Hollywood Walk of Fame. The Pittsburgh Children's Museum opens a Mister Rogers exhibit. 1999 - Is inducted into the Television Hall of Fame. 2000 - Unveils a planetarium show featuring "Neighborhood" characters at the Carnegie Science
    Center; ceases production on "Mister Rogers' Neighborhood." 2001 - Last original episodes of "Mister Rogers' Neighborhood" airs on PBS. 2002 - In July, is presented the Presidential Medal of Freedom, the nation's highest civilian honor. Filmed PBS
    public service announcements in August encouraging parents to read to their children and giving them advice on how to handle the first anniversary of the Sept. 11, 2001, terrorist attacks. Is named grand marshal of the Tournament of Roses Parade in
    Pasadena, Calif., with Bill Cosby and Art Linkletter. The theme: "Children's Wishes, Dreams and Imagination." 2003 - Dies of stomach cancer at age 74.

'''

My goal is to make this a dictionary where the keys are the years and the values are the events on the timeline. However, I'm having a little trouble.

I want to first split the timeline into a list like so: {key, value, key, value} and then turn it into a dict. I want to split the string 5 or 6 characters before " - " in order to accomplish this, but I have no idea how to do that. Can anyone help me?

EDIT: Came up with this possible solution -

timeline = timeline.replace("\n","")
timeline = timeline.replace("\\","")
timeline = timeline.replace("19","zz19")
timeline = timeline.replace("20","zz20")
timeline = timeline.split(" - ")

So basically I end up with this:

['zz1961', 'Second child, John, is born. zz1963', 'Is ordained as a Presbyterian minister. Makes his on-camera debut as host of a series of 15-minute episodes for children produced in Toronto. The program is titled "Mister Rogers." zz1964', 'Returns to Pittsburgh    and turns his 15-minute show into the half-hour "Mister Rogers\' Neighborhood." zz1968', '"Mister Rogers\' Neighborhood" debuts on PBS and wins the first of its several Emmy Awards. Rogers is appointed chairman of the Forum on Mass Media and Child Development    of the White House Conference on Youth. zz1969', "Wins the first of his two George Foster Peabody Awards for television excellence. Fred Rogers readies the opening pitch to start the Pirates' season in zz1988. (Post-Gazette archives) zz1971", 'Forms his production    company, Family Communications Inc. zz1975', 'Ceases production on "Mister Rogers\' Neighborhood," which continues to air on PBS in reruns. zz1978', 'Creates the PBS series "Old Friends, New Friends," focusing on older people. zz1979', 'Production resumes on    "Mister Rogers\' Neighborhood." zz1981', 'Eddie Murphy plays an inner-city children\'s show host in his "Mister Robinson\'s Neighborhood" sketch on "Saturday Night Live," which is probably the most famous of the many "Mister Rogers" spoofs. zz1984', "The Smithsonian    Institution in Washington makes Fred Rogers' trademark sweater part of its permanent collection. Rogers forces the fast-food chain Burger King to remove ads featuring a Mister Rogers lookalike. zz1987", "Appears on a children's TV show in the Soviet    Union and reciprocates by having a Soviet host appear on his show later in the year. zz1989", '"Mister Rogers\' Neighborhood of Make Believe" opens as an attraction at Idlewild Park. zz1990', "Sues the Ku Klux Klan and forces the organization to stop playing    racist telephone recordings featuring imitations of Fred Rogers' voice, speech patterns and theme song. zz1991", 'Tapes spots for PBS on the eve of the Gulf War to reassure children that they will be all right if hostilities occur. zz1996', 'TV Guide names    Fred Rogers one of the 50 greatest TV stars of all time. Rogers makes his only appearance as someone other than himself when he takes a cameo role as a preacher on the drama series "Dr. Quinn, Medicine Woman," which he says is one of his favorite    shows. zz1997', 'Is honored for lifetime achievement by the National Academy of Television Arts and Sciences (who hand out the Emmy Awards) and by the Television Critics Association. Named Pittsburgher of the Year by Pittsburgh magazine. zz1998', "Gets    a star on the Hollywood Walk of Fame. The Pittsburgh Children's Museum opens a Mister Rogers exhibit. zz1999", 'Is inducted into the Television Hall of Fame. zz2000', 'Unveils a planetarium show featuring "Neighborhood" characters at the Carnegie Science    Center; ceases production on "Mister Rogers\' Neighborhood." zz2001', 'Last original episodes of "Mister Rogers\' Neighborhood" airs on PBS. zz2002', 'In July, is presented the Presidential Medal of Freedom, the nation\'s highest civilian honor. Filmed PBS    public service announcements in August encouraging parents to read to their children and giving them advice on how to handle the first anniversary of the Sept. 11, zz2001, terrorist attacks. Is named grand marshal of the Tournament of Roses Parade in    Pasadena, Calif., with Bill Cosby and Art Linkletter. The theme: "Children\'s Wishes, Dreams and Imagination." zz2003', 'Dies of stomach cancer at age 74.']

So I would want to split this string by "zz", the only problem is I can't split it twice because it's already a list. Obviously I could concatenate the list, but then I would just run into the same problem as now.

Tried doing this:

for i in timeline:
    i = i.split("zz")

But it doesn't work :(

Let's split using regex!

import re
stripped_string = timeline.replace('\n', '').strip() # whitespaces go away
splitted = re.split('(\d{4}) - ', stripped_string) # split to ['', year1, event1, year2, event2, ...]
lst = splitted[1:] # first empty split goes away
keys = lst[0::2] # get years
values = lst[1::2] # get events
dic = dict(zip(keys, values))

You could make it more specific further. For example, if you know that years are 19xx or 20xx, then:

splitted = re.split('((?:19|20)\d\d) - ', stripped_string)

will do as well.

Also, since you have keys and values explicitly, you could do several things easily, like stripping whitespaces of each element in values . For example:

values = [event.strip() for event in lst[1::2]]

Here's the result:

>>> print("\n".join("{}: {}".format(k, v) for k, v in dic.items()))
1989: "Mister Rogers' Neighborhood of Make Believe" opens as an attraction at Idlewild Park.
1987: Appears on a children's TV show in the Soviet    Union and reciprocates by having a Soviet host appear on his show later in the year.
1984: The Smithsonian    Institution in Washington makes Fred Rogers' trademark sweater part of its permanent collection. Rogers forces the fast-food chain Burger King to remove ads featuring a Mister Rogers lookalike.
1968: "Mister Rogers' Neighborhood" debuts on PBS and wins the first of its several Emmy Awards. Rogers is appointed chairman of the Forum on Mass Media and Child Development    of the White House Conference on Youth.
1969: Wins the first of his two George Foster Peabody Awards for television excellence. Fred Rogers readies the opening pitch to start the Pirates' season in 1988. (Post-Gazette archives)
1981: Eddie Murphy plays an inner-city children's show host in his "Mister Robinson's Neighborhood" sketch on "Saturday Night Live," which is probably the most famous of the many "Mister Rogers" spoofs.
1964: Returns to Pittsburgh    and turns his 15-minute show into the half-hour "Mister Rogers' Neighborhood."
1961: Second child, John, is born.
1963: Is ordained as a Presbyterian minister. Makes his on-camera debut as host of a series of 15-minute episodes for children produced in Toronto. The program is titled "Mister Rogers."
1991: Tapes spots for PBS on the eve of the Gulf War to reassure children that they will be all right if hostilities occur.
1990: Sues the Ku Klux Klan and forces the organization to stop playing    racist telephone recordings featuring imitations of Fred Rogers' voice, speech patterns and theme song.
1997: Is honored for lifetime achievement by the National Academy of Television Arts and Sciences (who hand out the Emmy Awards) and by the Television Critics Association. Named Pittsburgher of the Year by Pittsburgh magazine.
1979: Production resumes on    "Mister Rogers' Neighborhood."
1996: TV Guide names    Fred Rogers one of the 50 greatest TV stars of all time. Rogers makes his only appearance as someone other than himself when he takes a cameo role as a preacher on the drama series "Dr. Quinn, Medicine Woman," which he says is one of his favorite    shows.
1999: Is inducted into the Television Hall of Fame.
1998: Gets    a star on the Hollywood Walk of Fame. The Pittsburgh Children's Museum opens a Mister Rogers exhibit.
1975: Ceases production on "Mister Rogers' Neighborhood," which continues to air on PBS in reruns.
1978: Creates the PBS series "Old Friends, New Friends," focusing on older people.
1971: Forms his production    company, Family Communications Inc.
2002: In July, is presented the Presidential Medal of Freedom, the nation's highest civilian honor. Filmed PBS    public service announcements in August encouraging parents to read to their children and giving them advice on how to handle the first anniversary of the Sept. 11, 2001, terrorist attacks. Is named grand marshal of the Tournament of Roses Parade in    Pasadena, Calif., with Bill Cosby and Art Linkletter. The theme: "Children's Wishes, Dreams and Imagination."
2003: Dies of stomach cancer at age 74.
2000: Unveils a planetarium show featuring "Neighborhood" characters at the Carnegie Science    Center; ceases production on "Mister Rogers' Neighborhood."
2001: Last original episodes of "Mister Rogers' Neighborhood" airs on PBS.

You could make the dictionary like this:

parts = timeline.replace('\n','').split(" - ")
d = dict()
for x in range(len(parts) - 1):
    d[parts[x][-4:]] = parts[x + 1][:len(parts[x+1]) - 4]

Since your delimiter is clearly " - " and by doing so you would have set of items like:

'1961'
'blablablablabla. 1963'
'blablablablabla. 1968'
'blablablablabla. 1969'
'blablablablabla'

Then you only need to pair your last 4 characters in the text with the next text except the last four characters. Thus, simple split and slice will work for you.

Using this to show the result:

for a in d:
    print(a + " " + d[a])

This is what I got:

1975 Ceases production on "Mister Rogers' Neighborhood," which continues to air on PBS in reruns. 
1968 "Mister Rogers' Neighborhood" debuts on PBS and wins the first of its several Emmy Awards. Rogers is appointed chairman of the Forum on Mass Media and Child Development    of the White House Conference on Youth. 
2001 Last original episodes of "Mister Rogers' Neighborhood" airs on PBS. 
1990 Sues the Ku Klux Klan and forces the organization to stop playing    racist telephone recordings featuring imitations of Fred Rogers' voice, speech patterns and theme song. 
2002 In July, is presented the Presidential Medal of Freedom, the nation's highest civilian honor. Filmed PBS    public service announcements in August encouraging parents to read to their children and giving them advice on how to handle the first anniversary of the Sept. 11, 2001, terrorist attacks. Is named grand marshal of the Tournament of Roses Parade in    Pasadena, Calif., with Bill Cosby and Art Linkletter. The theme: "Children's Wishes, Dreams and Imagination." 
1989 "Mister Rogers' Neighborhood of Make Believe" opens as an attraction at Idlewild Park. 
1996 TV Guide names    Fred Rogers one of the 50 greatest TV stars of all time. Rogers makes his only appearance as someone other than himself when he takes a cameo role as a preacher on the drama series "Dr. Quinn, Medicine Woman," which he says is one of his favorite    shows. 
1987 Appears on a children's TV show in the Soviet    Union and reciprocates by having a Soviet host appear on his show later in the year. 
1963 Is ordained as a Presbyterian minister. Makes his on-camera debut as host of a series of 15-minute episodes for children produced in Toronto. The program is titled "Mister Rogers." 
1984 The Smithsonian    Institution in Washington makes Fred Rogers' trademark sweater part of its permanent collection. Rogers forces the fast-food chain Burger King to remove ads featuring a Mister Rogers lookalike. 
1971 Forms his production    company, Family Communications Inc. 
1991 Tapes spots for PBS on the eve of the Gulf War to reassure children that they will be all right if hostilities occur. 
1997 Is honored for lifetime achievement by the National Academy of Television Arts and Sciences (who hand out the Emmy Awards) and by the Television Critics Association. Named Pittsburgher of the Year by Pittsburgh magazine. 
1999 Is inducted into the Television Hall of Fame. 
1964 Returns to Pittsburgh    and turns his 15-minute show into the half-hour "Mister Rogers' Neighborhood." 
2000 Unveils a planetarium show featuring "Neighborhood" characters at the Carnegie Science    Center; ceases production on "Mister Rogers' Neighborhood." 
1981 Eddie Murphy plays an inner-city children's show host in his "Mister Robinson's Neighborhood" sketch on "Saturday Night Live," which is probably the most famous of the many "Mister Rogers" spoofs. 
1978 Creates the PBS series "Old Friends, New Friends," focusing on older people. 
1969 Wins the first of his two George Foster Peabody Awards for television excellence. Fred Rogers readies the opening pitch to start the Pirates' season in 1988. (Post-Gazette archives) 
1961 Second child, John, is born. 
2003 Dies of stomach cancer at age
1998 Gets    a star on the Hollywood Walk of Fame. The Pittsburgh Children's Museum opens a Mister Rogers exhibit. 
1979 Production resumes on    "Mister Rogers' Neighborhood." 

Note that dictionary is unordered. Use OrderedDict if you need it to be ordered.

The ordered dictionary is as easily done as dictionary like this:

import collections
parts = timeline.replace('\n','').split(" - ")
od = collections.OrderedDict()
for x in range(len(parts) - 1):
    od[parts[x][-4:]] = parts[x + 1][:len(parts[x+1]) - 4]

And using this:

for a in od:
    print(a + " " + od[a])

The result will be ordered:

1961 Second child, John, is born. 
1963 Is ordained as a Presbyterian minister. Makes his on-camera debut as host of a series of 15-minute episodes for children produced in Toronto. The program is titled "Mister Rogers." 
1964 Returns to Pittsburgh    and turns his 15-minute show into the half-hour "Mister Rogers' Neighborhood." 
1968 "Mister Rogers' Neighborhood" debuts on PBS and wins the first of its several Emmy Awards. Rogers is appointed chairman of the Forum on Mass Media and Child Development    of the White House Conference on Youth. 
1969 Wins the first of his two George Foster Peabody Awards for television excellence. Fred Rogers readies the opening pitch to start the Pirates' season in 1988. (Post-Gazette archives) 
1971 Forms his production    company, Family Communications Inc. 
1975 Ceases production on "Mister Rogers' Neighborhood," which continues to air on PBS in reruns. 
1978 Creates the PBS series "Old Friends, New Friends," focusing on older people. 
1979 Production resumes on    "Mister Rogers' Neighborhood." 
1981 Eddie Murphy plays an inner-city children's show host in his "Mister Robinson's Neighborhood" sketch on "Saturday Night Live," which is probably the most famous of the many "Mister Rogers" spoofs. 
1984 The Smithsonian    Institution in Washington makes Fred Rogers' trademark sweater part of its permanent collection. Rogers forces the fast-food chain Burger King to remove ads featuring a Mister Rogers lookalike. 
1987 Appears on a children's TV show in the Soviet    Union and reciprocates by having a Soviet host appear on his show later in the year. 
1989 "Mister Rogers' Neighborhood of Make Believe" opens as an attraction at Idlewild Park. 
1990 Sues the Ku Klux Klan and forces the organization to stop playing    racist telephone recordings featuring imitations of Fred Rogers' voice, speech patterns and theme song. 
1991 Tapes spots for PBS on the eve of the Gulf War to reassure children that they will be all right if hostilities occur. 
1996 TV Guide names    Fred Rogers one of the 50 greatest TV stars of all time. Rogers makes his only appearance as someone other than himself when he takes a cameo role as a preacher on the drama series "Dr. Quinn, Medicine Woman," which he says is one of his favorite    shows. 
1997 Is honored for lifetime achievement by the National Academy of Television Arts and Sciences (who hand out the Emmy Awards) and by the Television Critics Association. Named Pittsburgher of the Year by Pittsburgh magazine. 
1998 Gets    a star on the Hollywood Walk of Fame. The Pittsburgh Children's Museum opens a Mister Rogers exhibit. 
1999 Is inducted into the Television Hall of Fame. 
2000 Unveils a planetarium show featuring "Neighborhood" characters at the Carnegie Science    Center; ceases production on "Mister Rogers' Neighborhood." 
2001 Last original episodes of "Mister Rogers' Neighborhood" airs on PBS. 
2002 In July, is presented the Presidential Medal of Freedom, the nation's highest civilian honor. Filmed PBS    public service announcements in August encouraging parents to read to their children and giving them advice on how to handle the first anniversary of the Sept. 11, 2001, terrorist attacks. Is named grand marshal of the Tournament of Roses Parade in    Pasadena, Calif., with Bill Cosby and Art Linkletter. The theme: "Children's Wishes, Dreams and Imagination." 
2003 Dies of stomach cancer at age

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM