简体   繁体   中英

Add two strings in a list

I am trying to create a system that will take a multiline string with names as input and will output the lines as a 2d list with the first names and surnames joined. My issue is that both first names and surnames and just first names may be given as an input. This may be confusing so I have an example below.

This is in Python 3.6.

I have a list of first names:

Bob
Steve
Ted
Blake
Harry
Edric
Tommy
Bartholomew

and a list of surnames:

Fischer
Stinson
McCord
Bone
Harvey

Input

"""Bob Fischer Steve Ted Stinson Blake Harry McCord
Edric Bone Tommy Harvey Bartholomew"""

Output

[["Bob Fischer","Steve","Ted Stinson","Blake","Harry McCord"],
["Edric Bone","Tommy Harvey","Bartholomew"]]

I get really stuck on the differentiating between spaces between sets of names ( Steve Ted ) and the first and surnames.

Can anyone help with this? I'm really stuck...

You seem to want to match a first name that is optionally followed with whitespace(s) and a last name.

You may create a single regex pattern from the name lists you have and use re.findall to find all non-overlapping occurrences:

import re
first = ['Bob','Steve','Ted','Blake','Harry','Edric','Tommy','Bartholomew']
surnames = ['Fischer','Stinson','McCord','Bone','Harvey']
r = r"\b(?:{})\b(?:\s+(?:{})\b)?".format("|".join(first),"|".join(surnames))
s = """Bob Fischer Steve Ted Stinson Blake Harry McCord
Edric Bone Tommy Harvey Bartholomew"""
print(re.findall(r, s))
# => ['Bob Fischer', 'Steve', 'Ted Stinson', 'Blake', 'Harry McCord', 'Edric Bone', 'Tommy Harvey', 'Bartholomew']

See the Python demo

The regex that is generated with this code :

\b(?:Bob|Steve|Ted|Blake|Harry|Edric|Tommy|Bartholomew)\b(?:\s+(?:Fischer|Stinson|McCord|Bone|Harvey)\b)?

Basically, \\b(?:...)\\b(?:\\s+(?:...)\\b)? matches a first name from the alternatives as a whole word (due to \\b around the first (?:...) grouping construct) and then (?:\\s+(?:...)\\b)? matches 1 or 0 occurrences (due to ? quantifier) of 1+ whitespaces ( \\s+ ) followed with any of the last names (again, as whole words due to the trailing \\b ).

Try this, I used (instead of surname and first name) a noun and the category they fall under.

A = [ 'Beaver' , 'Strawberry']
B = [ 'Animal' , 'Fruit']

input_string = 'Beaver Animal Strawberry Strawberry Fruit'
input_string = input_string.split(' ')

def combinestring( x_string ):
    compiling_string = []

    for i,x in enumerate(x_string):

        if (i+1) < len(x_string):
            if x in A and x_string[i+1] in B:
                compiling_string.append(x + ' ' + x_string[i+1])
            elif x in A:
                compiling_string.append(x)

        elif (i+1) == len(x_string) and x in A:
            compiling_string.append(x)

    return compiling_string



print combinestring(input_string)
#>>> ['Beaver Animal','Strawberry','Strawberry Fruit']
In [21]: first_names
Out[21]: ['Bob', 'Steve', 'Ted', 'Blake', 'Harry', 'Edric', 'Tommy', 'Bartholomew']

In [22]: surnames
Out[22]: ['Fischer', 'Stinson', 'McCord', 'Bone', 'Harvey']

In [23]: inp = """Bob Fischer Steve Ted Stinson Blake Harry McCord
    ...: Edric Bone Tommy Harvey Bartholomew""".split()

In [24]: out = []
    ...: fullname = None
    ...: for name in inp:
    ...:     if name in first_names:
    ...:         if fullname:
    ...:             out.append(fullname)
    ...:         fullname = name
    ...:     elif name in surnames:
    ...:         fullname += ' ' + name
    ...: out.append(fullname)
    ...:

In [25]: out
Out[25]:
['Bob Fischer',
 'Steve',
 'Ted Stinson',
 'Blake',
 'Harry McCord',
 'Edric Bone',
 'Tommy Harvey',
 'Bartholomew']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM