Grab certain words and phrases from a text file in Python

Question

I have this block of code, and it goes through a text file, grabs it line by line and splits it up into separate words. This is all well and good, but within my text file, I have certain words and phrases that start with and end with '-', for example, '-foo-' or '-foo bar-'. Right now, they are being split up because of the code into '-foo' and 'bar-'. I understand why this is happening however.

My plan would be to grab those instances that start and end with '-' , store them into a separate list, then the user changes each of those phrases into something new, put them back into the list. How do I tell it to grab a certain phrase if it is two separate words?

def madLibIt(text_file):
    listOfWords = [] #creates a word list
    for eachLine in text_file: #go through eachLine, and split it into 
        #seperate words
        listOfWords.extend(eachLine.split())
 print listOfWords

Answer 1

Calling str.split() without a separator splits the text by spaces, so you are not using - as a delimiter.

You can use re.findall() with the pattern (-.+?-) :

matches = re.findall(r'(-.+?-)', 'This is a -string- with a -foo bar-')
print(matches) # ['-string-', '-foo bar-']

Answer 2

This regular expression grabs exactly what you want.

import re

s = 'This is a string with -parts like this- and -normal- parts -as well-'

print re.findall(r'((?:-\w[\w\s]*\w-)|(?:\b\w+\b))', s)

>>> 
['This', 'is', 'a', 'string', 'with', '-parts like this-', 'and', '-normal-', 'parts', '-as well-']

Grab certain words and phrases from a text file in Python

Question

2 answers

solution1
2 ACCPTED 2013-04-14 15:02:03

solution2
1 2013-04-14 15:28:53

Grab certain words and phrases from a text file in Python

Question

2 answers

solution1 2 ACCPTED 2013-04-14 15:02:03

solution2 1 2013-04-14 15:28:53

solution1
2 ACCPTED 2013-04-14 15:02:03

solution2
1 2013-04-14 15:28:53