Get some string before " not in all lines python

Question

I have such entries in a txt file with such structure:

Some sentence.
Some other "other" sentence.
Some other smth "other" sentence.

In original:

Камиш-Бурунський залізорудний комбінат
Відкрите акціонерне товариство "Кар'єр мармуровий"
Закрите акціонерне товариство "Кар'єр мармуровий"

I want to extract everything before " and write to another file. I want the result to be:

Some other
Some other smth
Відкрите акціонерне товариство
Закрите акціонерне товариство

I have done this:

f=codecs.open('organization.txt','r+','utf-8')
text=f.read()
words_sp=text.split()
for line in text:
    before_keyword, after_keyword = line.split(u'"',1)
    before_word=before_keyword.split()[0]
    encoded=before_word.encode('cp1251')
    print encoded

But it doesn't work since there is a file lines that doesn't have " . How can I improve my code to make it work?

Answer 1

There are two problems. First you must use the splitlines() function to break a string into lines. (What you have will iterate one character at a time.) Secondly, the following code will fail when split returns a single item:

before_keyword, after_keyword = line.split(u'"',1)

The following works for me:

for line in text.splitlines():
    if u'"' in line:
        before_keyword, after_keyword = line.split(u'"',1)
        ... etc. ...

Get some string before " not in all lines python

Question

1 answers

solution1
2 ACCPTED 2013-11-09 21:23:49

Get some string before " not in all lines python

Question

1 answers

solution1 2 ACCPTED 2013-11-09 21:23:49

solution1
2 ACCPTED 2013-11-09 21:23:49