How to apply string method on regular expression in Python

Question

I'm having a markdown file wich is a little bit broken: the links and images which are too long have line-breaks in it. I would like to remove line-breaks from them.

Example:

from:

See for example the
[installation process for Ubuntu
Trusty](https://wiki.diasporafoundation.org/Installation/Ubuntu/Trusty). The
project offers a Vagrant installation too, but the documentation only admits
that you know what you do, that you are a developer. If it is difficult to

![https://diasporafoundation.org/assets/pages/about/network-
distributed-e941dd3e345d022ceae909beccccbacd.png](data/images/network-
distributed-e941dd3e345d022ceae909beccccbacd.png)

_A pretty decentralized network (Source: <https://diasporafoundation.org/>)_

to:

See for example the
[installation process for Ubuntu Trusty](https://wiki.diasporafoundation.org/Installation/Ubuntu/Trusty). The
project offers a Vagrant installation too, but the documentation only admits
that you know what you do, that you are a developer. If it is difficult to

![https://diasporafoundation.org/assets/pages/about/network-distributed-e941dd3e345d022ceae909beccccbacd.png](data/images/network-distributed-e941dd3e345d022ceae909beccccbacd.png)

_A pretty decentralized network (Source: <https://diasporafoundation.org/>)_

As you can see in this snippet, I managed to match the all links and images with the right pattern: https://regex101.com/r/uL8pO4/2

But now, what is the syntax in Python to use a string method like string.trim() on what I have captured with regular expression?

For the moment, I'm stuck with this:

fix_newlines = re.compile(r'\[([\w\s*:/]*)\]\(([^()]+)\)')
# Capture the links and remove line-breaks from their urls
# Something like r'[\1](\2)'.trim() ??
post['content'] = fix_newlines.sub(r'[\1](\2)', post['content'])

Edit: I updated the example to be more explicit about my problem.

Thank you for your answer

Answer 1

strip would work similar to functionality of trim. As you would need to trim the new lines, use strip('\\n'),

fin.readline.strip('\n')

Answer 2

This will work also:

>>> s = """
...    ![https://diasporafoundation.org/assets/pages/about/network-
... distributed-e941dd3e345d022ceae909beccccbacd.png](data/images/network-
... distributed-e941dd3e345d022ceae909beccccbacd.png)
... """

>>> new_s = "".join(s.strip().split('\n'))
>>> new_s
'![https://diasporafoundation.org/assets/pages/about/network-distributed-e941dd3e345d022ceae909beccccbacd.png](data/images/network-distributed-e941dd3e345d022ceae909beccccbacd.png)'
>>>

Often times built-in string functions will do, and are easier to read than figuring out regexes. In this case strip removes leading and trailing space, then split returns a list of items between newlines, and join puts them back together in a single string.

Answer 3

Alright, I finally found what I was searching. With the snippet below, I could capture a string with a regex and then apply the treatment on each of them.

def remove_newlines(match):
    return "".join(match.group().strip().split('\n'))

links_pattern = re.compile(r'\[([\w\s*:/\-\.]*)\]\(([^()]+)\)')
post['content'] = links_pattern.sub(remove_newlines, post['content'])

Thank you for your answers and sorry if my question wasn't explicit enough.

How to apply string method on regular expression in Python

Question

3 answers

solution1
0 2016-06-29 09:51:10

solution2
0 2016-06-29 10:26:26

solution3
0 ACCPTED 2016-06-29 14:55:58

How to apply string method on regular expression in Python

Question

3 answers

solution1 0 2016-06-29 09:51:10

solution2 0 2016-06-29 10:26:26

solution3 0 ACCPTED 2016-06-29 14:55:58

solution1
0 2016-06-29 09:51:10

solution2
0 2016-06-29 10:26:26

solution3
0 ACCPTED 2016-06-29 14:55:58