简体   繁体   中英

parsing a string in python for #hashtag

I am wondering, how could I make an algorithm that parses a string for the hashtag symbol ' # ' and returns the full string, but where ever a word starts with a '#' symbol, it becomes a link. I am using python with Google app engine: webapp2 and Jinja2 and I am building a blog. Thanks

A more efficient and complete way to find the "hashwords":

import functools

def hash_position(string):
    return string.find('#')

def delimiter_position(string, delimiters):
    positions = filter(lambda x: x >= 0, map(lambda delimiter: string.find(delimiter), delimiters))
    try:
        return functools.reduce(min, positions)
    except TypeError:
        return -1

def get_hashed_words(string, delimiters):
    maximum_length = len(string)
    current_hash_position = hash_position(string)
    string = string[current_hash_position:]
    results = []
    counter = 0
    while current_hash_position != -1:
        current_delimiter_position = delimiter_position(string, delimiters)
        if current_delimiter_position == -1:
            results.append(string)
        else:
            results.append(string[0:current_delimiter_position])
        # Update offsets and the haystack
        string = string[current_delimiter_position:]
        current_hash_position = hash_position(string)
        string = string[current_hash_position:]
    return results

if __name__ == "__main__":
    string = "Please #clarify: What do you #mean with returning somthing as a #link. #herp"
    delimiters = [' ', '.', ',', ':']
    print(get_hashed_words(string, delimiters))

Imperative code with updates of the haystack looks a little bit ugly but hey, that's what we get for (ab-)using mutable variables.

And I still have no idea what do you mean with "returning something as a link".

Hope that helps.

不知道从哪里获得链接的数据,但是可能类似:

[('<a href="...">%s</a>' % word) for word in input.split() if word[0]=='#']

Are you talking about twitter? Maybe this?

def get_hashtag_link(hashtag):
    if hashtag.startswith("#"):
        return '<a href="https://twitter.com/search?q=%s">%s</a>' % (hashtag[1:], hashtag)

>>> get_hashtag_link("#stackoverflow")
'<a href="https://twitter.com/search?q=stackoverflow">#stackoverflow</a>'

It will return None if hashtag is not a hashtag.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM