简体繁体中英

Implementing a More Efficient Python Algorithm for Substrings in Lists of Strings

原文 2014-12-15 09:55:39 0 2 python/ string/ algorithm/ data-structures

I'm doing the below in Python

I have a problem that involves checking a large list(size n) of input strings to see if the substrings contain any words from a large dictionary(size m).

I've looked around for efficient algorithms for this problem and found these: https://github.com/laurentluce/python-algorithms/blob/master/algorithms/string_matching.py

the Rabin-Karp and KMP matching algorithms.(note that I've replaced the ord() function for the Rabin-Karp with a dictionary for efficiency)

However, these actually perform slower than using an 'in' operation in Python which uses the Boyer–Moore–Horspool algorithm. I suppose that this is because the contains() method invoked by 'in' was is implemented in C.

How can I override this method with the Rabin-Karp for the string class in Python in C?

2 answers

You could have a look at cython:

http://docs.cython.org/src/quickstart/cythonize.html

I find it easier to write some custom code for a specific operation than overriding a core python structure.

You can't, I am sorry to say.

The built-in types, being constructed at the time the interpreter is compiled, cannot be patched at run-time. If speed is really so important then you might want to write a C extension type that subclasses the built-in string type but with a different contains method.

In python, what is more efficient? Modifying lists or strings?

What's more efficient in python - comparing lists, or strings?

Counting substrings in Python - more efficient approach?

Implementing Basic Efficient “Search” Algorithm: Python

Faster and more efficient python method for fuzzy matching substrings

Efficient searching partial strings in Python lists

More efficient way to handle big lists in python?

Are tuples more efficient than lists in Python?

How can I make this 2 strings matching algorithm more efficient?

Is there a more efficient way to do this - Python For Loop Strings

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question In python, what is more efficient? Modifying lists or strings? What's more efficient in python - comparing lists, or strings? Counting substrings in Python - more efficient approach? Implementing Basic Efficient “Search” Algorithm: Python Faster and more efficient python method for fuzzy matching substrings Efficient searching partial strings in Python lists More efficient way to handle big lists in python? Are tuples more efficient than lists in Python? How can I make this 2 strings matching algorithm more efficient? Is there a more efficient way to do this - Python For Loop Strings

Related Tags

Implementing a More Efficient Python Algorithm for Substrings in Lists of Strings

Question

2 answers

solution1
2 2014-12-15 10:05:04

solution2
1 ACCPTED 2014-12-15 10:01:18

Implementing a More Efficient Python Algorithm for Substrings in Lists of Strings

Question

2 answers

solution1 2 2014-12-15 10:05:04

solution2 1 ACCPTED 2014-12-15 10:01:18

solution1
2 2014-12-15 10:05:04

solution2
1 ACCPTED 2014-12-15 10:01:18