简体   繁体   中英

Parsing and constructing filtering queries similiar to SQL WHERE clause in Python/JavaScript

I am building a query engine for a database which is pulling data from SQL and other sources. For normal use cases the users can use a web form where the use can specify filtering parameters with select and ranged inputs. But for advanced use cases, I'd like to to specify a filtering equation box where the users could type

  • AND, OR

  • Nested parenthesis

  • variable names

  • , <, =, != operators

So the filtering equation could look something like:

 ((age > 50) or (weight > 100)) and diabetes='yes'

Then this input would be parsed, input errors detected (non-existing variable name, etc) and SQL Alchemy queries built based on it.

I saw an earlier post about the similar problem https://stackoverflow.com/a/1395854/315168

There seem to exist several language and mini-language parsers for Python http://navarra.ca/?p=538

However, does there exist any package which would be out of the box solution or near solution for my problem? If not what would be the simplest way to construct such query parser and constructor in Python?

Have a look at https://github.com/dfilatov/jspath

It's similar to xpath, so the syntax isn't as familiar as SQL, but it's powerful over hierarchical data.

I don't know if this is still relevant to you, but here is my answer:

Firstly I have created a class that does exactly what you need. You may find it here: https://github.com/snow884/filter_expression_parser/ It takes a list of dictionaries as an input + filter query and returns the filtered results. You just have to define the list of fields that are allowed plus functions for checking the format of the constants passed as a part of filter expression.

The filter expression it ingests has to have the following format:

(time > 45.34) OR (((user_id eq 1) OR (date gt '2019-01-04')) AND (username ne 'john.doe'))

or just

username ne 'john123'

Secondly it was foolish of me to even create this code because dataframe.query(...) from pandas already does almost exactly what you need: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM