简体   繁体   中英

Django Raw Query with params on Table Column (SQL Injection)

I have a kinda unusual scenario but in addition to my sql parameters, I need to let the user / API define the table column name too. My problem with the params is that the query results in:

SELECT device_id, time, 's0' ...

instead of

SELECT device_id, time, s0 ...

Is there another way to do that through raw or would I need to escape the column by myself?

queryset = Measurement.objects.raw(
            '''
            SELECT device_id, time, %(sensor)s FROM measurements
            WHERE device_id=%(device_id)s AND time >= to_timestamp(%(start)s) AND time <= to_timestamp(%(end)s)
            ORDER BY time ASC;
            ''', {'device_id': device_id, 'sensor': sensor, 'start': start, 'end': end})

As with any potential for SQL injection, be careful .

But essentially this is a fairly common problem with a fairly safe solution. The problem, in general, is that query parameters are "the right way" to handle query values, but they're not designed for schema elements.

To dynamically include schema elements in your query, you generally have to resort to string concatenation. Which is exactly the thing we're all told not to do with SQL queries.

But the good news here is that you don't have to use the actual user input. This is because, while possible query values are infinite, the superset of possible valid schema elements is quite finite. So you can validate the user's input against that superset.

For example, consider the following process:

  1. User inputs a value as a column name.
  2. Code compares that value (raw string comparison) against a list of known possible values. (This list can be hard-coded, or can be dynamically fetched from the database schema.)
  3. If no match is found, return an error.
  4. If a match is found, use the matched known value directly in the SQL query.

So all you're ever using are the very strings you, as the programmer, put in the code. Which is the same as writing the SQL yourself anyway.

It doesn't look like you need raw() for the example query you posted. I think the following queryset is very similar.

measurements = Measurement.objects.filter(
    device_id=device_id, 
    to_timestamp__gte=start,
    to_timestamp__lte,
).order_by('time')

for measurement in measurements:
    print(getattr(measurement, sensor)

If you need to optimise and avoid loading other fields, you can use values() or only() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM