简体   繁体   中英

How to use variables (strings) with quotes and whitespace at the end inside query in Pandas?

Let say that i have the following code

    comm='This is/a string with (some single quotes, inside like 'F' this)'
print(df.query('Column1==@comm')['Column2'].values[0])

This give me an error instead of return the value of Column2 when comm exist in Column1

I also tried:

df.query("Column1=='{0}'".format(comm))['Column2'].values[0]

Without luck as well.

If the variable is a string without single ' or double " quotes, it works just fine.

In the actual code comm is a dynamic variable that changes and takes for values strings with single ' and double " quotes in.

Thanks in advance.

EDIT: It seems that pandas queries suffer from various other problems if the string contain symbols.

I tried and replaced as advised comm.replace("'","\\'") and worked for strings containing single quotes ' .

Now im facing other problems where the query fail to find the string in the dataframe ( even though the string exists ) if the string contain whitespace at the end.

comm='This is a test. string '
comm='This is a test string/ '

As I see your string contains both.

No problem:

comm = "string with \" and ' in it!"

You can't write single quotes inside single quotes string because what you actually do is separating the string into two strings leaving syntax error

comm='This is/a string with (some single quotes, inside like ' +  F  + ' this)'

F is just a variable now not part of string

This lines of code works fine

df = pd.DataFrame({'Column1': ["string with 'single' quotes,inside like 'F' this", 'Data'],
    'Column2':['Done','Data2']})
    comm="string with 'single' quotes,inside like 'F' this"
    print(df.query('Column1==@comm')['Column2'].values[0])

edited:- You can use single quotes inside single quotes by write it prefixed by \'F\'

df = pd.DataFrame({'Column1': ['string with \'single\' quotes,inside like \'F\' this', 'Data'],
'Column2':['Done','Data2']})
comm='string with \'single\' quotes,inside like \'F\' this'
print(df.query('Column1==@comm')['Column2'].values[0])

This trick to make query work for double quotes format by changing it to format a little bit.

import json 
def convert_string(string): #Function which change format to be '"<string>"'
    return json.dumps(string)

df = pd.DataFrame({'Column1': ['here', 'Data'],
'Column2':['Done','Data2']})
comm="here"
converted = convert_string(comm)
print(df.query('Column1=={}'.format(converted))['Column2'].values[0])

Better Solution is by using exception.

df = pd.DataFrame({'Column1': ['here', 'Data'],
'Column2':['Done','Data2']})
comm="here"
try:
    print(df.query('Column1==@comm')['Column2'].values[0])
except:
    print(df.query("Column1==@comm")['Column2'].values[0])

edited - 2:

This Script for removing all symbols from data frame very fast.

#Create random dataframe
import pandas as pd
import numpy as np
import random
import string

random.seed(0)

def random_String(Length=20):
    letters = string.ascii_lowercase + string.punctuation
    return ''.join(random.choice(letters) for i in range(Length))
data_shape = 100000
data = {'A':[random_String() for i in range(data_shape)],'B':['Here string {}'.format(i) for i in range(data_shape)]}
df = pd.DataFrame(data)

df.head()
Out[]: 
                      A              B
0  {y[}!cq'&z]`t%w,~n'i  Here string 0
1  si[g.^q)>^-~jtg?e~{<  Here string 1
2  v%*gw"u./n*%#|(qd^*a  Here string 2
3  f?`z>_];/a.&_|vp?u>|  Here string 3
4  em+op^j^)#ffu}'&gt*s  Here string 4
def remove_symbols(s): #Function remove symbols from gived column
    return s.translate(str.maketrans('', '', string.punctuation))
def convert_data(pandas_series):
     return pandas_series.apply(remove_symbols)
df['A'] = convert_data(df['A'])
df.head()
Out[]: 
             A              B
0     ycqztwni  Here string 0
1     sigqjtge  Here string 1
2     vgwunqda  Here string 2
3       fzavpu  Here string 3
4  emopjffugts  Here string 4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM