[英]How to use variables (strings) with quotes and whitespace at the end inside query in Pandas?
Let say that i have the following code假设我有以下代码
comm='This is/a string with (some single quotes, inside like 'F' this)'
print(df.query('Column1==@comm')['Column2'].values[0])
This give me an error instead of return the value of Column2
when comm
exist in Column1
这给了我一个错误,而不是在
Column1
中存在comm
时返回Column2
的值
I also tried:我也试过:
df.query("Column1=='{0}'".format(comm))['Column2'].values[0]
Without luck as well.没有运气也是如此。
If the variable is a string without single '
or double "
quotes, it works just fine.如果变量是一个没有单
'
或双"
引号的字符串,它就可以正常工作。
In the actual code comm
is a dynamic variable that changes and takes for values strings with single '
and double "
quotes in.在实际代码中,
comm
是一个动态变量,它会更改并获取带有单'
和双"
引号的字符串。
Thanks in advance.提前致谢。
EDIT: It seems that pandas queries suffer from various other problems if the string contain symbols.编辑:如果字符串包含符号,pandas 查询似乎会遇到各种其他问题。
I tried and replaced as advised comm.replace("'","\\'")
and worked for strings containing single quotes '
.我按照建议尝试并替换了
comm.replace("'","\\'")
并为包含单引号'
的字符串工作。
Now im facing other problems where the query fail to find the string in the dataframe ( even though the string exists ) if the string contain whitespace at the end.现在我面临其他问题,如果字符串末尾包含空格,则查询无法在 dataframe 中找到字符串(即使字符串存在)。
comm='This is a test. string '
comm='This is a test string/ '
As I see your string contains both.我看到你的字符串包含两者。
No problem:没问题:
comm = "string with \" and ' in it!"
You can't write single quotes inside single quotes string because what you actually do is separating the string into two strings leaving syntax error您不能在单引号字符串中写单引号,因为您实际上所做的是将字符串分成两个字符串而留下语法错误
comm='This is/a string with (some single quotes, inside like ' + F + ' this)'
F is just a variable now not part of string F 只是一个变量,现在不是字符串的一部分
This lines of code works fine这行代码工作正常
df = pd.DataFrame({'Column1': ["string with 'single' quotes,inside like 'F' this", 'Data'],
'Column2':['Done','Data2']})
comm="string with 'single' quotes,inside like 'F' this"
print(df.query('Column1==@comm')['Column2'].values[0])
edited:- You can use single quotes inside single quotes by write it prefixed by \'F\'
编辑:-您可以在单引号内使用单引号,方法是在前面加上
\'F\'
df = pd.DataFrame({'Column1': ['string with \'single\' quotes,inside like \'F\' this', 'Data'],
'Column2':['Done','Data2']})
comm='string with \'single\' quotes,inside like \'F\' this'
print(df.query('Column1==@comm')['Column2'].values[0])
This trick to make query work for double quotes format by changing it to format a little bit.这个技巧通过将其更改为格式来使查询适用于双引号格式。
import json
def convert_string(string): #Function which change format to be '"<string>"'
return json.dumps(string)
df = pd.DataFrame({'Column1': ['here', 'Data'],
'Column2':['Done','Data2']})
comm="here"
converted = convert_string(comm)
print(df.query('Column1=={}'.format(converted))['Column2'].values[0])
Better Solution is by using exception.更好的解决方案是使用异常。
df = pd.DataFrame({'Column1': ['here', 'Data'],
'Column2':['Done','Data2']})
comm="here"
try:
print(df.query('Column1==@comm')['Column2'].values[0])
except:
print(df.query("Column1==@comm")['Column2'].values[0])
edited - 2:编辑 - 2:
This Script for removing all symbols from data frame very fast.此脚本用于非常快速地从数据框中删除所有符号。
#Create random dataframe
import pandas as pd
import numpy as np
import random
import string
random.seed(0)
def random_String(Length=20):
letters = string.ascii_lowercase + string.punctuation
return ''.join(random.choice(letters) for i in range(Length))
data_shape = 100000
data = {'A':[random_String() for i in range(data_shape)],'B':['Here string {}'.format(i) for i in range(data_shape)]}
df = pd.DataFrame(data)
df.head()
Out[1]: A B 0 {y[}!cq'&z]`t%w,~n'i Here string 0 1 si[g.^q)>^-~jtg?e~{< Here string 1 2 v%*gw"u./n*%#|(qd^*a Here string 2 3 f?`z>_];/a.&_|vp?u>| Here string 3 4 em+op^j^)#ffu}'>*s Here string 4
def remove_symbols(s): #Function remove symbols from gived column
return s.translate(str.maketrans('', '', string.punctuation))
def convert_data(pandas_series):
return pandas_series.apply(remove_symbols)
df['A'] = convert_data(df['A'])
df.head()
Out[2]: A B 0 ycqztwni Here string 0 1 sigqjtge Here string 1 2 vgwunqda Here string 2 3 fzavpu Here string 3 4 emopjffugts Here string 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.