![](/img/trans.png)
[英]A value is trying to be set on a copy of a slice from a DataFrame. using pandas during the initialization
[英]A value is trying to be set on a copy of a slice from a DataFrame. - pandas
我是pandas
的新手,并且在给定数据框的情况下,我试图删除一些未满足特定要求的列。 研究如何去做,我得到了这个结构:
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])]
但是,在处理框架时,出现此错误:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value
我不确定该怎么做,因为我已经在使用.loc
function。我错过了什么?
f = ['ID_manifest', 'issue_date', 'channel', 'product', 'ID_client', 'desc_manifest']
df = pd.DataFrame(columns=f)
for chunk in df2017_chunks:
aux = preProcess(chunk, f)
df = pd.concat([df, aux])
def preProcess(df, f):
stops = list(stopwords.words("portuguese"))
stops.extend(['reclama', 'cliente', 'santander', 'cartao', 'cartão'])
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])]
df.columns = f
df.desc_manifest = df.desc_manifest.str.lower() # All lower case
df.desc_manifest = df.desc_manifest.apply(lambda x: re.sub('[^A-zÀ-ÿ]', ' ', str(x))) # Just letters
df.replace(['NaN', 'nan'], np.nan, inplace = True) # Remone nan
df.dropna(subset=['desc_manifest'], inplace=True)
df.desc_manifest = df.desc_manifest.apply(lambda x: [word for word in str(x).split() if word not in stops]) # Remove stop words
return df
警告的目的是向用户表明他们可能正在操作副本而不是原件,但可能存在误报。 正如评论中提到的,这对您的用例来说不是问题。
您可以简单地关闭对数据框的检查:
df.is_copy = False
或者您可以明确复制:
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])].copy()
您需要copy
,因为如果您稍后修改df
值,您会发现修改不会传播回原始数据( df
),并且 Pandas 会发出警告。
loc
可以省略,但警告也可以不copy
。
df = pd.DataFrame({'DS_FAMILIA_PROD':['a','d','b'],
'desc_manifest':['F','rR', 'H'],
'C':[7,8,9]})
def preProcess(df):
df = df[df['DS_FAMILIA_PROD'].isin([u'a', u'b'])].copy()
df.desc_manifest = df.desc_manifest.str.lower() # All
...
...
return df
print (preProcess(df))
C DS_FAMILIA_PROD desc_manifest
0 7 a f
2 9 b h
如果您的程序打算故意获取 df 的副本,您可以通过以下方式停止警告:
pd.set_option('mode.chained_assignment', None)
pd.set_option('mode.chained_assignment', 'warn')
# if you set a value on a copy, warning will show
df = DataFrame({'DS_FAMILIA_PROD' : [1, 2, 3], 'COL2' : [5, 6, 7]})
df = df[df.DS_FAMILIA_PROD.isin([1, 2])]
df
Out[29]:
COL2 DS_FAMILIA_PROD
0 5 1
1 6 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.