简体   繁体   中英

How to manipulate Pandas Series without changing the given Original?

Context

I have a method that takes a Pandas Series of categorial Data and returns it as an indexed version. However, I think my implementation is also modifying the given Series, not just returning a modified new Series. I also get the following Errors:

A value is trying to be set on a copy of a slice from a DataFrame. See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy series[series == value] = index

SettingWithCopyWarning: modifications to a property of a datetimelike object are not supported and are discarded. Change values on the original. cacher_needs_updating = self._check_is_chained_assignment_possible()


Code

def categorials(series: pandas.Series) -> pandas.Series:
    unique = series.unique()

    for index, value in enumerate(unique):
        series[series == value] = index

    return series.astype(pandas.Int64Dtype())

Question

  • How can I achieve my goal: This method should return the modified series without manipulating the original given series?

You need to .copy() the incoming argument. Normally, that warning wouldn't have appeared; we're at liberty to write to Series/DataFrames after all. However, in the code you didn't share, it seems the argument you're passing here was obtained as a subset of another Series/Frame (or maybe even itself). FYI, if you're planning to do modifications on a subset, better chain .copy() at the end of initialization.

Anyway, back to the question, series = series.copy() as the first line in the function should resolve the issue. However, your method is actually doing factorization , so

pd.Series(pd.factorize(series)[0], index=series.index)

is equivalent to what your function does, where since pd.factorize returns a 2-tuple of (codes, uniques), we take the 0th one. Also it gives a NumPy array back, so we Series-ify it with the incoming index. Noting that, it does not attempt to modify the original Series, so no .copy is needed for it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM