简体   繁体   中英

Using loc on two columns to perform calculations that replace values of another column

I have been stuck on this way too long. All I am trying to do is create a new column called Duration Target Date which derives from Standard Duration Days + Date/Time Created . Below is my code so far: From my POV, I think that this code will iterate from 0 to the length of the data frame. If there is "No Set Standard Duration" in the Standard Duration Days column then goes to my else statement and overwrites that given cell with blank(same as I initialized it). However, if the code realizes that there is anything but "No Set Standard Duration" it should then add the value from the given cell from column Standard Duration Days with column Date/Time Created . I want the new value to be in the new column Duration Target Date at the corresponding index.

newDF["Duration Target Date"] = ""

for i in range(0,len(newDF)):
    if newDF.loc[i,"Standard Duration Days"] != "No Set Standard Duration":
        newDF.loc[i,"Duration Target Date"] = (timedelta(days = int(newDF.loc[i,"Standard Duration Days"])) + newDF.loc[i,"Date/Time Created"])
    else:
        newDF.loc[i,"Duration Target Date"] == ""

I noticed that this works partially but then it eventually stops working... I also get an error when I run this: "KeyError 326"

I would just add the columns and leave the NaT (Not a Time) error.

df = pd.DataFrame({
    "Standard Duration Days": [3, 5, "No Set Standard Duration"],
    "Date/Time Created": ['2019-01-01', '2019-02-01', '2019-03-01']
})

# 1. Convert string dates to pandas timestamps.
df['Date/Time Created'] = pd.to_datetime(df['Date/Time Created'])

# 2. Create time deltas, coercing errors.
delta = pd.to_timedelta(df['Standard Duration Days'], unit='D', errors='coerce')

# 3. Create new column by adding delta to 'Date/Time Created'.
df['Duration Target Date'] = (df['Date/Time Created'] + delta).dt.normalize()

>>> df
     Standard Duration Days Date/Time Created Duration Target Date
0                         3        2019-01-01           2019-01-04
1                         5        2019-02-01           2019-02-06
2  No Set Standard Duration        2019-03-01                  NaT

Adding text to a numeric column converts the entire column to object which takes more memory and is less efficient. Generally, one wants to leave empty values as np.nan or possibly a sentinel value in the case of integers. Only for display purposes do those get converted, eg df['Duration Target Date'].fillna('') .

A couple issues here. First, it looks like you're confusing loc with iloc . Very easy to do. loc looks up by the actual index, which may or may not be the integer-position index. But your i in range (0, len(newDF)) is iterating by integer-position index. So you're getting your KeyError 326 because you're getting to the 326th row of your dataframe, but it's index is not actually 326. you can check this by looking at print(newDF.iloc[320:330]) .

Second and more important issue: you almost never want to iterate through rows in a pandas dataframe. Instead, use a vectorized function that applies to a full column at a time. For your case where you want conditional assignment, the relevant function is np.where :

boolean_filter = newDF.loc[:,"Standard Duration Days"] != "No Set Standard Duration"
value_where_true = (timedelta(days = newDF.loc[:,"Standard Duration Days"].astype('int'))) + newDF.loc[:,"Date/Time Created"])
value_where_false = ""

newDF["Duration Target Date"] = np.where(boolean_filter, value_where_true, value_where_false) 

Here's a way using .apply row-wise:

newDF['Standard Duration Days'] = newDF['Standard Duration Days'].astype(int)

newDF['Duration Target Date'] = (newDF
                                .apply(lambda x:, x["Standard Duration Days"] + x["Date/Time Created"] if x["Standard Duration Days"] != "No Set Standard Duration" else None,axis=1)

Note: Since you haven't provided any data, it is not tested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM