简体   繁体   中英

Insert a series of values into pd.dataframe randomly

I have a large dataframe and what I want to do is overwrite X entries of that dataframe with a new value I set. The new entries have to be at a random position, but they have to be in order. Like I have a Column with random numbers, and want to overwrite 20 of them in a row with the new value x.

I tried df.sample(x) and then update the dataframe, but I only get individual entries. But I need the X new entries in a row (consecutively).

Somebody got a solution? I'm quite new to Python and have to get into it for my master thesis.

CLARIFICATION:

My dataframe has 5 columns with almost 60,000 rows, each row for 10 minutes of the year.

  • One Column is 'output' with electricity production values for that 10 minutes.
  • For 2 consecutive hours (120 consecutive minutes, hence 12 consecutive rows) of the year I want to lower that production to 60%. I want it to happen at a random time of the year.
  • Another column is 'status', with information about if the production is reduced or not.

I tried:

df_update = df.sample(12)
df_update.status = 'reduced'
df.update(df_update)
df.loc[('status) == 'reduced', ['production']] *=0.6 

which does the trick for the total amount of time (12*10 minutes), but I want 120 consecutive minutes and not separated.

I decided to get a random value and just index the next 12 entries to be 0.6. I think this is what you want.

df = pd.DataFrame({'output':np.random.randn(20),'status':[0]*20})
idx = df.sample(1).index.values[0]
df.loc[idx:idx+11,"output"]=0.6
df.loc[idx:idx+11,"status"]=1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM