简体繁体中英

Reinforcement Learning with MDP for revenues optimization

原文 2018-06-07 09:27:01 4 1 python/ optimization/ reinforcement-learning/ markov-decision-process

I want to modelize the service of selling seats on an airplane as an MDP( markov decision process) to use reinforcement learning for airline revenues optimization, for that I needed to define what would be: states, actions, policy, value and reward. I thought a little a bit about it, but i think there is still something missing.

I modelize my system this way:

States = (r,c) where r is the number of passengers and c the number of seats bought so r>=c .
Actions = (p1,p2,p3) that are the 3 prices. the objective is to decide which one of them give more revenues.
Reward: revenues.

Could you please tell me what do u think and help me?

After the modelization, I have to implement all of that wit Reinforcement Learning. Is there a package that do the work ?

1 answers

I think the biggest thing missing in your formulation is the sequential part. Reinforcement learning is useful when used sequentially, where the next state has to be dependent on the current state (thus the "Markovian"). In this formulation, you have not specified any Markovian behavior at all. Also, the reward is a scalar which is dependent on either the current state or the combination of current state and action. In your case, the revenue is dependent on the price (the action), but it has no correlation to the state (the seat). These are the two big problems that I see with your formulation, there are others as well. I will suggest you to go through the RL theory (online courses and such) and write a few sample problems before trying to formulate your own.

MDP & Reinforcement Learning - Convergence Comparison of VI, PI and QLearning Algorithms

Tensorflow Reinforcement Learning RNN returning NaN's after Optimization with GradientTape

Reinforcement learning, pendulum python

Negative reward in reinforcement learning

Time step in reinforcement learning

Simple interface for reinforcement learning

Reinforcement Learning on a Supervised Dataset

reinforcement learning - number of actions

Reinforcement Learning with Keras model

Regression through reinforcement learning

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question MDP & Reinforcement Learning - Convergence Comparison of VI, PI and QLearning Algorithms Tensorflow Reinforcement Learning RNN returning NaN's after Optimization with GradientTape Reinforcement learning, pendulum python Negative reward in reinforcement learning Time step in reinforcement learning Simple interface for reinforcement learning Reinforcement Learning on a Supervised Dataset reinforcement learning - number of actions Reinforcement Learning with Keras model Regression through reinforcement learning

Related Tags

Reinforcement Learning with MDP for revenues optimization

Question

1 answers

solution1 0 ACCPTED 2018-06-07 18:33:45

solution1
0 ACCPTED 2018-06-07 18:33:45