简体   繁体   中英

Reinforcement Learning without a final state?

I have a question about my case in the topic of reinforcement learning.

I want wo measure price acceptance of two products who are dependent on each other. That means if I change the price of Product A, maybe the customers would want to rather buy product B.

In my imagination I need a reinforcement learning algorythm for this. The state would be the actual prices of A and B like (eg A:15€, B:12€).

The actions would be the possible changes of price (eg Price A - 2€)

So the next state in this example would be (A:13€, B:12€)

The reward would be something like the profit difference or any other variable that tells me, how succesfull the price change was.

My question now is: I don't have a final state, right? How can I possibly handle this? I just want to maximize the reward. Is Reinforcement Learning even the right method or is there something more suitful for me?

The final state isn't necessary in renforcement learning, you just have to be carefull with your gamma adjustement.

Can we have a bit more informations about the price acceptance calculation ?

One other thing, I don't really find the interest of using a neural network in your problem, the fact is that your goal is to find the best couple of prices for products (A, B) according to your environement price acceptance which will give you the best profit, but when you'll find this couple, no matter what is the network inputs, the best couple will still be the same isn't it ?

I think that interest of using neural network with Q learning should be if you give as inputs of your network some environmental variables directely related to the price acceptance in addition to the current prices.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM