Q學習，matlab

大小: 4KB

文件類型: .m

金幣: 1

下載: 0 次

發布日期: 2021-05-16
語言: Matlab
標簽: Q學習??

高速下載

資源簡介

Q學習，很有幫助.jie shao le Q-learning de ji ben shiyong

資源截圖

小圖大圖

代碼片段和文件信息

%%?Q-learning?with?epsilon-greedy?exploration?Algorithm?for?Deterministic?Cleaning?Robot?V1
%??Matlab?code?:?Reza?Ahmadzadeh
%??email:?reza.ahmadzadeh@iit.it
%??March-2014
%%?The?deterministic?cleaning-robot?MDP
%?a?cleaning?robot?has?to?collect?a?used?can?also?has?to?recharge?its
%?batteries.?the?state?describes?the?position?of?the?robot?and?the?action
%?describes?the?direction?of?motion.?The?robot?can?move?to?the?left?or?to
%?the?right.?The?first?（1）?and?the?final?（6）?states?are?the?terminal
%?states.?The?goal?is?to?find?an?optimal?policy?that?maximizes?the?return
%?from?any?initial?state.?Here?the?Q-learning?epsilon-greedy?exploration
%?algorithm?（in?Reinforcement?learning）?is?used.
%?Algorithm?2-3?from:
%?@book{busoniu2010reinforcement
%???title={Reinforcement?learning?and?dynamic?programming?using?function?approximators}
%???author={Busoniu?Lucian?and?Babuska?Robert?and?De?Schutter?Bart?and?Ernst?Damien}
%???year={2010}
%???publisher={CRC?Press}
%?}
%?notice:?the?code?is?written?in?1-indexed?instead?of?0-indexed
%
%?V1?the?initial?evaluation?of?the?algorithm?
%
%%?this?is?the?main?function?including?the?initialization?and?the?algorithm
%?the?inputs?are:?initial?Q?matrix?set?of?actions?set?of?states
%?discounting?factor?learning?rate?exploration?probability
%?number?of?iterations?and?the?initial?state.
function?qlearning
%?learning?parameters
gamma?=?0.5;????%?discount?factor??%?TODO?:?we?need?learning?rate?schedule
alpha?=?0.5;????%?learning?rate????%?TODO?:?we?need?exploration?rate?schedule
epsilon?=?0.9;??%?exploration?probability?（1-epsilon?=?exploit?/?epsilon?=?explore）
%?states
state?=?[012345];
%?actions
action?=?[-11];
%?initial?Q?matrix
Q?=?zeros（length（state）length（action））;
K?=?1000;?????%?maximum?number?of?the?iterations
state_idx?=?3;??%?the?initial?state?to?begin?from
%%?the?main?loop?of?the?algorithm
for?k?=?1:K
????disp（[‘iteration:?‘?num2str（k）]）;
????r=rand;?%?get?1?uniform?random?number
????x=sum（r>=cumsum（[0?1-eps

91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

Q學習，matlab

資源簡介

資源截圖

代碼片段和文件信息

評論

相關資源