資源簡介
英文原書,強化學習導論英文第二版pdf加源碼實現(python)。包括第一版中第二章到第十章到中文翻譯

代碼片段和文件信息
#######################################################################
#?Copyright?(C)???????????????????????????????????????????????????????#
#?2016?-?2018?Shangtong?Zhang(zhangshangtong.cpp@gmail.com)???????????#
#?2016?Jan?Hakenberg(jan.hakenberg@gmail.com)?????????????????????????#
#?2016?Tian?Jun(tianjun.cpp@gmail.com)????????????????????????????????#
#?2016?Kenta?Shimada(hyperkentakun@gmail.com)?????????????????????????#
#?Permission?given?to?modify?the?code?as?long?as?you?keep?this????????#
#?declaration?at?the?top??????????????????????????????????????????????#
#######################################################################
import?numpy?as?np
import?pickle
BOARD_ROWS?=?3
BOARD_COLS?=?3
BOARD_SIZE?=?BOARD_ROWS?*?BOARD_COLS
class?State:
????def?__init__(self):
????????#?the?board?is?represented?by?an?n?*?n?array
????????#?1?represents?a?chessman?of?the?player?who?moves?first
????????#?-1?represents?a?chessman?of?another?player
????????#?0?represents?an?empty?position
????????self.data?=?np.zeros((BOARD_ROWS?BOARD_COLS))
????????self.winner?=?None
????????self.hash_val?=?None
????????self.end?=?None
????#?compute?the?hash?value?for?one?state?it‘s?unique
????def?hash(self):
????????if?self.hash_val?is?None:
????????????self.hash_val?=?0
????????????for?i?in?self.data.reshape(BOARD_ROWS?*?BOARD_COLS):
????????????????if?i?==?-1:
????????????????????i?=?2
????????????????self.hash_val?=?self.hash_val?*?3?+?i
????????return?int(self.hash_val)
????#?check?whether?a?player?has?won?the?game?or?it‘s?a?tie
????def?is_end(self):
????????if?self.end?is?not?None:
????????????return?self.end
????????results?=?[]
????????#?check?row
????????for?i?in?range(0?BOARD_ROWS):
????????????results.append(np.sum(self.data[i?:]))
????????#?check?columns
????????for?i?in?range(0?BOARD_COLS):
????????????results.append(np.sum(self.data[:?i]))
????????#?check?diagonals
????????results.append(0)
????????for?i?in?range(0?BOARD_ROWS):
????????????results[-1]?+=?self.data[i?i]
????????results.append(0)
????????for?i?in?range(0?BOARD_ROWS):
????????????results[-1]?+=?self.data[i?BOARD_ROWS?-?1?-?i]
????????for?result?in?results:
????????????if?result?==?3:
????????????????self.winner?=?1
????????????????self.end?=?True
????????????????return?self.end
????????????if?result?==?-3:
????????????????self.winner?=?-1
????????????????self.end?=?True
????????????????return?self.end
????????#?whether?it‘s?a?tie
????????sum?=?np.sum(np.abs(self.data))
????????if?sum?==?BOARD_ROWS?*?BOARD_COLS:
????????????self.winner?=?0
????????????self.end?=?True
????????????return?self.end
????????#?game?is?still?going?on
????????self.end?=?False
????????return?self.end
????#?@symbol:?1?or?-1
????#?put?chessman?symbol?in?position?(i?j)
????def?next_state(self?i?j?symbol):
????????new_state?=?State()
????????new_state.data?=?np.copy(self.data)
????????new_state.data[i?j]?=?symbol
????????return?new_state
????#?print?the?board
????def?print_state(self):
????????
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
????.......????????40??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\.gitignore
????.......???????148??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\.travis.yml
????.......?????11292??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter01\tic_tac_toe.py
????.......??????9151??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter02\ten_armed_testbed.py
????.......??????3808??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter03\grid_world.py
????.......??????7391??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter04\car_rental.py
????.......??????8716??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter04\car_rental_synchronous.py
????.......??????2445??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter04\gamblers_problem.py
????.......??????3436??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter04\grid_world.py
????.......?????13167??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter05\blackjack.py
????.......??????1814??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter05\infinite_variance.py
????.......??????9355??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter06\cliff_walking.py
????.......??????4269??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter06\maximization_bias.py
????.......??????6574??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter06\random_walk.py
????.......??????4018??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter06\windy_grid_world.py
????.......??????4249??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter07\random_walk.py
????.......??????1627??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter08\expectation_vs_sample.py
????.......?????23252??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter08\maze.py
????.......??????4892??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter08\trajectory_sampling.py
????.......?????15793??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter09\random_walk.py
????.......??????4262??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter09\square_wave.py
????.......??????9605??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter10\access_control.py
????.......?????13681??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter10\mountain_car.py
????.......?????11839??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter11\counterexample.py
????.......?????12140??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter12\mountain_car.py
????.......??????9679??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter12\random_walk.py
????.......??????8012??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\chapter13\short_corridor.py
????.......?????36003??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\images\example_13_1.png
????.......????238133??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\images\example_6_2.png
????.......?????31488??2019-03-14?04:56??《強化學習導論》第二版源代碼(python)\images\example_8_4.png
............此處省略67個文件信息
評論
共有 條評論