加州大学伯克利分校 2017：深度增强学习课程-Summary of policy gradients and temporal difference methods (Schulman)（下）-网易公开课

已购课程

个人中心 

已购课程
优惠券
我的收藏
播放记录
我的证书墙

内容中心

 关注我们

进入关怀模式

APP下载

反馈

意见反馈

您有什么问题？告诉我们，我们会为你解决

选择问题类型：

新版本体验建议视频画面花屏音/视频画面花屏播放不流畅其他

请详细描述您的建议、意见、问题等。

提交

Summary of policy gradients and temporal difference methods (Schulman)（下）

1022 播放

浩瀚宇宙探索

宇宙探索

课程免费缓存，随时观看～

下载

打开网易公开课APP
扫码下载视频

分享到

扫码分享到微信

通过代码可以让这个视频再其他地方播放哦！

手机看

扫描二维码用手机看

已观看至0分0秒

打开网易公开课APP-我的-右上角扫一扫，在手机上观看，还可以缓存视频，加入学习计划

还没有公开课客户端？立即下载

登录后可发评论

评论沙发是我的～

热门评论(0)

全部评论(0)

选集(57)

自动播放

[1] Introduction and ...

4914播放

26:11

Introduction and course overview (Levine, Finn, Schulman)（上）

[2] Introduction and ...

1581播放

26:14

Introduction and course overview (Levine, Finn, Schulman)（中）

[3] Introduction and ...

1283播放

26:08

Introduction and course overview (Levine, Finn, Schulman)（下）

[4] Supervised learni...

1853播放

24:06

Supervised learning and decision making (Levine)（上）

[5] Supervised learni...

1242播放

24:07

Supervised learning and decision making (Levine)（中）

[6] Supervised learni...

702播放

24:03

Supervised learning and decision making (Levine)（下）

[7] Optimal control a...

1585播放

21:06

Optimal control and planning (Levine)（上）

[8] Optimal control a...

607播放

21:13

Optimal control and planning (Levine)（中）

[9] Optimal control a...

526播放

21:03

Optimal control and planning (Levine)（下）

[10] Learning dynamica...

1204播放

27:27

Learning dynamical system models from data (Levine)（上）

[11] Learning dynamica...

1406播放

27:35

Learning dynamical system models from data (Levine)（中）

[12] Learning dynamica...

790播放

27:22

Learning dynamical system models from data (Levine)（下）

[13] Learning policies...

598播放

23:05

Learning policies by imitating optimal controllers (Levine)（上）

[14] Learning policies...

1451播放

23:08

Learning policies by imitating optimal controllers (Levine)（中）

[15] Learning policies...

1165播放

22:58

Learning policies by imitating optimal controllers (Levine)（下）

[16] RL definitions, v...

1145播放

17:19

RL definitions, value iteration, policy iteration (Schulman)（上）

[17] RL definitions, v...

1301播放

17:22

RL definitions, value iteration, policy iteration (Schulman)（中）

[18] RL definitions, v...

517播放

17:18

RL definitions, value iteration, policy iteration (Schulman)（下）

[19] Reinforcement lea...

927播放

21:48

Reinforcement learning with policy gradients (Schulman)（上）

[20] Reinforcement lea...

890播放

21:54

Reinforcement learning with policy gradients (Schulman)（中）

[21] Reinforcement lea...

860播放

21:42

Reinforcement learning with policy gradients (Schulman)（下）

[22] Learning Q-functi...

1353播放

25:50

Learning Q-functions: Q-learning, SARSA, and others (Schulman)（上）

[23] Learning Q-functi...

1128播放

25:53

Learning Q-functions: Q-learning, SARSA, and others (Schulman)（中）

[24] Learning Q-functi...

830播放

25:42

Learning Q-functions: Q-learning, SARSA, and others (Schulman)（下）

[25] Advanced Q-learni...

641播放

26:47

Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc（上）

[26] Advanced Q-learni...

854播放

26:55

Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc（中）

[27] Advanced Q-learni...

1029播放

26:41

Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc（下）

[28] Advanced topics i...

1480播放

27:53

Advanced topics in imitation and safety (Finn)（上）

[29] Advanced topics i...

1371播放

27:56

Advanced topics in imitation and safety (Finn)（中）

[30] Advanced topics i...

1002播放

27:47

Advanced topics in imitation and safety (Finn)（下）

[31] Inverse RL: acqui...

1459播放

24:47

Inverse RL: acquiring objectives from demonstration (Finn)（上）

[32] Inverse RL: acqui...

1195播放

24:48

Inverse RL: acquiring objectives from demonstration (Finn)（中）

[33] Inverse RL: acqui...

1225播放

24:47

Inverse RL: acquiring objectives from demonstration (Finn)（下）

[34] Advanced policy g...

1228播放

28:05

Advanced policy gradients: natural gradient and TRPO (Schulman)（上）

[35] Advanced policy g...

755播放

28:08

Advanced policy gradients: natural gradient and TRPO (Schulman)（中）

[36] Advanced policy g...

917播放

28:02

Advanced policy gradients: natural gradient and TRPO (Schulman)（下）

[37] Policy gradient v...

1186播放

26:55

Policy gradient variance reduction and actor-critic algorithms (Schulman)（上）

[38] Policy gradient v...

1417播放

27:00

Policy gradient variance reduction and actor-critic algorithms (Schulman)（中）

[39] Policy gradient v...

911播放

26:51

Policy gradient variance reduction and actor-critic algorithms (Schulman)（下）

[40] Summary of policy...

920播放

24:06

Summary of policy gradients and temporal difference methods (Schulman)（上）

[41] Summary of policy...

1446播放

24:10

Summary of policy gradients and temporal difference methods (Schulman)（中）

[42] Summary of policy...

1022播放

待播放

Summary of policy gradients and temporal difference methods (Schulman)（下）

[43] The exploration p...

678播放

27:18

The exploration problem (Schulman)（上）

[44] The exploration p...

894播放

27:18

The exploration problem (Schulman)（中）

[45] The exploration p...

727播放

27:17

The exploration problem (Schulman)（下）

[46] Parallel RL algor...

1485播放

26:14

Parallel RL algorithms, open problems and challenges in deep reinforcement（上）

[47] Parallel RL algor...

806播放

26:22

Parallel RL algorithms, open problems and challenges in deep reinforcement（中）

[48] Parallel RL algor...

1507播放

26:11

Parallel RL algorithms, open problems and challenges in deep reinforcement（下）

[49] Transfer in Reinf...

1371播放

28:18

Transfer in Reinforcement Learning (Finn)（上）

[50] Transfer in Reinf...

634播放

28:18

Transfer in Reinforcement Learning (Finn)（中）

[51] Transfer in Reinf...

984播放

28:16

Transfer in Reinforcement Learning (Finn)（下）

[52] Neural Architectu...

1011播放

25:24

Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z（上）

[53] Neural Architectu...

934播放

25:29

Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z（中）

[54] Neural Architectu...

956播放

25:17

Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z（下）

[55] Generalization an...

886播放

25:39

Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar（上）

[56] Generalization an...

544播放

25:40

Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar（中）

[57] Generalization an...

838播放

25:33

Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar（下）

为你推荐

08:15

a_s_Update Order ...

1357播放

a_s_Update Order Summary - 答案：更新订单汇总

05:28

Call Methods on O...

1117播放

Call Methods on Object - 对对象调用方法

08:53

49 Social Policy ...

1117播放

49 Social Policy Crash Course Government and Politics #49_标清

05:41

Vue JS 2 Tutorial...

627播放

Vue JS 2 Tutorial 3 - Data & Methods

05:55

pandas best pract...

1203播放

pandas best practices (6_10)- Using string methods - YouTube

06:16

3.1存在唯一性定理III（上）

1643播放

3.1存在唯一性定理III（上）

13:16

初二英语仁爱版科普版八年级上...

1.9万播放

初二英语仁爱版科普版八年级上册+下册教学视频初中英语 8年级上册八年级下册仁爱英语...

1:01:54

【中档】【函数】26、带参讨论单调...

1257播放

【中档】【函数】26、带参讨论单调区间

14:55

重返危机现场第六季(4)（下）

840播放

重返危机现场第六季(4)（下）

13:49

19-2-2课时二十二美国教育...

1252播放

19-2-2课时二十二美国教育（下）

00:34

麋鹿掉进窨井无法动弹，保护区工作人...

922播放

麋鹿掉进窨井无法动弹，保护区工作人员对其麻醉后救出

03:40

鱼腩加1把酸菜，这样做一盘太下饭了...

1116播放

鱼腩加1把酸菜，这样做一盘太下饭了，连吃3碗饭还想添，真开胃

11:01

Day27-06 __init__...

749播放

Day27-06 __init__方法（中）

11:10

23法理学背诵逻辑第一章绪论0...

1082播放

23法理学背诵逻辑第一章绪论01（上）

About NetEase
-
公司简介
-
联系方式
-
招聘信息
-
客户服务
-
相关法律
-
网络营销
-
网站地图
-
公开课用户服务协议

增值电信业务经营许可证粤B2-20090191