Spaces:
Running
Running
commit files to HF hub
Browse files- papers.csv +1 -1
papers.csv
CHANGED
@@ -1107,7 +1107,7 @@ Multiplier Bootstrap-based Exploration,"Runzhe Wan, Haoyu Wei, Branislav Kveton,
|
|
1107 |
Sequential Strategic Screening ,"Lee Cohen, Saeed Sharifi-Malvajerdi, Kevin Stangl, Ali Vakilian, Juba Ziani",,,,,,,,,
|
1108 |
Robust Subtask Learning for Compositional Generalization,"Kishor Jothimurugan, Steve Hsu, Osbert Bastani, Rajeev Alur",http://arxiv.org/abs/2302.02984,,https://huggingface.co/papers/2302.02984,,,,2302.02984,4,0
|
1109 |
Hindsight Learning for MDPs with Exogenous Inputs,"Sean R. Sinclair, Felipe Vieira Frujeri, Ching-An Cheng, Luke Marshall, Hugo Barbalho, Jingling Li, Jennifer Neville, Ishai Menache, Adith Swaminathan",http://arxiv.org/abs/2207.06272,,https://huggingface.co/papers/2207.06272,,,,2207.06272,9,0
|
1110 |
-
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding,"Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian M Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova",http://arxiv.org/abs/2210.03347,,https://huggingface.co/papers/2210.03347,,,,2210.03347,10,
|
1111 |
Settling the Reward Hypothesis,"John Martin, Michael Bowling, David Abel, Will Dabney",http://arxiv.org/abs/2212.10420,,https://huggingface.co/papers/2212.10420,,,,2212.10420,4,0
|
1112 |
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation,"Mark Rowland, Yunhao Tang, Clare Lyle, Remi Munos, Marc Bellemare, Will Dabney",http://arxiv.org/abs/2305.18388,,https://huggingface.co/papers/2305.18388,,,,2305.18388,6,0
|
1113 |
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition,"Yash Chandak, Shantanu Thakoor, Zhaohan Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana Borsa",http://arxiv.org/abs/2305.00654,,https://huggingface.co/papers/2305.00654,,,,2305.00654,7,0
|
|
|
1107 |
Sequential Strategic Screening ,"Lee Cohen, Saeed Sharifi-Malvajerdi, Kevin Stangl, Ali Vakilian, Juba Ziani",,,,,,,,,
|
1108 |
Robust Subtask Learning for Compositional Generalization,"Kishor Jothimurugan, Steve Hsu, Osbert Bastani, Rajeev Alur",http://arxiv.org/abs/2302.02984,,https://huggingface.co/papers/2302.02984,,,,2302.02984,4,0
|
1109 |
Hindsight Learning for MDPs with Exogenous Inputs,"Sean R. Sinclair, Felipe Vieira Frujeri, Ching-An Cheng, Luke Marshall, Hugo Barbalho, Jingling Li, Jennifer Neville, Ishai Menache, Adith Swaminathan",http://arxiv.org/abs/2207.06272,,https://huggingface.co/papers/2207.06272,,,,2207.06272,9,0
|
1110 |
+
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding,"Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian M Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova",http://arxiv.org/abs/2210.03347,,https://huggingface.co/papers/2210.03347,,,,2210.03347,10,4
|
1111 |
Settling the Reward Hypothesis,"John Martin, Michael Bowling, David Abel, Will Dabney",http://arxiv.org/abs/2212.10420,,https://huggingface.co/papers/2212.10420,,,,2212.10420,4,0
|
1112 |
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation,"Mark Rowland, Yunhao Tang, Clare Lyle, Remi Munos, Marc Bellemare, Will Dabney",http://arxiv.org/abs/2305.18388,,https://huggingface.co/papers/2305.18388,,,,2305.18388,6,0
|
1113 |
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition,"Yash Chandak, Shantanu Thakoor, Zhaohan Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana Borsa",http://arxiv.org/abs/2305.00654,,https://huggingface.co/papers/2305.00654,,,,2305.00654,7,0
|