AlignmentResearch
/

learned-planner

Reinforcement Learning

machine-learning

Model card Files Files and versions Community

learned-planner / README.md

taufeeque's picture

Update README.md

c7ba506 verified 6 months ago

|

history blame contribute delete

1.79 kB

	---
	language: en
	tags:
	- machine-learning
	- reinforcement-learning
	- sokoban
	- planning
	license: apache-2.0
	---

	# Trained learned planners

	This repository contains the trained networks from the paper ["Planning behavior in a recurrent neural network that
	plays Sokoban"](https://openreview.net/forum?id=T9sB3S2hok), presented at the ICML 2024 Mechanistic Interpretability
	Workshop.

	To load and use the NNs, please refer to the [learned-planner
	repository](http://github.com/alignmentresearch/learned-planner), and possibly to the [training code
	](https://github.com/AlignmentResearch/train-learned-planner).

	# Model details

	## Hyperparameters:

	See `model//cp_/cfg.json` for the hyperparameters that were used to train a particular run.

	## Best Models:

	The best models for each of the model type are stored in the following directory:
	\| Model \| Directory \| Parameter Count \|
	\|:-------\|:-----------\|:-----------------\|
	\| DRC(3, 3) \| `drc33/bkynosqi/cp_2002944000` \| 1,285,125 (1.29M) \|
	\| DRC(1, 1) \| `drc11/eue6pax7/cp_2002944000` \| 987,525 (0.99M) \|
	\| ResNet \| `resnet/syb50iz7/cp_2002944000` \| 3,068,421 (3.07M) \|

	## Probes & SAEs:

	The trained probes and SAEs are stored in the `probes` and `saes` directories, respectively.

	## Training dataset:

	The [Boxoban set of levels by DeepMind](https://github.com/google-deepmind/boxoban-levels).

	# Citation

	If you use any of these artifacts, please cite our work:

	```bibtex
	@inproceedings{garriga-alonso2024planning,
	title={Planning behavior in a recurrent neural network that plays Sokoban},
	author={Adri{\`a} Garriga-Alonso and Mohammad Taufeeque and Adam Gleave},
	booktitle={ICML 2024 Workshop on Mechanistic Interpretability},
	year={2024},
	url={https://openreview.net/forum?id=T9sB3S2hok}
	}
	```