Spaces:
Running
Running
Creating a custom task | |
______________________ | |
Creating a custom task in `BrowserGym` can be done easily by inheriting from the `AbstractBrowserTask` class. | |
Let's start with an example, we will build a task that starts from the Google Search page and asks for the Eiffel Tower Wikipedia page. | |
* **Goal**: Search for 'Eiffel Tower' Wikipedia page. | |
* **Reward**: Gets reward = 1 if reaches the expected Wikipedia page on Eiffel Tower, else gets 0. | |
.. code-block:: python | |
from typing import Tuple | |
import playwright.sync_api | |
from browsergym.core.task import AbstractBrowserTask | |
class SampleTask(AbstractBrowserTask): | |
def __init__(self, seed: int) -> None: | |
super().__init__(seed) | |
def get_task_id(cls): | |
return "sample_task" | |
First, let's setup the task. To do this, we need to implement the `setup()` function. The starting point is *https://www.google.com* page and the goal is *"Search for 'Eiffel Tower' Wikipedia page"*. | |
.. code-block:: python | |
class SampleTask(AbstractBrowserTask): | |
# ... | |
# Code above | |
# ... | |
def setup(self, page: playwright.sync_api.Page) -> Tuple[str, dict]: | |
"""Set up everything needed to execute the task.""" | |
page.goto("https://www.google.com", timeout=10000) | |
goal = "Search for 'Eiffel Tower' Wikipedia page." | |
info = {} | |
return goal, info | |
Next, we need to compute a reward. For this, we'll implement our validation criteria in the `validate()` function. In our case, we consider the task being completed when the Eiffel Tower Wikipedia page is reached. Else it's a fail. | |
.. code-block:: python | |
class SampleTask(AbstractBrowserTask): | |
# ... | |
# Code above | |
# ... | |
def validate( | |
self, page: playwright.sync_api.Page, chat_messages: list[str] | |
) -> Tuple[float, bool, str, dict]: | |
"""Compute reward based on reaching final URL.""" | |
if page.url == "https://en.wikipedia.org/wiki/Eiffel_Tower": | |
return 1.0, True, "Task completed", {} | |
else: | |
return 0.0, False, "", {} | |
We can also implement the code for completing the task, it's an oracle (a.k.a. cheat) version. For this, we'll fill out the `cheat()` function. | |
.. code-block:: python | |
class SampleTask(AbstractBrowserTask): | |
# ... | |
# Code above | |
# ... | |
def cheat(self, page: playwright.sync_api.Page, chat_messages: list[str]) -> None: | |
"""Solve the task in a single step using a hard-coded Playwright solution.""" | |
page.get_by_text("Search").fill("Eiffel Tower") | |
page.get_by_text("Google Search").click() | |
page.get_by_text("Eiffel Tower - Wikipedia").click() | |
Finally, the `teardown()` function. This function allows to clean resources before closing the enviroment. In our case, nothing need to be done, so we will leave it empty. | |
.. code-block:: python | |
class SampleTask(AbstractBrowserTask): | |
# ... | |
# Code above | |
# ... | |
def teardown(self) -> None: | |
# Nothing to do for this task. | |
pass | |
Our folder structure should look like the following: | |
.. code-block:: bash | |
. | |
|ββ tasks | |
| βββ __init__.py | |
| βββ sample_task.py | |
βββ run_task.py | |
Now we should register the task in the gym environment using the following code in the `__init__.py` of your package: | |
.. code-block:: python | |
from browsergym.core.registration import register_task | |
from .sample_task import SampleTask | |
register_task(id=SampleTask.get_task_id(), task_class=SampleTask) | |
Now that the task is registered it can be called via this code that you can put in `run_task.py` file: | |
.. code-block:: python | |
import gymnasium as gym | |
import tasks # will register the gym environment | |
env = gym.make("browsergym/sample_task") | |
obs, info = env.reset() | |
done = False | |
while not done: | |
action = "noop()" | |
obs, reward, terminated, truncated, info = env.step(action) | |
print(f"Reward: {reward}, Done: {done}, Info: {info}") | |