Spaces:
Sleeping
Sleeping
File size: 1,172 Bytes
70b87af |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# Async Website Loader
This loader is an asynchronous web scraper that fetches the text from static websites by converting the HTML to text.
## Usage
To use this loader, you need to pass in an array of URLs.
```python
from llama_index.readers.web.async_web.base import AsyncWebPageReader
# for jupyter notebooks uncomment the following two lines of code:
# import nest_asyncio
# nest_asyncio.apply()
loader = AsyncWebPageReader()
documents = loader.load_data(urls=["https://google.com"])
```
### Issues Jupyter Notebooks asyncio
If you get a `RuntimeError: asyncio.run() cannot be called from a running event loop` you might be interested in this (solution here)[https://saturncloud.io/blog/asynciorun-cannot-be-called-from-a-running-event-loop-a-guide-for-data-scientists-using-jupyter-notebook/#option-3-use-nest_asyncio]
### Old Usage
use this syntax for earlier versions of llama_index where llama_hub loaders where loaded via separate download process:
```python
from llama_index import download_loader
AsyncWebPageReader = download_loader("AsyncWebPageReader")
loader = AsyncWebPageReader()
documents = loader.load_data(urls=["https://google.com"])
```
|