[NEW] Internet access for Assistants

#385
by nsarrazin HF staff - opened
Hugging Chat org
โ€ข
edited Mar 25, 2024

image (2).png

Hey! We have just released an update to HuggingChat Assistants that allows you to connect them to the Internet to get more relevant and interactive answers. When you create or edit an Assistant, you will now see an option for Internet access. It can have four settings: enabled, domain search, specific links, and disabled.

  • Enabled is the same websearch we currently use in HuggingChat. Use it for generic assistants that can have conversations about many domains.
  • Domains search allows you to specify a domain name or even part of a website that the web search will crawl to try to find relevant information. Useful if you want to have an Assistant that can find search and use content from your website or a particular news outlet you like, for example.
  • Specific Links lets you specify a direct URL to a web page or plain text document that you want to pass to the Assistant. This is very useful for talking to your text documents, adding extra context to your role-playing game, or passing any arbitrary data from your web server to the Assistant.
  • Dynamic links Allow the use of template variables {{url=https://example.com/path}} to insert dynamic content into your prompt by making GET requests to specified URLs on each inference.

Enabled

This is equivalent to enabling the web search toggle in HuggingChat. The model generates a query from your question, uses it to search the web, and parses the results to improve the answer. You can't restrict which domains or links are used in this mode.

Domains search

In this mode you can restrict the domains used in the web search. This is equivalent to appending site:example.com to the web query. For example if you want to make an assistant that only uses wikipedia as a source:

You can also be more specific, for example you could have an assistant that only knows about the docs for our library diffusers.

Specific Links

This allows you to directly specify up to 10 URLs that will be added directly to the context for better results. We support both HTML and plain text content such as markdown! For example, if you want your assistant to know about the chat-ui repository README file, you can add a link to the markdown file:

Or if you want an assistant that knows about the current top news headlines:

This is a very flexible setting because you have complete control over what is passed to the context. For example, you could create your own web server in a space that returns a different country name every day and use it to make a country guessing game where every user gets the same new country every day. Also note that for long web pages it will limit the maximum context it can use, so less may perform better.

Safety & Trust

Because web-connected assistants use an external source of information, you should never trust any information without verifying it yourself, and you should always manually check which source they use; this information is publicly available on the Assistant Settings page. Web-connected assistants have a special icon so you can easily identify them.

Let us know what you think of the feature, and feel free to share your new assistants in this thread! ๐Ÿค—

It doesn't look like it's been pushed to huggingface.co/chat, is this a bug?

Hugging Chat org

@EveryPizza Feel free to check again! Should be there now

Hugging Chat org

this is incredible ๐Ÿ”ฅ

Love itt! Great job team!

Can the specific domains option be used in some clever way with a cyoa game, where the user mostly inputs numbers?
I tried it here, but web fails: https://hf.co/chat/assistant/65c80b2288d76ebc0fc357ca

victor pinned discussion

I m afraid that the Domains search option is not working correctly

I have just created an assistant for the records in the President John F. Kennedy Assassination Records Collection

https://hf.co/chat/assistant/65f4604731b538d22b432df7

I need it to answer based only in material found on archives.gov subdomains but, alas, it returns text from reddit, wikipedia and other sources

Hugging Chat org
โ€ข
edited Mar 15, 2024

@emilios can you share a conversation where this doesn't work for you? because for me it works: https://hf.co/chat/r/DN9CIVy

Can you check this please?

https://hf.co/chat/r/lWWJp8E

Hugging Chat org

Ok interesting seems like there's indeed an issue, let me have a look!

Well the good news is that it did not happen again the last 5 hours :)

Hugging Chat org

I pushed a fix for it today, shouldn't happen again! @emilios

Is there a way in the system prompt or otherwise to have the web search not be used for every response? im finding that for my testing of a coding helper that if it gives its initial response and used web search and my reply is continue or anything really for it to continue with the process it outlined it searches the web for continue instead of continuing

  • note i have gotten it to not have this issue and keep responding in the context of the project implementation but it still searches web every time which would be nice to toggle
Hugging Chat org

Good point @Csplk we'll see what we can do.

Good point @Csplk we'll see what we can do.

Thanks

Awesome, it just works perfectly !
I have just automated my assistant with live data.
I'll make a video probably today to celebrate this.

This comment has been hidden

It works perfectly well:

There seems to be an occasional problem, like a content mismatch, when using
Dynamic Prompt

Sometimes the results are allright, but some other times they are confusing
I m guessing it either hallucinates or sometimes it cannot access the web

I've tried {{url=https://www.imdb.com/title/tt14539740/}} Godzilla x Kong
but I got https://www.imdb.com/title/tt13320622/ Lost City

https://hf.co/chat/r/giJJUN8

And it is not related to the LLM you choose

Hugging Chat org
โ€ข
edited Apr 5, 2024

@emilios this might not be super clear from the current layout but dynamic prompt only injects web pages in the assistant system instructions. You would have to put your URL in your system instructions for them to be picked up.

Thank you for your answer. Since we can give a specific url with other ways in the instructions for dynamic content, would n't it be nice if something like {{url=}} is used in the chat prompt to parse the text of a different url every time needed?

@nsarrazin
image.png

I think you are using google json api and that's the reason it stops working.

Hugging Chat org

Sry @KingNish it's back working!

This comment has been hidden

Hi all, and thank you HF team for making this awesome feature available. After much experimentation and failed attempts, it seems I finally managed to make an assistant that works for finding relevant content on my website with the "domains search" option. For anyone is interested in the same use case, you can check my system prompt and conversation starters. I have observed that when asking not to generate URLs as part of the response (the links are generated by the app) and sending very simple prompts ("Show me content about...") it does what is expected most of the time and provides links to relevant pages and articles along with a coherent text intro:

Talking to Chatbots Web Browser (assistant)

For general web browsing, here's another assistant I built and works fine so far: Web Browsing Chatbot (assistant)

A potential improvement for this feature would be to connect asisstant with APIs, so content retrieval coul be refined (for example, leveraging Wordpress search on blogs built in WP) or incorporate any sort of interaction with a website that can be defined with an API.

I have switched on Command R and it also rocks !!!

I have switched on Command R and it also rocks !!!

Same, It's even better than Mixtrail, (Mixtrail sometimes repeat same thing or sometime system prompt, I told him to do not do this but it does not stope permanently, but when i switched to Comman R+ it works.)

I have switched on Command R and it also rocks !!!

Same, It's even better than Mixtrail, (Mixtrail sometimes repeat same thing or sometime system prompt, I told him to do not do this but it does not stope permanently, but when i switched to Comman R+ it works.)

I was on Nous ;-p

I was on Nous ;-p

Nous was better than mixtrail in creative work and Mixtrail is better in specific text generation work.
so, it depends on your work type.

But it also have some wrong information
image.png

like he was made by open ai, and many more.

it is possible to pass argument to "Specific links" as the query from the user ?

it is possible to pass argument to "Specific links" as the query from the user ?

Yes, use dynamic links.

it is possible to pass argument to "Specific links" as the query from the user ?

Yes, use dynamic links.

but what is the syntax in order to send user input to the to dunamic links? {{prompt}} does not work

Hugging Chat org

but what is the syntax in order to send user input to the to dunamic links? {{prompt}} does not work

We are currently working on something to do this ๐Ÿ‘€

Could you add a button to enable or disable internet search if the chatbot allows it, as in your default models? This would prevent the chatbot from automatically using internet search every time a prompt is sent.

In the case the resource is an url targetting an image, would it possible to display it directly in the chat ? Currently I can get the url but I have to click on it to see the image in another tab.

Could you add a button to enable or disable internet search if the chatbot allows it, as in your default models? This would prevent the chatbot from automatically using internet search every time a prompt is sent.

Just conclude in Systm prompt when to use web search and when not

In the case the resource is an url targetting an image, would it possible to display it directly in the chat ? Currently I can get the url but I have to click on it to see the image in another tab.

Use url in format

![](url)

example - ![](https://source.unsplash.com/random/?photo)

but what is the syntax in order to send user input to the dynamic links? {{prompt}} does not work.

Conclude in prompt that {prompt} is prompt given by user.

Here is an example - https://hf.co/chat/r/-XIrb_g

System prompt: -
Answer the query given by user from this site:
https://en.wikipedia.org/wiki/{prompt}
Here {prompt} is prompt given by user.

@KingNish : I'm going to try to tune the SYSTEM prompt to ask... when an image url is found, to answer so

@KingNish : I'm going to try to tune the SYSTEM prompt to ask... when an image url is found, to answer so

Can you give link to your bot

@rastadidi Are you trying to make bot which can identify objects in image or answer the question asked by user.

@KingNish :

@rastadidi Are you trying to make bot which can identify objects in image or answer the question asked by user.

Indeed I would lke him to enhance the answer with images
Chatbot : https://hf.co/chat/assistant/65bd4d47a16aaa191b5b501d

(the image url is part of a csv)

@rastadidi see this https://hf.co/chat/r/OjN_g8M

Image url are not working before because of this
image.png

i shifted it to hugging face database and it starts working

but what is the syntax in order to send user input to the to dynamic links? {{prompt}} does not work

Now Dynamic links are working.

@KingNish , you mentioned that we should 'Just conclude in System prompt when to use web search and when not,' as you can see in your response here: https://huggingface.co/spaces/huggingchat/chat-ui/discussions/385#661e39348d2ef5cea2837fd1. I was suggesting adding this button to our customized chatbot because if we enable this option (Web Research), all responses will come solely from web research. Unlike the main HuggingChat, where we have this button as an option, we can activate it whenever we choose to.

@IRZOUNI OOo, something like this
image.png
+1 for this feature

@KingNish Definitely yes!!! that would be very much appreciated. Thank you.

deleted

Can you open the prompt customization of the Internet search model? I have been trying to rewrite papers with cR+ recently, but it can only make up fake references, and it often gets stuck after enabling Internet search.

Any update on dynamic links where API URL with arguments is provided?

Anyone have problems with the web search not working? it just says generation failed
Screenshot 2024-04-19 at 7.17.32โ€ฏPM.png

I think that there should also be the option to enable or disable the web search for each specific query within the chat just like in normal (non-assistant) chats.

Hugging Chat org

I think that there should also be the option to enable or disable the web search for each specific query within the chat just like in normal (non-assistant) chats.

Or integrate it as a tool that can be called by the model (or not) by himself? we are working on something like this.

I think that there should also be the option to enable or disable the web search for each specific query within the chat just like in normal (non-assistant) chats.

Or integrate it as a tool that can be called by the model (or not) by himself? we are working on something like this.

I think they can both be helpful. Personally I like the toggle option because I have more control over it. But I also get that the tool use might make the experience a little more fluent (if it works well enough).

Or integrate it as a tool that can be called by the model (or not) by himself? we are working on something like this.

This is the only correct solution. The existing search is essentially to let other models guess what we want to search for, which is far less accurate than letting the model master function call by itself. And function calls can easily add mathematics and other abilities to the model, with coding cost-effective.

You mean... we could build then embed third party custom tools ?... "ร  la langchain" ? That will be awesome !

Assistant should be able to converse and interact with the user. Only doing an internet search when it is useful. If I greet the assistant, for example, I do not want my greeting to trigger an internet search. Essentially, the assistant should use the internet for information retrieval, and they should know when it is necessary for an internet search. ๐Ÿ˜

Hugging Chat org
โ€ข
edited Apr 30, 2024

don't tell anyone but this is coming @Joseph717171 (btw sometimes I want to decide to use websearch or not by myself vs always letting the model decide)

It would be great to add a feature to custom assistants, which have been given internet capabilities, to disable/enable them with a button, as is the case with the models on the page https://huggingface.co/chat/models.

I also think it would be good to disable the option that hides/unhides the copy to clipboard, retry buttons. On a PC, it is sometimes difficult to get these buttons to appear and to click on them to copy to the clipboard.

Thanks for everything you do, the Hugging Face team, its approach, its free open source offering. I love you guys.

don't tell anyone but this is coming @Joseph717171 (btw sometimes I want to decide to use websearch or not by myself vs always letting the model decide)

I love lamp. And thank you for your service, amen.

I pushed a fix for it today, shouldn't happen again! @emilios

Try this.... https://hf.co/chat/r/NaBll9k

ะŸั€ะพัั‚ะพ ะทะผั–ะฝั–ั‚ัŒ ั–ะบะพะฝะบัƒ, ะฒะพะฝะฐ ะฟะพั‚ะฒะพั€ะฝะฐ, ั–ะฝั‚ะตั€ั„ะตะนั ะดัƒะถะต ะบั€ะฐัะธะฒะธะน ะฐะปะต ั–ะบะพะฝะบะฐ/ะปะพะณะพั‚ะธะฟ ะฟะพั‚ะฒะพั€ะฝะฐ. ั†ะต ั”ะดะธะฝะต ะทะฐัƒะฒะฐะถะตะฝะฝั

AI assistant searches the web even when we search is disabled

See everyone build their own websearch tool and most likely this is free like duckduck go search but there is a catch if we put a prompt like "search for a specific product in various websites and compare their prices" in this case the large model have to use the search function more then one time and the llm have to collect the prices of product from various e-commerce websites and then llm have to study them and then they have to compare them but it is not possible in current time no ai search agent can do this as per I know

Remember this is not a agentic work it is more focusing on improving the function calling ability of a llm model this they can call search function to study not to just providing the answer

None of the search functionalities (web, domain, specific links) work for my assistants anymore. Here is an example: https://hf.co/chat/r/my6uN3Y

Hugging Chat org

Thanks for bringing this up @HannaLueschow ! I just deployed a fix for this, it should be live in about 10 minutes or so. Let me know if that resolved the issue for you.

Thanks for the quick response! But it doesn't seem to have worked (yet)

Edit: It's back up ๐Ÿ‘

Is it not possible to just include the toggle web search on and off button for assistants?

A thought: I would like my assistant to have access to do web search, but only on demand. Meaning, usually it won't have access, unless I'm telling him to retrieve some information (e.g., searching for some articles on google scholar). Additionally, I am not sure if assistants that are not configured to search the internet are capable of retrieving information from links they are provided with.

I assume since this hasn't been added yet, it must be non-trivial. I'm guessing it has to do with the way models are initialized and their chat thread is likely affixed to the instance of the conversation you are in. To toggle the web search off, you'd have to reinitialized the model.. which is easy to do when you don't have a single instance of a model carrying the world on its shoulders.. or am I completely off and dead wrong? I don't know I am just learning to break scripts :D

Codewise, this is not easy to implement. You mark flag or a checkbox or something and it's done. I guess that doing this within an assistant is not that easy. However, I think that this is highly valuable. For example, I have an assistant that I want it to provide me links during a discussion. Most of the time I don't need it to search the web. At the moment it is either searching the web with every message, or not at all. Searching the web with every message costs money and CO2 emission. So looking into it is worthwhile.

Yes, we all want the assistant to do a better job, but she's actually kind of a bitch if you get to know her. I used to live in the same apartment as her and I could hear her every night just laughing into a bag of popcorn melodramatically in such a way that it would amplify her droning cackle 2-3x.

I create assistants to summarise contents from websites. However, I need the summary to be specific only to the content of the target website. The Web Search feature would search for and add in related content from other sources. As such, it would be useful to include the URL Fetcher tool as an option.

Hi, I created a new Assistant (https://hf.co/chat/assistant/66a8d1fbf1ee8893bb66a6ec) and I was able to get this to work. I want to share the link with others. Do you need to login to HF for this to work? I tried to use the link in Incognito mode. I see my new Assistant but when I click Start Chatting, it goes to the generic Assistant.

Is it possible to know the privacy of some particular Assistants (e.g. BreakBot, https://hf.co/chat/assistant/66017fca58d60bd7d5c5c26c)? It seems this particular one has an option where it says "This assistant uses the websearch" as well as "This Assistant has dynamic prompts enabled and can make requests to external services.", however, it's not possible for me to see what external services it's using and it doesn't seem to be capable of accessing the internet.
I see Clone of Hugging Face CTO (https://hf.co/chat/assistant/65b26737e9ccc6d0853dc16f) uses the same kind of Dynamic Prompting for requests.
Could this mean it would get get conversation information or queries? Or is more of a default option that most Assistants have?

Allows users to custom search views

Allows users to custom search views

Could this, however, indicate that queries are sent to an external server or are privacy limitations set in place?

nsarrazin unpinned discussion

Hey, the internet access for assistants working very well but I would like to know how can I instruct the assistant to search in a Git repo? I mean, when I use llama 3.1 70b model with "Url Fetch" feature, and ask something about the codebase like:

"Do you know that this library supports any HashMap structure? Please search in codeberg.org/user/repo."

It's smart enough to perform a search in the codebase like: https://codeberg.org/user/repo/search?q=HashMap&ref=master

And pull out the relevant information. But I'm not able to instruct the same way for the assistant I've created.

Hi folks, the "Default" setting for the assistant seems to be searching the web now. It usually only happens on my second example prompt. The UI says the default option is no web search. I'm confused as to why my assistant is now searching stuff.

Hi folks, the "Default" setting for the assistant seems to be searching the web now. It usually only happens on my second example prompt. The UI says the default option is no web search. I'm confused as to why my assistant is now searching stuff.

same happening here with my assistant
managed to get around it by stating "Only use web search when asked to" in the system prompt, seems to work for now

It's happening to me too, more so with a custom assistant. Add "Do not search the web" at the end of prompts as a workaround for now.

It's happening with command-r-plus.

I hope the developers will give due consideration to integrating the 'Fetch URL' tool into the Custom Assistants creation process. This tool would be highly beneficial for users who, like myself, have created assistants designed to summarize content from specific URLs. At present, the 'Web Search' tool is the only option, but it often includes external sources in its summaries. This can be problematic as it may compromise the accuracy of the information, drawing from sources beyond the intended URL content.

Hugging Chat org

@LostSpirit @louay01 @pearsonkyle should be fixed now! there was an issue in the backend code

I just want to express my appreciation to the developers and whoever's servers this is running on, you change my life for the better

This comment has been hidden

c4ai-command-r-plus-08-2024 currently copes well with Fetch URL processing ๐Ÿค— (I will continue to observe)

Sign up or log in to comment