do html documents need tags removed?

#10
by awokeknowing - opened

my documents are descriptions in html, where there are paragraph tags and ul li lists, and some strong tags etc. Do I need to strip all that before embedding, or does it help to understand the meaning of the text?

NLP Group of The University of Hong Kong org

Hi, Thanks a lot for your interest in the INSTRUCTOR model!

For the html descriptions, I would suggest removing the tags for better semantic understanding.

Feel free to add any further questions or comments!

Sign up or log in to comment