openai pdftotext sentence_transformers numpy pandas