togethercomputer/evo-1-131k-base · evo crash the databricks server when doing a simple inference

compute ressource: databricks azure cluster with nvidia A100

the model had no problem being loaded

code:
'''
from transformers import AutoConfig, AutoModelForCausalLM
from stripedhyena.tokenizer import CharLevelTokenizer

tokenizer = CharLevelTokenizer(512)

hf_model_name = 'togethercomputer/evo-1-131k-base'

model_config = AutoConfig.from_pretrained(
hf_model_name,
trust_remote_code=True,
revision='1.1_fix',
)

model_config.use_cache = True

model = AutoModelForCausalLM.from_pretrained(
hf_model_name,
config=model_config,
trust_remote_code=True,
revision='1.1_fix',
)

sequence = 'ACGT'

input_ids = torch.tensor(
tokenizer.tokenize(sequence),
dtype=torch.int,
).to(device).unsqueeze(0)

with torch.no_grad():
logits, _ = model(input_ids)
'''

it crashed the python kernel at the last line.
I observed exactly the same thing for stripedhyena

I was not sure if the issue was caused by configuration of my databrick ressources I also tried randomly other models of the same size e.g. "HuggingFaceH4/zephyr-7b-beta" and there were no problem making the inference. I do not know if there's any other possible incompatibility between stripedhyena/evo and databricks though.

Does anyone also encounter this problem?