Spaces:

nvidia
/

Test-Time-Translation-LLM-Demo

Sleeping

huckiyang commited on Mar 15

Commit

fe23ebb

1 Parent(s): 135276f

gated model updates

Files changed (1) hide show

app.py CHANGED Viewed

@@ -26,11 +26,13 @@ lm_model = AutoModelForCausalLM.from_pretrained(
     device_map="auto"
 )
-# Load the reward model
 RM = AutoModelForCausalLMWithValueHead.from_pretrained(
     'ray24724919/plan2align_rm',
     torch_dtype=torch_dtype,
-    device_map="auto"
 )
 RM.eval()
 print("Models loaded successfully!")

     device_map="auto"
 )
+# Load the reward model - fix the offloading issue
+print("Loading reward model...")
 RM = AutoModelForCausalLMWithValueHead.from_pretrained(
     'ray24724919/plan2align_rm',
     torch_dtype=torch_dtype,
+    device_map={"": 0},  # Force model to stay on GPU (device 0)
+    offload_folder=None,  # Disable offloading
 )
 RM.eval()
 print("Models loaded successfully!")