import torch
from diffusers import StableDiffusion3Pipeline

model_dir = 'llm_models/T2V/stable-diffusion-3-medium-diffusers' # local model path

pipe = StableDiffusion3Pipeline.from_pretrained(
model_dir, torch_dtype=torch.float16, revision="fp16",
)
pipe = pipe.to("cuda")

prompt = "A cat holding a sign that says hello world"

image = pipe(
prompt,
negative_prompt="",
num_inference_steps=28,
guidance_scale=7.0,
).images[0]
image

[ I run the above code and the error message is as follows: ]

---> 69 output, invvar = fused_layer_norm_cuda.rms_forward_affine(
70 input_, ctx.normalized_shape, weight_, ctx.eps)
71 ctx.save_for_backward(input_, weight_, invvar)
72 return output

RuntimeError: expected scalar type Float but found Half

To resolve this issue, you can try the following:
Use torch.float32 (32-bit floating-point) instead of torch.float16:
python
pipe = StableDiffusion3Pipeline.from_pretrained(
model_dir, torch_dtype=torch.float32, revision="fp16",
)

This will ensure that the input tensors have the expected data type for the fused_layer_norm_cuda operation.
If you want to use torch.float16 for memory efficiency, you can try converting the input tensor to torch.float32 before passing it to the fused_layer_norm_cuda operation:
python
prompt = "A cat holding a sign that says hello world"

image = pipe(
prompt,
negative_prompt="",
num_inference_steps=28,
guidance_scale=7.0,
).images[0]

Convert the input tensor to torch.float32 before passing it to the operation

image = image.to(torch.float32)

This will temporarily convert the input tensor to torch.float32 for the duration of the operation and then convert it back to torch.float16.
If the issue persists, you can try the following additional steps:
Ensure that you have the latest version of PyTorch and the Diffusers library installed.
Check if there are any updates or known issues related to the StableDiffusion3Pipeline or the specific model you are using.
If you are running your code in a specific environment (e.g., Google Colab, Jupyter Notebook), make sure that the environment is properly configured and has the necessary dependencies installed.