nvidia-smi before starting pipeline:
1978MiB / 19195MiB
At 0% of loading safetensors:
param.data = param.data.to("cpu", non_blocking=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: out of memory
My (slightly modified) script:
from OmniGen import OmniGenPipeline
import torch
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1")
Note: Your local model path is also acceptable, such as 'pipe = OmniGenPipeline.from_pretrained(your_local_model_path)', where all files in your_local_model_path should be organized as https://huggingface.co/Shitao/OmniGen-v1/tree/main
print("Starting first pipe...")
Text to Image
images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=512,
width=512,
guidance_scale=2.5,
seed=0,
offload_model=False,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_t2i.png") # save output PIL Image
print("Starting second pipe...")
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
Multi-modal to Image
In the prompt, we use the placeholder to represent the image. The image placeholder should be in the format of
<|image_*|>
You can add multiple images in the input_images. Please ensure that each image has its placeholder. For example, for the list input_images [img1_path, img2_path], the prompt needs to have two placeholders:
<|image_1|>,
<|image_2|>.
images = pipe(
prompt="A man in a black shirt is reading a book. The man is the right man in
<|image_1|>.",
input_images=["./imgs/test_cases/two_man.jpg"],
height=512,
width=512,
guidance_scale=2.5,
img_guidance_scale=1.6,
seed=0,
offload_model=True,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_ti2i.png") # save output PIL image
Note: Sometimes it will run the first pipeline then fail on the second pipeline.
Note: I see system ram climbing by about 12GB then going down 12GB as VRAM climbs 12GB (for the first pipe).
Note: If I set 'offload_model=True' for the first pipeline, it will not even finish the first pipeline. (run out of memory)
Note: This is running on WSL Ubuntu on Windows 10.
nvidia-smi before starting pipeline:
1978MiB / 19195MiB
At 0% of loading safetensors:
param.data = param.data.to("cpu", non_blocking=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: out of memory
My (slightly modified) script:
from OmniGen import OmniGenPipeline
import torch
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1")
Note: Your local model path is also acceptable, such as 'pipe = OmniGenPipeline.from_pretrained(your_local_model_path)', where all files in your_local_model_path should be organized as https://huggingface.co/Shitao/OmniGen-v1/tree/main
print("Starting first pipe...")
Text to Image
images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=512,
width=512,
guidance_scale=2.5,
seed=0,
offload_model=False,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_t2i.png") # save output PIL Image
print("Starting second pipe...")
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
Multi-modal to Image
In the prompt, we use the placeholder to represent the image. The image placeholder should be in the format of
<|image_*|>
You can add multiple images in the input_images. Please ensure that each image has its placeholder. For example, for the list input_images [img1_path, img2_path], the prompt needs to have two placeholders:
<|image_1|>,
<|image_2|>.
images = pipe(
<|image_1|>.",
prompt="A man in a black shirt is reading a book. The man is the right man in
input_images=["./imgs/test_cases/two_man.jpg"],
height=512,
width=512,
guidance_scale=2.5,
img_guidance_scale=1.6,
seed=0,
offload_model=True,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_ti2i.png") # save output PIL image
Note: Sometimes it will run the first pipeline then fail on the second pipeline.
Note: I see system ram climbing by about 12GB then going down 12GB as VRAM climbs 12GB (for the first pipe).
Note: If I set 'offload_model=True' for the first pipeline, it will not even finish the first pipeline. (run out of memory)
Note: This is running on WSL Ubuntu on Windows 10.