Skip to content

CUDA Error: out of memory #117

Description

@bagelbig

nvidia-smi before starting pipeline:

1978MiB / 19195MiB

At 0% of loading safetensors:

param.data = param.data.to("cpu", non_blocking=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

RuntimeError: CUDA error: out of memory

My (slightly modified) script:
from OmniGen import OmniGenPipeline

import torch
torch.cuda.empty_cache()
torch.cuda.ipc_collect()

pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1")

Note: Your local model path is also acceptable, such as 'pipe = OmniGenPipeline.from_pretrained(your_local_model_path)', where all files in your_local_model_path should be organized as https://huggingface.co/Shitao/OmniGen-v1/tree/main

print("Starting first pipe...")

Text to Image

images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=512,
width=512,
guidance_scale=2.5,
seed=0,
offload_model=False,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_t2i.png") # save output PIL Image

print("Starting second pipe...")
torch.cuda.empty_cache()
torch.cuda.ipc_collect()

Multi-modal to Image

In the prompt, we use the placeholder to represent the image. The image placeholder should be in the format of <|image_*|>

You can add multiple images in the input_images. Please ensure that each image has its placeholder. For example, for the list input_images [img1_path, img2_path], the prompt needs to have two placeholders: <|image_1|>, <|image_2|>.

images = pipe(
prompt="A man in a black shirt is reading a book. The man is the right man in <|image_1|>.",
input_images=["./imgs/test_cases/two_man.jpg"],
height=512,
width=512,
guidance_scale=2.5,
img_guidance_scale=1.6,
seed=0,
offload_model=True,
use_kv_cache=True,
offload_kv_cache=True,
separate_cfg_infer=True
)
images[0].save("example_ti2i.png") # save output PIL image

Note: Sometimes it will run the first pipeline then fail on the second pipeline.
Note: I see system ram climbing by about 12GB then going down 12GB as VRAM climbs 12GB (for the first pipe).
Note: If I set 'offload_model=True' for the first pipeline, it will not even finish the first pipeline. (run out of memory)
Note: This is running on WSL Ubuntu on Windows 10.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions