Skip to content

Archit01/M2SVID-gui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

M2SVID GUI - Windows Version

This project is a user-friendly GUI version of the original M2SVID (Monocular-to-Stereo Video Conversion). It provides an end-to-end pipeline for depth-based warping, inpainting, and merging to create high-quality stereo (3D) videos from standard 2D monocular videos.

Note

This is a fork/standalone version focused on ease of use for Windows users, featuring a complete Gradio-based graphical interface. GUI developed by Archit. Original research and code by Nina Shvetsova et al. [3DV 2026].


✨ Key Features

  • Automated Windows Installation: One-click setup using a portable Python 3.12 environment (no complex conda setup required).
  • Gradio GUI: A complete graphical interface to manage all stages of the pipeline:
    • Section 1: Warping: Generate reprojected right-eye views from depth maps.
    • Section 2: Inpainting: Fill in disocclusions using a temporal/spatial-aware model.
    • Section 3: Merging: Final SBS (Side-by-Side) encoding with custom shadow/edge mitigation and color transfer.
  • Optimized for RTX GPUs: Pre-configured for CUDA 12.8 with support for RTX 20/30/40/50 series GPUs.
  • Efficient Memory Management: Built-in support for tiling and chunking to handle high-resolution videos without running out of VRAM.

🛠️ Getting Started (Windows)

1. Prerequisites

2. Installation

  1. Clone this repository recursively:
    git clone --recursive https://github.com/Archit01/M2SVID-gui.git
    (Alternatively, download the ZIP and ensure submodules are initialized manually).
  2. Double-click install_windows.bat.
    • This will download a portable Python 3.12 environment.
    • It will install all necessary dependencies (PyTorch 2.9.1, CUDA 12.8, xformers, etc.).
    • This keeps your system's global Python installation untouched.

3. Download Models (Checkpoints)

You must download the following weights and place them in a ckpts folder in the project root:

  1. Clip & SGM Weights: Download ckpts.zip from Hi3D repo and unzip into ckpts/. (Download ckpts.zip from Hi3D repo and unzip (follow step "2. Download checkpoints here and unzip."). Our model follows Hi3D implementation and uses the same openclip model.) Link: https://drive.google.com/file/d/1j_NEG2CPhFeRetYziWK6Qe62R5h7lG_V/view?usp=sharing
  2. M2SVid Weights: Download the M2SVid weights and extract them into ckpts/.
    • You should have m2svid_weights.pt and m2svid_no_full_atten_weights.pt in the ckpts folder.

🚀 How to Use

1. Launching the GUI

Double-click run_app.bat. This will start the Gradio server and open the interface in your web browser.

2. Pipeline Steps

The interface is split into three tabs:

  • Tab 1: Warping: Provide your input videos and corresponding depth maps (generated by tools like DepthCrafter).
  • Tab 2: Inpainting and Refine: Choose the model variant (Full Attention or No Full Attention) and process the warped videos to fill gaps.
  • Tab 3: Merging: Preview the final output, adjust SBS settings (Full/Half SBS, Anaglyph), and render the final 3D video.

📜 Original Credits & Citation

If you use this work, please cite the original authors:

@article{shvetsova2026m2svid,
  title={M2SVid: End-to-End Inpainting and Refinement for Monocular-to-Stereo Video Conversion},
  author={Shvetsova, Nina and Bhat, Goutam and Truong, Prune and Kuehne, Hilde and Tombari, Federico},
  journal={3DV},
  year={2026}
}

Original Repository: google-research/m2svid

About

By Croods

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors