A 3d Rendering Engine for ComfyUI
A community project shared on GitHub Discussions by paulh4x
Project Description
🧊 ComfyUI-PHRenderFormerWrapper 🧊
PH's ComfyUI RenderFormer Wrapper is a complete set of custom nodes for tokenbased 3d rendering inside ComfyUI.
ComfyUI_00184_.mp4
Download and Documentation: ComfyUI_PHRenderFormerWrapper GitHub Repository
RenderFormer is a model for neural rendering of 3d geometry based on tokens, it is a new approach in rendering 3d geometry and is best described in this 5 min youtube video: https://www.youtube.com/watch?v=qYJk9l65eJ8&ab_channel=TwoMinutePapers. The model is being presented by the authors at the next siggraph and i see a lot of potential in it. Thats why i tried to contribute something back to the opensource community, without having written any code before.
Everything is very limited and its just my babysteps in coding, but here are the actual features:
🎨 End-to-End Rendering: Load 3D models, define materials, set up cameras, and render—all within ComfyUI.
⚙️ Modular Node-Based Workflow: Each step of the rendering pipeline is a separate node, allowing for flexible and complex setups.
🎥 Animation & Video: Create camera and light animations by interpolating between keyframes. The nodes output image batches compatible with ComfyUI's native video-saving nodes.
🔧 Advanced Mesh Processing: Includes nodes for loading, combining, remeshing, and applying simple color randomization to your 3D assets.
💡 Lighting and Material Control: Easily add and combine multiple light sources and control PBR material properties like diffuse, specular, roughness, and emission.
More resources:
ComfyUI
microsoft's RenderFormer GitHub Repository
I am completely new to all of this and was pointed to kilo code, then i installed vs code for it and tried with the free 20$ credtis. The first attempt failed but then i read about the architectmode and created a plan for this:
`#` ComfyUI RenderFormer Custom Nodes: Implementation Plan
This document outlines the plan for creating ComfyUI custom nodes to integrate the RenderFormer model.
## 1. Project Setup
- **Directory:** A new directory named `ComfyUI-PHRenderFormerWrapper` will be created to house all the necessary files for the custom node.
- **Core Files:**
- `__init__.py`: This file will register the custom nodes with ComfyUI, making them available in the user interface.
- `nodes.py`: This file will contain the Python code for all the custom nodes.
- `requirements.txt`: This file will list all the Python dependencies required for the nodes to function correctly.
## 2. Dependencies
The `requirements.txt` file will include a merged list of dependencies from both the original RenderFormer model and the standard requirements for a ComfyUI custom node. This ensures a complete and functional environment. Key dependencies will include:
- `torch`
- `trimesh`
- `huggingface_hub`
- `simple-ocio`
- `dacite`
- `einops`
- `roma`
- `safetensors`
- `scipy`
- `pymeshlab`
- `h5py`
- `natsort`
- `imageio[ffmpeg]`
- `git+https://github.com/iamNCJ/renderformer-liger-kernel.git`
## 3. Node Implementation (`nodes.py`)
The integration will be achieved through a set of modular and user-friendly nodes that form a clear pipeline within ComfyUI.
### Node 1: `RenderFormerModelLoader`
- **Purpose:** To load the pre-trained RenderFormer model from Hugging Face.
- **Inputs:**
- `model_id` (string): The identifier of the model on Hugging Face (e.g., `microsoft/renderformer-v1.1-swin-large`).
- **Logic:**
- The node will call `RenderFormerRenderingPipeline.from_pretrained(model_id)` to download and initialize the model.
- **Output:**
- `RENDERFORMER_PIPELINE`: A pipeline object that is ready for rendering.
### Node 2: `RenderFormerSceneBuilder`
- **Purpose:** To construct a 3D scene in the format that RenderFormer requires, directly from inputs within the ComfyUI. This node abstracts the manual process of creating JSON and HDF5 files.
- **Inputs:**
- `trimesh`: A mesh object provided by another node (e.g., a mesh loader).
- `camera_config`: A configuration object for the camera, including position, look-at point, and field of view (FOV).
- `material_properties`: A configuration object for material properties like diffuse color, roughness, and emissive values.
- **Logic:**
- This node will programmatically generate the necessary data tensors (`triangles`, `texture`, `vn`, `c2w`, `fov`) in memory, replicating the logic from `scene_processor/to_h5.py`.
- **Output:**
- `RENDERFORMER_SCENE`: A scene object containing all the prepared data tensors required for rendering.
### Node 3: `RenderFormerGenerator`
- **Purpose:** The main node that executes the rendering process.
- **Inputs:**
- `pipeline`: The `RENDERFORMER_PIPELINE` object from the model loader node.
- `scene`: The `RENDERFORMER_SCENE` object from the scene builder node.
- `resolution` (integer): The desired width and height of the output image.
- `tone_mapper` (string, optional): The tone mapping algorithm to apply to the final image (e.g., 'agx', 'filmic').
- **Logic:**
- The node will invoke the `pipeline.render()` method, passing in the scene data and rendering parameters, as demonstrated in the `infer.py` script.
- **Output:**
- `IMAGE`: The final rendered image.
This plan ensures a robust, modular, and user-friendly integration of RenderFormer into ComfyUI.>`
I added a wrapper for hunyuan3d model by kijai and the example nodepack from filltm as context
I cannot remember the exact prompt for this but it was something like:
create a detailed plan to integrate the RenderFormer Model from @/url_to_microsoft github_rep with a complete repository of custom nodes for @/url_to_ComfyUI , make sure to follow these steps:
- look at the requirements of the renderformer model and add them to the requirements of this repository
- look at @/renderformer/readme_file and make sure to follow the process to render an image for creating nodes
- create a set of nodes to load the model in comfyui and render a 3d scene
From then on it went pretty good, when the plan was adjusted to certain tasks, like making this an independent repo or adding animation functionality to an object. Basically i kept adding functions and nodes in code-mode, used debug mode when i got an error. This was exceptional easy because of debugging and pasting the logs back fixed almost everything. Sometimes i had to point to certain files like the rendering_pipeline.py or batch_infer.py that i thought could be of use to a llm to understand the process and that also helped a lot. Sometimes my context window exceeded by far because i forgot to look out for it and was complete mindblown how far i got with this. All the major steps have been made with the free credits only, but especially closer to thursdays i ran short, in the end i really wanted it to work out and spent 6x15€ during the entire creation, that took mainly place on some of the nights between 06/25 to 07/15, i would estimate the entire amount of time spent with this from installing kilo to publicly open the repo to about 30-40 hours with a lot of adjusting my prompts before sending them.
I would love to push this development further and appreciate any ideas, feedback and support for it.
Thanks for the read,
/PH
Continue the conversation
This project was originally shared as a GitHub Discussion. Join the conversation, ask questions, or share your own implementation.
View Discussion on GitHub