PromptCharm is an interactive system for iterative refinement of text-to-image creation with diffusion models. This repository contains the official implementation of our related paper:
PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement
Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, Tianyi Zhang
2024 ACM CHI Conference on Human Factors in Computing Systems (CHI 2024)
We suggest use virtual environment to avoid messing up your own environments.
Create virtual environments (optional)
$ cd ./backend
$ python -m venv ./venv
$ source ./venv/bin/activate
Install
pip install -r requirements.txt
git clone -b promptcharm https://github.com/paulwong16/ecco.git
cd ecco
pip install -e .
cd ..
git clone https://github.com/paulwong16/daam.git
cd daam
pip install -e .
cd ..
Download pre-mined images from diffusion_db and organize them as the followings. You can also follow the notebook in ./backend
to do it by yourself.
├── web/dashboard
│ ├── public
│ ├── src
│ │ └── data
│ │ │── diffusion_db
│ │ │ │── 0.jpg
│ │ │ │── 1.jpg
│ │ │ └── ...
│ │ └── ...
│ └── ...
├── backend
└── ...
Install
$ cd ./web/dashboard
$ npm install
$ npm start
Copy the url and open it in browser.
$ cd ./backend
$ python main.py --seed [YOUR RANDOM SEED]
If you found our paper/code useful in your research, please consider citing:
@inproceedings{wang2024promptcharm,
author = {Wang, Zhijie and Huang, Yuheng and Song, Da and Ma, Lei and Zhang, Tianyi},
title = {PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement},
booktitle = {Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems},
year = {2024},
}
This project is released under the MIT license.
Kudos to the following projects: