Running Text-to-image AI models locally.

2 min readJun 4, 2023

While ChatGPT has certainly made a significant impact as a leading generative AI, text-to-image models have also gained prominence and popularity. In a short span of time, numerous text-to-image services have emerged, offering their own trained models or repackaging existing models. These AI-generated images have become widely popular on social media, leading to the formation of vibrant communities around them within weeks.

For those who are curious and eager to explore, it may be valuable to run these models locally and gain a deeper understanding of their workings or uncover the “magic” behind them. However, before delving into the provided Jupyter notebooks, there are a few essential points to keep in mind:

OpenJourney

OpenJourney is a pre-trained model developed and maintained by PromptHero. Opting for a pre-trained model is a wise choice as it allows for a better understanding of the image generation process without getting entangled in the complexities of training a model from scratch. By using OpenJourney, you can focus on comprehending the steps involved in generating images rather than dealing with the intricacies of training a model.

GPU vs. CPU

I have created two versions of the Jupyter notebooks to cater to different hardware setups: one for CUDA-compatible GPUs and another for CPUs. While the code itself remains largely the same, there is a significant performance difference between the two versions.

When running the notebook on a CUDA-compatible GPU, such as the Nvidia A10, the prompt and settings took approximately 1.10 seconds per iteration. However, when running it on an 8-core i7 CPU, the same operations took over 40 seconds per iteration. This substantial gap in performance highlights the advantage of utilizing a CUDA-compatible GPU for faster image generation.

Hugging Face Hub

Often regarded as the “GitHub of AI/ML models,” this platform serves as an excellent resource for accessing and sharing AI/ML models. Creating an account on the platform is free and necessary for running these models locally. It also provides a valuable opportunity to explore a wide range of pre-trained text-to-image models and potentially delve into general-purpose text-to-text language models.

Notebooks

The notebook with CUDA-compatible GPU optimizations, can be found here.
The notebook to run the model on CPU can be found here.

Disclaimers

The code above is for illustration/education purposes only.
The image generation process is extremely resource intensive. It will use all available CPUs or GPUs to the max. So, if heating is an issue I would suggest running these notebooks in Google Colab or use a cloud service with GPU instance access.