I would like to know image generation AI!
Reliability of This Article
by Our Founder/CEO&CTO Hiroyuki Chishiro
- He has been involved in 12 years of research on real-time systems.
- He teaches OS (Linux kernel) in English at the University of Tokyo.
- From September 2012 to August 2013, he was a visiting researcher at the Department of Computer Science, the University of North Carolina at Chapel Hill (UNC), Chapel Hill, North Carolina, United States. He has been involved in research and development of real-time Linux in C language.
- He has experienced in more than 15 years of programming languages: C/C++, Python, Solidity/Vyper, Java, Ruby, Go, Rust, D, HTML/CSS/JS/PHP, MATLAB, Verse (UEFN), Assembler (x64, ARM).
- While a faculty member at the University of Tokyo, he developed the "Extension of LLVM Compiler" in C++ language and his own real-time OS "Mcube Kernel" in C language, which he published as open source on GitHub.
- In January 2020-Present, he is CTO of Guarantee Happiness LLC, Chapel Hill, North Carolina, United States, in charge of e-commerce site development and web/social network marketing. In June 2022-Present, he is CEO&CTO of Japanese Tar Heel, Inc. in Chapel Hill, North Carolina, United States.
- We have been engaged in disseminating useful information on AI and Crypto (Web3).
- We have written more than 20 articles on AI including AI chatbots such as ChatGPT, Auto-GPT, Gemini (formerly Bard). He has experience in contract work as a prompt engineer, manager, and quality assurance (QA) for several companies in San Francisco, United States (Silicon Valley in the broadest sense of the word).
- We have written more than 40 articles on cryptocurrency (including smart contract programming). He has experience as an outsourced translator of English articles on cryptocurrency into Japanese for a company in London, England.
You can learn from us.
If you would like to know the recommended job sites for AI Engineers, please click the following.
If you would like to know the recommended job sites for Prompt Engineers, please click the following.
Table of Contents
What is Image Generation AI?
Image generation AI is AI that generates images (text-to-image) from text.
In other words, by issuing instructions for what kind of image to be created from text, the AI will create an image that matches the command.
Image Generative AI is one of Generative AI.
If you would like to know more about Generative AI, please click the following.
AI Chatbot is a text creation AI in Generative AI.
If you would like to know more about AI chatbots, please click here.
Main Technologies Used in Image Generation AI
This section introduces the main technologies used in image generation AI.
NOTE: Variational Autoencoder (VAE), Autoregressive Model, and other models used in image generation AI.
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs) are a method of generating new images using two neural networks, a Generator and a Discriminator.
The Generator starts with random noise and tries to generate an image.
The Discriminator is trained to distinguish between the image generated by the Generator and the real image.
The Generator repeats the training so that it does not judge the image generated by the Discriminator as a fake.
This training is repeated until the Generator is able to produce an image that is close to the real image.
Applications of GANs include image restoration, super-resolution, and image transformation, as well as image generation.
These are explanatory videos of GANs by Ian Goodfellow, the proponent of GANs (slides).
Diffusion Model
Diffusion Model is a diffusion probabilistic model used in image generation AI.
The paper on Diffusion Models is "Denoising Diffusion Probabilistic Model".
These are explanatory videos of Diffusion Model.
Representative Software for Image Generation AI
The following is a list of typical image generation AI software.
These software mainly use Diffusion Models.
DALL-E
DALL-E is an image generation AI developed by OpenAI.
The versions of DALL-E are as follows
- 2021: Release DALL-E
- 2022: Release DALL-E 2
- 2023: Release DALL-E 3
These are explanatory videos of DALL-E.
Midjourney
Midjourney is an image generation AI developed by Midjourney, Inc. in 2022.
Midjourney made headlines when it was used to deepfake Donald Trump in March 2023.
These are explanatory videos of Midjourney.
Stable Diffusion
Stable Diffusion is an image generation AI developed by Robin Rombach and Andreas Blattmann at Ludwig Maximilian University Munich in 2022.
Stable Diffusion makes it easy to create your own images for use in articles.
- Paper of Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
- GitHub of Stable Diffusion: GitHub
These are explanatory videos of Stable Diffusion.
Stable Diffusion is easily available at the following.
Adobe Firefly
Adobe Firefly is an image generation AI provided by Adobe.
These are explanatory videos of Adobe Firefly.
Bing Image Creator
Bing Image Creator is an image generation AI of Bing.
These are explanatory videos of Bing Image Creator.
Leonardo.Ai
Leonardo.Ai is an image generation AI that can generate up to 150 images per day for free.
In addition, Leonardo.Ai can be used commercially.
These are explanatory videos of Leonardo.Ai.
FLUX.1
FLUX.1 is an open source image generation AI developed by Black Forest Labs, the development team behind Stable Diffusion.
Click the following to watch videos introducing FLUX.1.
Summary
We introduced Image Generation AI and representative software.
We found that there are many image generation AIs.
If you would like to know the recommended job sites for AI Engineers, please click the following.
If you would like to know the recommended job sites for Prompt Engineers, please click the following.