Apple’s research division, together with researchers at the University of California, Santa Barbara, has released MGIE, a multimodal artificial intelligence model designed for image editing. To make changes to a snapshot, the user only needs to describe in natural language what he wants to get as an output.
MGIE (Multimodal Large Language Model-Guided Image Editing) can be used for tasks of varying complexity related to image editing: you can add various objects, remove them, or change them. When receiving a command, the model interprets the user’s words, and then “imagines” what the picture will look like when modified in accordance with them.
In describing MGIE article Several examples of her work are given. When a model was given a photo of pizza to edit and was offered “make it healthier”, she added vegetable filling. Next, she was offered an overly dark photograph of a cheetah in the desert and suggested “add contrast, simulating more light” — the image became brighter. MGIE is available for free download at GitHuband you can try it out on the platform Hugging Face Spaces. Apple did not specify what the company’s plans are for the model outside of the research project.
Image editing is supported by several AI generators, including OpenAI DALL-E 3; Generative AI features are also present in Photoshop thanks to the Adobe Firefly model. Apple doesn’t position itself as a major player in AI like Microsoft, Meta or Google, but CEO Tim Cook recently said the manufacturer will add new AI features to its devices this year. Last December, the company released the MLX open platform for training AI models on Apple Silicon chips.