New AI image generator runs using 10 times fewer steps than today's best models — and it's coming to smartphones and laptops
Researchers have developed an AI image generator that produces images in just four steps, rather than dozens. This could bring fast, private image generation directly to consumer devices.
Get the world’s most fascinating discoveries delivered straight to your inbox.
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Artificial intelligence (AI) image generators are becoming more powerful, and they usually rely on heavyweight large language models (LLMs) running in the cloud. But researchers say they've built a new system that can generate high-quality images using roughly 10 times fewer processing steps.
The result is AI that's fast and efficient enough to run locally on phones and laptops, while being more secure and environmentally friendly than AI that runs on power-hungry data centers.
The technology, called Stable Diffusion 3.5 Flash (SD3.5-Flash), was developed through a collaboration between researchers at the University of Surrey's Institute for People-Centred AI and the company Stability AI.
Article continues belowThey outlined how the new model works in a study uploaded Sept. 25 2025, to the preprint arXiv database and announced March 4 in a statement that Lenovo has licensed the model for integration into its upcoming on-device AI platform. That means this system will soon appear in forthcoming smartphones, tablets and laptops.
The goal is simple but ambitious: to bring powerful generative AI out of remote data centers and onto the devices people actually use. This not only has implications for environmental impact and privacy, but could also make AI-based image generation faster than ever before.
Why most AI image generators are slow
Most modern text-to-image systems rely on a technique called diffusion. These AI models start with random noise – essentially a grid of pixels filled with random values – and gradually refine it into an image through a long sequence of steps.
Typically, that process takes 30 to 50 iterations to produce a finished image, with each step requiring significant computing power. That's why many popular AI image generation tools run on large clusters of graphics processing units (GPUs) in remote servers via the cloud, rather than locally on a phone or laptop.
Get the world’s most fascinating discoveries delivered straight to your inbox.
Achieving this level of efficiency is technically challenging, as it requires compressing a diffusion model to run in only a few steps while maintaining quality
Hmrishav Bandyopadhyay, doctoral researcher at the University of Surrey
That architecture works well for producing high-quality images, but it also creates practical limitations. The models are slower and energy-intensive, and they must send prompts or images to remote servers before waiting for a response.
In the new study, the scientists set out to tackle that bottleneck. SD3.5-Flash dramatically shortens the generation pipeline. Instead of dozens of iterations, the model can produce an image in just four processing steps, the scientists said.
This is achieved by compressing the diffusion process into a more efficient form while preserving image quality. In essence, the system learns how to "jump" through the fine-tuning process in larger leaps rather than inching forward step by step. According to the study, however, maintaining visual quality while reducing the number of steps is the core technical challenge.
"Our SD3.5-Flash model allows users to create images from text descriptions entirely on their device, with no data leaving their hardware," said Hmrishav Bandyopadhyay, a doctoral researcher at the University of Surrey who developed the model during an internship at Stability AI, in the statement. "Achieving this level of efficiency is technically challenging, as it requires compressing a diffusion model to run in only a few steps while maintaining quality."
Reducing the number of inference steps means the model requires far fewer computational resources, thus making it feasible to run on consumer-grade hardware.
Greater privacy, speed and AI sustainability
Running generative AI locally rather than in the cloud could have several advantages. The first is privacy: if an AI model runs entirely on a device, prompts and generated images don't need to be sent to remote servers, which reduces the risk of data exposure, interception, or misuse.
The second is speed: With fewer processing steps and no network latency, image generation could become nearly instantaneous.
Finally, there's an environmental angle. Large cloud AI models consume substantial energy and water through data center operations, but lightweight models running locally can dramatically reduce those demands.
Yi-Zhe Song, director of the SketchX Lab at the University of Surrey, said the broader aim is to make AI more accessible and practical: "SD3.5-Flash puts a powerful creative tool directly in users' hands while keeping their data private and reducing the energy demands associated with cloud processing."
In the study, the team tested SD3.5-Flash against traditional diffusion pipelines to measure whether the drastic reduction in processing steps affected the quality of the images. They evaluated the system using standard benchmarks for generative models, including image fidelity and the extent to which outputs match text prompts. These metrics are widely used in machine learning research to compare different image generation approaches.
Tests on standard image-generation benchmarks found the model could deliver results similar to traditional diffusion systems, despite cutting the number of processing steps from around 30–50 down to just four.
Most notably, the technology is already heading toward real products. Lenovo has licensed the model for integration into its upcoming Personal Ambient Intelligence platform, called Qira, which aims to bring AI capabilities directly to consumer devices.
That could enable features like AI image generation on laptops, tablets and smartphones without the need for an internet connection. In March, the company introduced its first set of Qira-compatible devices, including new concept devices, suggesting it won't be much longer before we see this new AI system integrated into laptops, tablets and smartphones.
If successful, it would represent a broader shift in how generative AI is delivered. Instead of relying on centralized infrastructure, future AI tools may increasingly run locally on the edge — embedded directly into everyday devices. It's something the researchers see as part of a larger push to make generative AI more efficient and practical.
Compressing large models without sacrificing quality remains an active area of research, but SD3.5-Flash suggests the gap between powerful AI systems and consumer hardware may be shrinking quickly. If companies like Lenovo follow through with device integrations, the next wave of AI creativity tools might not live in the cloud but in your pocket.
Carly Page is a technology journalist and copywriter with more than a decade of experience covering cybersecurity, emerging tech, and digital policy. She previously served as the senior cybersecurity reporter at TechCrunch.
Now a freelancer, she writes news, analysis, interviews, and long-form features for publications including Forbes, IT Pro, LeadDev, Resilience Media, The Register, TechCrunch, TechFinitive, TechRadar, TES, The Telegraph, TIME, Uswitch, WIRED, and others. Carly also produces copywriting and editorial work for technology companies and events.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.

