Technical Articles, Integration

From 0 to 100% Faster: How Scenario Optimized Image Generation in 3 Weeks

Feb 10, 2025

Quentin Sinig

Quentin Sinig

Go-to-Market Lead

John Rachwan

John Rachwan

Cofounder & CTO

A matter of timing

In early 2024, Pruna AI was introduced to Scenario through Mottier Ventures. Hervé, Scenario’s founder, saw the potential in Pruna AI’s optimization capabilities but had to be pragmatic—his small team needed to prioritize their most pressing challenges.

So, we kept the conversation going, sharing only what mattered: tangible proof of significant efficiency gains and feature updates that could directly improve his team’s productivity. Still, we waited for the right trigger.

That moment came when Scenario explored a new use case: an image upscaler. An independent project, separate from Scenario’s existing ML pipelines. It was the perfect opportunity to put our solution to the test.

Why was it a mutual fit?

From our conversations with other image generation companies, we realized something: when teams are already deeply invested in inference optimization, proving our value isn’t as straightforward as it seems. Sure, we can push beyond what Tensor RT and Torch Compile offer—but demonstrating that in a meaningful way is another challenge.

For example, we showed that we could make Flux Dev (see playground) 9x faster, bringing it down to 30ms per step on 1024x1024 images. But later, we recognized a key gap in our benchmarking: comparing a vanilla model’s performance with Pruna’s optimizations wasn’t a perfect 1:1, since many teams—Scenario included—had already implemented their own efficiency tweaks.

What made Scenario x Pruna a great fit, however, was the way we worked together. Scenario dedicated a key person to our collaboration, allowing us to operate in a task-force mode—handling one task at a time with direct, hands-on involvement. No waiting on tickets, no long feedback loops—just tight, iterative collaboration, making each other’s work easier.

From “Zero to Prod” in 3 weeks

As mentioned earlier, we started with the upscaler model. In just one week, we implemented a hands-on optimization in the dev environment, ensuring the upscaler preserved image quality with zero degradation. After intensive QA and stress testing, the optimized model went to production in just three weeks.

The upscaler was Scenario’s one of the most used services, but its deployment came with a major challenge: dynamic pipeline switching. Scenario’s infrastructure needed to handle:

  • Model and scheduler switching

  • Parameter variations

To address this, we integrated Pruna in a way that supports their dynamic architecture and configurations, including LoRA switching. We also experimented with integrating a third-party tool for anonymous voting tests, validating that our optimizations maintained or improved image quality compared to the original model.

Scenario’s QA team ran intensive stress tests on different inputs and parameter settings. After quickly resolving OOM issues and ensuring no breaking failures (artifacts or inconsistencies), the final version worked across all scenarios. It passed development, then production—within just three weeks.

A word about the evaluation process

It’s worth noting that the current evaluation process is mostly manual. Validation today relies on blind testing with shuffled outputs for human evaluation—a standard approach, but one that’s time-consuming and subjective. While pixel-by-pixel checks are integrated into CI/CD workflows, they don’t fully ensure that the image’s overall quality (e.g., artifacts, visual coherence) aligns with customer expectations.

Pruna AI has a plan for this. After experimenting with a voting system to incorporate human-in-the-loop evaluations, we recognized that the industry needs a plug-and-play validation tool—one that’s:

  • Easy to integrate with minimal setup

  • Capable of comparing inputs and outputs in a structured way

  • Providing automated metrics to identify the best-performing model

To bring this all together, we’re envisioning dashboards that allow teams to monitor each deployment, track before-and-after performance, and stay on top of model quality as it evolves.

40% baseline improvement, up to 3x faster

The results were game-changing. In just three weeks, Scenario’s upscaler saw a 40% speed improvement on A100 80GB GPUs, allowing them to process significantly more with the same resources, without any quality trade-offs. In just three weeks

But the impact didn’t stop there.

  • The style transfer model, a simpler integration, was optimized with x2 speed up and production-ready in just one week.

  • For Flux Schnell, Pruna AI evaluated a potential 2-3x speed improvement, pushing toward sub-second generation times, even under complex dynamic switching conditions.

As this one is core business and tied to very complex pipelines, it will surely require a bit more work. We’re not done yet, and that’s we currently working with Scenario’s team.

Pruna pays for itself through compute savings

Last point: ROI. This comes up in every conversation with a new prospect (and hopefully with you too, dear reader!), so let’s break it down with some real numbers. While we can’t disclose exact figures (fun fact: everyone wants to know, no one wants to share), we can talk about ratios instead.

Here’s the ROI case:

  • Pruna AI represents ~12.5% of the total compute costs where it’s deployed.

  • On average, we provide a 100% efficiency gain (though this varies significantly between models).

  • For every €1 saved, ~$0.25 goes to Pruna, and the rest translates into direct cost savings for Scenario.

Now, let’s put that in perspective.

An A100 80GB costs ~$4 per GPU per hour on AWS (purchased in packs of 8). Running a single instance 24/7 results in $8,640 saved per month (or $1,080 per GPU)—after Pruna’s licensing cost is deducted.

Bottom line: Pruna’s licensing costs are offset by the compute cost reductions.


About Scenario

Scenario is a powerful suite of AI tools designed for the gaming, media, entertainment, design, and creative industries. It helps teams create visually consistent assets with maximum efficiency and productivity. Scenario provides users with intuitive, high-precision tools to streamline and manage AI-driven content creation at every stage.


Subscribe to Pruna's Newsletter

Speed Up Your Models With Pruna AI.

Inefficient models drive up costs, slow down your productivity and increase carbon emissions. Make your AI more accessible and sustainable with Pruna.

Speed Up Your Models With Pruna AI.

Inefficient models drive up costs, slow down your productivity and increase carbon emissions. Make your AI more accessible and sustainable with Pruna.

Speed Up Your Models With Pruna AI.

Inefficient models drive up costs, slow down your productivity and increase carbon emissions. Make your AI more accessible and sustainable with Pruna.

© 2025 Pruna AI - Built with Pretzels & Croissants 🥨 🥐

© 2025 Pruna AI - Built with Pretzels & Croissants

© 2025 Pruna AI - Built with Pretzels & Croissants 🥨 🥐