Common media processing operations with Jetpack Media3 Transformer

Posted by Nevin Mital – Developer Relations Engineer, and Kristina Simakova – Engineering Manager

Android users have demonstrated an increasing desire to create, personalize, and share video content online, whether to preserve their memories or to make people laugh. As such, media editing is a cornerstone of many engaging Android apps, and historically developers have often relied on external libraries to handle operations such as Trimming and Resizing. While these solutions are powerful, integrating and managing external library dependencies can introduce complexity and lead to challenges with managing performance and quality.

The Jetpack Media3 Transformer APIs offer a native Android solution that streamline media editing with fast performance, extensive customizability, and broad device compatibility. In this blog post, we’ll walk through some of the most common editing operations with Transformer and discuss its performance.

Getting set up with Transformer

To get started with Transformer, check out our Getting Started documentation for details on how to add the dependency to your project and a basic understanding of the workflow when using Transformer. In a nutshell, you’ll:

Create one or many MediaItem instances from your video file(s), then

Apply item-specific edits to them by building an EditedMediaItem for each MediaItem,

Create a Transformer instance configured with settings applicable to the whole exported video,

and finally start the export to save your applied edits to a file.

Aside: You can also use a CompositionPlayer to preview your edits before exporting, but this is out of scope for this blog post, as this API is still a work in progress. Please stay tuned for a future post!

Here’s what this looks like in code:

val mediaItem = MediaItem.Builder().setUri(mediaItemUri).build()
val editedMediaItem = EditedMediaItem.Builder(mediaItem).build()
val transformer = 
  Transformer.Builder(context)
    .addListener(/* Add a Transformer.Listener instance here for completion events */)
    .build()
transformer.start(editedMediaItem, outputFilePath)

Transcoding, Trimming, Muting, and Resizing with the Transformer API

Let’s now take a look at four of the most common single-asset media editing operations, starting with Transcoding.

Transcoding is the process of re-encoding an input file into a specified output format. For this example, we’ll request the output to have video in HEVC (H265) and audio in AAC. Starting with the code above, here are the lines that change:

val transformer = 
  Transformer.Builder(context)
    .addListener(...)
    .setVideoMimeType(MimeTypes.VIDEO_H265)
    .setAudioMimeType(MimeTypes.AUDIO_AAC)
    .build()

Many of you may already be familiar with FFmpeg, a popular open-source library for processing media files, so we’ll also include FFmpeg commands for each example to serve as a helpful reference. Here’s how you can perform the same transcoding with FFmpeg:

$ ffmpeg -i $inputVideoPath -c:v libx265 -c:a aac $outputFilePath

The next operation we’ll try is Trimming.

Specifically, we’ll set Transformer up to trim the input video from the 3 second mark to the 8 second mark, resulting in a 5 second output video. Starting again from the code in the “Getting set up” section above, here are the lines that change:

// Configure the trim operation by adding a ClippingConfiguration to
// the media item
val clippingConfiguration =
   MediaItem.ClippingConfiguration.Builder()
     .setStartPositionMs(3000)
     .setEndPositionMs(8000)
     .build()
val mediaItem =
   MediaItem.Builder()
     .setUri(mediaItemUri)
     .setClippingConfiguration(clippingConfiguration)
     .build()

// Transformer also has a trim optimization feature we can enable.
// This will prioritize Transmuxing over Transcoding where possible.
// See more about Transmuxing further down in this post.
val transformer = 
  Transformer.Builder(context)
    .addListener(...)
    .experimentalSetTrimOptimizationEnabled(true)
    .build()

With FFmpeg:

$ ffmpeg -ss 00:00:03 -i $inputVideoPath -t 00:00:05 $outputFilePath

Next, we can mute the audio in the exported video file.

val editedMediaItem = 
  EditedMediaItem.Builder(mediaItem)
    .setRemoveAudio(true)
    .build()

The corresponding FFmpeg command:

$ ffmpeg -i $inputVideoPath -c copy -an $outputFilePath

And for our final example, we’ll try resizing the input video by scaling it down to half its original height and width.

val scaleEffect = 
  ScaleAndRotateTransformation.Builder()
    .setScale(0.5f, 0.5f)
    .build()
val editedMediaItem =
  EditedMediaItem.Builder(mediaItem)
    .setEffects(
      /* audio */ Effects(emptyList(), 
      /* video */ listOf(scaleEffect))
    )
    .build()

An FFmpeg command could look like this:

$ ffmpeg -i $inputVideoPath -filter:v scale=w=trunc(iw/4)*2:h=trunc(ih/4)*2 $outputFilePath

Of course, you can also combine these operations to apply multiple edits on the same video, but hopefully these examples serve to demonstrate that the Transformer APIs make configuring these edits simple.

Transformer API Performance results

Here are some benchmarking measurements for each of the 4 operations taken with the Stopwatch API, running on a Pixel 9 Pro XL device:

(Note that performance for operations like these can depend on a variety of reasons, such as the current load the device is under, so the numbers below should be taken as rough estimates.)

Input video format: 10s 720p H264 video with AAC audio

Transcoding to H265 video and AAC audio: ~1300ms

Trimming video to 00:03-00:08: ~2300ms

Muting audio: ~200ms

Resizing video to half height and width: ~1200ms

Input video format: 25s 360p VP8 video with Vorbis audio

Transcoding to H265 video and AAC audio: ~3400ms

Trimming video to 00:03-00:08: ~1700ms

Muting audio: ~1600ms

Resizing video to half height and width: ~4800ms

Input video format: 4s 8k H265 video with AAC audio

Transcoding to H265 video and AAC audio: ~2300ms

Trimming video to 00:03-00:08: ~1800ms

Muting audio: ~2000ms

Resizing video to half height and width: ~3700ms

One technique Transformer uses to speed up editing operations is by prioritizing transmuxing for basic video edits where possible. Transmuxing refers to the process of repackaging video streams without re-encoding, which ensures high-quality output and significantly faster processing times.

When not possible, Transformer falls back to transcoding, a process that involves first decoding video samples into raw data, then re-encoding them for storage in a new container. Here are some of these differences:

Transmuxing

Transformer’s preferred approach when possible - a quick transformation that preserves elementary streams.
Only applicable to basic operations, such as rotating, trimming, or container conversion.
No quality loss or bitrate change.

Transcoding

Transformer's fallback approach in cases when Transmuxing isn't possible - Involves decoding and re-encoding elementary streams.
More extensive modifications to the input video are possible.
Loss in quality due to re-encoding, but can achieve a desired bitrate target.

We are continuously implementing further optimizations, such as the recently introduced experimentalSetTrimOptimizationEnabled setting that we used in the Trimming example above.

A trim is usually performed by re-encoding all the samples in the file, but since encoded media samples are stored chronologically in their container, we can improve efficiency by only re-encoding the group of pictures (GOP) between the start point of the trim and the first keyframes at/after the start point, then stream-copying the rest.

Since we only decode and encode a fixed portion of any file, the encoding latency is roughly constant, regardless of what the input video duration is. For long videos, this improved latency is dramatic. The optimization relies on being able to stitch part of the input file with newly-encoded output, which means that the encoder's output format and the input format must be compatible.

If the optimization fails, Transformer automatically falls back to normal export.

What’s next?

As part of Media3, Transformer is a native solution with low integration complexity, is tested on and ensures compatibility with a wide variety of devices, and is customizable to fit your specific needs.

To dive deeper, you can explore Media3 Transformer documentation, run our sample apps, or learn how to complement your media editing pipeline with Jetpack Media3. We’ve already seen app developers benefit greatly from adopting Transformer, so we encourage you to try them out yourself to streamline your media editing workflows and enhance your app’s performance!

Source: Android Developers Blog

Chrome Beta for Desktop Update

The Chrome team is excited to announce the promotion of Chrome 135 to the Beta channel for Windows, Mac and Linux. Chrome 135.0.7049.3 contains our usual under-the-hood performance and stability tweaks, but there are also some cool new features to explore - please head to the Chromium blog to learn more!

A partial list of changes is available in the Git log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Chrome Release Team
Google Chrome

Source: Google Chrome Releases

Dev Channel Update for ChromeOS / ChromeOS Flex

The Dev channel is being updated to OS version 16209.5.0 (Browser version 135.0.7049.0) for most ChromeOS devices.

If you find new issues, please let us know one of the following ways

File a bug
Visit our ChromeOS communities
General: Chromebook Help Community
Beta Specific: ChromeOS Beta Help Community
Report an issue or send feedback on Chrome
Interested in switching channels? Find out how.

Alon Bajayo,
Google ChromeOS Release Team

Source: Google Chrome Releases

Google won two major awards at MWC Barcelona 2025.

Today Google has been honored with two Global Mobile (GLOMO) Awards at Mobile World Congress 2025. Celebrating 30 years of excellence, the GLOMOs, judged by over 200 ind…

Source: The Official Google Blog

Expanding AI Overviews and introducing AI Mode

AI Mode is a new generative AI experiment in Google Search.

Source: The Official Google Blog

Chrome Beta for iOS Update

Hi everyone! We've just released Chrome Beta 135 (135.0.7049.3) for iOS; it'll become available on App Store in the next few days.

You can see a partial list of the changes in the Git log. If you find a new issue, please let us know by filing a bug.

Chrome Release Team
Google Chrome

Source: Google Chrome Releases

Drive with Chaka Khan on Waze

Check out Waze’s latest driving experience with Chaka Khan

Source: The Official Google Blog

Use AI to find exactly the right clothes, try on makeup and more

Learn more about Google Shopping’s new AI tools, like image generation for shopping, new AR beauty features and more.

Source: Google Ads & Commerce

Search Central Live is going to Madrid

We're very excited to announce that Search Central Live is going to Madrid for the first time on April 9! The event will have a mix of presenters from the Google Search, News, and Partnerships teams and the content will be delivered in English and Spanish, but we'll have live translation.

Source: Google Search Central Blog

Generate Stunning Visuals in Your Android Apps with Imagen 3 via Vertex AI in Firebase

Posted by Thomas Ezan Sr. – Android Developer Relation Engineer (@lethargicpanda)

Imagen 3, our most advanced image generation model, is now available through Vertex AI in Firebase, making it even easier to integrate it to your Android apps.

Designed to generate well-composed images with exceptional details, reduced artifacts, and rich lighting, Imagen 3 represents a significant leap forward in image generation capabilities.

Hot air balloons float over a scenic desert landscape with unique rock formations.

Image generated by Imagen 3 with prompt: “Shot in the style of DSLR camera with the polarizing filter. A photo of two hot air balloons over the unique rock formations in Cappadocia, Turkey. The colors and patterns on these balloons contrast beautifully against the earthy tones of the landscape below. This shot captures the sense of adventure that comes with enjoying such an experience.”

A wooden robot stands in a field of yellow flowers, holding a small blue bird on its outstretched hand.

Image generated by Imagen 3 with prompt: A weathered, wooden mech robot covered in flowering vines stands peacefully in a field of tall wildflowers, with a small blue bird resting on its outstretched hand. Digital cartoon, with warm colors and soft lines. A large cliff with a waterfall looms behind.

Imagen 3 unlocks exciting new possibilities for Android developers. Generated visuals can adapt to the content of your app, creating a more engaging user experience. For instance, your users can generate custom artwork to enhance their in-app profile. Imagen can also improve your app's storytelling by bringing its narratives to life with delightful personalized illustrations.

You can experiment with image prompts in Vertex AI Studio, and learn how to improve your prompts by reviewing the prompt and image attribute guide.

Get started with Imagen 3

The integration of Imagen 3 is similar to adding Gemini access via Vertex AI in Firebase. Start by adding the gradle dependencies to your Android project:

dependencies {
    implementation(platform("com.google.firebase:firebase-bom:33.10.0"))

    implementation("com.google.firebase:firebase-vertexai")
}

Then, in your Kotlin code, create an ImageModel instance by passing the model name and optionally, a model configuration and safety settings:

val imageModel = Firebase.vertexAI.imagenModel(
  modelName = "imagen-3.0-generate-001",
  generationConfig = ImagenGenerationConfig(
    imageFormat = ImagenImageFormat.jpeg(compresssionQuality = 75),
    addWatermark = true,
    numberOfImages = 1,
    aspectRatio = ImagenAspectRatio.SQUARE_1x1
  ),
  safetySettings = ImagenSafetySettings(
    safetyFilterLevel = ImagenSafetyFilterLevel.BLOCK_LOW_AND_ABOVE
    personFilterLevel = ImagenPersonFilterLevel.ALLOW_ADULT
  )
)

Finally generate the image by calling generateImages:

val imageResponse = imageModel.generateImages(
  prompt = "An astronaut riding a horse"
)

Retrieve the generated image from the imageResponse and display it as a bitmap as follow:

val image = imageResponse.images.first()
val uiImage = image.asBitmap()

Next steps

Explore the comprehensive Firebase documentation for detailed API information.

Access to Imagen 3 using Vertex AI in Firebase is currently in Public Preview, giving you an early opportunity to experiment and innovate. For pricing details, please refer to the Vertex AI in Firebase pricing page.

Start experimenting with Imagen 3 today! We're looking forward to seeing how you’ll leverage Imagen 3's capabilities to create truly unique, immersive and personalized Android experiences.

googblogs.com

All Google blogs and Press in one site

Common media processing operations with Jetpack Media3 Transformer

Getting set up with Transformer

Transcoding, Trimming, Muting, and Resizing with the Transformer API

Transformer API Performance results

Transmuxing

Transcoding

What’s next?

Source: Android Developers Blog

Chrome Beta for Desktop Update

Source: Google Chrome Releases

Dev Channel Update for ChromeOS / ChromeOS Flex

Source: Google Chrome Releases

Google won two major awards at MWC Barcelona 2025.

Source: The Official Google Blog

Expanding AI Overviews and introducing AI Mode

Source: The Official Google Blog

Chrome Beta for iOS Update

Source: Google Chrome Releases

Drive with Chaka Khan on Waze

Source: The Official Google Blog

Use AI to find exactly the right clothes, try on makeup and more

Source: Google Ads & Commerce

Search Central Live is going to Madrid

Source: Google Search Central Blog

Generate Stunning Visuals in Your Android Apps with Imagen 3 via Vertex AI in Firebase

Get started with Imagen 3

Next steps

Source: Android Developers Blog