Kotlin’s Flow in ViewModels: it’s complicated

LiveData is still your friend

Loading UI data in Android applications can be challenging. The lifecycles of the various screens need to be taken into account, as well as configuration changes leading to the destruction and recreation of Activities.

The individual screens of an app constantly toggle between interactive and hidden as the user navigates further and back in an app, switches from one app to another, or the device screen gets locked or unlocked. Each component needs to play fair and only perform active work when given the ball.

Configuration changes happen on various occasions: when changing the device orientation, switching the app to multi-window mode or resizing its window size, switching to dark or light mode, changing the default locale or font sizes, and more.

Goals of efficiency

To achieve efficient data loading in Activities and Fragments leading to the best user experience, the following should be considered:

  1. Caching: data that has been loaded successfully and is still valid should be delivered immediately and not loaded a second time. In particular, when an existing Activity or Fragment becomes visible again, or after an Activity gets recreated on configuration change;
  2. Avoiding background work: when an Activity or Fragment becomes invisible (moves from the STARTED to the STOPPED state), any ongoing loading work should be paused or canceled in order to save resources. This is especially important for endless streams of data like location updates or periodic refreshes of any kind;
  3. No work interruption during configuration changes: this is an exception to the second goal. During configuration changes, an Activity gets replaced by a new instance of it while preserving its state, so canceling ongoing work when the old instance is destroyed to immediately restart it when the new instance is created would be counter-productive.

Today: ViewModel and LiveData

To help developers reach these goals with code of manageable complexity, Google released the first Architecture Components libraries in 2017 in the form of ViewModel and LiveData. This was before Kotlin was introduced as the recommended programming language to develop Android applications.

ViewModel are objects preserved across configuration changes. They are useful to reach goals #1 and #3: loading operations can run uninterrupted inside of them during configuration changes, while the resulting data can be cached in them and shared with one or more Fragments/Activity currently attached to it.

LiveData is a simple observable data holder class that is also lifecycle-aware. New values are only dispatched to observers when their lifecycle is at least in the STARTED (visible) state, and observers are unregistered automatically which is handy to avoid memory leaks. LiveData is useful to reach goals #1 and #2: it caches the latest value of the data it holds and that value is automatically dispatched to new observers. Plus, it is notified when there are no more registered observers in the STARTED state, which allows to avoid performing unnecessary background work.

A graph illustrating the ViewModel Scope in relation to the Activity lifecycle
A graph illustrating the ViewModel Scope in relation to the Activity lifecycle

If you are an experienced Android developer, you probably know all of this already. But it’s important to recap these features in order to compare them with those of Flow.

LiveData + Coroutines

LiveData itself is quite limited compared to reactive streams solutions like RxJava:

  • It only handles passing data to and from the main thread, leaving the burden of managing background threads to the developer.
    Notably, the map() operator executes its transformation function on the main thread and cannot be used to perform I/O operations or heavy CPU work. For that case the switchMap() operator needs to be used, in combination with manually launching an asynchronous operation on a background thread, even if only a single value has to be posted back on the main thread.
  • Only 3 transformation operators are provided for LiveData: map(), switchMap() and distinctUntilChanged(). If more are needed, you must implement them yourself using MediatorLiveData.

To help overcome these limitations, the Jetpack libraries also provide bridges from LiveData to other technologies like RxJava or Kotlin’s coroutines.

The simplest and most elegant bridge in my opinion is the LiveData coroutine builder function provided by the androidx.lifecycle:lifecycle-livedata-ktx Gradle dependency. This function is similar to the flow {} builder function from the Kotlin Coroutines library and allows to smartly wrap a coroutine as a LiveData instance:

val result: LiveData<Result> = liveData {
val
data = someSuspendingFunction()
emit(data)
}
  • You can use all the power of coroutines and coroutine contexts to write asynchronous code in a synchronous way without callbacks, switching between threads automatically as needed;
  • New values are dispatched to the LiveData observers on the main thread by calling the emit() or emitSource() suspending functions from the coroutine;
  • The coroutine uses a special scope and lifecycle tied to the LiveData instance. When the LiveData becomes inactive (has no more observer in the STARTED state), the coroutine will automatically be canceled, allowing to reach goal #2 without having to do any extra work;
  • The cancellation of the coroutine will actually be delayed by 5 seconds after the LiveData becomes inactive in order to handle configuration changes gracefully: if a new Activity immediately replaces the old one and the LiveData becomes active again before the timeout, cancellation will not happen and the cost of an unnecessary restart will be avoided (goal #3);
  • If the user comes back to the screen and the LiveData becomes active again, the coroutine will automatically restart, but only if it was previously canceled before completion. As soon as the coroutine completes it will not restart anymore, allowing to avoid loading the same data twice if the input did not change (goal #1).

Conclusion: by using the LiveData coroutines builder you get the best behavior by default with the simplest code.

Instead of suspending functions returning a single value, what if a repository provides functions returning a stream of values in the form of a Flow? It is also possible to convert it to a LiveData and take advantage of all the above features by using the asLiveData() extension function:

val result: LiveData<Result> = someFunctionReturningFlow().asLiveData()

Under the hood, asLiveData() also uses the LiveData coroutines builder to create a simple coroutine collecting the Flow while the LiveData is active:

fun <T> Flow<T>.asLiveData(): LiveData<T> = liveData {
collect {
emit(it)
}
}

But let’s pause for a moment — what exactly is a Flow and is it possible to use it as a complete replacement for LiveData ?

Introducing Kotlin’s Flow

Charlie Chaplin turning his back on his wife labeled LiveData to look at an attractive woman labeled Flow
Charlie Chaplin turning his back on his wife labeled LiveData to look at an attractive woman labeled Flow
Flow is newer and trendy so it must be better, right?

Flow is a class from Kotlin’s Coroutines library introduced in 2019 which represents a stream of values computed asynchronously. It’s similar in concept to RxJava Observables but is based on coroutines and has a simpler API.

At first, only cold flows were available: stateless streams that are created on demand each time an observer starts collecting their values in the scope of a coroutine. Each observer gets its own sequence of values, they are not shared.

Later, new hot flows subtypes SharedFlow and StateFlow were added and graduated as stable APIs in release 1.4.0 of the Coroutines library.

SharedFlow allows to publish values that are broadcast to all observers. It can manage an optional replay cache and/or a buffer and basically replaces all variants of the deprecated BroadcastChannel API.

StateFlow is a specialized and optimized subclass of SharedFlow which stores and replays the latest value only. Sounds familiar?

StateFlow and LiveData have a lot in common:

  • They are observable classes
  • They store and broadcast the latest value to any number of observers
  • They force you to catch exceptions early: an uncaught Exception in a LiveData callback stops the application. An uncaught Exception in a hot Flow ends the stream with no possibility to restart it, even when using the .catch() operator.

But they also have important differences:

  • MutableStateFlow requires an initial value, MutableLiveData does not (note: MutableSharedFlow(replay = 1) can be used to emulate a MutableStateFlow with no initial value, but its implementation is a bit less efficient)
  • StateFlow always filters out repetitions of the same value using Any.equals() for comparison, LiveData does not unless combined with the distinctUntilChanged() operator (note: a SharedFlowcan also be used to prevent this behavior)
  • StateFlow is not lifecycle-aware. However, a Flow can be collected from a lifecycle-aware coroutine which requires more code to setup without using LiveData (more details below)
  • LiveData uses versioning to keep track of which value has been dispatched to which observer. This allows to avoid dispatching the same value twice to the same observer when it moves back to the STARTED state.
    StateFlow has no versioning. Each time a coroutine collects a Flow, it is considered as a new observer and will always receive the latest value first. This can lead to performing duplicate work as we will see in the following case study.

Observing LiveData vs Collecting Flow

Observing a LiveData instance from an Activity of Fragment is straightforward:

viewModel.results.observe(viewLifecycleOwner) { data ->
displayResult(data)
}

It’s a one-time operation and LiveData takes care of syncing the stream with the lifecycle of the observers.

The equivalent operation for a Flow is called collecting and collection needs to be done from a coroutine. Because a Flow itself is not lifecycle-aware, the responsibility of syncing with the lifecycle is moved up to the coroutine collecting the Flow.

To create a lifecycle-aware coroutine collecting a Flow while an Activity/Fragment is in the STARTED state and cancel the collection automatically when the Activity/Fragment is destroyed, the following code can be used:

viewLifecycleOwner.lifecycleScope.launchWhenStarted {
viewModel.result.collect { data ->
displayResult(data)
}
}

But there is a major limitation with this code: it will only work properly with cold flows not backed by a channel or buffer. Such a flow is only driven by the coroutine collecting it: when the Activity/Fragment moves to the STOPPED state, the coroutine will suspend, the Flow producer will suspend along and nothing else will happen until the coroutine is resumed.

However, there are other kinds of flows:

  • hot flows which are always active and will dispatch results to all current observers (including the suspended ones);
  • callback-based or channel-based cold flows which subscribe to an active data source when collection starts and only stop the subscription when the collection is canceled (not suspended).

For these cases, the underlying flow producer will be kept active even when the coroutine collecting the Flow is suspended, buffering new results in the background. Resources are wasted and goal #2 is missed.

Forrest Gump on a bench saying “Life is like a box of chocolates, you never know which kind of Flow you’re going to collect.”
Forrest Gump on a bench saying “Life is like a box of chocolates, you never know which kind of Flow you’re going to collect.”

A safer way to collect Flows of any kind needs to be implemented. The coroutine performing the collection must be canceled when the Activity/Fragment becomes invisible and restarted when it becomes visible again, exactly like what the LiveData coroutine builder does. For this purpose, new APIs were introduced in lifecycle:lifecycle-runtime-ktx:2.4.0 (still in alpha at the time of writing this article):

viewLifecycleOwner.lifecycleScope.launch {
viewLifecycleOwner.repeatOnLifecycle(Lifecycle.State.STARTED) {
viewModel.result.collect { data ->
displayResult(data)
}
}
}

or alternatively:

viewLifecycleOwner.lifecycleScope.launch {
viewModel.result
.flowWithLifecycle(viewLifecycleOwner.lifecycle, Lifecycle.State.STARTED)
.collect { data ->
displayResult(data)
}
}

As you can see, to achieve the same level of safety and efficiency, observing results from an Activity or Fragment is simpler with LiveData.

You can learn more about these new APIs in the article “A safer way to collect flows from Android UIs” from Manuel Vivo.

Replacing LiveData with StateFlow in ViewModels

Let’s go back to the ViewModel. We established that this is a simple and efficient way to fetch data asynchronously using LiveData:

val result: LiveData<Result> = liveData {
val data = someSuspendingFunction()
emit(data)
}

How can we achieve the same effect using StateFlow in place of LiveData? Jose Alcérreca wrote a long migration guide to help answer this question. Long story short, for the above use case the equivalent code is:

val result: Flow<Result> = flow {
val
data = someSuspendingFunction()
emit(data)
}.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5000L),
initialValue = Result.Loading
)

The stateIn() operator transforms our cold flow into a hot flow able to share a single result between multiple observers. Thanks to SharingStarted.WhileSubscribed(5000L), the hot flow is started lazily when the first observer subscribes and is canceled 5 seconds after the last observer unsubscribes, allowing to avoid doing unnecessary work in the background while also taking configuration changes into account. Furthermore, as soon as the upstream flow reaches its end, it won’t be restarted automatically by the sharing coroutine so we avoid doing the same work twice.

It looks like we managed to achieve our 3 goals and replicate almost the same behavior as LiveData using code that is a bit more complex.

But there remains a small key difference: each time an Activity/Fragment becomes visible again, a new flow collection will start and StateFlow will always start the flow by immediately delivering the latest result to the observer. Even if that same result had already been delivered to the same Activity/Fragment during the previous collection. Because unlike LiveData, StateFlow doesn’t support versioning and each flow collection is considered as a brand new observer.

Is that problematic? For this simple use case, not really: an Activity or Fragment could just perform an extra check to avoid updating the View if the data hasn’t changed.

viewLifecycleOwner.lifecycleScope.launch {
viewModel.result
.flowWithLifecycle(viewLifecycleOwner.lifecycle, Lifecycle.State.STARTED)
.distinctUntilChanged()
.collect { data ->
displayResult(data)
}
}

But problems may arise in more complicated, real-life use cases, as we’ll see in the next section.

Using StateFlow as trigger in a ViewModel

A common scenario is to use a trigger-based approach to load data in a ViewModel: every time the trigger value is updated, the data gets refreshed.

Using MutableLiveData this works very well:

class MyViewModel(repository: MyRepository) : ViewModel() {
private val trigger = MutableLiveData<String>()

fun setQuery(query: String) {
trigger.value = query
}

val results: LiveData<SearchResult>
= trigger.switchMap { query ->
liveData {
emit(repository.search(query))
}
}
}
  • On refresh, the switchMap() operator will connect the observers to a new underlying LiveData source, replacing the old one. And because the above example uses the LiveData coroutine builder, the previous LiveData source will automatically cancel its associated coroutine 5 seconds after being disconnected from its observers. Work on obsolete values is avoided with a small delay.
  • Because LiveData has versioning, the MutableLiveData trigger will only dispatch the new value once to the switchMap() operator, as soon as there is at least one active observer. Later, when observers become inactive and active again, the work of the latest underlying LiveData source will just resume where it left off.

The code is simple enough and reaches all goals of efficiency.

Now let’s see if we can implement the same logic using MutableStateFlow in place of MutableLiveData.

The naive approach

class MyViewModel(repository: MyRepository) : ViewModel() {
private val trigger = MutableStateFlow("")

fun setQuery(query: String) {
trigger.value = query
}

val results: Flow<SearchResult> = trigger.mapLatest { query ->
repository.search(query)
}.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5000L),
initialValue = SearchResult.EMPTY
)
}

The API of MutableLiveData and MutableStateFlow being very close, the trigger code looks almost identical. The biggest difference is the usage of the mapLatest() transformation function which is equivalent to LiveData’s switchMap() for a single return value (for multiple return values, flatMapLatest() should be used).

mapLatest() works like map() but instead of fully executing the transformation on all input values in sequence, the input values are consumed immediately and the transformation is executed asynchronously in a separate coroutine. When a new value is emitted in the upstream flow, the transformation coroutine for the previous value will be canceled immediately if it was still running and a new one will be launched to replace it. This way, work on obsolete values is avoided.

So far so good. However, here comes the major problem with this code: because StateFlow does not support versioning, the trigger will re-emit the latest value when the flow collection restarts. This happens every time the Activity/Fragment becomes visible again after being invisible for more than 5 seconds.

Britney Spears singing “Oops!… I emit again”
Britney Spears singing “Oops!… I emit again”

And when the trigger re-emits the same value, the mapLatest() transformation will run again, hitting the repository one more time with the same arguments, even though the result had already been delivered and cached!
Goal #1 is missed: data that is still valid should not be loaded a second time.

Preventing re-emission of the latest trigger value

The next questions that come to mind are: should we prevent this re-emission and how? StateFlow already takes care of deduplicating values from within a flow collection, and the distinctUntilChanged() operator does the same for other kinds of flows. But no standard operator exists to deduplicate values across multiple collections of the same flow, because flow collections are supposed to be self-contained. This is a major difference with LiveData.

In the specific case of a Flow shared between multiple observers using the stateIn() operator, emitted values will be cached and there will always be at most one coroutine collecting the source Flow at any given time. It looks tempting to hack around some operator function that would remember the latest value of a previous collection to be able to skip it when a new collection starts:

// Don't do this at home (or at work)
fun <T> Flow<T>.rememberLatest(): Flow<T> {
var latest: Any? = NULL
return
flow {
collectIndexed { index, value ->
if
(index != 0 || value !== latest) {
emit(value)
latest = value
}
}
}
}

Remark: an attentive reader noted that the same behavior can be achieved by replacing the MutableStateFlow with a Channel(capacity = CONFLATED) then turning it into a Flow using receiveAsFlow(). Channels never re-emit values.

Unfortunately the above logic is flawed and will not work as intended when the downstream flow transformation is canceled before completion.

The code assumes that after emit(value) returns, the value has been processed and should not be emitted again if the flow collection restarts, but this is only true when using unbuffered Flow operators. Operators like mapLatest() are buffered and in this case emit(value) will return immediately while the transformation is executed asynchronously. This means that there is no way to know when a value has been fully processed by the downstream flow. If the flow collection is canceled in the middle of an asynchronous transformation, we still need to re-emit the latest value when the flow collection restarts in order to resume that transformation, otherwise that value will be lost!

TL; DR: Using StateFlow as a trigger in a ViewModel leads to repeating work every time the Activity/Fragment becomes visible again and there is no simple way to avoid it.

This is why LiveData is superior to StateFlow when used as a trigger in a ViewModel, even though these differences are not mentioned in Google’s “Advanced coroutines with Kotlin Flow” codelab which implies the Flow implementation behaves the exact same way as the LiveData one. It does not.

Conclusion

Here are my recommendations based on the above demonstration:

  • Keep using LiveData in your Android UI layer and ViewModels, especially for triggers. Use it whenever possible to expose data to be consumed in Activities and Fragments: it will make your code both simple and efficient;
  • The LiveData coroutine builder function is your friend and can replace Flows in ViewModels in many cases;
  • You can still use the power of Flow operators when you need them, then convert the resulting Flow to a LiveData;
  • Flow is a better fit than LiveData for all the other layers of an application like repositories or data sources, because it doesn’t depend on Android-specific lifecycles and is easier to test.

Now you know the tradeoffs you’re willing to make if you still want to fully “go with the Flow” and eradicate LiveData from your Android UI layer.

Are you concerned about performance and experienced similar issues? Did you find other ways to work around them? If so, feel free to post a message in the comments section and as always, please share or subscribe if you enjoyed reading this long article.

Android developer from Belgium, blogging about advanced programming topics.