Skip to content

Back-filling streams

Let's say you have an existing stream which already contains data, in this case a topic:

my-topic := (topic partitions = 16)

And you want to create a child stream of my-topic:

child-stream := my-topic -> (filter by len(val) > 1000) -> (store stream)

When child-stream is created it won't receive any of the messages in my-topic that were there before it was created. It will only receive new messages that arrive after it was created.

Sometimes you want to run all the existing data into the child stream. You do this with a backfill operator:

child-stream := my-topic -> (backfill) ->
    (filter by len(val) > 1000) -> (store stream)

The backfill operator will scan all the data in the parent stream and feed it into the child stream before the child stream receives any new messages.

Please note, that if the parent stream has a lot of data, it can take a significant time to back-fill the child stream