/ Tags: RUBY 3 / Categories: RUBY

Enumerator and Lazy Evaluation in Ruby — Processing Large Collections Without Blowing Memory

Most Ruby developers reach for map, select, and each without thinking twice — and for small arrays, that’s totally fine. But the moment you’re working with large datasets, external files, API paginated responses, or sequences that are theoretically infinite, eager evaluation starts costing you memory you don’t have to spend. Ruby’s Enumerator and lazy evaluation give you a way to process data one element at a time, computing only what you actually need.

What Is an Enumerator?


An Enumerator is an object that wraps iteration. Instead of executing a loop immediately, it encapsulates the iteration logic so you can control when and how elements are produced. Every standard Ruby method that takes a block (each, map, select) returns an Enumerator when called without one.

Example:

enum = [1, 2, 3].each
# => #<Enumerator: [1, 2, 3]:each>

enum.next  # => 1
enum.next  # => 2
enum.next  # => 3
enum.next  # raises StopIteration

You can also build custom enumerators with Enumerator.new. The block receives a “yielder” object — call << or yield on it to push values out.

Example:

counter = Enumerator.new do |yielder|
  i = 0
  loop do
    yielder << i
    i += 1
  end
end

counter.next   # => 0
counter.next   # => 1
counter.take(5)  # => [0, 1, 2, 3, 4]

That loop do runs forever in theory, but the enumerator only executes as far as the caller pulls. That’s the key insight: the producer and consumer are decoupled.

Infinite Sequences


Infinite enumerators are genuinely useful — not just academic curiosities. The classic example is a Fibonacci sequence that never terminates:

Example:

fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end

fibonacci.take(10)
# => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

fibonacci.first(5)
# => [0, 1, 1, 2, 3]

No array is ever allocated to hold “all Fibonacci numbers.” The sequence runs as long as the caller keeps asking. This pattern is also useful for things like unique ID generators, paginated API cursors, or any sequence where the next value depends on the previous one.

Lazy Enumerators — Deferring Computation


Where custom Enumerator.new gives you explicit control over production, .lazy gives you laziness on top of existing collections and chains. Calling .lazy on any enumerable returns a Enumerator::Lazy — operations chained onto it are not executed until values are consumed.

Example:

# Eager — allocates three full intermediate arrays
result = (1..Float::INFINITY)
  .select { |n| n.odd? }
  .map { |n| n ** 2 }
  .first(5)
# This would run forever — you can't map an infinite range eagerly

# Lazy — processes one element at a time, stops after 5
result = (1..Float::INFINITY)
  .lazy
  .select { |n| n.odd? }
  .map { |n| n ** 2 }
  .first(5)
# => [1, 9, 25, 49, 81]

The lazy version processes element 1, checks the predicate, applies the transform, and moves on — only doing this 5 times. The eager version would try to evaluate an infinite range before you ever get a result.

Lazy vs Eager on Finite Collections

For small collections, lazy is often slower because of the overhead of the Enumerator::Lazy wrapper. Benchmarks generally show the crossover around a few thousand elements, depending on the complexity of the chain.

Example:

require "benchmark"

data = (1..10_000).to_a

Benchmark.bm do |x|
  x.report("eager: ") do
    data.select { |n| n % 3 == 0 }.map { |n| n * 2 }.first(100)
  end
  x.report("lazy:  ") do
    data.lazy.select { |n| n % 3 == 0 }.map { |n| n * 2 }.first(100)
  end
end

For small arrays where you want all results, eager wins. For large collections where you only need the first N elements, or for infinite sequences, lazy wins clearly.

Chaining Lazy Operations


Lazy chains compose naturally. Every lazy operation returns another Enumerator::Lazy, so you can chain .select, .map, .reject, .flat_map, and .take freely. The chain only fires when you terminate it with .first(n), .to_a, .force, or iterate with .each.

Example:

result = (1..Float::INFINITY)
  .lazy
  .select { |n| n % 2 == 0 }          # keep even numbers
  .map { |n| n * n }                   # square them
  .reject { |n| n.to_s.include?("4") } # drop those containing digit 4
  .first(5)

# => [36, 100, 196, 256, 676]
# (4, 16, 36, 64, ...) — 4 dropped, 16 dropped, 36 kept, 64 dropped, ...)

.force is an alias for .to_a on lazy enumerators — it forces full evaluation and returns a plain array. Use it when you want all remaining elements but want to signal to readers that this is intentionally materializing a lazy chain.

Example:

(1..20).lazy.select(&:odd?).map { |n| n * 3 }.force
# => [3, 9, 15, 21, 27, 33, 39, 45, 51, 57]

Pro-Tip: each_with_object is not lazy-chain-friendly. It evaluates eagerly and breaks lazy semantics when used mid-chain. If you need to accumulate while staying lazy, use inject or reduce instead — or call .to_a/.force first to materialize the collection before folding it. Trying to use each_with_object inside a lazy chain is one of those things that silently produces wrong results or hangs.

Enumerator::Chain — Combining Enumerators


Ruby 2.6 introduced Enumerator::Chain and the + operator for enumerators, letting you concatenate two independent sequences into a single one.

Example:

first_batch  = [1, 2, 3].each
second_batch = [4, 5, 6].each

combined = first_batch + second_batch
combined.to_a  # => [1, 2, 3, 4, 5, 6]

You can also use Enumerator::Chain.new explicitly:

Example:

chain = Enumerator::Chain.new(
  (1..3).each,
  ("a".."c").each,
  [:x, :y, :z].each
)

chain.to_a  # => [1, 2, 3, "a", "b", "c", :x, :y, :z]

This is useful when you want to process multiple data sources — like shards of a database result, or chunks of a file — as a single stream without loading them all into memory simultaneously. Combine it with lazy for a memory-efficient pipeline across multiple sources:

Example:

source_a = (1..Float::INFINITY).lazy.select { |n| n % 5 == 0 }
source_b = (1..Float::INFINITY).lazy.select { |n| n % 7 == 0 }

combined = source_a + source_b
combined.first(6)  # => [5, 10, 15, 20, 25, 30]

When to Use Lazy vs Eager


The decision is simpler than it sounds:

  • Use eager when the collection fits comfortably in memory and you need all the results anyway. Array#map and Array#select are optimized and fast for typical use.
  • Use lazy when you only need the first N results from a large or infinite sequence — the savings on computation and memory are real.
  • Use lazy when reading large files line-by-line or processing paginated external data, so you never hold the full dataset in RAM.
  • Use Enumerator.new when you need custom iteration logic, stateful sequences, or to expose an iteratable interface from a non-standard data source.

One pattern worth knowing: IO#each_line already behaves lazily in practice, but wrapping it in a lazy chain lets you compose transformations cleanly:

Example:

interesting_lines = File.open("large_log.txt")
  .each_line
  .lazy
  .select { |line| line.include?("ERROR") }
  .map(&:chomp)
  .first(50)

This reads only as many lines as needed to find 50 errors. No 500MB log file loaded into memory.

Conclusion


Ruby’s Enumerator is one of those features that unlocks a completely different way of thinking about iteration. Once you understand that computation can be deferred — that you can define what to produce without immediately producing it — you start seeing patterns everywhere: infinite sequences, streaming pipelines, memory-efficient transforms over large data. Lazy enumerators and Enumerator::Chain extend that model to composition and multi-source pipelines. The performance win is real for the right use cases, but honestly, the bigger win is writing code that clearly expresses “I only care about the first N results” rather than computing everything and throwing most of it away.

FAQs


Q1: Is lazy evaluation always faster than eager?
No. For small collections, lazy has overhead from the Enumerator::Lazy wrapper that makes it slower than a plain map/select chain. Lazy wins when you’re working with large or infinite sequences and only need a subset of results — the savings on unnecessary computation outweigh the wrapper overhead.

Q2: What’s the difference between .first(n) and .take(n).to_a on a lazy enumerator?
They’re functionally equivalent for lazy enumerators. Both consume exactly N elements and return an array. .first(n) is slightly more idiomatic and readable; .take(n).force is explicit about the lazy/force boundary if you want to signal intent in the code.

Q3: Can I reuse an Enumerator after calling .next on it?
Not without rewinding. An Enumerator maintains internal state. Call .rewind to reset it to the beginning. Note that lazy enumerators built from infinite sequences can’t be meaningfully rewound — creating a new one is the right approach.

Q4: How does Enumerator.new differ from implementing each with Enumerator.new(&method(:each))?
Enumerator.new { |y| ... } creates a fully custom enumerator with explicit yielding logic. Enumerator.new(&method(:each)) wraps an existing each method into an enumerator. The latter is the standard pattern for making a custom class enumerable without including Enumerable directly — though including Enumerable and defining each is usually the cleaner approach.

Q5: Does lazy work with flat_map?
Yes. Enumerator::Lazy supports flat_map (also aliased as collect_concat). It flattens one level and stays lazy. This is useful for expanding paginated results or nested data structures without materializing the entire output before processing.

cdrrazan

Rajan Bhattarai

Full Stack Software Developer! 💻 🏡 Grad. Student, MCS. 🎓 Class of '23. GitKraken Ambassador 🇳🇵 2021/22. Works with Ruby / Rails. Photography when no coding. Also tweets a lot at TW / @cdrrazan!

Read More