A ScalaMeter generator, represented by the Gen[T] trait, is a datatype that provides input data for the test. More specifically, a generator provides a sequence of warmup test inputs, a sequence of parameter combinations which produce a specific test input value and allows producing a test input value from a parameter combination.

The trait Gen[T] looks roughly like this:

trait Gen[T] {
  def warmupset: Iterator[T]
  def dataset: Iterator[Parameters]
  def generate(params: Parameters): T
}

ScalaMeter generators are lazy – they do not internally hold references to test input objects by default. Instead, they generate them lazily when calling the next method of their input data iterators.

As mentioned earlier, generators are divided into two main categories – the basic generators and the composed generators. A number of basic generators are already predefined, and you can obtain new ones by implementing the above mentioned trait. However, the preferred way to obtain generators for more complex data types is from basic ones using for-comprehensions. The generators obtained this way are called the composed generators.

Basic generators

Gen.unit(axis: String)
Iterates only a single value – a (). This generator is useful when we don’t need a range of different inputs, or there is just one meaningful input that is encoded in the microbenchmark. For example, measuring the time needed to ping some fixed web address fits into this category.

Gen.single[T](axis: String)(v: T)
Generates a single, specified value v. Similar to the previous generator, but more general.

Gen.range(axis: String)(from: Int, upto: Int, hop: Int)
Generates an inclusive range of integer values. Used to generate collection sizes, problem size input for various algorithms, or parametrizing algorithms.

Gen.enumeration[T](axis: String)(xs: T*)
Generates the enumerated values of type T. Useful when parametrizing benchmarked algorithms or methods with non-numeric data.

Gen.exponential(axis: String)(from: Int, until: Int, factor: Int)
Generates an inclusive exponential range of integer values. The starting value is from, and each subsequent value is mutliplied by factor, until the value until is reached. Useful as an input when the measurement changes in an interesting way with a power of some parameter – for example, the parallelism level or the data size for a sorting algorithm.

Each basic generator has a single axis. The name of this axis is the name specified when the generator was created. This same name will be the name of an axis when you generate a chart using a ChartReporter.

A special, caching, generator can be obtained by calling cached on a generator. This generator will not recreate the test input values each time the input data is traversed. Instead, it will create the data only once on first iteration and keep it cached afterwards. This is useful to avoid regenerating expensive objects like thread pools or database connections when only a few such objects are needed during the entire test.

class CachedGeneratorTest
extends Bench.OnlineRegressionReport {
  def persistor = new persistence.SerializationPersistor

  val sizes = Gen.range("size")(100000000, 500000000, 200000000)
  val parallelismLevels = Gen.enumeration("parallelismLevel")(1, 2, 4, 8)
  val pools = (for (par <- parallelismLevels) yield
    new collection.parallel.ForkJoinTaskSupport(
      new concurrent.forkjoin.ForkJoinPool(par))).cached
  val inputs = Gen.tupled(sizes, pools)

  performance of "foreach" in {
    performance of "ParRange" in {
      using(inputs) config (
        exec.benchRuns -> 30,
        exec.independentSamples -> 5
      ) in { case (sz, p) =>
        val pr = (0 until sz).par
        pr.tasksupport = p
        pr.foreach(x => ())
      }
    }
  }
}

Composed generators

Here is an example of a composed generator:

for {
  size <- Gen.sizes("size")(5000, 50000, 10000)
  par <- Gen.exponential("par")(1, 8, 2)
} yield {
  val parrange = (0 until size).par
  parrange.tasksupport = createTaskSupport(par)
  parrange
}

The for-comprehension is desugared into map and flatMap calls on generators. The new generator will go over the combinations of "size" and "par" to generate different values. It will have two axes, meaning that every running time of a benchmark run using it will depend on two input parameters.
Such data dependency is best displayed using a 3D chart.

In the next section we take a look at the different reporters.