Skip to content

Workloads

Clayton Knittel edited this page Feb 12, 2021 · 4 revisions

Workload stages define how the benchmark tool will do everything, from constructing records to what types of query calls to make.

Workload stage structure

To specify workload stages, make a yaml file (for now, let's make one called workload.yaml). The structure of a stage looks like this:

- stage: <n> (the stage number, starting from 1)
  desc: "a description of the stage, which will be printed at the start of the stage"
  duration: the minimum amount of time the stage will run for, in seconds. This argument
      is required.
  workload: the workload type, either I (linear insertion), RU,<read percent> (random
      read/update), or DB (delete bins)
  tps: maximum possible transactions per second, used as a throttle (default is 0, i.e.
      no throttling).
  object-spec: the object spec for the stage, otherwise inherited from the previous stage,
      with the first stage inheriting from the global context (see the object spec wiki
      page)
  key-start: integer key start (if not specified, inherited from global context).
  key-end: integer key end (if not specified, adds --keys from the global context to
      key-start to calculate key-end).
  read-bins: which bins to read from records when performing read operations (has no
      affect if the workload doesn't contain reads). The default is to read all bins.
  pause: maximum number of seconds to wait before this stage begins. The true wait time
      is chosen as a random integer number of seconds between 1 and pause. Default is 0.
  async: when true/yes, uses asynchronous commands for this stage, dispatching those
      commands with only a single worker thread. Default is false.
  random: when true/yes, randomly generates new object for each write. Otherwise, a
      single object is made at the beginning of the stage according to the stage's
      object-spec, and that same object is used for each write transaction. Default is
      false.
  batch-size: specifies the batch size of reads for this stage. Default is 1, i.e. don't
      use batch read commands.

Notes:

  • For read/update workloads, key values are chosen randomly between key-start (inclusive) and key-end (exclusive)
    • Since there is no set amount of work needed to be done for this type of workload, the stage will last exactly duration seconds
  • For both linear and deletion workloads, all keys are iterated over by the transaction worker threads
    • Since these workloads have a specific amount of work to do, the workload stage will run until the workload is complete, regardless of what duration is set to. However, if duration ends up being longer than how long it actually takes to complete the stage, the benchmark will idle until duration is complete. It is recommended to set duration to 0 for these types of workloads
    • In async mode, keys are iterated from start to end, meaning you get true linear insertion/deletion from start to end (minus reordering over the network or from thread scheduling). However, in synchronous mode, each thread takes a continuous subrange of key values from the total range and iterates linearly over that range, meaning keys are not inserted in order from a global perspective
  • You can set the number of client event loop threads with --eventLoops, which controls the amount of concurrency possible when doing an async workload.
  • You can set the maximum number of outstanding asynchronous command calls with --asyncMaxCommands (default is 50)

Running only a single workload stage

To run just a single workload stage, you can supply each of the fields from the workload stages yaml file as command line arguments.

Here is a list of all the command line arguments:

  • --duration (-t): sets the duration of the stage
  • --workload (-w): sets the workload type of the stage
  • --throughput (-g): sets the target number of transactions per second
  • --objectSpec (-o): sets the global object spec
  • --startKey (-K): sets the global start key
  • --keys (-k): sets the total number of keys to be used
    • key-end is calculated as key-start + keys
  • --readBins: sets the bins to be read from
  • --async: run this workload in async mode
  • --random: randomly generates records for each write transaction
  • --batchSize: sets the batch size on read transactions

For example, to run a 30-second long async random read/update workload (50% reads/writes), randomly generating each record on writes, run:

target/benchmark -t 30 -w RU,50 --async --random