Summarization of Formula Variables

Variables can be mapped to run attributes that can have multiple values. When multiple values are present, calculations will execute and provide a singular value. For this reason the summarization method must be specified for each variable.

A quick review of terminology:

  • Calculation: The act or result of running a formula.
  • Formula: A set of variables used to calculate a value.
  • Run Attribute: A value that can be mapped to a variable in the formula editor.

Potential scenarios where a variable might be mapped to multiple values:

  1. N runs on Step 1 merge onto M runs on Step 2. A formula that references a run attribute on Step 1 might have multiple upstream runs connected to it. A summarization method is needed to determine which values from which runs should be used.
  2. 1 run on Step 1 merges onto 1 run on Step 2. A formula that references a run attribute on Step 1 might reference a run attribute with a multivalued data block. A summarization method is needed to determine which values from the multi valued data block should be used.
  3. N runs on Step 1 merge onto M runs on Step 2. A formula that references a run attribute on Step 1 might have multiple upstream runs connected with each run attribute having a multivalued data block. A summarization method is needed to determine which values from which multivalued data block from which runs should be used.

Summarization methods vary based on the data types being mapped to a variable. A brief description for each summarization method is below. Unless otherwise noted "run" refers to all runs connected to the run that the calculation is being executed on.

Name

Description

Applies to

Action Within a Run

Action Across Runs

first

Returns the first value on the first connected run (by run number).

any

first

first run by run number

first by run end

Returns the first value on the first connected run (by run end time). Ignores any runs without an end time.

any

first

first run by end time

first by run start

Returns the first value on the first connected run (by run start time). Ignores any runs without a start time.

any

first

first run by start time

last

Returns the last value on the last connected run (by run number).

any

last

last run by run number

last by run end

Returns the last value on the last connected run (by run end time). Ignores any runs without an end time.

any

last

last run by end time

last by run start

Returns the last value on the last connected run (by run start time). Ignores any runs without a start time.

any

last

last run by start time

max

Finds the maximum value across all runs.

numeric, datetime

max

max

mean

Averages all values as if they were one data set.

numeric

mean

mean (weighted by number of values)

mean of run means

Averages values on each run, then averages the averages.

numeric

mean

mean

median

Takes the median of all values as if they were one data set.

numeric

none

median

median of run medians

Takes the median of values on each run, then the median of the medians.

numeric

median

median

min

Finds the minimum value across all runs.

numeric, datetime

min

min

run count

Counts the number of connected runs.

any

1

sum

SD (pop)

Takes the standard deviation of all values as if they were one population.

numeric, datetime

none

standard deviation (population)

SD (sam)

Takes the standard deviation of all values as if they were a set of samples from one population.

numeric

none

standard deviation (sample)

sum

Adds up all values across all runs.

numeric

sum

sum

value count

Counts the number of values.

any

number of values

sum

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.