Joining and Connecting Data in Riffyn

Riffyn joins data in multiple ways depending on how you have designed your process and executed your experiments. The joining in Riffyn can be generalized into these basic types across the below data tables where each table is the data on a step and where R is a resource and P is a property.

No joining

Picture1.png

Runs are not connected so corresponding rows of data are not connected. This is not a common join and represents data across steps that are not connected – such as generating samples and measuring standard solutions. Neither are related and are not joined.

Joining by 1:1 run connections

Picture2.png

Runs are connected at a 1:1 ratio where each upstream run is connected to one downstream run. This is the default when flowing runs or linking runs by assigning a resource. This is most common when a single sample is passed through to another step to be used or tested. This can also be individual samples that are related to each other via a 1:1 ratio genealogy.

Joining by N:1 run connections

Picture4.png

Runs are connected at a N:1 merge so that each downstream run is connected to each corresponding run from the upstream step. This is most common when blending or merging samples together – as is done in formulation or mixture testing. This is done by connecting runs across steps

Joining by 1:M run connections

Picture3.png

Runs are connected at a 1:M split so that each upstream run is connected to each corresponding run from the downstream step. This is most common when taking many subsample from a parent sample, or when a single sample is split for testing in multiple places. This is done by connecting runs across steps

Joining multivalued data

Picture6.png

Runs are connected at a 1:1 ratio where each upstream run is connected to one downstream run (N:M relationships are also possible). This is most common when a single sample that was measured multiple times is passed through to another step to be used or tested. The join of multi-valued data results in each row of data on each upstream run connected to each row of data on each downstream run. In cases where you have multiple successive time series measured for the same sample the number of data rows can become exceptionally large due to the multiplicative nature of joining multivalued data. Consider a custom join rule to reduce the amount of rows created upon joining.

Joining using a custom join rule

Picture5.png

Data on a run are connected at a N:M relationship so that each data point on one upstream run is connected to each corresponding data points on multiple downstream runs. This is not a standard way of relating materials, but is most commonly applicable to processes where a series of time data is to be connected to a discrete sample taken from it. This is done via custom join rules and domain specific use cases are described in Processes that Benefit From Custom Join Rules.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.