User manual - Data analyses

The interface allows one to perform the following types of data analysis:
  • Data analyses to get moments, minimum, maximum, PDF, etc.

  • Marginals inferences

  • Dependence inferences

  • Metamodels creation

1- Data analysis

1-1 Creation

A new sample analysis can be created through:
  • the context menu of the Definition item of the data model

    ../../../_images/dataAnalysisdefContextMenu.png
  • the Data analysis box of the model diagram

    ../../../_images/dataModelDiagramBoxes.png

When the analysis is required, a new item is added in the study tree below the data model item.

Its context menu has the following actions:
  • Rename: Rename the analysis

  • Remove: Remove the analysis from the study

This item is associated with a window showing a progress bar and Run/Stop buttons, to launch or stop the analysis.

../../../_images/dataAnalysisWindow.png

1-2 Results

When the analysis is finished or stopped, the following window appears.

../../../_images/data_model_analysis_summary.png

The window shows numerous tabs, some of which are interactively linked (Table, Parallel coordinates plot, Plot matrix and Scatter plot tabs): when the user selects points on one of these representations, the same points are automatically selected in the other tabs.

  • The Summary tab summarizes the results of the analysis, for a selected variable (left column): sample size, elapsed time, moment estimates, empirical quantiles, minimum/maximum values, input values at extremum.

  • The PDF/CDF tab presents the PDF/CDF of the variables together with a kernel smoothing representation.

    • Use the Graph settings window to set up graphical parameters and select the graphic type: PDF (default) or CDF

    • Graph interactivity:
      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

    ../../../_images/data_model_analysis_PDF.png
  • The Box plots tab presents the box plot of the variables. They are rescaled for each variable (x), using mean (\mu) and standard deviation (\sigma): y = (x - \mu)/\sigma

    • Use the Graph settings window to set up graphical parameters.

    • Graph interactivity:
      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

    ../../../_images/data_model_analysis_boxplot.png
  • The Dependence tab displays the Spearman’s matrix estimate.

    • The cells are colored according to the value of the Spearman’s coefficient.

    • Its context menu allows one to export the table in a CSV file or as a PNG image.

    • Select cells and Press Ctrl+C to copy values in the clipboard

    ../../../_images/doe_dependence.png
  • The Table tab shows the input/output samples. The table can be exported (Export button).

    • Table interactivity:
      • Left-click on lines to select them (optional: + Ctrl for adding to current selection, + Shift for contiguous selection)

      • Left-click on column header to sort values in ascending or descending order

      • Left-click on a column header and drag it in another place to change columns order

      • Right-click on table to export or copy to the clipboard (shortcut Ctrl+C) the current selection

      • Ctrl+A selects all lines

    ../../../_images/designOfExperimentTable.png
  • The Parallel coordinates plot tab displays the sample points.

    • Use the Graph settings window to set up graphical parameters.

    • Graph interactivity:
      • Left-click on columns to select curves (multiple selection possible)

    ../../../_images/data_model_analysis_Cobweb.png
  • The Plot matrix tab: histograms of the distribution of each variable (diagonal) and scatter plots between each couple of input/output variables (off-diagonal).

    • Use the Graph settings window to set up graphical parameters.

    • Graph interactivity:
      • Right-click to select points

      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

    ../../../_images/data_model_analysis_plotmatrixYX.png
  • The Scatter plots tab displays the scatter plot of two parameters.

    • Use the Graph settings window to set up graphical parameters and select the variables to plot on X-axis and Y-axis (default: first output versus first input)

    • Graph interactivity:
      • Right-click to select points

      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

    ../../../_images/data_model_analysis_scatterplot.png

2- Marginals inference

The inference analysis allows one to perform a Bayesian Information Criterion (BIC) and either a Kolmogorov-Smirnov or Lilliefors goodness-of-fit tests for 1-d continuous distributions.

New marginals inference can be created thanks to:
  • the context menu of the Definition item of the data model

    ../../../_images/dataAnalysisdefContextMenu.png
  • the Marginals inference box of the model diagram

    ../../../_images/dataModelDiagramBoxes.png

2-1 Definition

../../../_images/inference_wizard1.png
When an analysis is required, a window appears, in order to set up:
  • the variables of interest (default: all variables are analysed) by checking off the corresponding line in the first table

  • the list of distributions to infer for each variable (default: Normal distribution):
    • The list of distributions can be different for each variable.

    • Click on Apply the list of distributions to all variables in the context menu of a variable to set up the same list of distributions to the other checked variables.

      ../../../_images/inference_wizard_applyToAll.png
    • To add a distribution, click on the Add combo box and select a distribution of the list which appears (or all of them with the All item):

      • the distribution is added in the table

      • the distribution is removed from the combo box

      ../../../_images/inference_wizard_distributions_list.png
    • To remove a distribution, select it in the table and click on Remove. Press the Ctrl or Shift key to select multiple lines.

  • the Kolmogorov-Smirnov/Lilliefors level such that \alpha = 1 - {\rm level} is the risk of committing a Type I error, that is an incorrect rejection of a true null hypothesis (default: 0.05., expected: float in the range ]0, 1[)

  • Advanced paramters are as follows:

    • Require an estimation of the tested distributions parameters confidence interval at a specified level

    • Fine-tune Lilliefors parameters (precision, min/max sampling sizes)

2-2 Launch

When the analysis is required, a new item is added in the study tree below the data model item.

Its context menu has two actions:
  • Rename: Rename the analysis

  • Modify: Reopen the setting window to change the analysis parameters

  • Remove: Remove the analysis from the study

This item is associated with a window displaying the list of the parameters, a progress bar and Run/Stop buttons, to launch or stop the analysis.

../../../_images/inferenceWindow.png

2-3 Results

When the analysis is finished or stopped, a window appears.

../../../_images/inference_resultWindow_tab_summary_PDF.png

The results window gathers:

  • The Summary tab includes, for a selected variable (left column):
    • a table of all the tested distributions, the associated Bayesian Information Criterion value and the p-value.
      • The last column indicates whether the distribution is accepted or not according to the given level.

      • The distributions are sorted in increasing order of BIC values.

    • for the selected distribution:
      • The PDF/CDF tab presents the PDF/CDF of the sample together with the distribution PDF.

        • Use the Graph settings window to set up graphical parameters and select the graphic type: PDF (default) or CDF

        • Graph interactivity:
          • Left-click to translate the graph

          • Mouse wheel up/down to zoom in/zoom out

      • The Q-Q plot tab presents the Q-Q plot which opposes the data quantiles to the quantiles of the tested distribution.

        ../../../_images/inference_resultWindow_tab_summary_QQplot.png
        • Use the Graph settings window to set up graphical parameters.

        • Graph interactivity:
          • Left-click to translate the graph

          • Mouse wheel up/down to zoom in/zoom out

      • The Parameters tab includes a table with the moments of the selected distribution and the values estimate of its native parameters.

        ../../../_images/inference_resultWindow_tab_summary_parameters.png

        failed in the Acceptation column means that an error occurred when building a distribution with the given sample. Then, the Parameters tab shows the error message.

        ../../../_images/inference_resultWindow_tab_summary_parameters_error_message.png

The result can be used in the Probabilistic model window.

3- Dependence inference

The dependence inference allows one to infer copulas on the sample of the data model.

This analysis can be created thanks to:
  • the context menu of the Definition item of the relevant data model

    ../../../_images/dataAnalysisdefContextMenu.png
  • the Dependence inference box of the model diagram

    ../../../_images/dataModelDiagramBoxes.png

3-1 Definition

When an analysis is required, a window appears:

../../../_images/dependenceInference_wizard.png
The windows allows one to set up:
  • the groups of variables to test:
    • Select at least two variables of the model (left table):
      • Refer to the estimate of the Spearman’s matrix in the data analysis result window to create groups

      • For convenience, the list of groups may be set by default thanks to this estimate (if correlation between variables exists)

    • Click on the right arrow:
      • the group is added in the second table

      • a third table appears with the default item Normal

../../../_images/dependenceInference_wizardOneGroup.png
  • the copulas to infer on the groups: - Click on the Add combo box - Select a copula in the list (or all of them with the All item):

    • For a pair of variables : bivariate copulas are available (Ali-Mikhail-Haq, Clayton, Farlie-Gumbel-Morgenstern, Frank, Gumbel, Normal)

    • For a group with more than two variables: only the Normal copula is available (Add and Remove buttons are then disabled)

    ../../../_images/dependenceInference_wizard_copulaList.png
To remove a group:
  • Select a group in the second table

  • Click on the left arrow

3-2 Launch

When the analysis is required, a new item is added in the study tree below the data model item.

Its context menu has the following actions:
  • Rename: Rename the analysis;

  • Modify: Reopen the setting window to change the analysis parameters;

  • Remove: Remove the analysis from the study.

This item is associated with a window displaying the list of the parameters, a progress bar and Run/Stop buttons, to launch or stop the analysis.

../../../_images/copulaInferenceWindow.png

3-3 Results

When the analysis is finished or stopped, a window appears:

../../../_images/copulaInference_resultWindow_tab_summary_PDF.png

The window gathers:

  • The Summary tab includes, for a selected set of variables:
    • a table of all the tested copulas

    • for the selected copula:
      • the PDF/CDF tab presents, for each pair of variables, the PDF/CDF of the sample together with the distribution PDF.

        • Use the Graph settings window to set up graphical parameters and select the graphic type: PDF (default) or CDF

        • Graph interactivity:
          • Left-click to translate the graph

          • Mouse wheel up/down to zoom in/zoom out

      • the Kendall plot tab presents a visual fitting test for each pair of variables using the Kendall plot. This plot can be interpreted as a QQ-plot (for marginals): the more the curve fits the diagonal, the more adequate the dependence model is.

        • Use the Graph settings window to set up graphical parameters.

        • Graph interactivity:
          • Left-click to translate the graph

          • Mouse wheel up/down to zoom in/zoom out

      ../../../_images/copulaInference_resultWindow_tab_summary_Kendall.png
      • the Parameters tab includes the parameters estimate of the selected copula.

        ../../../_images/copulaInference_resultWindow_tab_summary_parameters.png
        • For the Gaussian copula: the tab displays the Spearman’s coefficients.

        • ‘-’ in the BIC column means that an error occurred when building a copula with the given sample. Then, the Parameters tab shows the error message.

        ../../../_images/copulaInference_resultWindow_tab_summary_parameters_ErrorMessage.png

The result can be used in the Probabilistic model window.

4- Metamodel creation

To perform this analysis, the data model or the design of experiments must contain an output sample.

A new metamodel can be created in 4 different ways:
  • the context menu of a design of experiments item

    ../../../_images/doe_eval_ContextMenu.png
  • the Metamodel creation box of a physical model diagram

    ../../../_images/physicalModel_Diagram_metamodelBox.png
  • the context menu of the Definition item of a data model

    ../../../_images/dataAnalysisdefContextMenu.png
  • the Metamodel creation box of a data model diagram

    ../../../_images/dataModelDiagramBoxes.png

4-1 Definition

When an analysis is required, a window appears, in order to set up:
  • the outputs of interest (Select outputs - default: all outputs are analyzed)

  • the method: polynomial regression (default), functional chaos or kriging

../../../_images/metaModel_wizard.png

4-1-1 Linear regression

The Linear regression window allows one to define:
  • Parameters: polynomial degree (default: 1, expected: integer in [1, 2]), interaction terms (if degree>1 only)

Refer to PolynomialRegressionAnalysis for implementation details.

4-1-2 Functional chaos

../../../_images/metaModel_functional_chaos_wizard.png
The Functional chaos parameters window allows one to define:
  • Parameters: chaos degree (default: 2, expected: integer greater or equal to 1)

  • Advanced Parameters (default: hidden): sparse chaos (default: not sparse)

Refer to FunctionalChaosAnalysis for implementation details.

4-1-3 Kriging

../../../_images/metaModel_kriging_wizard.png
The Kriging parameters window allows one to define:
  • Parameters:
    • The type of covariance model: Squared exponential (default), Absolute exponential, Generalized exponential, Matérn model

    • Parameters of the covariance model (default: hidden, visible if a model is chosen):
      • Generalized exponential: parameter p, exponent of the euclidean norm (default: 1., positive float expected)

      ../../../_images/kriging_p_parameter.png
      • Matérn: coefficient nu (default: 1.5, positive float expected)

      ../../../_images/kriging_nu_parameter.png
    • The type of the trend basis: Constant (default), Linear or Quadratic

  • Advanced Parameters are accessible for model covariance optimization (default: hidden):
    • Optimize the covariance model parameters (default: checked)

    • Scales for each input (default: 1): To edit the scales, click on the “” button to generate the input variables table and their scale through a wizard.

    ../../../_images/kriging_scale_wizard.png
    • Amplitude of the process (default: 1., positive float expected)

Refer to KrigingAnalysis for implementation details.

4-1-3 Validation

In the following window, the generated metamodel can be validated, with three different methods:
  • Analytically (default): This method corresponds to an approximation of the Leave-one-out method result.
    • For more information about Kriging, see O. Dubrule, Cross Validation of Kriging in a Unique Neighborhood, Mathematical Geology,1983.

    • For more information about Functional chaos, see G. Blatman, Adaptive sparse polynomial chaos expansions for uncertainty propagation and sensitivity analysis., PhD thesis. Blaise Pascal University-Clermont II, France, 2009.

  • Using a test sample: The data sample is divided into two subsamples, by picking points randomly (default seed = 1): training sample (default: 80% of the sample points) and test sample (default: 20% of the sample points). A new metamodel is built with the training sample and is validated with the test sample.

  • Using the K-Fold method: Define the number of folds (default: 5, expected: integer greater than 1) and specify how the folds are generated (default seed:1).

../../../_images/metaModel_validation_page.png

4-2 Results

When the window is validated, a new element appears in the study tree below the data model item or the design of experiments item.

The context menu of this item contains these actions:
  • Rename: Rename the analysis

  • Modify: Reopen the setting window to change the analysis parameters

  • Convert metamodel into physical model (default: disabled, enabled when the analysis is successfully finished): Add the metamodel in the study tree

  • Export metamodel (default: disabled, enabled when the analysis is successfully finished): Export the metamodel in a standalone OpenTURNS study file (XML) which can be accessed by running the generated python script

  • Remove: Remove the analysis from the study

This item is associated with a window displaying the list of the parameters, a progress bar and Run/Stop buttons, to launch or stop the analysis.

../../../_images/metaModelWindow.png

4-2-1 Functional chaos

../../../_images/metaModel_result_window_moments.png

The results window gathers:

  • The Results tab shows different information about the selected output (left column):
    • Residual

    • Relative error

    • first and second order moments

    • polynomial basis: dimension, maximum degree, full/truncated size

    • part of variance explained by each polynom

../../../_images/metaModel_result_window_plot.png
  • The Adequation tab shows the fitting curve between the physical model output values (Real otput values) and the metamodel values (Prediction). The reference diagonal (in black) is built with the physical model output values.

    • Use the Graph settings window to set up graphical parameters.

    • Graph interactivity:

      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

  • The Sobol indices tab includes, for a selected output (left column):

    • The graphic representation of the first and total order indices for each variable. Use the Graph settings window to set up graphical parameters.

    • A summary table with the first and total order indices.

      • Table interactivity:
        • Select cells and Press Ctrl+C to copy values in the clipboard

        • Left-click on column header to sort values in ascending or descending order. Sorting the table will automatically sort the indices on the graph.

    • The index corresponding to the interactions (below the table).

    If the Sobol’s indices estimates are incoherent, an attentionButton will appear in the table. It is advised to refer to the associated warning message (tooltip of the attentionButton).

    ../../../_images/metaModel_result_window_sobol_indices.png
  • The Validation tab (default: hidden; visible if a metamodel validation is required) shows for each method and selected output:
    • The metamodel predictivity coefficient: \displaystyle Q2 = 1 - \frac{\sum_{i=0}^N (y_i - \hat{y_i})^2}{\sum_{i=0}^N {(\bar{y} - y_i)^2}}

    • The residual: \displaystyle res = \frac{\sqrt{\sum_{i=0}^N (y_i - \hat{y_i})^2}}{N}.

    • K-Fold and Test sample: A plot showing the relation between the output values (physical model) and the predicted metamodel values. The relation is compared to a reference diagonal built with the physical model output values.

      • Use the Graph settings window to set up graphical parameters.

      • Graph interactivity:
        • Left-click to translate the graph

        • Mouse wheel up/down to zoom in/zoom out

      ../../../_images/metaModel_result_window_LOO_plot.png
    • Analytical: the Q2 value

      ../../../_images/FC_analyticalValidation.png
  • The Parameters tab summarizes the parameters of the metamodel creation.

    ../../../_images/metaModel_result_window_parameters.png

4-2-2 Kriging

../../../_images/metaModel_result_window_kriging_plot.png

The results window gathers:

  • The Metamodel tab shows for a selected output the graphic relation between output values from the physical model (Real output values) and metamodel values (Prediction). The reference diagonal (in black) is built with the physical model output values.

    • Use the Graph settings window to set up graphical parameters.

    • Graph interactivity:
      • Left-click to translate the graph

      • Mouse wheel up/down to zoom in/zoom out

  • The Results tab presents the optimized covariance model parameters and the trend coefficients.

    ../../../_images/metaModel_result_window_kriging_results.png
  • If a metamodel validation is required, a Validation tab appears for the selected method and output:
    • The residual: \displaystyle res = \frac{\sqrt{\sum_{i=0}^N (y_i - \hat{y_i})^2}}{N}.

    • The metamodel predictivity coefficient: \displaystyle Q2 = 1 - \frac{\sum_{i=0}^N (y_i - \hat{y_i})^2}{\sum_{i=0}^N {(\bar{y} - y_i)^2}}

    • A plot showing the relation between the output values (physical model) and the predicted metamodel values. The relation is compared to a reference diagonal built with the physical model output values.

      • Use the Graph settings window to set up graphical parameters.

      • Graph interactivity:
        • Left-click to translate the graph

        • Mouse wheel up/down to zoom in/zoom out

      ../../../_images/metaModel_result_window_LOO_plot.png
  • The Parameters tab summarizes the parameters of the metamodel creation.