DataCleaning¶
- class persalys.DataCleaning(*args)¶
DataModel sample manipulation. Allows one to remove/replace irrelevant/erroneous values on-the-fly.
- Parameters
- sampleopenturns.Sample
Sample instance that requires cleaning. Original sample is copied and will not be modified. Use sample accessor for the modified sample version.
Examples
>>> import math >>> import openturns as ot >>> import persalys >>> sample = ot.Sample(0,3) >>> sample.add([4,2,4]) >>> sample.add([2,math.nan,4]) >>> sample.add([2,3,7]) >>> cleaner = persalys.DataCleaning(sample) >>> cleaner.removeAllNans() >>> cleaned_sample = cleaner.getSample()
Methods
Column by column sample analysis.
Computes sample geometric median absolute deviation
Computes sample median absolute deviation
Accessor to the object's name.
Geom.
getMAD
()MAD accessor Returns ------- MAD :
openturns.Point
getMean
()Mean accessor
Median accessor
Returns number of Nans/Infs in each sample column Returns ------- nNans :
openturns.Point
Sample accessor
Removes Nans/Infs in sample
removeNansByColumn
(col)Removes Nans/Infs in sample column
replaceAllNans
(point)Replaces Nans/Infs in sample point by point values
replaceNansByColumn
(col, val)Replaces Nans/Infs in sample column by value
- __init__(*args)¶
- analyseSample()¶
Column by column sample analysis. Allows marginals mean/median computation by ignoring Nans/Infs. Evaluates number of Nans/Infs for each marginal
- computeGeometricMAD()¶
Computes sample geometric median absolute deviation
- computeMAD()¶
Computes sample median absolute deviation
- getClassName()¶
Accessor to the object’s name.
- Returns
- class_namestr
The object class name (object.__class__.__name__).
- getGeometricMAD()¶
Geom. MAD accessor Returns ——- geomMad :
openturns.Scalar
- getMAD()¶
MAD accessor Returns ——- MAD :
openturns.Point
- getMean()¶
Mean accessor
- Returns
- mean
openturns.Point
- mean
- getMedian()¶
Median accessor
- Returns
- median
openturns.Point
- median
- getNanNumbers()¶
Returns number of Nans/Infs in each sample column Returns ——- nNans :
openturns.Point
- getSample()¶
Sample accessor
- Returns
- sample
openturns.Sample
Sample associated to the DataCleaning API. Constructed from a copy of the original sample and edited on the fly.
- sample
- removeAllNans()¶
Removes Nans/Infs in sample
- removeNansByColumn(col)¶
Removes Nans/Infs in sample column
- Parameters
- colint
Column index to clean
- replaceAllNans(point)¶
Replaces Nans/Infs in sample point by point values
- Parameters
- pointopenturns.Point
Replacement values
- replaceNansByColumn(col, val)¶
Replaces Nans/Infs in sample column by value
- Parameters
- colint
Column index to clean
- valopenturns.Scalar
Replacement value