Orbital library

orbital.math
Class Stat

java.lang.Object
  extended by orbital.math.Stat

public final class Stat
extends java.lang.Object

This class contains algorithms and utilities for stochastics and statistical mathematics. It works much like the standard class Math extended for statistics.

Author:
André Platzer
See Also:
MathUtilities, Combinatorical
Stereotype:
Utilities, Module

Method Summary
static double arithmeticMean(double[] x)
          Returns the arithmetic mean (average) of a set of n values.
static double average(double[] x)
          Normal arithmetic mean of a set of values.
static double coefficientOfCorrelation(double[] x, double[] y)
          Returns the (2D) coefficient of correlation of a set of n pairs (xi,yi).
static double coefficientOfVariation(double[] x)
          Returns the coefficient of variation of a set of n values.
static Function functionalRegression(Function composedFunc, Matrix experiment)
          Performs linear regression to estimate a composed function with least squares.
static double geometricMean(double[] x)
          Returns the geometric mean of a set of n values.
static double harmonicMean(double[] x)
          Returns the harmonic mean of a set of n values.
static double mean(double[] x)
          Normal arithmetic mean of a set of values.
static double meanDeviation(double[] x)
          Returns the mean absolute deviation of a set of n values.
static double median(double[] x)
          Returns the median of a set of n values sorted in ascending numerical order.
static double quantile(double[] x, double a)
          Returns the a-quantile of a set of n values sorted in ascending numerical order.
static Vector regression(Function[] funcs, Matrix experiment)
          Performs linear regression to estimate the statistical mean according to least squares.
static Vector regression(Vector u, Matrix A, Matrix Cu)
          Performs elemental linear regression to estimate the statistical mean according to the method of least squares.
static double standardDeviation(double[] x)
          Returns the standard deviation of a set of n values.
static java.lang.String statistics(double[] x)
          Returns a string with the usual descriptive statistics for an array of double values.
static double trimmedMean(double[] x, double a)
          Returns the mean of a set of n values, sorted in ascending numerical order, with a fraction a of entries at each end dropped.
static double variance(double[] x)
          Returns the variance of a set of n values.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

arithmeticMean

public static double arithmeticMean(double[] x)
Returns the arithmetic mean (average) of a set of n values. Sensitive to errorneous data.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
1/n * ∑i=0n-1 xi where n is the length of x.
Preconditions:
x.length>0

geometricMean

public static double geometricMean(double[] x)
Returns the geometric mean of a set of n values.

It is always true that geometricMean ≤ arithmeticMean. Sensitive to errorneous data.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
n(∏i=0n-1 xi).

harmonicMean

public static double harmonicMean(double[] x)
Returns the harmonic mean of a set of n values.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
n / ∑i=0n-1 (1/xi).

mean

public static double mean(double[] x)
Normal arithmetic mean of a set of values. Also called Average.

See Also:
arithmeticMean(double[]), average(double[])

average

public static double average(double[] x)
Normal arithmetic mean of a set of values. Also called Average.

See Also:
arithmeticMean(double[]), mean(double[])

variance

public static double variance(double[] x)
Returns the variance of a set of n values. Sensitive to errorneous data.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
1/(n-1)*Sum((xi-mean)2).

standardDeviation

public static double standardDeviation(double[] x)
Returns the standard deviation of a set of n values. Sensitive to errorneous data.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
Sqrt(variance).

coefficientOfVariation

public static double coefficientOfVariation(double[] x)
Returns the coefficient of variation of a set of n values. Sensitive to errorneous data.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
standardDeviation/mean.

median

public static double median(double[] x)
Returns the median of a set of n values sorted in ascending numerical order.

Parameters:
x - the sorted array of double values representing the set of statistical data.
Returns:
x(n-1)/2 if n is odd, and (xn/2-1 + xn/2) / 2 if n is even.
See Also:
Arrays.sort(double[]), System.arraycopy(java.lang.Object, int, java.lang.Object, int, int)
Preconditions:
sorted(x)
Note:
The complexity of determining the median of an unsorted sequence of length n is in Θ(n).

quantile

public static double quantile(double[] x,
                              double a)
Returns the a-quantile of a set of n values sorted in ascending numerical order.

quantile(x,0.25) is called "lower quartile", and quantil(ex,0.75) is called "upper quartile". The interquartile range, quantile(x,0.75)-quantile(x,0.25), is a good measure for statistical deviation. It is median(x)==quantile(x,0.5).

Parameters:
x - the sorted array of double values representing the set of statistical data.
a - a number within the open range ]0,1[ that defines the quantile of which part of the data is desired.
Returns:
xk if n*a is no natural number (but fractional), and (xk-1 + xk) / 2 if n*a is a natural number, with k:=[n*a] (gaussian brackets).
See Also:
Arrays.sort(double[]), System.arraycopy(java.lang.Object, int, java.lang.Object, int, int)
Preconditions:
a ∈ (0,1) && sorted(x)

trimmedMean

public static double trimmedMean(double[] x,
                                 double a)
Returns the mean of a set of n values, sorted in ascending numerical order, with a fraction a of entries at each end dropped.

Parameters:
x - the sorted array of double values representing the set of statistical data.
a - a number within the semi-open range of [0,0.5[.
Returns:
1/(n-2k)*(xk + ... + xn-k-1), with k:=[n*a] (gaussian brackets).
See Also:
Arrays.sort(double[]), System.arraycopy(java.lang.Object, int, java.lang.Object, int, int)
Preconditions:
a ∈ [0, 0.5) && sorted(x)

meanDeviation

public static double meanDeviation(double[] x)
Returns the mean absolute deviation of a set of n values. It is a good measure for statistical deviations.

Parameters:
x - the array of double values representing the set of statistical data.
Returns:
1/n*Sum(|xi-mean|).

statistics

public static java.lang.String statistics(double[] x)
Returns a string with the usual descriptive statistics for an array of double values.


coefficientOfCorrelation

public static double coefficientOfCorrelation(double[] x,
                                              double[] y)
Returns the (2D) coefficient of correlation of a set of n pairs (xi,yi).

Parameters:
x - the array of double values representing the x part of the set of statistical data (with same length and in same order as y).
y - the array of double values representing the y part of the set of statistical data (with same length and in same order as x).
Returns:
1/(n-1)*Sum((xi-mean(x))*(xi-mean(y))) / (standardDeviation(x)*standardDeviation(y)).
Preconditions:
x.length == y.length

functionalRegression

public static Function functionalRegression(Function composedFunc,
                                            Matrix experiment)
Performs linear regression to estimate a composed function with least squares.

Unlike regression(Function[],Matrix), this method is a facade that works for single parametric functions, only.

See Also:
Facade Pattern

regression

public static Vector regression(Function[] funcs,
                                Matrix experiment)
Performs linear regression to estimate the statistical mean according to least squares. For a (vectorial) theory u = a*f(x) the coefficient-vector a can be estimated using experimental data.

The data of an experiment is represented as a matrix containing the experimental data for the parameter row-vectors x in the first columns and - in the last column - the experimental data of the result scalar ui of each experiment i.

Here û denotes the vector of n response variables, and â denotes the vector of p unknown parameters to be estimated. The function f must be determined according to a thesis.

f:RkRm; x↦f(x) = (f1(x1),...fm(xk))T if k=m.
The theoretical function f can apply any combination of real functions fi on the different parameters xi.

Returns:
an estimate for the true coefficients vector â.
See Also:
regression(Vector, Matrix, Matrix)
Preconditions:
experiment.dimension().width - 1 == funcs.length

regression

public static Vector regression(Vector u,
                                Matrix A,
                                Matrix Cu)
                         throws java.lang.ArithmeticException
Performs elemental linear regression to estimate the statistical mean according to the method of least squares. For a (vectorial) theory u = A*a an estimate for the true coefficient-vector â is predicted such that
||A* - u||2 = mina ||A*a - u||2.

Here û denotes the vector of n response variables, and â denotes the vector of p unknown parameters to be estimated. The p predictor variables for n experiments are denoted by the n×p Matrix A.

This method is called with the experimentally deviated data u, A and the covariance Cu of the variables.

Comments: For Cu=IDENTITY(n), the result equals

pseudoInverse(A) * u
because it is the minimum-norm-solution.
  • linear equalization with least square uses ||.||2 as the norm
  • linear optimization uses ||.||1 as the norm.
  • tschebyscheff equalization uses ||.|| as the norm.

Parameters:
u - the experimental vector of n response variables.
A - the n×p Matrix of predictor variables for n experiments.
Cu - the n×n covariance Matrix of u. The diagonal vector contains the variance during experimental determination of each variable, the other components contain the covariance, with other values.
Returns:
an estimate for the true coefficients vector â. With regard to the statistical deviation, this vector a can be used to calculate the estimated scalar u for other parameter-vectors p as
u = p*a
Throws:
java.lang.IllegalArgumentException - if response vector has another size than matrix height n.
java.lang.ArithmeticException - if the solution would be (n-m) parametric since less experiments exist than unknown parameters.
Preconditions:
u.dimension() == A.dimension().height

Orbital library
1.3.0: 11 Apr 2009

Copyright © 1996-2009 André Platzer
All Rights Reserved.