11 Numerics

In this chapter, we will begin to shape our toolkit towards more specific scientific and engineering needs. As we survey the field, the gnu scientific library is a strong model to aspire to. In fact in a later chapter will build a binding directly to the library. Another family favored in the world of Python is the NumPy, SciPy and friends. In fact it wont be far fetched to assert these libraries have rendered Python a system of choice in machine learning, signal processing and other such applications.

The libraries are great but were they to be not easily adaptible to our needs, building one from scratch is a daunting effort. They tend to hide some of the practical issues and potential solutions e.g. comparison of floating point variables or dealing with zero divide issues and the like.

The projectlets of this chapter fill the engineer’s toolkit with e.g. support for descriptive statistics.

11.1 Learning Objectives

Real (floating point) arrays
Basic descriptive statistics support.
Random numbers and test data generation

11.2 Numbers Library for Floating Point data types

11.2.1 Core Definitions

Being a strongly typed language, we have to strategize be library development. Ada solution for this need is the generics. In other words like the rest of the Predefined library, a generic package can be designed then instantiated for other data types. With a view to eventually transitioning in this direction, the core approach is to rename the standard data type float as RealType and use the latter throughout. When the package is migrated to be generic the specific data type (float, long_float or even Long_Long_Float) will be the instantiation parameter.

~/bin/codemd ../../toolkit/adalib/src/numlib.ads -x Core -l

0005 | package numlib is
0006 | 
0007 |    subtype RealType is Float;
0008 |    eps : constant RealType := 0.000_1;
0009 |    type RealArray is array (Integer range <>) of RealType;
0010 |    package RealVectors_Pkg is new Ada.Containers.Vectors (Natural, RealType);
0011 |    use RealVectors_Pkg;
0012 |    package Sorter_Pkg is new RealVectors_Pkg.Generic_Sorting ("<");
0013 | 
0014 |    function Convert (a : RealArray) return Vector;
0015 |    function Convert (v : Vector) return RealArray;
0016 | 
0017 |    function Create
0018 |      (length : Natural; default : RealType := RealType'First) return Vector;
0019 |    function Create (from : Vector) return Vector;
0020 |    function Create (length : Natural; low, high : RealType) return Vector;
0021 |    function Create (low, high, step : RealType) return Vector;
0022 |

The above segment specifies several functions and procedures in terms of the data type RealType. Based on this a new package is created ie RealVectors_Pkg which affords a lot more power than the basic array supported by the language. E.g. This provides an iteration framework which we will be leveraging extensively.

However we also define some helpers Convert to ease the adoption of this package. In addition, specification and implementation of other support e.g. for pairwise operations is significantly improved.

~/bin/codemd ../../toolkit/adalib/src/numlib.ads -x Pair -l

0044 |    -- Pairwise arithmetic support
0045 |    procedure Add (v : in out Vector; value : Vector);
0046 |    procedure Sub (v : in out Vector; value : Vector);
0047 |    procedure Mult (v : in out Vector; value : Vector);
0048 |    procedure Div (v : in out Vector; value : Vector);
0049 | 
0050 |    procedure Append (v1 : in out Vector; v2 : Vector) renames
0051 |      RealVectors_Pkg.Append_Vector;
0052 |    function Append (v1 : Vector; v2 : Vector) return Vector;

11.2.2 Vector Creation

In the spirit of NumPy many flexible ways to create arrays:

~/bin/codemd ../../toolkit/adalib/src/numlib.adb -x Create -l

0040 |    function Create (from : Vector) return Vector is
0041 |       result : Vector;
0042 |    begin
0043 |       result := To_Vector (0.0, from.Length);
0044 |       Add (result, from);
0045 |       return result;
0046 |    end Create;
0047 | 
0048 |    function Create (length : Natural; low, high : RealType) return Vector is
0049 |       result : Vector            := Create (length);
0050 |       val    : RealType          := low;
0051 |       vdelta : constant RealType := (high - low) / Float (length - 1);
0052 |    begin
0053 |       for idx in 0 .. length - 1 loop
0054 |          result.Replace_Element (idx, val);
0055 |          val := val + vdelta;
0056 |       end loop;
0057 |       return result;
0058 |    end Create;

or a much more flexible, customizable appoach :

~/bin/codemd ../../toolkit/adalib/src/numlib.adb -x Populate -l

0073 |    function Create
0074 |      (length   : Natural;
0075 |       populate : not null access function
0076 |         (length, idx : Natural) return RealType)
0077 |       return Vector
0078 |    is
0079 |       use type Ada.Containers.Count_Type;
0080 |       result : Vector;
0081 |    begin
0082 |       result.Set_Length (Ada.Containers.Count_Type (length));
0083 |       for idx in 0 .. length - 1 loop
0084 |          Replace_Element (result, idx, populate (length, idx));
0085 |       end loop;
0086 |       return result;
0087 |    end Create;

This could be used for example to create signals to explore Digital Signal Processing algorithms.

11.3 Extensions - Statistics

In the following example, an array is created and populated with Normally distributed random variables:

~/bin/codemd ../../toolkit/adalib/src/numlib-statistics.adb -x Normal -l

0027 |    function Normal
0028 |      (length : Natural; mean : RealType := 0.0; stdev : RealType := 1.0)
0029 |       return Vector
0030 |    is
0031 |       result : Vector;
0032 |       procedure Normal (c : Cursor) is
0033 |          g : constant RealType := GNAT.Random_Numbers.Random_Gaussian (Gen);
0034 |       begin
0035 |          Replace_Element (result, To_Index (c), g * stdev + mean);
0036 |       end Normal;
0037 |    begin
0038 |       result.Set_Length (Ada.Containers.Count_Type (length));
0039 |       result.Iterate (Normal'Access);
0040 |       return result;
0041 |    end Normal;

11.3.1 Sample Runs

The following example generates 100 normally distributed random variates with a mean of 10 and a standard deviation of 5:

~/bin/codemd ../../toolkit/examples/stats/src/stats.adb -x Random -l

0053 |    procedure random is
0054 |       use numlib.RealVectors_Pkg ;
0055 |       unit : constant String := Enclosing_Entity ;
0056 |       rv : Vector ;
0057 |    begin
0058 |       if verbose
0059 |       then
0060 |          Put_Line(unit);
0061 |       end if;
0062 |       if Argument_Count > 1
0063 |       then 
0064 |          rv := numlib.statistics.Normal(100,10.0,5.0);
0065 |          Put_Line("Normal ");
0066 |       else
0067 |          rv := numlib.statistics.Uniform(100);
0068 |          Put_Line("Uniform ");
0069 |       end if ;
0070 |       show(" " , rv);
0071 |    end random ;

and a sample run of the above:

~/bin/stats r n

Stats.random
Normal 
Vector  
Length  100
Mean  1.04035E+01
Stdev  5.44411E+00
Values 
 6.89268E+00, 1.34863E+01, 7.88176E+00, 9.10447E+00, 1.34546E+01, 1.19114E+01, 8.53111E+00,
 2.08609E+00, 1.93468E+01, 9.92501E+00, 1.08521E+01, 4.81616E+00, 7.66150E+00, 1.09800E+01,
 1.03812E+01, 7.74668E+00, 1.39057E+01, 1.33314E+00, 4.37127E+00,-1.46028E+00, 1.51027E+01,
 1.54312E+01, 1.28105E+01, 5.66142E+00, 7.96846E+00, 1.80275E+01, 2.00069E+01, 1.53938E+01,
 2.72482E+00, 7.31482E+00, 1.41641E+01, 6.43051E+00, 1.94287E+01, 8.58459E+00, 2.48774E+01,
 5.88362E+00, 1.26574E+01, 9.17087E+00, 3.03655E+00,-7.20230E-01, 3.58198E+00, 1.74868E+01,
 8.99496E+00, 1.13188E+01, 9.84350E+00, 9.32866E+00, 1.16671E+01, 8.52222E-01, 1.18996E+01,
 2.28308E+01, 2.09008E+01, 1.04644E+01, 1.28988E+01, 7.29402E+00, 9.65872E+00, 5.34604E+00,
 1.37380E+01, 1.55229E+01, 1.47116E+01, 1.12253E+01, 8.70695E+00, 1.52813E+01, 1.27516E+01,
 1.23011E+01, 1.05752E+01, 1.17954E+01, 7.80758E+00, 4.33020E+00, 1.14758E+01, 1.99570E+00,
 9.87057E+00, 1.39714E+01, 1.56334E+01, 3.81311E+00, 7.74414E+00, 4.02613E+00, 1.31363E+01,
 8.24636E+00, 9.04350E+00, 1.07669E+01, 5.43279E+00, 5.90409E+00,-2.54082E-01, 1.64418E+01,
 1.01381E+01, 1.44102E+01, 9.11493E+00, 1.20032E+01, 6.24473E+00, 1.91956E+01, 7.99577E+00,
 1.40970E+01, 1.36972E+01, 7.34773E+00, 8.97826E+00, 2.49183E+01, 1.04659E+00, 1.65986E+01,
 1.43930E+01, 1.26444E+01,

11.4 Extensions - Polynomials

With the toolkit chosen, polynomial evaluation and such become extremely simple:

~/bin/codemd ../../toolkit/adalib/src/numlib-polynomials.adb -x Eval -l

0003 |    function Evaluate (v : Vector; x : RealType) return RealType is
0004 |       result : RealType := 0.0;
0005 |       procedure Evaluate (c : Cursor) is
0006 |       begin
0007 |          result := result * x + Element (c);
0008 |       end Evaluate;
0009 |    begin
0010 |       v.Reverse_Iterate (Evaluate'Access);
0011 |       return result;
0012 |    end Evaluate;

11.5 Stretch

As developed now, the data items are of Float data type. Conversion to a generic approach to enable Long_Float and other Fixed data types would enhance the library. An example is https://github.com/RajaSrinivasan/numerics.git.
Subsetting vectors using windows or slices is a common need specifically for Signal processing applications.
Higher order moments like skewness, kurtosis and other computations such as median and quantiles are extremely useful for data analysis.

11.6 Sample Implementation

Repository: toolkit
Repository: TechAdaBook 
Directory: examples/stats