Regression Challenge

Arrangements, sorting, packing, partitions, critical path analysis, networks, graphs, ...
Post Reply
User avatar
zrosnbrick
Posts: 137
Joined: Fri Mar 17, 2006 3:58 am
Location: Madison, WI, USA
Contact:

Regression Challenge

Post by zrosnbrick »

The Challenge:
Write from scratch, a regression program that when given n points (x1, y1), (x2, y2), ..., (xn, yn), will produce the coefficients of an n-1 order polynomial that goes through the given points.

Post your code and times for various orders here.
[zeta]([alpha] + [beta]i) = 0 [imp] [alpha] = 1/2
[pi] + e

User avatar
zrosnbrick
Posts: 137
Joined: Fri Mar 17, 2006 3:58 am
Location: Madison, WI, USA
Contact:

Post by zrosnbrick »

Code: Select all

sub determinant { my($a) = @_ ;
        my $k = scalar(@$a) ** .5;
        if ($k == 1) {
                return $$a[0];
        }
        if ($k == 2) {
                return ($$a[0] * $$a[3] - $$a[1] * $$a[2]);
        }
        my $sign = -1;
        my $sum = 0;
        for (my $i = 0; $i < $k; $i ++) {
                $sign *= -1;
                next if($$a[$i] == 0);
          my @b = ();
                for (my $j = $k; $j < $k ** 2; $j ++) {
                        next if($j % $k == $i);
                        push @b, $$a[$j];
                }
                $sum += $sign * $$a[$i] * determinant(\@b);
        }
        return $sum;
}

sub inverse { my($a) = @_ ;
        my $k = scalar(@$a) ** .5;
        my @i = ();
        for (my $i = 0; $i < $k ** 2; $i += $k) {
                for (my $j = $i; $j < $i + $k; $j ++) {
                        my @b = ();
                        for (my $n = 0; $n < $k ** 2; $n ++) {
                                next if($n % $k == int($j / $k));
                                next if(int($n / $k) == $j % $k);
                                push @b, $$a[$n];
                        }
                        push @i, ((-1) ** (int($j / $k) + $j % $k) * determinant(\@b));
                }
        }
        my $d = determinant($a);
        for (my $i = 0; $i < $k ** 2; $i ++) {
                $i[$i] /= $d;
        }
        return @i;
}

sub multiply { my($a, $b) = @_ ;
        my @c = ();
        my $s = scalar(@$a) ** .5;
        for (my $i = 0; $i < $s; $i ++) {
                my $p = 0;
                for (my $j = 0; $j < $s; $j ++) {
                        $p += $$a[$s * $i + $j] * $$b[$j];
                }
                push @c, $p;
        }
        return @c;
}

my @a1 = ();
my @a2 = ();
my $n = scalar(@ARGV);
for (my $r = 0; $r < $n / 2; $r ++) {
        for (my $p = $n / 2 - 1; $p >= 0; $p --) {
                push @a1, $ARGV[2 * $r] ** $p;
        }
        push @a2, $ARGV[2 * $r + 1];
}

@a1 = inverse(\@a1);
my @ans = multiply(\@a1, \@a2);
print "@ans\n";

Code: Select all

Order 1: .013s
Order 2: .013s
Order 3: .013s
Order 4: .016s
Order 5: .030s
...
Order 8: .691s
[zeta]([alpha] + [beta]i) = 0 [imp] [alpha] = 1/2
[pi] + e

User avatar
jdrandall123
Posts: 65
Joined: Sun Mar 26, 2006 11:57 am
Location: New York, USA

Post by jdrandall123 »

Linear algebra approach, in J. Essentially an O(n^3) algorithm, so not practical for large arrays. However, it is easy to program and fast for small arrays. Timings are on a Celeron at 1.6 MHz, 1 G RAM.

Code: Select all

ts=: 6!:2, 7!:2@]                 NB. time and space
VM=: ((/ ~) (@i.)) (@#)           NB. Vandermonde adverb
L=:] %. ^VM @: [                  NB. Lagrange interpolation x L y
test=:([: i. x:) L ([: ? ]#0:) f. NB. test n generates data and interpolates
   ts 'test 5'
0.00197427 16448
   ts 'test 10'
0.0284352 63424
   ts 'test 20'
0.201783 294272
   ts 'test 30'
1.29029 698176
 

User avatar
jdrandall123
Posts: 65
Joined: Sun Mar 26, 2006 11:57 am
Location: New York, USA

Post by jdrandall123 »

If the purpose of this is to actually interpolate, you don't need the Lagrange polynomial, and you can write much faster (O(n^2)) code. In particular you do not need to invert a matrix. Here is some code in J:

Code: Select all

ts=: 6!:2, 7!:2@]                 NB. time and space
NB. x lag y gives function to do interpolation
lag=:2 : '+/n*(*/"1 (y-c)%(m-c)) [ c=.1 ]\. m'
NB. test n generates data, interpolates and evaluates
test=:3 : '((i. x: y) lag (? y # 0)) ? 0'

   ts 'test 10'
0.000339708 17792
   ts 'test 100'
0.0303413 1.3872e6
   ts 'test 1000'
1.45002 1.14175e8
 
If you can evaluate the interpolating function at complex values, you can get the coefficients by using the Fast Fourier Transform, which is O(n log(n)), giving an overall O(n^2). Any approach which involves inverting a matrix is O(n^3).

Lagrange interpolation is not suitable for real-world problems with large n: the polynomials oscillate too much. This is why people tend to use some form of spline interpolation (e.g. cubic splines, NURBS).

Post Reply