A PHP Linear Regression Function

The inspiration for this function came from Fitting Functions to Data: Linear and Exponential Regression miscellaneous on-line topics for Finite Mathematics and Calculus Applied to the Real World.


/**
 * linear regression function
 * @param $x array x-coords
 * @param $y array y-coords
 * @returns array() m=>slope, b=>intercept
 */
function linear_regression($x, $y) {

  // calculate number points
  $n = count($x);
  
  // ensure both arrays of points are the same size
  if ($n != count($y)) {

    trigger_error("linear_regression(): Number of elements in coordinate arrays do not match.", E_USER_ERROR);
  
  }

  // calculate sums
  $x_sum = array_sum($x);
  $y_sum = array_sum($y);

  $xx_sum = 0;
  $xy_sum = 0;
  
  for($i = 0; $i < $n; $i++) {
  
    $xy_sum+=($x[$i]*$y[$i]);
    $xx_sum+=($x[$i]*$x[$i]);
    
  }
  
  // calculate slope
  $m = (($n * $xy_sum) - ($x_sum * $y_sum)) / (($n * $xx_sum) - ($x_sum * $x_sum));
  
  // calculate intercept
  $b = ($y_sum - ($m * $x_sum)) / $n;
    
  // return result
  return array("m"=>$m, "b"=>$b);

}

Example Usage:

var_dump( linear_regression(array(1, 2, 3, 4), array(1.5, 1.6, 2.1, 3.0)) );

Advertisements

Software developer by day, scale model builder and wargamer by night.

Posted in PHP
20 comments on “A PHP Linear Regression Function
  1. […] Filtering out the crap so you don’t have to « A PHP Linear Regression Function An ASP.Net Linear Regression Function January 25th,2006 […]

  2. Robb Laney says:

    On your linear regression code …

    I don’t know PHP but I could still follow the action. The parameter equations seem to match my CRC Math Tables.

    Thanks for the time saver!

  3. Richard@Home says:

    hehe, ’tis funny Roob: before I wrote this I couldn’t follow the CRC Math Tables 😉

  4. Nice, dude. I had forgotten that it really was this easy. That’ll save me some time.

  5. Khany says:

    Brilliant tutorial.

    How would I expand this to do multiple regression on a nonlinear curve?

    I would have 1 dependant column and 2 independant columns and at least 5 records of data.

    Thanks

  6. JakeT says:

    I used this code to create trendlines in Open Flash Charts.

    I was making trendlines from a single set of y-values, though. So I built a couple functions around this. I posted what I did on my site.

    Altogether, it returns an array of values that Open Flash Chart can then chart as a straight line. Enjoy.

  7. mBird says:

    Hi —

    In some cases you could get a Divide by Zero warning so I would make slight change to you code (note I do not do try/catch since PHP wont catch warnings):
    $divisor = (($n * $xx_sum) – ($x_sum * $x_sum));
    if ($divisor == 0) $m = 0;
    else $m = (($n * $xy_sum) – ($x_sum * $y_sum)) / $divisor;

    Thank you!

  8. aarts says:

    Eres el puto amo!!

    Now my cloud tags look seriously good 🙂

    Thank you very much!

  9. Daniel says:

    Any chance you can turn this into cubic regression for us??

  10. paul salber says:

    I am sure the correlation coefficient is easy to add for this code, anybody want to have a go

  11. Petronio Costa says:

    adding r2:

    in this part of the code add

    for($i = 0; $i < $n; $i++) {

    $xy_sum+=($x[$i]*$y[$i]);
    $xx_sum+=($x[$i]*$x[$i]);
    $yy_sum+=($y[$i]*$y[$i]); $m, “b”=>$b, “r”=>$r, “r2″=>$r2);

  12. ironpoet says:

    Sorry, the coment above was scrambled.

    Let me try again:

    for($i = 0; $i $m, “b”=>$b, “r”=>$r, “r2″=>$r2);

  13. Thomas says:

    thanks man, easy and works like a charm ..

  14. VIDA says:

    I will immediately take hold of your rss as I can’t find your e-mail subscription hyperlink or newsletter service. Do you’ve any? Kindly allow me recognize in order that I could subscribe. Thanks.
    VIDA http://www.net-ict.be/

  15. Mauricio says:

    Thank you very much, from Colombia.
    I Hope help you sometime

  16. OPS says:

    Muchas gracias, me sirvio muchisimo.
    Saludos desde Chile

  17. Mwalima says:

    Like it verry much and adjusted it to OOP

    class Regression {

    private $xGegeven = array(1, 2, 3, 4);
    private $yGegeven = array(12851, 13524, 14257, 14552);

    public function __construct($x = array(), $y = array()) {
    $x = $this->xGegeven;
    $y = $this->yGegeven;
    array_push($x, $xGegeven);
    array_push($y, $yGegeven);
    }

    public function NumberOfPoints($x) {
    // calculate number points
    $x = $this->xGegeven;
    //var_dump($x).’x’;
    $n = count($x);
    //var_dump($n). ‘n’;
    return $n;
    }

    public function variableY($y) {
    // ensure both arrays of points are the same size
    $y = $this->yGegeven;
    if ($n != count($y)) {
    trigger_error(“linear_regression(): Number of elements in coordinate arrays do not match.”, E_USER_ERROR);
    }
    }

    public function xx_Sum($x) {
    // calculate sums
    $x = $this->xGegeven;
    $n = $this->NumberOfPoints($this->xGegeven);

    $xx_sum = 0;

    for ($i = 0; $i < $n; $i++) {
    $xx_sum+=($x[$i] * $x[$i]);
    }
    // echo 'xx_sum'.$xx_sum.'’;
    return $xx_sum;
    }

    public function xy_Sum($x, $y) {
    // calculate sums
    $y = $this->yGegeven;
    $x = $this->xGegeven;
    $n = $this->NumberOfPoints($this->xGegeven);

    $xy_sum = 0;

    for ($i = 0; $i yGegeven;
    $y_sum = array_sum($y);
    return $y_sum;
    }

    public function x_sum() {
    $x = $this->xGegeven;
    $x_sum = array_sum($x);
    return $x_sum;
    }

    public function calculateM() {
    // calculate slope

    $xy_sum = $this->xy_sum($x, $y);
    $x_sum = $this->x_sum();
    $xx_sum = $this->xx_sum($x);
    $y_sum = $this->y_sum();
    $n = $this->NumberOfPoints($this->xGegeven);
    $m = (($n * $xy_sum) – ($x_sum * $y_sum)) / (($n * $xx_sum) – ($x_sum * $x_sum));
    return $m;
    }

    public function calculateB() {
    // calculate intercept

    $x_sum = $this->x_sum();
    $y_sum = $this->y_sum();
    $n = $this->NumberOfPoints($this->xGegeven);
    $m = $this->calculateM();
    $b = ($y_sum – ($m * $x_sum)) / $n;
    return $b;
    }

    }

  18. liars says:

    Yes! Finally someone writes about cheaters.

  19. I added SEE calculations:

    // Calculate SEE
    foreach($x as $x_row => $x_val){
    $y_val = $y[$x_row];
    $error[] = pow($y_val-($m*$x_val+$b),2);
    }
    $s = sqrt(array_sum($error)/($n-2));

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: