+ - 0:00:00
Notes for current slide
Notes for next slide

Non-linearity

Dr. D’Agostino McGowan

1 / 22

Non-linear relationships

What have we used so far to deal with non-linear relationships?

2 / 22

Non-linear relationships

What have we used so far to deal with non-linear relationships?

  • Hint: What did you use in Lab 03?
2 / 22

Non-linear relationships

What have we used so far to deal with non-linear relationships?

  • Hint: What did you use in Lab 03?
  • Polynomials!
2 / 22

Polynomials

yi=β0+β1xi+β2xi2+β3xi3+βdxid+ϵi

3 / 22

Polynomials

yi=β0+β1xi+β2xi2+β3xi3+βdxid+ϵi

  • This is data from the Columbia World Fertility Survey (1975-76) to examine household compositions
3 / 22

Polynomials

yi=β0+β1xi+β2xi2+β3xi3+βdxid+ϵi

  • This is data from the Columbia World Fertility Survey (1975-76) to examine household compositions
  • Fit here with a 4th degree polynomial
3 / 22

How is it done?

  • New variables are created ( X1=X, X2=X2, X3=X3, etc) and treated as multiple linear regression
4 / 22

How is it done?

  • New variables are created ( X1=X, X2=X2, X3=X3, etc) and treated as multiple linear regression
  • We are not interested in the individual coefficients, we are interested in how a specific x value behaves

f^(x0)=β^0+β^1x0+β^2x02+β^3x03+β^4x04

4 / 22

How is it done?

  • New variables are created ( X1=X, X2=X2, X3=X3, etc) and treated as multiple linear regression
  • We are not interested in the individual coefficients, we are interested in how a specific x value behaves

f^(x0)=β^0+β^1x0+β^2x02+β^3x03+β^4x04

  • or more often a change between two values, a and b

f^(b)f^(a)=β^1b+β^2b2+β^3b3+β^4b4β^1aβ^2a2β^3a3β^4a4

4 / 22

How is it done?

  • New variables are created ( X1=X, X2=X2, X3=X3, etc) and treated as multiple linear regression
  • We are not interested in the individual coefficients, we are interested in how a specific x value behaves

f^(x0)=β^0+β^1x0+β^2x02+β^3x03+β^4x04

  • or more often a change between two values, a and b

f^(b)f^(a)=β^1b+β^2b2+β^3b3+β^4b4β^1aβ^2a2β^3a3β^4a4

f^(b)f^(a)=β^1(ba)+β^2(b2a2)+β^3(b3a3)+β^4(b4a4)

4 / 22

Polynomial Regression

f^(b)f^(a)=β^1(ba)+β^2(b2a2)+β^3(b3a3)+β^4(b4a4)

How do you pick a and b?

5 / 22

Polynomial Regression

f^(b)f^(a)=β^1(ba)+β^2(b2a2)+β^3(b3a3)+β^4(b4a4)

How do you pick a and b?

  • If given no other information, a sensible choice may be the 25th and 75th percentiles of x
5 / 22

Polynomial Regression

6 / 22
03:00

Polynomial Regression

pop=β0+β1age+β2age2+β3age3+β4age4+ϵ

Using the information below, write out the equation to predicted change in population from a change in age from the 25th percentile (24.5) to a 75th percentile (73.5).

term estimate std.error statistic p.value
(Intercept) 1672.0854 64.5606 25.8995 0.0000
age -10.6429 9.2268 -1.1535 0.2516
I(age^2) -1.1427 0.3857 -2.9627 0.0039
I(age^3) 0.0216 0.0059 3.6498 0.0004
I(age^4) -0.0001 0.0000 -3.6540 0.0004
7 / 22

Choosing d

yi=β0+β1xi+β2xi2+β3xi3+βdxid+ϵi

Either:

  • Pre-specify d (before looking 👀 at your data!)
  • Use cross-validation to pick d
8 / 22

Choosing d

yi=β0+β1xi+β2xi2+β3xi3+βdxid+ϵi

Either:

  • Pre-specify d (before looking 👀 at your data!)
  • Use cross-validation to pick d

Why?

8 / 22

Polynomial Regression

  • polynomials have notoriously bad tail behavior (so they can be bad for extrapolation)
9 / 22

Polynomial Regression

  • polynomials have notoriously bad tail behavior (so they can be bad for extrapolation)

What does this mean?

9 / 22

Step functions

  • Another way to create a transformation is to cut the variable into distinct regions

C1(X)=I(X<35),C2(X)=I(35X<65),C3(X)=I(X65)

10 / 22

Step functions

  • Create dummy variables for each group
11 / 22

Step functions

  • Create dummy variables for each group
  • Include each of these variables in multiple regression
11 / 22

Step functions

  • Create dummy variables for each group
  • Include each of these variables in multiple regression
  • The choice of cutpoints or knots can be problematic (and make a big difference!)
11 / 22

Step functions

C1(X)=I(X<35),C2(X)=I(35X<65),C3(X)=I(X65)

12 / 22

Step functions

C1(X)=I(X<35),C2(X)=I(35X<65),C3(X)=I(X65)

C1(X)=I(X<15),C2(X)=I(15X<65),C3(X)=I(X65)

12 / 22

Piecewise polynomials

  • Instead of a single polynomial in X over it's whole domain, we can use different polynomials in regions defined by knots

yi={β01+β11xi+β21xi2+β31xi3+ϵiif xi<cβ02+β12xi+β22xi2+β32xi3+ϵiif xic

13 / 22

Piecewise polynomials

  • Instead of a single polynomial in X over it's whole domain, we can use different polynomials in regions defined by knots

yi={β01+β11xi+β21xi2+β31xi3+ϵiif xi<cβ02+β12xi+β22xi2+β32xi3+ϵiif xic

What could go wrong here?

13 / 22

Piecewise polynomials

  • Instead of a single polynomial in X over it's whole domain, we can use different polynomials in regions defined by knots

yi={β01+β11xi+β21xi2+β31xi3+ϵiif xi<cβ02+β12xi+β22xi2+β32xi3+ϵiif xic

What could go wrong here?

  • It would be nice to have constraints (like continuity!)
13 / 22

Piecewise polynomials

  • Instead of a single polynomial in X over it's whole domain, we can use different polynomials in regions defined by knots

yi={β01+β11xi+β21xi2+β31xi3+ϵiif xi<cβ02+β12xi+β22xi2+β32xi3+ϵiif xic

What could go wrong here?

  • It would be nice to have constraints (like continuity!)
  • Insert splines!
13 / 22

14 / 22

Linear Splines

A linear spline with knots at ξk, k=1,,K is a piecewise linear polynomial continuous at each knot

15 / 22

Linear Splines

A linear spline with knots at ξk, k=1,,K is a piecewise linear polynomial continuous at each knot

yi=β0+β1b1(xi)+β2b2(xi)++βK+1bK+1(xi)+ϵi

15 / 22

Linear Splines

A linear spline with knots at ξk, k=1,,K is a piecewise linear polynomial continuous at each knot

yi=β0+β1b1(xi)+β2b2(xi)++βK+1bK+1(xi)+ϵi

  • bk are basis functions

b1(xi)=xibk+1(xi)=(xiξk)+,k=1,,K

15 / 22

Linear Splines

A linear spline with knots at ξk, k=1,,K is a piecewise linear polynomial continuous at each knot

yi=β0+β1b1(xi)+β2b2(xi)++βK+1bK+1(xi)+ϵi

  • bk are basis functions

b1(xi)=xibk+1(xi)=(xiξk)+,k=1,,K

Here ()+ means the positive part

(xiξk)+={xiξkif xi>ξk0otherwise

15 / 22

16 / 22

Cubic Splines

A cubic spliens with knots at ξi,k=1,,K is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot.

17 / 22

Cubic Splines

A cubic spliens with knots at ξi,k=1,,K is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot.

Again we can represent this model with truncated power functions

yi=β0+β1b1(xi)+β2b2(xi)++βK+3bK+3(xi)+ϵi

b1(xi)=xib2(xi)=xi2b3(xi)=xi3bk+3(xi)=(xiξk)+3,k=1,,K

17 / 22

Cubic Splines

A cubic spliens with knots at ξi,k=1,,K is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot.

Again we can represent this model with truncated power functions

yi=β0+β1b1(xi)+β2b2(xi)++βK+3bK+3(xi)+ϵi

b1(xi)=xib2(xi)=xi2b3(xi)=xi3bk+3(xi)=(xiξk)+3,k=1,,K

where

(xiξk)+3={(xiξk)3if xi>ξk0otherwise

17 / 22

18 / 22

Natural cubic splines

A natural cubic spline extrapolates linearly beyond the boundary knots

19 / 22

Natural cubic splines

A natural cubic spline extrapolates linearly beyond the boundary knots

  • This adds 4 extra constraints and allows us to put more internal knots for the same degrees of freedom as a regular cubic spline

19 / 22

Knot placement

  • One strategy is to decide K (the number of knots) in advance and then place them at appropriate quantiles of the observed X
20 / 22

Knot placement

  • One strategy is to decide K (the number of knots) in advance and then place them at appropriate quantiles of the observed X
  • A cubic spline with K knots has K+4 parameters (or degrees of freedom!)
20 / 22

Knot placement

  • One strategy is to decide K (the number of knots) in advance and then place them at appropriate quantiles of the observed X
  • A cubic spline with K knots has K+4 parameters (or degrees of freedom!)
  • A natural spline with K knots has K degrees of freedom
20 / 22

Knot placement

  • Here is a comparision of a degree-14 polynomial and natural cubic spline (both have 15 degrees of freedom)

21 / 22
22 / 22

Non-linear relationships

What have we used so far to deal with non-linear relationships?

2 / 22
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow