Is there any special case where ridge regression can shrink coefficients to zero?
$begingroup$
Are there some special cases, where the Ridge Regression can also lead to coefficients that are zero ?
It is widely known, that lasso is shrinking coefficients towards or on zero, while the ridge Regression cant shrink coefficients to zero
machine-learning lasso ridge-regression
$endgroup$
|
show 4 more comments
$begingroup$
Are there some special cases, where the Ridge Regression can also lead to coefficients that are zero ?
It is widely known, that lasso is shrinking coefficients towards or on zero, while the ridge Regression cant shrink coefficients to zero
machine-learning lasso ridge-regression
$endgroup$
1
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
1
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
1
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14
|
show 4 more comments
$begingroup$
Are there some special cases, where the Ridge Regression can also lead to coefficients that are zero ?
It is widely known, that lasso is shrinking coefficients towards or on zero, while the ridge Regression cant shrink coefficients to zero
machine-learning lasso ridge-regression
$endgroup$
Are there some special cases, where the Ridge Regression can also lead to coefficients that are zero ?
It is widely known, that lasso is shrinking coefficients towards or on zero, while the ridge Regression cant shrink coefficients to zero
machine-learning lasso ridge-regression
machine-learning lasso ridge-regression
edited Aug 28 '18 at 19:00
Vala
asked Aug 28 '18 at 18:48
ValaVala
415
415
1
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
1
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
1
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14
|
show 4 more comments
1
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
1
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
1
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14
1
1
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
1
1
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
1
1
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14
|
show 4 more comments
1 Answer
1
active
oldest
votes
$begingroup$
Suppose, as in the case of least squares methods, you are trying to solve a statistical estimation problem for a (vector-valued) parameter $beta$ by minimizing an objective function $Q(beta)$ (such as the sum of squares of the residuals). Ridge Regression "regularizes" the problem by adding a non-negative linear combination of the squares of the parameter, $P(beta).$ $P$ is (obviously) differentiable with a unique global minimum at $beta=0.$
The question asks, when is it possible for the global minimum of $Q+P$ to occur at $beta=0$? Assume, as in least squares methods, that $Q$ is differentiable in a neighborhood of $0.$ Because $0$ is a global minimum for $Q+P$ it is a local minimum, implying all its partial derivatives are $0.$ The sum rule of differentiation implies
$$fracpartialpartial beta_i(Q(beta) + P(beta)) = fracpartialpartial beta_iQ(beta) + fracpartialpartial beta_iP(beta) = Q_i(beta) + P_i(beta)$$
is zero at $beta=0.$ But since $P_i(0)=0$ for all $i,$ this implies $Q_i(0)=0$ for all $i,$ which makes $0$ at least a local minimum for the original objective function $Q.$ In the case of any least squares technique every local minimum is also a global minimum. This compels us to conclude that
Quadratic regularization of Least Squares procedures ("Ridge Regression") has $beta=0$ as a solution if and only if $beta=0$ is a solution of the original unregularized problem.
$endgroup$
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
add a comment |
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f364396%2fis-there-any-special-case-where-ridge-regression-can-shrink-coefficients-to-zero%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Suppose, as in the case of least squares methods, you are trying to solve a statistical estimation problem for a (vector-valued) parameter $beta$ by minimizing an objective function $Q(beta)$ (such as the sum of squares of the residuals). Ridge Regression "regularizes" the problem by adding a non-negative linear combination of the squares of the parameter, $P(beta).$ $P$ is (obviously) differentiable with a unique global minimum at $beta=0.$
The question asks, when is it possible for the global minimum of $Q+P$ to occur at $beta=0$? Assume, as in least squares methods, that $Q$ is differentiable in a neighborhood of $0.$ Because $0$ is a global minimum for $Q+P$ it is a local minimum, implying all its partial derivatives are $0.$ The sum rule of differentiation implies
$$fracpartialpartial beta_i(Q(beta) + P(beta)) = fracpartialpartial beta_iQ(beta) + fracpartialpartial beta_iP(beta) = Q_i(beta) + P_i(beta)$$
is zero at $beta=0.$ But since $P_i(0)=0$ for all $i,$ this implies $Q_i(0)=0$ for all $i,$ which makes $0$ at least a local minimum for the original objective function $Q.$ In the case of any least squares technique every local minimum is also a global minimum. This compels us to conclude that
Quadratic regularization of Least Squares procedures ("Ridge Regression") has $beta=0$ as a solution if and only if $beta=0$ is a solution of the original unregularized problem.
$endgroup$
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
add a comment |
$begingroup$
Suppose, as in the case of least squares methods, you are trying to solve a statistical estimation problem for a (vector-valued) parameter $beta$ by minimizing an objective function $Q(beta)$ (such as the sum of squares of the residuals). Ridge Regression "regularizes" the problem by adding a non-negative linear combination of the squares of the parameter, $P(beta).$ $P$ is (obviously) differentiable with a unique global minimum at $beta=0.$
The question asks, when is it possible for the global minimum of $Q+P$ to occur at $beta=0$? Assume, as in least squares methods, that $Q$ is differentiable in a neighborhood of $0.$ Because $0$ is a global minimum for $Q+P$ it is a local minimum, implying all its partial derivatives are $0.$ The sum rule of differentiation implies
$$fracpartialpartial beta_i(Q(beta) + P(beta)) = fracpartialpartial beta_iQ(beta) + fracpartialpartial beta_iP(beta) = Q_i(beta) + P_i(beta)$$
is zero at $beta=0.$ But since $P_i(0)=0$ for all $i,$ this implies $Q_i(0)=0$ for all $i,$ which makes $0$ at least a local minimum for the original objective function $Q.$ In the case of any least squares technique every local minimum is also a global minimum. This compels us to conclude that
Quadratic regularization of Least Squares procedures ("Ridge Regression") has $beta=0$ as a solution if and only if $beta=0$ is a solution of the original unregularized problem.
$endgroup$
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
add a comment |
$begingroup$
Suppose, as in the case of least squares methods, you are trying to solve a statistical estimation problem for a (vector-valued) parameter $beta$ by minimizing an objective function $Q(beta)$ (such as the sum of squares of the residuals). Ridge Regression "regularizes" the problem by adding a non-negative linear combination of the squares of the parameter, $P(beta).$ $P$ is (obviously) differentiable with a unique global minimum at $beta=0.$
The question asks, when is it possible for the global minimum of $Q+P$ to occur at $beta=0$? Assume, as in least squares methods, that $Q$ is differentiable in a neighborhood of $0.$ Because $0$ is a global minimum for $Q+P$ it is a local minimum, implying all its partial derivatives are $0.$ The sum rule of differentiation implies
$$fracpartialpartial beta_i(Q(beta) + P(beta)) = fracpartialpartial beta_iQ(beta) + fracpartialpartial beta_iP(beta) = Q_i(beta) + P_i(beta)$$
is zero at $beta=0.$ But since $P_i(0)=0$ for all $i,$ this implies $Q_i(0)=0$ for all $i,$ which makes $0$ at least a local minimum for the original objective function $Q.$ In the case of any least squares technique every local minimum is also a global minimum. This compels us to conclude that
Quadratic regularization of Least Squares procedures ("Ridge Regression") has $beta=0$ as a solution if and only if $beta=0$ is a solution of the original unregularized problem.
$endgroup$
Suppose, as in the case of least squares methods, you are trying to solve a statistical estimation problem for a (vector-valued) parameter $beta$ by minimizing an objective function $Q(beta)$ (such as the sum of squares of the residuals). Ridge Regression "regularizes" the problem by adding a non-negative linear combination of the squares of the parameter, $P(beta).$ $P$ is (obviously) differentiable with a unique global minimum at $beta=0.$
The question asks, when is it possible for the global minimum of $Q+P$ to occur at $beta=0$? Assume, as in least squares methods, that $Q$ is differentiable in a neighborhood of $0.$ Because $0$ is a global minimum for $Q+P$ it is a local minimum, implying all its partial derivatives are $0.$ The sum rule of differentiation implies
$$fracpartialpartial beta_i(Q(beta) + P(beta)) = fracpartialpartial beta_iQ(beta) + fracpartialpartial beta_iP(beta) = Q_i(beta) + P_i(beta)$$
is zero at $beta=0.$ But since $P_i(0)=0$ for all $i,$ this implies $Q_i(0)=0$ for all $i,$ which makes $0$ at least a local minimum for the original objective function $Q.$ In the case of any least squares technique every local minimum is also a global minimum. This compels us to conclude that
Quadratic regularization of Least Squares procedures ("Ridge Regression") has $beta=0$ as a solution if and only if $beta=0$ is a solution of the original unregularized problem.
answered Aug 28 '18 at 19:08
whuber♦whuber
206k33453821
206k33453821
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
add a comment |
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
1
1
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
$begingroup$
As pointed out by Martijn Weterings, it would also shrink coefficients to zero if t=0, or lambda converges to infinity. Regarding to the latter one: Could Ridge shrink a coefficient to zero for a sufficient large Tuning Parameter or is it just a theoretical concept that if lambda converges to infinity then the coefficient will be converge also to zero
$endgroup$
– Vala
Aug 28 '18 at 19:14
2
2
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
Lambda going to infinity is the equivalent of minimizing $Q/lambda + P.$ I hope it's easy to see that for sufficiently large $lambda$ the solutions will have to be close to $beta=0,$ guaranteeing convergence to $beta=0$ in the limit.
$endgroup$
– whuber♦
Aug 28 '18 at 19:20
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
There will be also close to zero, but cant get exactly zero, Right ?
$endgroup$
– Vala
Aug 28 '18 at 19:34
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Please re-read the conclusion of my answer: I can't think of any way to make it clearer.
$endgroup$
– whuber♦
Aug 28 '18 at 19:36
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
$begingroup$
Regarding to your final conclusion it should be correct
$endgroup$
– Vala
Aug 28 '18 at 19:39
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f364396%2fis-there-any-special-case-where-ridge-regression-can-shrink-coefficients-to-zero%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
Of course! If the least squares estimates are zero, then Ridge Regression will always produce zeros. What would be of interest is to find any other situation :-).
$endgroup$
– whuber♦
Aug 28 '18 at 18:53
$begingroup$
In which cases an ols coefficient can be exactly zero ?
$endgroup$
– Vala
Aug 28 '18 at 18:54
1
$begingroup$
This will happen whenever the response variable is orthogonal to each of the explanatory variables.
$endgroup$
– whuber♦
Aug 28 '18 at 18:56
$begingroup$
Would there be also the requirement that the predictors are orthogonal to each other, or would it be enough if just the correlation to the respone is zero
$endgroup$
– Vala
Aug 28 '18 at 18:57
1
$begingroup$
See also stats.stackexchange.com/questions/74542/… for an explanation of why Ridge cannot shrink the parameters to zero (unless they start there, as @whuber observes.)
$endgroup$
– jbowman
Aug 28 '18 at 20:14