Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradients2 #44

Merged
merged 3 commits into from
Oct 24, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 17 additions & 18 deletions FUNCTIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Name | Type of set | Properties
`IndBallRank` | Set of matrices with given rank | nonconvex
`IndBinary` | Indicator of a binary set | nonconvex, separable
`IndBox` | Box | convex, separable
`IndGraph` | Indicator of the graph of a linear op. | convex
`IndExpPrimal` | Indicator of (primal) exponential cone | convex cone
`IndExpDual` | Indicator of (dual) exponential cone | convex cone
`IndFree` | Indicator of the free cone | convex cone, separable
Expand All @@ -34,33 +35,31 @@ Name | Type of set | Properties

Name | Description | Properties
----------------|-------------------------------------|----------------
`ElasticNet` | Elastic-net regularization | convex, separable
`ElasticNet` | Elastic-net regularization | convex, separable, gradient
`NormL0` | L0 pseudo-norm | nonconvex
`NormL1` | L1 norm | convex, separable
`NormL2` | Euclidean norm | convex
`NormL1` | L1 norm | convex, separable, gradient
`NormL2` | Euclidean norm | convex, gradient
`NormL21` | Sum-of-L2 norms | convex
`NormLinf` | L-infinity norm | convex
`NormLinf` | L-infinity norm | convex, gradient
`NuclearNorm` | Nuclear norm | convex
`SqrNormL2` | Squared Euclidean norm | convex, separable
`SqrNormL2` | Squared Euclidean norm | convex, separable, gradient

### Penalties
### Penalties and other functions

Name | Description | Properties
----------------|-------------------------------------|-----------------
`HingeLoss` | Hinge loss function | convex, separable
`HuberLoss` | Huber loss function | convex
`LogBarrier` | Logarithmic barrier | convex, separable
`LeastSquares` | Sum-of-residual-squares | convex
`HingeLoss` | Hinge loss function | convex, separable, gradient
`HuberLoss` | Huber loss function | convex, gradient
`LogBarrier` | Logarithmic barrier | convex, separable, gradient
`LeastSquares` | Sum-of-residual-squares | convex, gradient
`Maximum` | Maximum coordinate of a vector | convex
`Linear` | Linear function | convex, separable, gradient
`Quadratic` | Quadratic function | convex, gradient
`SumPositive` | Sum of the positive coefficients | convex, gradient

### Distances

Name | Description | Properties
----------------|------------------------------------------------------|----------------
`DistL2` | Euclidean distance from a convex set | convex
`SqrDistL2` | Squared Euclidean distance from a convex set | convex

### Other functions

Name | Description | Properties
----------------|------------------------------------------------------|----------------
`Maximum` | Maximum coordinate of a vector | convex
`DistL2` | Euclidean distance from a convex set | convex, gradient
`SqrDistL2` | Squared Euclidean distance from a convex set | convex, gradient
6 changes: 6 additions & 0 deletions docs/src/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,9 @@ Maximum
Quadratic
SumPositive
```

## Distances
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I have also put in the section describing calculus rules, since these are really modifiers of some other (indicator) functions; but I guess they can stay in both places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were in the FUNCTIONS.md in the root, so I copied them here. But you are right, we should think of how we can unify the different documentations.

```@docs
DistL2
SqrDistL2
```
1 change: 1 addition & 0 deletions src/ProximalOperators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ include("utilities/deep.jl")
include("utilities/linops.jl")
include("utilities/symmetricpacked.jl")
include("utilities/uniformarrays.jl")
include("utilities/vecnormdiff.jl")

# Basic functions

Expand Down
24 changes: 16 additions & 8 deletions src/calculus/distL2.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ export DistL2

Given `ind_S` the indicator function of a convex set ``S``, and an optional positive parameter `λ`, returns the (weighted) Euclidean distance from ``S``, that is function
```math
g(x) = \\mathrm{dist}_S(x) = \\min \\{ \\|y - x\\| : y \\in S \\}.
g(x) = λ\\mathrm{dist}_S(x) = \\min \\{ λ\\|y - x\\| : y \\in S \\}.
```
"""

Expand All @@ -35,24 +35,32 @@ DistL2{R <: Real, T <: ProximableFunction}(ind::T, lambda::R=1.0) = DistL2{R, T}

function (f::DistL2){R <: RealOrComplex}(x::AbstractArray{R})
p, = prox(f.ind, x)
return f.lambda*vecnorm(x-p)
return f.lambda*vecnormdiff(x,p)
end

function prox!{R <: RealOrComplex}(y::AbstractArray{R}, f::DistL2, x::AbstractArray{R}, gamma::Real=1.0)
p, = prox(f.ind, x)
d = vecnorm(x-p)
prox!(y, f.ind, x)
d = vecnormdiff(x,y)
gamlam = (gamma*f.lambda)
if gamlam < d
gamlamd = gamlam/d
for k in eachindex(p)
y[k] = (1-gamlamd)*x[k] + gamlamd*p[k]
end
y .= (1-gamlamd).*x .+ gamlamd.*y
return f.lambda*(d-gamlam)
end
y[:] = p
return 0.0
end

function gradient!{T <: RealOrComplex}(y::AbstractArray{T}, f::DistL2, x::AbstractArray{T})
prox!(y, f.ind, x) # Use y as temporary storage
dist = vecnormdiff(x,y)
if dist > 0
y .= (f.lambda/dist).*(x .- y)
else
y .= 0
end
return f.lambda*dist
end

fun_name(f::DistL2) = "Euclidean distance from a convex set"
fun_dom(f::DistL2) = fun_dom(f.ind)
fun_expr(f::DistL2) = "x ↦ λ inf { ||x-y|| : y ∈ S} "
Expand Down
11 changes: 9 additions & 2 deletions src/calculus/sqrDistL2.jl
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@ SqrDistL2{R <: Real, T <: ProximableFunction}(ind::T, lambda::R=1.0) = SqrDistL2

function (f::SqrDistL2){T <: RealOrComplex}(x::AbstractArray{T})
p, = prox(f.ind, x)
return (f.lambda/2)*vecnorm(x-p)^2
return (f.lambda/2)*vecnormdiff2(x,p)
end

function prox!{T <: RealOrComplex}(y::AbstractArray{T}, f::SqrDistL2, x::AbstractArray{T}, gamma::Real=1.0)
p, = prox(f.ind, x)
sqrd = (f.lambda/2)*vecnorm(x-p)^2
sqrd = (f.lambda/2)*vecnormdiff2(x,p)
c1 = 1/(1+f.lambda*gamma)
c2 = f.lambda*gamma*c1
for k in eachindex(p)
Expand All @@ -52,6 +52,13 @@ function prox!{T <: RealOrComplex}(y::AbstractArray{T}, f::SqrDistL2, x::Abstrac
return sqrd*c1^2
end

function gradient!{T <: RealOrComplex}(y::AbstractArray{T}, f::SqrDistL2, x::AbstractArray{T})
p, = prox(f.ind, x)
dist2 = vecnormdiff2(x,p)
y .= f.lambda.*(x .- p)
return (f.lambda/2)*dist2
end

fun_name(f::SqrDistL2) = "squared Euclidean distance from a convex set"
fun_dom(f::SqrDistL2) = fun_dom(f.ind)
fun_expr(f::SqrDistL2) = "x ↦ (λ/2) inf { ||x-y||^2 : y ∈ S} "
Expand Down
8 changes: 8 additions & 0 deletions src/functions/elasticNet.jl
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,14 @@ function prox!{R <: Real}(y::AbstractArray{Complex{R}}, f::ElasticNet{R}, x::Abs
return f.mu*norm1x + (f.lambda/2)*sqnorm2x
end

function gradient!{T <: RealOrComplex, R <: Real}(y::AbstractArray{T}, f::ElasticNet{R}, x::AbstractArray{T})
# Gradient of 1 norm
y .= f.mu.*sign.(x)
# Gradient of 2 norm
y .+= f.lambda.*x
return f.mu*vecnorm(x,1) + (f.lambda/2)*vecnorm(x,2)^2
end

fun_name(f::ElasticNet) = "elastic-net regularization"
fun_dom(f::ElasticNet) = "AbstractArray{Real}, AbstractArray{Complex}"
fun_expr(f::ElasticNet) = "x ↦ μ||x||_1 + (λ/2)||x||²"
Expand Down
12 changes: 6 additions & 6 deletions src/functions/indSOC.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ is_convex(f::IndSOC) = true
is_set(f::IndSOC) = true

function prox!{T <: Real}(y::AbstractArray{T,1}, f::IndSOC, x::AbstractArray{T,1}, gamma::T=one(T))
nx = norm(x[2:end])
@views nx = norm(x[2:end])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I didn’t know this!

t = x[1]
if t <= -nx
y[:] = 0.0
Expand All @@ -36,7 +36,7 @@ function prox!{T <: Real}(y::AbstractArray{T,1}, f::IndSOC, x::AbstractArray{T,1
else
r = 0.5 * (1 + t / nx)
y[1] = r * nx
y[2:end] = r * x[2:end]
@views y[2:end] .= r .* x[2:end]
end
return 0.0
end
Expand All @@ -57,7 +57,7 @@ function prox_naive{T <: Real}(f::IndSOC, x::AbstractArray{T,1}, gamma=1.0)
y = zeros(x)
r = 0.5 * (1 + t / nx)
y[1] = r * nx
y[2:end] = r * x[2:end]
y[2:end] .= r .* x[2:end]
end
return y, 0.0
end
Expand Down Expand Up @@ -95,19 +95,19 @@ function prox!{T <: Real}(y::AbstractArray{T,1}, f::IndRotatedSOC, x::AbstractAr
x1 = 0.7071067811865475*x[1] + 0.7071067811865475*x[2]
x2 = 0.7071067811865475*x[1] - 0.7071067811865475*x[2]
# project rotated x onto SOC
nx = sqrt(x2^2+norm(x[3:end])^2)
@views nx = sqrt(x2^2+norm(x[3:end])^2)
t = x1
if t <= -nx
y[:] = 0.0
elseif t >= nx
y[1] = x1
y[2] = x2
y[3:end] = x[3:end]
@views y[3:end] .= x[3:end]
else
r = 0.5 * (1 + t / nx)
y[1] = r * nx
y[2] = r * x2
y[3:end] = r * x[3:end]
@views y[3:end] = r .* x[3:end]
end
# rotate back y cw by pi/4
y1 = 0.7071067811865475*y[1] + 0.7071067811865475*y[2]
Expand Down
4 changes: 3 additions & 1 deletion src/functions/linear.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ end

Linear(c::A) where {R <: RealOrComplex, A <: AbstractArray{R}} = Linear{R, A}(c)

is_separable(f::Linear) = true
is_convex(f::Linear) = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add is_smooth = true if not already there.


function (f::Linear{RC, A})(x::AbstractArray{RC}) where {R <: Real, RC <: Union{R, Complex{R}}, A <: AbstractArray{RC}}
return vecdot(f.c, x)
end
Expand All @@ -28,7 +31,6 @@ function prox!(y::AbstractArray{RC}, f::Linear{RC, A}, x::AbstractArray{RC}, gam
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changed in PR, but should be
y .= x .- gamma.*(f.c)
for efficiency.


function gradient!(y::AbstractArray{RC}, f::Linear{RC, A}, x::AbstractArray{RC}) where {R <: Real, RC <: Union{R, Complex{R}}, A <: AbstractArray{RC}}
A_mul_B!(y, f.Q, x)
y .= f.c
return vecdot(f.c, x)
end
Expand Down
11 changes: 11 additions & 0 deletions src/functions/logBarrier.jl
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,17 @@ function prox!{T <: Real}(y::AbstractArray{T}, f::LogBarrier, x::AbstractArray{T
return -f.mu*sumf
end

function gradient!{T <: Real}(y::AbstractArray{T}, f::LogBarrier, x::AbstractArray{T})
sum = 0.0
for i in eachindex(x)
logarg = f.a*x[i]+f.b
y[i] = -f.mu*f.a/logarg
sum += log(logarg)
end
sum *= -f.mu
return sum
end

fun_name(f::LogBarrier) = "logarithmic barrier"
fun_dom(f::LogBarrier) = "AbstractArray{Real}"
fun_expr(f::LogBarrier) = "x ↦ -μ * sum( log(a*x_i+b), i=1,...,n )"
Expand Down
5 changes: 5 additions & 0 deletions src/functions/normL1.jl
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,11 @@ function prox!{T <: Real, R <: Real}(y::AbstractArray{Complex{R}}, f::NormL1{T},
return f.lambda*n1y
end

function gradient!{T <: Union{Real, Complex}}(y::AbstractArray{T}, f::NormL1, x::AbstractArray{T})
y .= f.lambda.*sign.(x)
return f(x)
end

fun_name(f::NormL1) = "weighted L1 norm"
fun_dom(f::NormL1) = "AbstractArray{Real}, AbstractArray{Complex}"
fun_expr{R <: Real}(f::NormL1{R}) = "x ↦ λ||x||_1"
Expand Down
10 changes: 10 additions & 0 deletions src/functions/normL2.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,16 @@ function prox!{T <: RealOrComplex}(y::AbstractArray{T}, f::NormL2, x::AbstractAr
return f.lambda*scale*vecnormx
end

function gradient!{T <: Union{Real, Complex}}(y::AbstractArray{T}, f::NormL2, x::AbstractArray{T})
fx = vecnorm(x) # Value of f, without lambda
if fx == 0
y .= 0
else
y .= (f.lambda/fx).*x
end
return f.lambda*fx
end

fun_name(f::NormL2) = "Euclidean norm"
fun_dom(f::NormL2) = "AbstractArray{Real}, AbstractArray{Complex}"
fun_expr(f::NormL2) = "x ↦ λ||x||_2"
Expand Down
7 changes: 7 additions & 0 deletions src/functions/normLinf.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,13 @@ function (f::Conjugate{IndBallL1{R}}){R <: Real, S <: RealOrComplex}(x::Abstract
return (f.f.r)*vecnorm(x, Inf)
end

function gradient!{T <: RealOrComplex, R <: Real}(y::AbstractArray{T}, f::Conjugate{IndBallL1{R}}, x::AbstractArray{T})
absxi, i = findmax(abs(xi) for xi in x) # Largest absolute value
y .= 0
y[i] = f.f.r*sign(x[i])
return f.f.r*absxi
end

fun_name{R <: Real}(f::Postcompose{Conjugate{IndBallL1{R}}, R}) = "weighted L-infinity norm"
fun_expr{R <: Real}(f::Postcompose{Conjugate{IndBallL1{R}}, R}) = "x ↦ λ||x||_∞ = λ⋅max(abs(x))"
fun_params{R <: Real}(f::Postcompose{Conjugate{IndBallL1{R}}, R}) = "λ = $(f.a*(f.f.f.r))"
1 change: 1 addition & 0 deletions src/functions/quadratic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ If `iterative=true`, then `prox!` is evaluated approximately using an iterative

abstract type Quadratic <: ProximableFunction end

is_convex(f::Quadratic) = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I have not previously put because one should really check that Q is positive semidefinite...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on this @mfalt ? Of course checking the eigenvalues is not viable, but maybe the user should be able to assert whether the function is convex or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No ok, the docstring says we assume Q to be positive definite (should be semidefinite, to be corrected). We may want to change this in the near future, not all quadratics are convex!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added this because of the documentation, but you are right, maybe some keyword from the user, acknowledging if it is positive(semi)-definite would be good.

is_smooth(f::Quadratic) = true
is_quadratic(f::Quadratic) = true

Expand Down
7 changes: 6 additions & 1 deletion src/functions/sumPositive.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ is_separable(f::SumPositive) = true
is_convex(f::SumPositive) = true

function (f::SumPositive){T <: Real}(x::AbstractArray{T})
return sum(max.(0.0, x))
return sum(xi -> max(xi,0), x)
end

function prox!{T <: Real}(y::AbstractArray{T}, f::SumPositive, x::AbstractArray{T}, gamma::Real=1.0)
Expand All @@ -31,6 +31,11 @@ function prox!{T <: Real}(y::AbstractArray{T}, f::SumPositive, x::AbstractArray{
return fsum
end

function gradient!{T <: Real}(y::AbstractArray{T}, f::SumPositive, x::AbstractArray{T})
y .= max.(0, sign.(x))
return sum(xi -> max(xi,0), x)
end

fun_name(f::SumPositive) = "Sum of the positive coefficients"
fun_dom(f::SumPositive) = "AbstractArray{Real}"
fun_expr(f::SumPositive) = "x ↦ sum(max(0, x))"
Expand Down
15 changes: 15 additions & 0 deletions src/utilities/vecnormdiff.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
"""
Fast (non-allocating) version of vecnorm(x-y,2)^2
"""
function vecnormdiff2(x,y)
s = 0.0
for i in eachindex(x)
s += abs2(x[i]-y[i])
end
return s
end

"""
Fast (non-allocating) version of vecnorm(x-y,2)
"""
vecnormdiff(x,y) = sqrt(vecnormdiff2(x,y))
4 changes: 4 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ end
include("test_graph.jl")
end

@testset "Gradients" begin
include("test_gradients.jl")
end

@testset "Calculus rules" begin
include("test_calculus.jl")
include("test_moreauEnvelope.jl")
Expand Down
Loading