PowerLU and ExpoLU

Just tried two new LUs – PowerLU and ExpoLU which are not continuously differentiable like ReLU.

1) PowerLU’s math:

for x<-1, y=0 and y’=0;

for -1<=x<1, y=(x+1)**p/(2**p) and y’=p*(x+1)**(p-1)/(2**p), in which when x=1, y=1 and y’=p/2;

for 1<x, y=x and y’=1.

2) ExpoLU’s math

for x<-b, y=0 and y’=0;

for -b<=x<b, y=(x+b)**p and y’=p*(x+b)**(p-1), in which when x=b, y=(2*b)**p;

for b<x, y=x+(2*b)**p-b and y’=1, in which x=b, y=(2*b)**p.

3) I’m trying these Linear units for fun and also for: first to make training converge become more jumping to avoid small deep holes, second to provide some curve around x=0 for activation.

In test of a neural network of 512 features and 15/30 layers to approximate 21 pairs of x and y, both PowerLU and ExpoLU can work with proper settings.

Be First to Comment

Leave a Reply Cancel reply