Thank you for forwarding this interesting work. My group has some experience with the sensitivities of stochastic gradient methods to learning rates, but not with the second-order methods you pursue here. We ended up developing a non-stochastic online alternative that avoids learning rates entirely: 论文名|
Your comment about "reusing the previous information" sounds related.
I recently returned from a very-exciting NIPS workshop on variational inference, which had many talks and posters relevant to your interests: