On the performance of random reshuffling in stochastic learning
File(s)ita_2017.pdf (465.21 KB)
Accepted version
Author(s)
Ying, Bicheng
Yuan, Kun
Vlaski, Stefan
Sayed, Ali H
Type
Conference Paper
Abstract
In empirical risk optimization, it has been observed that gradient descent implementations that rely on random reshuffling of the data achieve better performance than implementations that rely on sampling the data randomly and independently of each other. Recent works have pursued justifications for this behavior by examining the convergence rate of the learning process under diminishing step-sizes. Some of these justifications rely on loose bounds, or their conclusions are dependent on the sample size which is problematic for large datasets. This work focuses on constant step-size adaptation, where the agent is continuously learning. In this case, convergence is only guaranteed to a small neighborhood of the optimizer albeit at a linear rate. The analysis establishes analytically that random reshuffling outperforms independent sampling by showing that the iterate at the end of each run approaches a smaller neighborhood of size O(μ 2 ) around the minimizer rather than O(μ). Simulation results illustrate the theoretical findings.
Date Issued
2017-08-31
Date Acceptance
2017-02-01
Citation
2017 Information Theory and Applications Workshop (ITA), 2017
Publisher
IEEE
Journal / Book Title
2017 Information Theory and Applications Workshop (ITA)
Copyright Statement
Copyright © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Identifier
https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000426450600029&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=a2bf6146997ec60c407a63945d4e92bb
Source
Information Theory and Applications Workshop (ITA)
Subjects
Computer Science
Computer Science, Information Systems
convergence analysis
mean-square performance
Random reshuffling
Science & Technology
stochastic gradient descent
Technology
Publication Status
Published
Start Date
2017-02-12
Finish Date
2017-02-17
Coverage Spatial
San Diego, CA, USA