On dynamic pricing with covariates
File(s)On_Dynamic_Pricing_minor.pdf (911.65 KB)
Accepted version
Author(s)
Wang, Hanzhao
Talluri, Kalyan
Li, XIaocheng
Type
Journal Article
Abstract
We consider dynamic pricing with covariates under a generalized linear demand model: a seller can dynam ically adjust the price of a product over a horizon of T time periods, and at each time period t, the demand
of the product is jointly determined by the price and an observable covariate vector xt ∈ R
d
through a gener alized linear model with unknown co-efficients. Most of the existing literature assumes the covariate vectors
xt’s are independently and identically distributed (i.i.d.); the few papers that relax this assumption either
sacrifice model generality or yield sub-optimal regret bounds. In this paper, we show that UCB and Thompson sampling-based pricing algorithms can achieve an O(d
√
T log T) regret upper bound without assuming
any statistical structure on the covariates xt. Our upper bound on the regret matches the lower bound up to
logarithmic factors. We thus show that (i) the i.i.d. assumption is not necessary for obtaining low regret, and
(ii) the regret bound can be independent of the (inverse) minimum eigenvalue of the covariance matrix of
the xt’s, a quantity present in previous bounds. Moreover, we consider a constrained setting of the dynamic
pricing problem where there is a limited and unreplenishable inventory and we develop theoretical results
that relate the best achievable algorithm performance to a variation measure with respect to the temporal
distribution shift of the covariates. We also demonstrate the proposed algorithms’ performance with numerical experiments.
of the product is jointly determined by the price and an observable covariate vector xt ∈ R
d
through a gener alized linear model with unknown co-efficients. Most of the existing literature assumes the covariate vectors
xt’s are independently and identically distributed (i.i.d.); the few papers that relax this assumption either
sacrifice model generality or yield sub-optimal regret bounds. In this paper, we show that UCB and Thompson sampling-based pricing algorithms can achieve an O(d
√
T log T) regret upper bound without assuming
any statistical structure on the covariates xt. Our upper bound on the regret matches the lower bound up to
logarithmic factors. We thus show that (i) the i.i.d. assumption is not necessary for obtaining low regret, and
(ii) the regret bound can be independent of the (inverse) minimum eigenvalue of the covariance matrix of
the xt’s, a quantity present in previous bounds. Moreover, we consider a constrained setting of the dynamic
pricing problem where there is a limited and unreplenishable inventory and we develop theoretical results
that relate the best achievable algorithm performance to a variation measure with respect to the temporal
distribution shift of the covariates. We also demonstrate the proposed algorithms’ performance with numerical experiments.
Date Acceptance
2024-12-23
Citation
Operations Research
ISSN
0030-364X
Publisher
Institute for Operations Research and Management Sciences
Journal / Book Title
Operations Research
Copyright Statement
Subject to copyright. This paper is embargoed until publication. Once published the author’s accepted manuscript will be made available under a CC-BY License in accordance with Imperial’s Research Publications Open Access policy (www.imperial.ac.uk/oa-policy).
License URL
Publication Status
Accepted
Rights Embargo Date
10000-01-01