Abstract
Billions of dollars are exchanged between companies through accounts payable in today's business landscape. Given its immense scale and critical role in business operations, effectively managing accounts payable and working capital is essential for organizations. In this study, we examine the financial supply chain problem, where a company seeks to minimize total payments toward accounts payable received from its upstream partners (e.g., suppliers) while leveraging cash inflows from downstream partners (e.g., distributors, wholesalers, retailers, and customers). This is accomplished by optimizing payment decisions based on payment terms and capitalizing on interest gains from cash on hand over time. Unlike prior studies, we investigate this problem in a more realistic setting where information about incoming invoices and cash inflows is uncertain. We formulate the problem as a stochastic dynamic program and identify the structural properties of an optimal policy. The optimal policy reveals payment priorities among invoices and establishes thresholds for maintaining cash on hand. We further find that payment priorities can be deterministic or stochastic, depending on the problem's state and random parameters. Additionally, we identify all instances where payment priorities are deterministic. Our study provides valuable managerial insights and practical implications derived from the structural properties of the optimal policy. Notably, some of these insights challenge well-known heuristics and seemingly intuitive practices. Lastly, we develop a simple heuristic based on the identified structural properties and demonstrate that it outperforms other widely used methods for solving large-scale practical problems.
Keywords
Introduction
Supply chains have been a focal point of study in operations management, with growing attention to the financial dimension—the flow of money that directly affects the financial performance of supply chain parties (Kouvelis et al., 2006; Pfohl and Gomm, 2009).
One of the research initiatives focusing on the financial aspects of supply chains explores the optimal management of money flow, similar to the control of material or information flow within a supply chain (Gupta and Dutta, 2011; Gupta et al., 1987; Ng et al., 2012; Rios-Solis et al., 2017). For instance, consider a simple three-layer supply chain with a company positioned in the middle layer. The company settles multiple invoices from its upstream partners (suppliers, vendors) using incoming cash inflows received from downstream partners (distributors, retailers, wholesalers, customers). Invoices come with payment terms that include discount and penalty conditions, and unused cash on hand generates interest gains. The company aims to minimize total payments by strategically timing and sizing invoice payments. We follow Gupta and Dutta (2011) in calling this the financial supply chain problem.
Prior studies of this problem (Gupta and Dutta, 2011; Gupta et al., 1987; Ng et al., 2012; Rios-Solis et al., 2017) have assumed that all invoices and cash inflows are known in advance—a deterministic setting that yields foundational structural insights but does not reflect the dynamic uncertainty firms face in practice. In reality, purchase orders and incoming cash arrive stochastically over time (Berling and Martínez-de-Albéniz, 2011; Katehakis et al., 2016). We address this gap by studying the Stochastic Dynamic Financial Supply Chain Problem (SDFSCP).
In SDFSCP, a firm receives a stream of invoices with discount and penalty terms while facing uncertain cash inflows and interest on cash holdings. At each decision epoch, the firm chooses which invoices to pay (and by how much, subject to preemptive or non-preemptive rules) to minimize total payment over a finite horizon, where unpaid invoices are valued at their outstanding balances at the end of the horizon. The setting jointly features time-variant resource capacity (cash that evolves with inflows and interest) and time-variant processing requirements (invoice balances that change with discounts and penalties) under uncertainty.
To underscore the practical importance of studying SDFSCP, we now give a stylized illustrative example and point to company evidence showing how naïve payment rules can be costly.
Because the problem is complex, practitioners often rely on simple heuristics such as snowball or avalanche (Merritt, 2024), which are known to be far from optimal (Rios-Solis et al., 2017). Consider three invoices I1, I2, and I3 with terms in Table 1. The firm receives $350 per week, and unused cash earns 0.1% weekly interest. The snowball rule (ascending payment amount) pays I2 in week 4 ($1,400, discounted), I1 in week 14 ($3,200, normal), and I3 in week 28 ($5,000, normal), yielding a net-present total of $9411.96—34% above the optimal $7020.77. The avalanche rule (descending penalty rate) pays I3 in week 10 ($3500), I1 in week 20 ($3200), and I2 in week 25 ($2000), which is 22% above optimal. Thus, even in a small instance, intuitive heuristics can perform poorly.
Payment terms of the three invoices.
Payment terms of the three invoices.
Note: 1. The structure of payment terms follows prior literature (Gupta & Dutta, 2011; Gupta et al., 1987; Ng et al., 2012; Rios-Solis et al., 2017). 2. The discounted value is paid if an invoice is paid before or at its discount payment due date. If the invoice is paid after the discount period but before or at the payment due date, the face value must be paid. If the invoice is paid after the payment due date, the payment amount is subject to weekly compounded penalty rates, leading to an increased total. For example, if I1 is paid before or at week 10, $2240 is needed to pay off the invoice; however, if it is paid after week 10 but before or at week 22, it must be paid at its face value, $3200. If it is paid after week 22, for instance, at week 24, it must be paid at the increased total of $3225.65 (=$3200*(1 + 0.004)2).
Company financial statements underscore the stakes. Accounts payable (AP) balances are sizable in absolute terms and relative to assets or retained earnings (Table 2). In 2022, Amazon reports $79.6B in AP (17.2% of total assets; 95.7% of retained earnings), Tesla $15.3B (18.6% of assets; 118.6% of retained earnings), Intel $9.6B (5.3% of assets), Alphabet $5.1B (1.4%), and Macy's $2.9B (16.5%). Given these magnitudes, AP policies materially affect costs and liquidity. These practical stakes motivate the modeling approach below and guide our main results.
Size of accounts payable (in billions, in 2022).
Note: Percentages in parentheses represent the ratios of accounts payable to total assets and retained earnings. Company data is extracted from 10-K reports.
We formulate SDFSCP as a finite-horizon stochastic dynamic program and characterize structural properties of the optimal policy, including threshold-type conditions and priority ordering rules (deterministic and stochastic), with an at-most-one-split property. Leveraging these results, we design a simple heuristic that mimics the structure of the optimal policy and runs in O(nlogn) time. Extensive experiments across cash tightness, inflow skewness, and preemptive/non-preemptive settings show that the heuristic consistently and significantly outperforms widely used benchmarks (e.g., snowball and avalanche).
Our contributions are: (i) to our knowledge, the first dynamic stochastic formulation of SDFSCP that jointly models time-varying cash capacity and time-dependent invoice requirements; (ii) new structural properties—threshold conditions, priority orderings, and an at-most-one-split property—that clarify optimal behavior; (iii) a scalable, simple, structure-guided heuristic with both preemptive and non-preemptive variants; (iv) an empirical evaluation demonstrating robustness across realistic and worst-case regimes and offering actionable guidance; and (v) methodological insights that extend beyond SDFSCP to dynamic job assignment and stochastic scheduling with time-dependent resource requirements and time-dependent resource capacity.
Supply chain finance, working capital, and trade credits
Supply chain finance—encompassing trade credits, buyer-led finance, and third-party financing—has been extensively studied (Babich and Kouvelis, 2018; Kouvelis, 2023). Trade credits in particular have received substantial attention from both economics and operations management perspectives, with research examining their rationale (Burkart and Ellingsen, 2004; Petersen and Rajan, 1997) and their strategic use in supply chains (Astvansh and Jindal, 2022; Kouvelis and Zhao, 2018; Lee et al., 2018; Wu et al., 2019a).
This study addresses a financial supply chain management problem closely related to trade credit, often interchangeably referred to as AP. While prior work has focused on the structure of trade credits (Kouvelis and Zhao, 2018; Wu et al., 2019a) and their supply chain impacts (Lee et al., 2018), this study centers on the operational payment decisions themselves: how a company strategically settles AP to minimize costs by leveraging payment terms and interest on unused cash.
Financial supply chain problem: multiple accounts payable optimization
Despite the extensive literature on trade credits and supply chain management, there is limited research addressing the financial supply chain problem that optimizes a company's payment decisions for multiple AP. Gupta et al. (1987) is the first study that introduces this financial supply chain problem. They define the problem, examine some unique characteristics of optimal payment strategies, and propose a branch and bound algorithm to solve the problem optimally. Although Gupta et al. (1987) open up this new problem category, their optimal solution approach is limited due to the computational complexity of the problem. Gupta and Dutta (2011) extend the work of Gupta et al. (1987). They show that the financial supply chain problem belongs to the class of NP-hard problems, provide some heuristics, and show that the heuristics perform well in a certain range of problem instances using computational experiments. Ng et al. (2012) address a simpler version of the financial supply chain problem by assuming a constant rate of cash inflows and no interest gains of cash on hand over time. They formulate this simplified version as a single machine scheduling problem, convert it into a continuous non-linear optimization problem, and provide an approximate solution using linear programming approximation. Rios-Solis et al. (2017) examine the financial supply chain problem from a more practical perspective. They conduct a computational experiment to show that widely-used practical heuristics, including the highest interest debt method, where invoices are paid in descending order of their penalty rates, and the debt snowball method, where invoices are paid in descending order of their invoice amounts, are far from the optimal solution.
Similar problems have been studied. For the two-layer supply chain, Devalkar and Krishnan (2019) consider the working capital financing problem with cash flows between the bank, supplier, and buyer. Similarly, Peng and Zhou (2019) deal with the optimal deploying of working capital in a supply chain with one supplier and one retailer who faces uncertain demand for maximization of the profits. Huang (2022) also studies dyadic supply chain financing, considering advance payment with a tailored discount rate and an extended payment timeline for the balance due. With consideration of a discount rate, Wu et al. (2019b) examine the influence of three supply chain finance schemes: early payment, delayed payment, and reverse factoring on the financial performance of the supplier and retailer. Semaa et al. (2020) study the deterministic version of the financial supply chain for a three-layer supply chain with supplier and customer invoices. The invoices have a given payment date and the corresponding discount and penalty rates for early and late payments, respectively. In the same vein, Zhu et al. (2022) investigate the optimum operational schedule and accounts receivable financing in a supply chain.
The above studies collectively establish the deterministic structure of the problem, but all assume invoices and cash inflows are known in advance. Gupta and Dutta (2011) partially relax this by sketching a stochastic extension, but stop short of a full stochastic formulation and do not characterize an optimal policy. Our work closes this gap: we formulate SDFSCP as a stochastic dynamic program, characterize structural properties of the optimal policy, and propose a structure-guided heuristic.
Related problems in stochastic dynamic optimization
Related problems exist in the literature. One example is the stochastic dynamic job assignment problem, as studied by Akcay et al. (2010) and Kang et al. (2016). These problems are similar to SDFSCP in that they involve dynamic decision-making to allocate resources (cash on hand in SDFSCP) to complete incoming jobs (invoices in SDFSCP). While both problems address dynamic decisions under uncertainty regarding incoming tasks and resource availability, they differ significantly in several respects. One key distinction is that, unlike the job assignment problem, where the maximum resource capacity is fixed, in SDFSCP, the level of cash on hand—analogous to resource capacity—is dynamic and may increase or decrease over time. Additionally, while the resource required to complete a given job in the job assignment problem is fixed, in SDFSCP, the cash required to pay off an invoice is not fixed but increases over time due to payment terms involving discounts and penalties.
Another related problem is the scheduling problem with time-dependent process times (e.g., Cheng et al., 2004). A simplified version of the financial supply chain problem can be conceptualized as a scheduling problem where job processing times increase as a function of their start times, often taking a specific non-linear form (Ng et al., 2012). However, there are fundamental differences between these two problems. In scheduling problems, machine capacity remains constant or unaffected by idle periods. By contrast, in the financial supply chain problem, cash on hand—analogous to machine capacity—increases over time through interest gains if left unused or fluctuates based on dynamic cash inflows. Prior studies have also explored stochastic scheduling problems in queuing systems, which assume dynamic and stochastic incoming jobs for scheduling (e.g., Pinedo, 1983; Shanthikumar and Yao, 1992).
In both dynamic job assignment and scheduling problems, while not identical, settings similar to SDFSCP can be observed. For instance, in manufacturing and service scheduling, delays in processing jobs can increase resource requirements due to degradation or complications (Browne and Yechiali, 1990; Mosheiov, 1994; Oron, 2014). Conversely, server capacities often improve after periods of inactivity due to recovery mechanisms. Workers regain efficiency with rest, machines perform better after cooling or maintenance, and battery-powered systems recharge during idle times, enhancing overall performance.
However, to the best of our knowledge, no prior research directly addresses the specific problem examined in this study—one that considers both time-variant capacity and time-variant processing time under uncertainty while analyzing the structural properties of its optimal policy. Our work, therefore, goes beyond application-specific insights. Methodologically, we introduce a dynamic stochastic programming formulation that explicitly incorporates both evolving capacity (cash inflows and reserves) and time-dependent processing requirements (invoice balances with discounts and penalties). We characterize structural properties of the optimal policy, including threshold-type decision rules and priority orders that we classify into deterministic and stochastic types. Finally, we design a heuristic that leverages these properties, demonstrating how structural analysis can inform practical, implementable decision rules. Taken together, these methodological innovations extend beyond the financial supply chain context and contribute to the broader literature on dynamic job assignment and stochastic scheduling, where problems with both stochastic arrivals and time-dependent resource requirements are increasingly relevant.
Problem definition and formulation
In this section, we formally define SDFSCP as a finite-horizon, discrete-time, stochastic, dynamic program. All notations are explained whe3.n they are first introduced. We also provide a summary of the main notations and their descriptions in the e-companion. All proofs are also in the e-companion.
Stochastic dynamic financial supply chain problem (SDFSCP)
The decision horizon of SDFSCP is composed of T time intervals, each having a uniform length (e.g., day or week). Interval inv(t) is defined as [t, t + 1), where t = 0,1,2,…,T − 1. Within each interval, new invoices from upstream partners and cash inflows from downstream partners arrive at the end of each interval. At the start of each interval, the company uses available cash to decide which existing invoices to pay.
Consider an invoice k, that arrives at the end of inv(t − 1) (i.e., at time t). We set its availability time to
Cash on hand accrues interest at factor per interval r (≥1): cash c at time
State variables
For an invoice k with arrival time
For notation and aggregation, we group invoices into types by their discount conversion rate
Decision variables
Payment decisions are made at the beginning of each interval (D1 in Figure 1). Let

SDFSCP event diagram.
Let
Both
Stochastic dynamic program
We formulate SDFSCP as a stochastic dynamic program that minimizes expected total payments over the horizon, with payments valued in end-of-horizon units. A dollar paid at time t is scaled by
The first term of the objective function represents the end-of-horizon value of payments made at time t; the second term is the expected minimal payment from t + 1 to T. The first and third constraints ensure that the payment for an invoice cannot be less than zero and exceed the outstanding invoice amount, respectively. The second constraint enforces that the total amount of payment does not exceed the available cash on hand, which also implies
For structural analysis, it is convenient to define the post-decision objective
Computing an exact optimal policy for SDFSCP is computationally prohibitive due to the curse of dimensionality (Powell, 2007). Accordingly, we focus on characterizing structural properties of the optimal policy and the managerial insights they imply. These properties provide a foundation for efficient, implementable heuristics; leveraging them, we propose a new heuristic for SDFSCP.
To characterize the structural properties of an optimal policy, we analyze the objective function of P1,
Existence of an optimal policy
The constraint sets (C1)–(C3) in P1 form a closed and bounded convex set. Hence, if the objective function,
Lemma 1 proves the existence of an optimal policy of SDFSCP.
We examine the structural properties of an optimal policy. Because the problem's state spaces are multidimensional, characterizing the global structure of an optimal policy is challenging (Zhuang and Li, 2012). Following prior work (Kang et al., 2016; Zhuang and Li, 2012), we address this difficulty by first analyzing the two-dimensional cases (Sections 4.2.1 and 4.2.2). Insights gained there are then used to inform and characterize the structure of the optimal policy in the original multidimensional setting (Section 4.3).
Optimal policy in
Consider a firm that holds a single outstanding invoice and some cash on hand. The central question is simple: how much of the available cash should be used to pay the invoice now, and how much should be kept in reserve for future obligations? Paying more now reduces the invoice balance and avoids potential penalties, but it leaves less cash available for upcoming invoices or cash shortfalls. Keeping more cash in reserve preserves flexibility but may increase the cost of the current invoice over time. The optimal policy must strike the right balance between these two competing pressures.
This trade-off has a clean geometric interpretation. In the two-dimensional space

Structure of the
We now formalize this structure. We introduce subconvexity (Koole, 1998; Zhuang and Li, 2012) and its monotone structure, represented by the notation
(subconvexity). A real function
For a given base point
That is,
Lemma 2 establishes monotonicity of
Subconvexity (and the related notion of the directional property used in Lemma 4) is closely related to the more familiar concepts of supermodularity (and its dual, submodularity), which are widely used in optimization to characterize monotone behavior of optimal solutions (Kang et al., 2016; Powell, 2007). Whereas supermodularity describes how an optimal solution varies with respect to an individual variable (e.g., in
Theorem 3 establishes the subconvexity of the two-dimensional objective function for t = 1, 2, …, T.
Let
Let the
The structure of the
The optimal policy is a switching-curve policy that partitions the state space into regions (A) and (B), as illustrated in Figure 2(a). The boundary,
Now suppose the firm has two outstanding invoices and limited cash that cannot cover both. The central question shifts: which invoice should be paid first, and should cash ever be split between them? Intuitively, splitting cash proportionally might seem like a reasonable hedge. But as we show, this is never optimal—the firm should always fully settle one invoice before allocating any cash to the other.
This trade-off has a clean geometric interpretation. In the two-dimensional space of remaining invoice balances, any feasible payment moves the state along a −45° line: one dollar more toward one invoice means one dollar less toward the other. The optimal decision is the point on this −45° line that minimizes total expected cost. As shown in Figure 2(b), the unconstrained minimizer lies at the origin (0, 0), since reducing both balances is always preferable, and the locus of constrained minimizers (the dashed curve) is downward-sloping with a slope between −45° and −90°. This geometry implies that the optimal point always lies at one of the two endpoints of the feasible −45° segment: the firm pays as much as possible toward one invoice before touching the other. This is the at-most-one-split property, which defines a strict payment priority between any two invoices. To characterize which invoice takes priority, we use the directional property—the two-invoice analog of subconvexity.
We next consider allocating available cash between two invoices,
(Superconvexity/Subconcavity). A real function
A real function
For a given base point
That is,
Lemma 4 establishes monotonicity of
In our problem, when two invoices
To analyze payment priority between invoices
Let
Let the
The structure of the
The
The optimal policy forms a switching-curve policy. The policy divides the state space into three adjacent areas: areas (A), (B), and (C). The solid arrows represent the optimal decision in each area. According to the policy, in area (A), all
The
Theorems 3 and 5 provide structural properties based on the location of an optimal solution, which is typically unknown due to the computational challenges posed by the curse of dimensionality. However, such properties often yield valuable insights even in the absence of direct knowledge of the actual optimal solution (Kang et al., 2016; Smith and McCardle, 2002). Theorem 3 offers a key managerial insight: managers should avoid increasing cash on hand as invoice amounts rise, regardless of the optimal solution's location. Similarly, Theorem 5 suggests that one invoice should be fully paid before initiating payment on another.
Theorems 3 and 5 provide key structural insight into an overall optimal policy for SDFSCP within the original multidimensional state space
Under an optimal policy
Corollary 6 and Theorem 5 show that the optimal policy for SDFSCP is governed by payment priorities among invoices. We therefore characterize the structure of these priorities, distinguishing between deterministic priority, which is state-independent, and stochastic priority, which depends on the state and random parameters. We first identify conditions for deterministic priority and when it becomes stochastic. The analysis proceeds from a single invoice (Section 4.4.1), to two invoices (Section 4.4.2), and then to the multi-invoice case, where we revisit the overall policy structure (Section 4.4.3).
Single invoice
For a single invoice, we characterize the optimal payment timing independent of other invoices. By converting all payment amounts into end-of-horizon values, we define a projected cost function that allows time-invariant comparison across payment dates. We show that each invoice has a well-defined optimal payment time determined by the trade-off between discount, interest, and penalty effects, and Lemma 7 establishes that in an optimal policy an invoice is paid only at (or after) its optimal timing threshold, never before.
Consider invoice k. Let
We assume the value of money changes over time according to the interest rate r. Hence, we define
Let
Let
In an optimal policy, if invoice k is paid at time
Lemma 7 provides meaningful managerial insights into the timing of invoice payments. If there is sufficient cash on hand to pay all existing invoices and no new invoices arrive, then every invoice can be paid at its optimal payment time
We next characterize payment priorities between two invoices. Using the relative projected cost ratio
Let
If
Under the conditions of Proposition 8, it is preferrable for the company to pay immediately both invoices i and j because
If
While the company should pay invoice j at time
If
Because
Propositions 8 and 9 define time-dependent payment priority rules that hold at a specific point in time but may not necessarily remain valid beyond that moment. For instance, if invoices i and j satisfy the conditions of Proposition 8 or 9 and invoice j is not fully paid at time
If
Propositions 8, 9, 10, and 11 provide valuable insights into the payment priority rules between two invoices. These rules hold consistently regardless of other random parameters, such as the arrival rates of new invoices and cash inflows. For this reason, we refer to these payment priorities as deterministic optimal payment priorities. However, not all invoice pairs can be prioritized using these propositions. In some cases, the payment priority between two existing invoices depends on the problem state and random parameters. We refer to this as the stochastic optimal payment priority. Proposition 12 illustrates an example of this stochastic optimal payment priority.
If there exist
Figure 3 shows an example when the optimal payment priority between invoices i and j becomes stochastic. The figure shows

An example of the stochastic optimal payment priority.
We now understand both the overall structure of an optimal policy and the payment priorities between two invoices. Building on these insights, we extend the structure to characterize an optimal policy for multiple invoices.
Let
We consider two different types of invoices: l and
Rule (1) represents the case when both invoices
Rule (2) addresses the cases when both invoices benefit from being paid at discounted prices rather than normal prices, as the discount gains exceed the interest gains that can be earned during the normal periods (
Rule (3) addresses the cases when invoice
Rule (4) represents the opposite cases of rule (3), showing situations where invoice
(1) For
(1.1) When
(1.2) When
(2) For
(2.1) When
(2.2) When
(2.3) When
(2.4) When
(3) For
(3.1) When
(3.2) When
(4) For
(4.1) When
(4.2) When
Theorem 13 characterizes when one invoice has a deterministic optimal payment priority over another. When these conditions hold, the ordering is stable and state-independent; otherwise, priority becomes state-dependent and stochastic. For invoices of the same type, additional refinements are provided in the e-companion. Together, Propositions 8–12 and Theorem 13 clarify the structure of the optimal policy. Once parameters are specified, deterministic priorities can be identified at each state, substantially reducing computational complexity in both exact optimization and heuristic design.
Importantly, Theorem 13 shows that common managerial instincts can be misleading. Consider rule (4.1), where invoice
Although the complete payment rules are intricate and state-dependent, the structural results yield several high-level managerial takeaways:
These principles guide the design of our heuristic, which operationalizes the structural ordering and pay-or-reserve rules in a computationally tractable manner.
To demonstrate the effectiveness of the structural properties of the optimal policy, we develop a simple heuristic that leverages these properties. We then compare its performance to existing heuristics, specifically the snowball and the avalanche methods. These two methods are most widely used in practice and implemented in many commercial software for SDFSCP (Merritt, 2024; Rios-Solis et al., 2017).
A new heuristic: optimal-policies-based heuristic
Our proposed heuristic, the optimal-policies-based heuristic (OPBH), operationalizes the structural priority rules (Propositions 8–12) and the pay-or-reserve logic of Lemma 7. It first applies all deterministic priority conditions. When priority is stochastic (Proposition 12), invoices are ranked using a second-order dominance approximation: we compare the areas under their projected cost ratio curves,
Experimental design
We consider a company that receives invoices from upstream partners and cash inflows from downstream partners on a weekly basis, making payment decisions over a 1-year horizon (52 weeks). Parameter values reflect common business practices: early payment discounts of 1% to 5% (PYMNTS, 2022), late penalties of 0.3% to 1.3% per week (Resolve, 2025b), and a 0.1% weekly interest rate (about 5.3% annually). These settings align with prior literature (Gupta and Dutta, 2011; Gupta et al., 1987; Rios-Solis et al., 2017) and provide a realistic basis to evaluate the effectiveness of our heuristic. The parameters used in this experiment are detailed in Table 3. For each combination of parameters, we randomly generate 30 problem instances.
Parameters used in the experiments.
Parameters used in the experiments.
Note. *For the main analysis, we use the base values for n (number of invoices) and the tightness to control cash inflow
To demonstrate the effectiveness of our proposed heuristic, we compare it against two established approaches: the snowball method and the avalanche method (Merritt, 2024; Rios-Solis et al., 2017). Specifically, we implement four variations tailored to our problem structure: (a) snowball with the smallest face value (SF), (b) snowball with the smallest current debt amount (SC), (c) avalanche with the highest penalty rate (AP), and (d) avalanche with the highest discount rate (AD). Detailed algorithmic descriptions are provided in the e-companion. We use the same objective as the analytical model: total payment—the amount paid toward all invoices received during the horizon; any unpaid invoices at the end are valued at their outstanding balance and included. All heuristics and the problem generator are implemented in Python, and the experiments are conducted on a system with an Intel i9-10900KF processor and 64 GB of RAM.
Before presenting aggregate results, we provide a small illustrative sample path to clarify why simple benchmark policies can perform poorly relative to our OPBH heuristic. Consider
Experimental results
Table 4 reports the main results. It illustrates the comparative effectiveness of our heuristic against the others, measured by gap performance, defined as (other heuristic's total payment – our heuristic's total payment)/our heuristic's total payment × 100. Each cell in the table provides descriptive statistics of gap performance in the format “average (standard deviation) [min – max].” OPBH consistently outperforms all alternatives: the average gap is 4.01%, statistically significant (t-test, p < 0.001). It consistently outperforms the other methods in every problem instance (i.e., minimum gap > 0). The source of this systematic advantage is structural: OPBH leverages two properties that benchmark heuristics ignore. First, the pay-now trigger (Lemma 7) prevents premature payments that sacrifice liquidity without reducing long-run cost—benchmark heuristics frequently pay invoices before their economic trigger, locking up cash that could earn interest or cover higher-urgency invoices arriving later. Second, the projected-cost priority (Propositions 8–11) sequences invoices by their time-adjusted burden rather than by static attributes such as balance size or penalty rate, capturing the compounding dynamics of discounts and penalties over the remaining horizon. These advantages are most pronounced when cash is tight (tightness = 0.4–0.7), where the pay-now discipline prevents costly early commitments, and when the horizon is long, where priority misordering by benchmarks compounds over many periods. When cash is abundant (tightness = 1.6), all invoices can be paid at their optimal times regardless of sequencing, so the structural advantage shrinks—though OPBH still dominates in every instance.
Gap performance.
Gap performance.
Table 5 presents a sensitivity analysis on cash-on-hand tightness and the number of invoices. Cash tightness materially affects gap performance. The problem becomes close to trivial when cash is abundant: firms can pay all discounted invoices and avoid late penalties, so differences among heuristics narrow—yet our heuristic still outperforms others by better exploiting payment terms. When cash is scarce, the problem is harder because selecting which invoices to pay is critical; our heuristic is designed for this setting and continues to perform strongly. By contrast, gap performance is largely unaffected by the number of invoices, indicating that the heuristic performs consistently well across problem sizes. The heuristic runs in
Sensitivity analysis on cash-on-hand tightness and the number of invoices.
The proposed heuristic, OPBH, utilizes both optimal deterministic and stochastic payment priorities identified in Section 4.4.2 and the insights from Theorems 3 and 5. While deterministic priorities ensure optimality, computing the optimal stochastic priority is computationally infeasible due to its complexity. Instead, OPBH employs a simple approximation by comparing the under-area of the
To assess OPBH's performance under such conditions, we test it in scenarios with highly skewed cash inflows, as shown in Table 6. As expected, the effectiveness of OPBH decreases as the skewness of cash inflows increases. For instance, when 90% of cash inflows are concentrated in the last five periods (∼10% of the horizon), the average performance gap decreases by 2.35%.
Sensitivity analysis on the skewedness of cash inflow.
Sensitivity analysis on the skewedness of cash inflow.
Note. Seventy percent (70%) or ninety percent (90%) indicates that 70% or 90% of the total cash inflow is concentrated in the last five periods, which represent approximately 10% of the decision horizon.
According to the sensitivity analysis of problem parameters, OPBH's gap is minimized when cash is abundant (tightness = 1.6). Combined with this disadvantageous parameter setting, highly skewed cash inflows (90%) constitute a worst-case configuration in our experiments. In this worst case, the average gap is 2.30%, and the difference is statistically significant. OPBH still outperforms all benchmark heuristics in every instance (see Table 7).
Worst case scenario analysis.
We further evaluate OPBH across a range of settings. Although our main experiments use a weekly decision epoch and a one-year horizon, the heuristic applies equally to other intervals (including daily decisions). Additional experiments show that changing the decision epoch does not materially affect performance. The decision horizon can also be lengthened or shortened; the observed performance gap tends to widen as the horizon increases, suggesting that OPBH's advantages accumulate over time and yield greater payment efficiency. We also test the effect of invoice arrival concentration and find that concentration alone does not materially alter OPBH's performance. Finally, we conduct sensitivity analyses over discount, penalty, and interest rates; OPBH continues to perform robustly. Detailed results are reported in the e-companion.
In summary, across all problem instances, including those under worst-case or extreme scenarios, our heuristic consistently outperforms all other heuristics. Given the relative size of AP compared to retained earnings or total assets (see Table 2), even a small percentage improvement in total payment savings can translate into millions or even billions of dollars, directly enhancing profit margins.
Partial invoice payments are increasingly common in supply chain and financial management. According to Resolve (2025a), over 30% of B2B transactions now involve partial payment arrangements. For instance, Amazon allows its business customers and AWS users to make partial payments, providing greater flexibility in managing cash flows (McMillan, 2025). Similarly, Buy Now, Pay Later (BNPL) providers, such as Klarna, Affirm, and Afterpay, enable consumers to make partial payments on their purchases (Investopedia, 2025). Many enterprise systems, such as SAP, natively support partial invoice payments, enabling firms to record and manage split settlements in both accounts receivable and AP processes. Prior research has similarly assumed preemptive invoice structures that allow partitioned payments (Ng et al., 2012; Rios-Solis et al., 2017). These practices and prior studies support our assumption that invoices are preemptive, allowing partitioning for partial settlements to support dynamic cash flow management.
While our analysis focuses on preemptive full payments, non-preemptive full payments are also widely used in practice. The structural properties of an optimal preemptive policy do not guarantee optimality in non-preemptive settings, due to stricter indivisibility and sequencing constraints (Correa et al., 2012; Lawler and Labetoulle, 1978). Nevertheless, key insights, such as priority ordering and threshold-based decision rules, can often be adapted as effective heuristics in non-preemptive environments (Correa et al., 2012; Pinedo, 2016). Although deriving tight theoretical performance bounds when applying preemptive-based strategies to non-preemptive contexts is challenging, assessing whether these structural insights yield good practical outcomes remains feasible. Motivated by this, we modify the heuristic and compare its performance to that of existing heuristics in non-preemptive scenarios.
For this analysis, we enforce a non-preemptive constraint in OPBH: if available funds are insufficient to fully pay an invoice, payment is skipped rather than partially allocated. We likewise adjust all benchmark heuristics to follow the same non-preemptive rules for a fair comparison. The pseudocode appears in the e-companion. Table 8 reports results under the same parameter settings as the preemptive analysis. In the non-preemptive setting, OPBH consistently outperforms existing heuristics, with the average performance gap of 4.08%. As in the preemptive setting, OPBH dominates in every instance, reinforcing the robustness of the structural insights derived from the preemptive model. The average performance difference between the non-preemptive and preemptive OPBHs is 0.07%. While not a theoretical bound, this suggests that the loss of optimality from enforcing non-preemption may be small.
Gap performance—non-preemptive case.
Gap performance—non-preemptive case.
This study investigates the SDFSCP, where a company seeks to minimize total invoice payments under uncertain cash inflows, payment terms, and interest on cash holdings. Our contributions are fourfold. First, we introduce the first dynamic stochastic programming formulation of SDFSCP. The model explicitly incorporates both time-variant processing requirements (invoice balances that evolve through discounts and penalties) and time-variant capacity (cash inflows and reserves), extending beyond prior deterministic or simplified approaches (Gupta and Dutta, 2011). Second, we identify structural properties of the optimal policy. These properties, which reveal threshold-like decision rules and priority-ordering patterns, reduce the effective complexity of the problem and provide new theoretical insights into stochastic dynamic optimization with evolving capacity and requirements. Third, we develop a heuristic grounded in these structural properties. Computational experiments show that it consistently outperforms widely used strategies such as the snowball and avalanche methods. Importantly, we extend the analysis to non-preemptive settings, where partial payments are disallowed, and demonstrate that our heuristic continues to outperform benchmarks across all problem instances. This robustness underscores the practical value of the structural results, even under stricter real-world constraints. Fourth, our findings contribute to broader literatures on scheduling and dynamic job assignment. SDFSCP shares features with problems in which both resource capacity and job requirements evolve dynamically—contexts rarely addressed jointly in prior research. Methodologically, our formulation and structural results, including diagonal analysis and structural properties using subconvexity and the directional property (Kang et al., 2016), may inform the design of algorithms and heuristics in domains such as healthcare scheduling, cloud computing, and energy management, where time-dependent capacities and processing requirements play a central role. From a managerial perspective, our findings demonstrate that intuitive practices—such as paying invoices with higher penalty rates or smaller amounts—may be systematically suboptimal. The structural properties identified provide actionable guidance for designing payment strategies that achieve significant cost savings while maintaining computational efficiency.
Limitations and future research
This study has several limitations that also create opportunities for future research. For tractability, the main model assumes preemptive invoices, that is, partial payments are allowed. While this assumption aligns with many modern enterprise systems, not all financial settings permit such flexibility. We therefore extend our heuristic to non-preemptive settings and show experimentally that it remains robust, consistently outperforming existing heuristics. A formal structural analysis of the non-preemptive case, or theoretical bounds on the performance gap between preemptive and non-preemptive policies, represents a promising and challenging direction.
Computing an exact optimal policy for SDFSCP remains difficult due to the curse of dimensionality (Powell, 2007). Although we characterize structural properties of the optimal policy that guide efficient algorithm design, solving the full model exactly is often computationally intractable. Consequently, some structural insights arise from optimal solutions that cannot be directly computed in large-scale instances. Nonetheless, these properties provide valuable guidance for heuristic design and deepen qualitative understanding of how optimal decisions respond to parameter changes (Smith and McCardle, 2002).
Beyond these limitations, our findings highlight several avenues for future research. The structural properties we identify—threshold-type reservation rules and priority orderings—together with the analytical features underlying them—directional modular properties such as subconvexity and superconcavity, and the directional property (Kang et al., 2016)—suggest broader methodological opportunities for stochastic dynamic resource allocation and scheduling, particularly when resource capacity and requirements fluctuate over time.
These properties may inform approximation schemes (e.g., rollout policies or index-based scoring rules), structure-guided decompositions (e.g., Lagrangian relaxations with efficiently solvable subproblems and column generation), instance-specific performance bounds via dual–primal gaps, and learning-based approaches that embed structural constraints to improve sample efficiency and interpretability. Potential application domains include patient scheduling in healthcare (where treatment complexity evolves with delay), task allocation in cloud computing (where server availability and task requirements fluctuate), and energy storage management (where capacity both depletes and replenishes dynamically).
To illustrate how these structural insights can be operationalized, we describe two concise methodological “templates” that future researchers may adopt.
Finally, while the proposed heuristic performs strongly in our experiments, formal optimality guarantees remain open. Deriving theoretical performance bounds or certificates is an important avenue for future research. In addition, empirical validation using proprietary, transaction-level AP and cash flow data would further test generalizability and enable more precise calibration.
Supplemental Material
sj-pdf-1-pao-10.1177_10591478261460124 - Supplemental material for Optimal policy for managing stochastic cash flows in a financial supply chain
Supplemental material, sj-pdf-1-pao-10.1177_10591478261460124 for Optimal policy for managing stochastic cash flows in a financial supply chain by Keumseok Kang, Sushil Gupta, Inkyoung Hur and Sungbum Jun in Production and Operations Management
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the IITP-ITRC grant (IITP-2026-RS-2021-II211816), the IITP-Global Data-X Leader HRD program grant (IITP-RS-2024-00440626), and the NRF grants (2024S1A5A2A0303904513; RS-2022-NR068758; RS-2024-00341647; RS-2025-16072058) funded by the Korea Government.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
How to cite this article
Kang K , Gupta S, Hur I, Jun S (2026) Optimal Policy for Managing Stochastic Cash Flows in a Financial Supply Chain. \textit{Production and Operations Management} XX(XX): 1–21.
