Cross-substrate transfers apparently cost around 2 pJ/bit vs. 0.3-ish over passive interposers (as measured by actually consumed bandwidth, not link capacity). The question then becomes how much you are willing to pay for development and validation/integration costs to save maybe a Watt or two per link that only in incurred under the heaviest transfer scenarios, which are likely to be transient conditions for all but a small subset of users' applications. HBM makes more sense since each module can transfer around 250 GB/s already (and twice again that for HBM3) that is likely to be close to fully saturated, where an Infinity Fabric link is around 50 GB/s and will have very low duty cycles for many workloads.
