Author: Admin

  • S&P 500 Return Contribution Analysis: Which Stocks Are Really Driving the Index?

    S&P 500 return contribution analysis helps answer a simple question: which stocks are actually driving the index?

    The S&P 500 is usually discussed as one number. The index is up, the index is down, the market rallied, or the market sold off. But in a market where mega-cap companies account for a growing share of index weight, that single number can hide what is happening underneath the surface.

    That is why S&P 500 concentration has become such a popular topic. Visual Capitalist’s “The Entire S&P 500 in 2026 in One Chart” makes the concentration visible, showing that just 13 companies make up over 40% of the S&P 500. Slickcharts’ S&P 500 companies by weight page shows the same issue from another angle by listing current S&P 500 constituents and their weights.

    Those resources are useful for seeing what the index looks like today. But current constituent weights do not fully answer the historical performance question:

    Which individual stocks actually contributed to S&P 500 price-index returns through time?

    That is the gap open-spx is designed to explore. open-spx is open Python tooling for approximate, bottom-up S&P 500 price-index replication and constituent-level contribution analysis using local CSV inputs.

    Example open-spx output comparing the S&P 500 price index with an approximate bottom-up replicated SPX series.

    Why S&P 500 Concentration Is Everywhere Right Now

    The S&P 500 contains around 500 companies, but the index is not equally weighted. The largest companies matter far more than the smallest companies.

    When Nvidia, Apple, Microsoft, Amazon, Alphabet, Meta, Broadcom, Tesla, Berkshire Hathaway, or JPMorgan move, the index feels it much more than when a smaller constituent moves.

    That is not a bug. It is how a market-cap-weighted index works.

    But it does mean that “owning the S&P 500” is not the same thing as owning 500 companies in equal proportion.

    This is why so much market commentary now focuses on concentration risk, narrow market leadership, and the role of the Magnificent Seven.

    The important follow-up question is not only whether the index is concentrated.

    The better question is:

    How much did each stock actually contribute to the index return?

    The LinkedIn Conversation: S&P 500 Concentration Is Already Mainstream

    The concentration debate is not abstract. It is already showing up across LinkedIn finance commentary, advisor posts, asset-management discussions, and market research threads.

    A few examples:

    These posts all point toward the same underlying question:

    If the S&P 500 is becoming more concentrated, can we inspect which constituents actually contributed to the index return?

    That is where open-spx fits.

    Many public discussions stop at current weights or concentration charts. Visual Capitalist makes the size distribution of the index easy to see. Slickcharts provides a current constituent-weight snapshot.

    Those are useful starting points.

    But current weight is not the same as historical return contribution.

    open-spx is aimed at the next layer down: approximate, constituent-level S&P 500 price-index return contribution analysis through time.

    What Is S&P 500 Return Contribution Analysis?

    S&P 500 return contribution analysis is a way to decompose index performance into the stocks that drove it.

    In simple terms:

    stock contribution = stock weight × stock return
    

    If a company has a large index weight and a strong return, it can contribute meaningfully to the S&P 500’s return. If a company has a small weight, even a very large stock move may have only a small index-level impact.

    This is also called:

    • S&P 500 contribution analysis
    • S&P 500 performance attribution
    • constituent-level return attribution
    • stock-level index attribution
    • S&P 500 return decomposition
    • bottom-up S&P 500 replication

    The concept is simple.

    The implementation is not.

    Why This Is Hard to Find for Free

    Current S&P 500 weights are relatively easy to inspect. Slickcharts, ETF holdings pages, and other public sources can show a current snapshot of component weights.

    But stock-level contribution analysis through time requires more than current weights.

    To estimate historical S&P 500 constituent contribution, you need:

    1. Point-in-time index membership.
    2. Point-in-time constituent weights.
    3. Price returns for each constituent.
    4. Correct handling of ticker changes, mergers, additions, deletions, spin-offs, share-class events, splits, and other corporate actions.
    5. A target index series to compare against.

    The S&P 500 is maintained using float-adjusted market capitalization and official index methodology. Exact official weights, float adjustments, divisor changes, and corporate-action treatments are not fully observable from free public data.

    That is the gap.

    You can read many articles and LinkedIn posts saying that the S&P 500 is concentrated. You can view today’s largest weights. You can see beautiful visualizations of the index by company size.

    But if you want an open, reproducible table showing approximate stock-level return contributions over time, the options are much thinner.

    Introducing open-spx

    open-spx is open Python tooling for approximate bottom-up replication and contribution analysis of S&P 500 price-index returns from user-provided CSV inputs.

    It helps answer questions like:

    • Which stocks contributed most to the S&P 500 price-index return?
    • Which stocks detracted most?
    • How much of the index move came from the largest names?
    • How did contribution change through time?
    • How closely can a bottom-up approximation replicate a supplied S&P 500 price-index series?
    • Where might data-quality issues, ticker mappings, or corporate actions be affecting the result?

    The project is designed around a practical reality:

    Official S&P 500 contribution data is not freely available as a complete point-in-time dataset, but an approximate, transparent, bottom-up workflow is still useful.

    What open-spx Does

    At a high level, open-spx performs five tasks.

    1. Builds a point-in-time membership matrix

    A static list of today’s S&P 500 constituents is not enough.

    The index changes. Companies are added and removed. Tickers change. Share classes appear. Mergers and spin-offs happen. Historical analysis needs a point-in-time view of who was in the index on each date.

    open-spx builds a membership matrix from historical constituent snapshots and optional ticker mappings.

    2. Loads constituent prices from local CSV files

    The project expects users to provide their own local price data.

    This matters because open-spx is not a data vendor. It does not redistribute licensed constituent price histories or official index data. Users remain responsible for the data they are allowed to use.

    3. Builds prior weights from market caps or shares outstanding

    A stock’s contribution depends on both return and weight.

    open-spx estimates prior weights from user-provided market-cap CSVs or from shares outstanding combined with close prices.

    These are approximate prior weights, not official S&P Dow Jones Indices weights.

    4. Computes bottom-up return contributions

    Once membership, prices, returns, and prior weights are aligned, the tool computes stock-level return contributions.

    The output can be used to inspect:

    • largest cumulative contributors
    • largest cumulative detractors
    • daily contribution tables
    • prior-weight replication
    • replicated index returns versus the supplied S&P 500 price-index series

    5. Optionally fits a constrained RNN adjustment layer

    One daily index return cannot uniquely identify hundreds of constituent weights.

    Because of that, open-spx optionally fits a regularized, prior-constrained masked RNN weight path as one smooth explanation of the supplied return series.

    This fitted layer should not be treated as official index data. It is a diagnostic tool, not a source of truth.

    The model-implied weights are ex-post and in-sample unless the user implements a holdout or walk-forward validation.

    What open-spx Does Not Do

    open-spx is intentionally explicit about its limitations.

    It does not:

    • reproduce the official S&P 500 methodology
    • provide official S&P 500 constituent weights
    • model the official index divisor
    • reproduce all float adjustments
    • model all corporate-action treatments
    • recover official investable weight factors
    • compute total-return index contribution
    • redistribute licensed data

    The project focuses on the S&P 500 price index, not the total-return index. Ordinary dividends should not be mixed into the input series casually.

    This distinction matters. If you are trying to explain the S&P 500 price index, use price-index-compatible inputs. If you are trying to explain total return, that is a different problem.

    Why Approximate Contribution Analysis Is Still Useful

    Approximate does not mean useless.

    A transparent approximation can still help answer important questions:

    • Is the index being carried by a small number of companies?
    • Which names contributed most over a specific window?
    • Are the biggest contributors the same as the biggest weights?
    • Which stocks are offsetting the leading contributors?
    • Does a bottom-up replication broadly track the supplied index?
    • Where does replication error appear?
    • Which dates or tickers deserve data-quality review?

    That is often enough to move from vague commentary to concrete analysis.

    Instead of saying:

    “The S&P 500 is being driven by a handful of stocks.”

    You can ask:

    “Which stocks, by approximate contribution, drove the S&P 500 price-index return over this period?”

    That is a better research question.

    Example Outputs

    open-spx writes CSV files and plots designed for inspection.

    Example outputs include:

    historical_constituents.csv
    membership_date_ranges.csv
    prices.csv
    market_caps_prior_timeseries.csv
    weights_prior_timeseries.csv
    replication_prior_weights.csv
    return_contributions_prior_weights.csv
    weights_model_implied.csv
    effective_exposures_model_fit.csv
    market_cap_equivalent_exposure_gap.csv
    returns.csv
    return_contributions.csv
    cumulative_top_return_contributors.csv
    cumulative_top_return_bleeders.csv
    replication_vs_sp500.csv
    replication_metrics.csv
    replication_metrics_by_model.csv
    anomaly_report.csv
    input_usage_report.csv
    spx_vs_replicated_spx.png
    largest_market_cap_difference_case.png
    

    The most useful files for contribution analysis are:

    • return_contributions.csv
    • return_contributions_prior_weights.csv
    • cumulative_top_return_contributors.csv
    • cumulative_top_return_bleeders.csv
    • replication_vs_sp500.csv
    • replication_metrics_by_model.csv
    • anomaly_report.csv

    The anomaly report is especially useful because strange contribution results often come from data issues: split handling, stale shares outstanding, ticker mappings, missing membership transitions, spin-offs, special dividends, or other corporate actions.

    How to Run open-spx

    Install the project:

    git clone https://github.com/wrageul/open-spx.git
    cd open-spx
    pip install -r requirements.txt
    pip install -e . --no-deps
    

    Then run it with local CSV inputs:

    open-spx \
      --start 2024-01-01 \
      --index data/sp500_index.csv \
      --local-data-dir data/inputs \
      --out data/run
    

    For quieter logs or CI usage:

    open-spx --start 2024-01-01 --quiet
    

    You can also override the constituent input folders independently:

    open-spx \
      --start 2024-01-01 \
      --index data/sp500_index.csv \
      --local-prices-dir data/prices \
      --local-market-caps-dir data/market_caps \
      --out data/run
    

    Required Data Inputs

    open-spx expects plain CSV inputs.

    S&P 500 price-index series

    Date,Close
    2024-01-02,4742.83
    2024-01-03,4704.81
    

    Accepted value column names include Close, sp500_index, index, or level.

    Historical constituents

    date,ticker
    2024-01-01,A
    2024-01-01,B
    2024-01-02,A
    2024-01-02,C
    

    The default constituent source points to an open historical S&P 500 component dataset. It is useful, but it is not an official S&P constituent feed. Serious use still requires validation.

    Constituent prices

    Date,Open,High,Low,Close,Volume
    2024-01-02,101.0,103.0,100.5,102.2,1234567
    2024-01-03,102.2,104.1,101.7,103.6,1456789
    

    Daily close data is strongly recommended.

    Market caps or shares outstanding

    Market-cap example:

    date,market_cap
    2024-01-02,12345678900
    2024-01-03,12400000000
    

    Shares-outstanding example:

    date,shares_outstanding
    2024-01-02,123456789
    

    If only shares outstanding are provided, open-spx builds the market-cap prior as:

    market cap prior = close price × shares outstanding
    

    How This Complements Visual Capitalist and Slickcharts

    Visual Capitalist is excellent for seeing the S&P 500 in one chart. It makes concentration visually obvious.

    Slickcharts is useful for checking current S&P 500 companies by weight.

    But both are mostly snapshot-oriented resources. They help answer:

    What does the S&P 500 look like now?

    open-spx is aimed at a different question:

    Which constituents approximately contributed to S&P 500 price-index returns through time?

    That distinction is important.

    Current weight is not the same as historical contribution. A stock can have a large current weight because it performed well in the past. Contribution analysis tries to show how that performance accumulated.

    Why This Matters for Investors, Researchers, and Developers

    S&P 500 concentration is not just a portfolio-management topic. It is also a data-transparency topic.

    If a small group of companies drives a large share of index performance, then understanding the index requires more than looking at the headline return.

    You need to inspect the drivers.

    For investors, that can clarify how much passive exposure depends on a few mega-cap names.

    For researchers, it creates a reproducible way to study concentration and return decomposition.

    For developers, it provides a concrete Python workflow for working with point-in-time membership, constituent returns, prior weights, and replication diagnostics.

    For market commentators, it creates a more precise alternative to broad claims about “narrow leadership.”

    The Main Takeaway

    The S&P 500 may contain around 500 companies, but its returns are not produced equally by 500 companies.

    As concentration rises, the question becomes more important:

    Which stocks are actually driving the index?

    open-spx does not claim to provide official S&P 500 weights or exact index replication. Instead, it provides open Python tooling for approximate, inspectable, bottom-up S&P 500 price-index contribution analysis using user-provided CSV inputs.

    That is the missing middle ground between high-level concentration charts and proprietary index attribution systems.

    If you want to move beyond “the S&P 500 is concentrated” and start inspecting approximate stock-level contribution directly, open-spx is built for that.

    Repository

    Find the project on GitHub:

    github.com/wgeul/open-spx

    Code is licensed under Apache-2.0. Users are responsible for ensuring they have the rights to use and distribute the CSV inputs and generated outputs they create with the project.

    This project is independent and is not affiliated with, endorsed by, or sponsored by S&P Dow Jones Indices, S&P Global, or CME Group.

    Source Links

  • When Growth Destroys Value: Capital Intensity, ROIC, and the Cost of Capital

    When Growth Destroys Value: Capital Intensity, ROIC, and the Cost of Capital

    Growth is usually treated as a virtue in finance. Firms that expand revenue, assets, or market share often receive higher valuations and optimistic narratives.

    But growth is not the same thing as value creation. In fact, growth can destroy value when it requires large reinvestment and earns returns below the cost of capital.

    This post explains why that happens, why capital intensity matters, and how to evaluate growth using a simple economic lens: ROIC versus WACC.


    Growth and value are not the same thing

    The key condition for value creation is straightforward:

    A firm creates value when its return on invested capital (ROIC) exceeds its weighted average cost of capital (WACC).

    McKinsey defines economic profit in exactly this way: as the spread between ROIC and WACC (and in absolute terms, scaled by invested capital). See their discussion of economic profit here: McKinsey – Global economic profit.

    This leads to a simple but often overlooked implication:

    • If ROIC > WACC, growth tends to increase value.
    • If ROIC < WACC, growth tends to destroy value.

    Capital intensity is the hidden variable in growth stories

    Two firms can grow at the same pace and still have very different economic outcomes. The difference is often capital intensity – how much incremental capital is needed to produce incremental output.

    • Capital-light growth can scale with relatively little new investment.
    • Capital-intensive growth requires continuous reinvestment just to keep expanding.

    McKinsey has a useful (and older, but still relevant) note on why return-on-capital comparisons behave differently when invested capital is low: McKinsey – Comparing performance when invested capital is low.

    The practical takeaway is that growth metrics (revenue growth, EBITDA growth, even earnings growth) do not tell you whether growth is value creating unless you also understand the capital required to generate it.


    Why earnings growth can be misleading

    Accounting earnings can rise even when economic value is falling. The typical reason is that earnings do not explicitly charge the firm for the full cost of capital used to produce them.

    This is the motivation behind economic profit and EVA-style thinking. Damodaran provides a clear overview of EVA and the economic profit logic here: Damodaran – Economic Value Added (EVA).

    In plain terms: a firm can report higher profits while becoming a worse business if it needs an even larger capital base to generate those profits at returns below its cost of capital.


    The real test: incremental returns versus the cost of capital

    To judge whether expansion creates value, focus on the economics of the next unit of growth:

    • What is the incremental return on the new capital being invested?
    • Is that incremental return above or below WACC?

    A concise way to frame it is with economic profit:

    Economic profit = (ROIC – WACC) * invested capital.

    For a practitioner-oriented explanation of the economic profit formula and intuition, see: Wall Street Prep – Economic profit.


    The broader framework in my Master’s thesis is based on the same economic logic: value creation is about the spread between returns on capital and the cost of capital, scaled by the capital employed and related to prices. That framework is applied to explain differences in average stock returns across firms.

    If you want the full empirical and methodological details, you can read the thesis here: Value-Creation Pricing Factor (PDF).

    Related posts in this series can be linked internally for context:


    Implications for investors

    Growth should not be evaluated in isolation. A few questions help keep the analysis grounded:

    • How much invested capital is required to sustain growth?
    • Is incremental ROIC above WACC, or below it?
    • Is growth improving capital efficiency, or diluting it?

    None of these questions requires a perfect model. They just force the conversation away from growth narratives and toward capital discipline.


    Conclusion

    Growth is not inherently good or bad. Its value depends on the return the firm earns on the capital required to grow.

    When growth is capital-intensive and incremental returns fall short of the cost of capital, expansion can destroy value even as revenue and earnings rise. ROIC versus WACC is a simple framework, but it remains one of the most effective ways to separate value creating growth from value destroying growth.

  • Book-Based vs Market-Based WACC: Explaining the Cross-Section of Returns

    Book-Based vs Market-Based WACC: Explaining the Cross-Section of Returns

    The Weighted Average Cost of Capital (WACC) is most often estimated using market values and market-implied discount rates. This approach is well aligned with valuation and with the idea that markets are forward-looking.

    In empirical asset pricing, however, the objective is different. Rather than estimating intrinsic value, the goal is to explain why firms with certain characteristics earn systematically different average stock returns. In this setting, historical and book-based measures of capital costs – interpreted as realized financing costs rather than required returns – can be informative.

    This post discusses the role of book-based WACC in cross-sectional return analysis, highlights important limitations, and summarizes evidence from my Master’s thesis showing that book-based costs of capital were more informative than market-based alternatives when explaining the cross-section of returns.


    What is meant by book-based WACC

    A book-based WACC is constructed using:

    • Book values of equity and debt from the balance sheet
    • Realized equity financing costs, measured as cash remuneration to equity holders (dividends and net share repurchases) relative to book equity
    • Contractual or realized costs of debt, such as interest expense relative to book debt

    Importantly, the equity component is not interpreted as a required or expected return, but as the firm’s ex-post cash cost of servicing equity capital.

    By contrast, a market-based WACC relies on market capitalization, market values of debt, and discount rates inferred from current prices and expected returns.

    Both approaches are internally coherent. Their relevance depends on the research question.


    Why book-based WACC can be informative in asset pricing

    Book-based WACC captures historical financing conditions and realized capital servicing costs that reflect past issuance and payout decisions. These characteristics tend to evolve slowly and are often persistent over time.

    In cross-sectional return studies, persistence is frequently important. Many established return predictors are derived from accounting data rather than market prices, including measures of profitability, investment, and leverage. Book-based capital costs naturally belong to this broader class of slow-moving firm characteristics.

    A further advantage is that book-based WACC is largely insulated from contemporaneous price movements. Because it is constructed from accounting quantities and realized cash flows, it avoids mechanical links between explanatory variables and returns that can complicate interpretation when market-based measures are used.


    Evidence from the Master’s thesis

    In my Master’s thesis, a book-based cost of capital proved more informative than a market-based cost of capital when explaining the cross-section of stock returns.

    When incorporated into measures of economic value creation, the book-based cost of capital exhibited stronger and more robust associations with average returns across portfolios. Market-based cost of capital measures, while theoretically appealing from a valuation perspective, showed weaker explanatory power in this empirical setting.

    The full thesis is available here: Value-Creation Pricing Factor (PDF).


    Limitations of book-based WACC

    At the same time, book-based WACC has important limitations.

    Historical financing costs may no longer reflect firms’ current risk profiles. Business risk, leverage, and competitive conditions can change, making book-based measures potentially stale.

    In addition, the explanatory power of book-based WACC may reflect delayed market adjustment rather than compensation for risk. Because book-based measures update slowly, they may proxy for information that markets incorporate only gradually.

    Finally, book-based WACC aggregates several distinct elements, including historical financing conditions, managerial payout and issuance decisions, and accounting conventions. This complicates interpretation and makes it difficult to attribute explanatory power to a single underlying mechanism.


    Brief context from the literature

    Existing theory does not provide a clear framework for why historical, cash-based financing costs should dominate market-implied discount rates in explaining the cross-section of returns.

    In corporate finance, market-value weights and market-implied discount rates are generally viewed as theoretically correct for WACC in valuation. The use of book-value weights is typically justified as a practical approximation rather than a normative benchmark (see, for example, Fernández (2011), and Damodaran’s valuation materials NYU Stern page).

    At the same time, a well-established valuation and performance-measurement literature applies a cost of capital as a charge to book capital, most notably in residual income and EVA frameworks. In these models, book values define the capital base, while the required return itself remains market-based (see Ohlson, 1995).


    Open questions

    Against this backdrop, the finding that a book-based cost of capital is more informative than a market-based alternative in explaining the cross-section of returns should be interpreted as an empirical regularity that points to a missing mechanism in standard asset-pricing benchmarks.

    A compelling explanation is that historical financing costs embed managerial timing skill in capital issuance and payout decisions. Firms differ systematically in their ability to issue equity or debt, and to distribute cash, when financing conditions are favorable. These decisions accumulate over time and are reflected in realized, book-based financing costs.

    In contrast, market-implied costs of capital reflect prevailing market sentiment and discount rates at a point in time, but abstract from the conditions under which existing capital was raised and serviced. As a result, they do not capture cross-sectional differences in firms’ realized financing outcomes arising from heterogeneous issuance timing ability.

    If financing timing is a persistent managerial attribute, then book-based capital costs may serve as a sufficient statistic for the long-run interaction between managerial decisions and capital market conditions. This provides a natural explanation for why book-based measures outperform contemporaneous market-based costs in explaining returns, even though the latter remain theoretically appropriate under frictionless markets.

    Understanding how issuance timing skill is priced, how persistent it is across firms, and whether it reflects informational advantages or agency-driven behavior remains an open avenue for future research.


    Conclusion

    Book-based WACC is not a substitute for market-based discount rates in valuation. Its relevance instead lies in empirical asset pricing, where the objective is to explain cross-sectional differences in realized stock returns rather than to infer intrinsic value.

    Evidence from the Master’s thesis indicates that historical, cash-based financing costs contain economically meaningful information in this setting. One plausible interpretation is that book-based measures reflect firm-specific histories of financing decisions, including managers’ ability to time equity and debt issuance and to service capital under favorable conditions. These realized financing outcomes accumulate in book values but are not captured by contemporaneous market-implied costs of capital.

    While this interpretation offers a coherent economic rationale for the empirical results, a fully developed theoretical framework linking issuance timing, persistence in financing conditions, and expected returns is still lacking. Consequently, the findings should be interpreted with appropriate caution.

    Clarifying the mechanisms through which historical financing costs become priced in the cross-section of returns remains an important direction for future research.