Source attribution
Most financial data APIs are black boxes: a price is “the price” and you have
no way to know where it came from. oneapi.finance does the opposite. Every
row we serve carries a meta.source and a meta.fetched_at, and we treat that
as a load-bearing part of the contract.
This page explains why source attribution exists, how to read it, and how to use it in production.
Why we do this
Three reasons.
-
Reproducibility. If your screener flagged AAPL as a buy yesterday based on our
pegRatio, you can reproduce that decision because the source of the underlying inputs is recorded. Black-box vendors cannot offer this. -
Defensive monitoring. When an upstream goes down or starts emitting garbage, the
sourcefield is your earliest warning. A spike insource = "stockanalysis"for a ticker that usually showssource = "yahoo_query1"tells you that yahoo failed and we fell back. You can alert on that. -
Honest disclosure. We scrape public sources. We do not pretend otherwise. By labeling every row, we make the trust model explicit instead of papering over it.
What meta.source looks like
Every quote, time-series response, statistics row, and FX rate carries:
"meta": { "source": "yahoo_query1", "fetched_at": "2026-05-04T20:14:32Z"}fetched_at is when our worker pulled the data from source. It is not
necessarily when the data became live upstream — see delayed data
for the full freshness story.
Source identifiers
The current set of source identifiers and what they cover:
meta.source | Upstream | Covers |
|---|---|---|
yahoo_query1 | query1.finance.yahoo.com | Quotes, time-series, statistics, dividends/splits, profile. Primary for retail tickers. |
yahoo_query2 | query2.finance.yahoo.com | Failover for yahoo_query1. Same shape. |
stockanalysis | stockanalysis.com | Statistics, fundamentals (TTM/quarterly), profile. Strong for non-US listings. |
sec_edgar | data.sec.gov | Authoritative filings for US issuers. Income, balance, cash flow. |
nbim | nbim.no | Limited use for sovereign-fund holdings (roadmap). |
coingecko | coingecko.com | Crypto only. |
exchangerate_host | exchangerate.host | FX rates. |
internal_cache | (us) | Served from our cache. The original source is stored alongside but not surfaced when we do this. |
The list is stable but not closed: we add sources as we add coverage. Any addition is announced in the changelog.
Practical patterns
Alert when a primary source fails
If your application normally sees source = "yahoo_query1" for a ticker and
you start seeing the fallback (stockanalysis), the upstream failed. Track
the source distribution over time and alert on shifts.
from collections import Counter
def source_distribution(quotes): return Counter(q["meta"]["source"] for q in quotes if q.get("meta"))
# Run nightly; alert if the distribution diverges from the previous week's.Mark stale data in your UI
If you display a price to a user, surface the source and freshness:
function QuoteFreshness({ quote }) { const fetchedAt = new Date(quote.meta.fetched_at); const ageMin = Math.round((Date.now() - fetchedAt.getTime()) / 60_000); return ( <span className="quote-meta"> {ageMin}m ago · {quote.meta.source} </span> );}Refuse to act on stale fundamentals
For monthly screening, you might require statistics fetched within the last 24 hours:
from datetime import datetime, timezone, timedelta
def is_fresh(stats, max_age=timedelta(hours=24)): fetched = datetime.fromisoformat(stats["meta"]["fetched_at"].replace("Z", "+00:00")) return datetime.now(timezone.utc) - fetched < max_ageProvenance at the field level
For statistics specifically, each field can come from a different source (for
example, revenueTtm from sec_edgar and forwardPe from stockanalysis).
The database stores this in a provenance JSONB column. The API does not yet
expose it row-by-row. If you need field-level provenance, file a request — we
will add it as a meta.provenance map without breaking the existing shape.
What’s next
- Data sources policy — full disclosure of upstreams and refresh cadence.
- Delayed data — why prices are 15 minutes behind.