Raw-as-reported

The rule is one line: the first observation for a given (station, observed_at, report_type) is final. No corrections, no overwrites, no backfills. If NWS issues a correction an hour later, the original number is still what the market settled on. So the original number is still the number the SDK returns for that as-of.

Corrected-in-place (wrong)

# Most weather archives silently overwrite
# corrections. Your settlement replay drifts.
obs = archive.get("NYC", at="2026-04-02T12:00Z")
obs.temp_f
# 71  ← NWS correction issued Apr 4
#       (original report was 69)

Raw-as-reported (right)

# Mostly Right preserves what was actually
# reported at query time. Replay is exact.
import mostlyright
df = mostlyright.research(
  "KNYC",
  from_date="2026-04-02",
  to_date="2026-04-02",
  as_of="2026-04-02T12:00:00Z",
)
df.iloc[0]["obs_high_f"]
# 69  ← original METAR, as received

01 · Why this matters for settlement

A Kalshi weather contract settles on the number the NWS reported at the settlement time. If at 14:51Z the NYC station broadcast a temperature of 52°F, that contract settled on 52. An hour later, the station may publish a correction saying it was actually 51. Too late. The market already paid out.

If your backtest archive quietly replaced 52 with 51, your model now believes a contract settled on 51. When you size the next position against that belief, you are trading on a market that never existed.

Your backtest is only useful if it reproduces the data the market traded on. That is what raw-as-reported guarantees.

02 · The dedup rule

When the SDK fetches a window, for each (station, observed_at, report_type) key it keeps the first reported value:

1
if row.key not already seen:
2
    keep row
3
else:
4
    skip  # first-reported wins

The key is (station, observed_at, report_type). A later-arriving row with the same key is discarded, not merged. The SDK never substitutes a correction for an observation it has already surfaced under that key.

This is the opposite of standard data-engineering hygiene. In most pipelines, data quality means “catch errors and fix them.” Here, data quality means “replay the record exactly as it was the first time anyone saw it.”

03 · What we do keep about corrections

The SDK does not substitute the correction, but it does not drop it either. When AWC publishes a COR or RTD METAR (coded correction), the SDK surfaces it as a distinct row with a different observation_type and, if one is given, a different observed_at. If the correction carries the same observed_at as the original, it is kept as a separate correction row keyed by observation_type, not folded into the original observation.

A dedicated corrections feed is on the roadmap. For now, the SDK does not expose corrections separately. Settlement always uses the original observation_type="METAR" row.

04 · The `as_of` guarantee

Because the SDK returns the original reported value (never a later substitution), as_of queries are reproducible — re-running the same as_of returns the same rows:

1
import mostlyright
2

3
# Query A, run today:
4
a = mostlyright.research("KNYC", "2026-04-02", "2026-04-02", as_of="2026-04-02T12:00:00Z")
5

6
# Query A, run six months from today:
7
a_later = mostlyright.research("KNYC", "2026-04-02", "2026-04-02", as_of="2026-04-02T12:00:00Z")
8

9
# These return the same rows. Bit-for-bit.
10
assert a.equals(a_later)

You can cache indefinitely. You can version your model against a fixed as_of and know that re-running with that timestamp months later produces the same inputs.

05 · Where the rule applies, and where it does not

Observations. Covered, without exception. Even if we find a bug in our parser six months from now, we fix the parser forward; already-cached rows aren’t rewritten, and re-fetching with a past as_of still reproduces the original number.
Forecasts. Not covered in the same way. A forecast is a prediction about a future window. The same forecast model issues new runs every few hours. Each forecast run is a distinct row keyed by issued_at; you fetch a specific historical cycle, so replay of “what did the NBM think at 06Z?” works the same way.
Climate aggregates. Not covered. CLIMATE rollups are derivations, not observations. They can be re-derived on schema changes without breaking the settlement story, because the underlying observation values they derived from are re-fetchable raw-as-reported and don’t change.

The single rule, applied to the single place where it matters, gives you the one property you actually need: a backtest you can trust against a market that actually cleared.

Data sources for why AWC beats IEM beats GHCNh.
Observation schema for the 30 fields every raw row carries.
SDK overview for the query-side surface of the rule.