Congressional Trades Fund

83K+
Trades Analyzed
11
Autonomous Agents
222%
CAGR Validated
30
Days to Build

What We Built and Why It Matters

The STOCK Act requires members of Congress to disclose their stock trades — but the 45-day filing delay and sheer volume of disclosures make it nearly impossible for individual investors to act on the data. Tools exist to surface the filings, but none close the loop from disclosure to trade execution.

We built a system that does. The Congressional Trades Fund is a fully automated pipeline that ingests disclosures, enriches them with lobbying and government contract intelligence, scores every signal with machine learning, and executes trades through a brokerage API — with mandatory human approval at two separate gates.

Public Data, Private Advantage

Members of Congress trade stocks while shaping the laws that move markets. Their disclosures are public, but the delay between execution and disclosure means most people never see them in time to act.

Hundreds of filings land every month. Manually tracking them, cross-referencing with lobbying records, evaluating committee conflicts, and making timely decisions is impractical for any individual investor. Existing tools show you the filings — they don't tell you which ones matter, and they certainly don't trade on them.

The opportunity: Build a system that closes the entire loop — from the moment a disclosure is filed to a trade placed in a brokerage account — with intelligent filtering at every stage.

An 11-Agent Automated Pipeline

The system is built as a pipeline of eleven specialized agents, each responsible for a single stage of the workflow. This isn't a monolith — each agent has a defined input, a defined output, and can be tested, replaced, or improved independently.

Continuous Pipeline Architecture
AGENT 11 Orchestrator coordinates all 01 Data Acquis. 02 Processing 03 Analytics 04 Model Dev 05 Signal Gen 06 ⚑ Approval 07 ⚑ Risk Chk 08 Execution 09 Reconcil. 10 Reporting ↻ continuous cycle — 7 automated jobs daily

Agent 1 — Data Acquisition. Pulls congressional trade disclosures from the Quiver API, fetches price data from Yahoo Finance, and ingests lobbying records and government contract data. Rate-limit aware with scoped fetching: historical prices are pulled once when trades arrive, current prices only for open positions.

Agent 2 — Data Processing. Validates and cleans incoming trades. Calculates disclosure lag, enriches records with price data on both dates, and flags data quality issues. The dataset achieved an 86.6% quality score across 83,000+ records.

Agent 3 — Performance Analytics. Multi-dimensional analysis across party, chamber, sector, trade size, and individual member. Produces performance rankings with composite scoring — answering the core question: which members of Congress are actually good at picking stocks?

Agent 4 — Model Development. Walk-forward backtesting engine evaluating 18 scoring models across expanding time windows. Handles smart retraining — quick daily updates when new data arrives, full weekly comparisons on Sundays.

Agent 5 — Signal Generation. The decision engine. Applies the trained model to recent disclosures, layers in lobbying and contract intelligence, checks committee conflicts, and produces scored buy/sell signals. Nine trigger-based exit conditions replace arbitrary holding periods.

Agents 6 & 7 — Approval & Risk Management. Trade proposals sent to Telegram with inline keyboard buttons — no trade moves forward without explicit human approval. Eight risk checks per signal: position limits, sector concentration, cash reserves, and correlation analysis. Oversized positions are automatically reduced. A second human approval required before execution.

Agent 8 — Trade Execution. Submits approved orders to Alpaca's brokerage API. Includes stale order reconciliation on startup and deduplication to prevent double-execution.

Agent 9 — Reconciliation. Compares internal ledger against broker-reported positions and cash. Across all validation runs: zero discrepancies.

Agent 10 — Reporting. Generates daily, weekly, and monthly performance reports delivered via Telegram.

Orchestrator. Coordinates the entire pipeline via APScheduler. Seven automated jobs run throughout the day. Every job sends a Telegram summary on completion.

Layered Signals, Compounding Edge

The system didn't start with four data sources. It evolved through three iterations, each adding context that sharpened signal quality.

V1 — Congressional Trades. Ingested 83,000+ trade disclosures from 2012 to present. A fixed-rule strategy filtering by member quality and disclosure speed generated 232% total return over 2020–2024 — roughly 2.4x the S&P 500.

V2 — Lobbying & Government Contracts. Expanded to 25,400+ lobbying records across 1,220 tickers and 48,700 government contracts across 2,200 tickers. Lobbying spend is quartile-bucketed with graduated score boosts. Contract trends add another dimension.

V3 — Committee Conflict Detection. Maps 532 members across 230 committees to 79 policy issues. When a member of the Armed Services Committee buys defense contractor stock, the system detects the conflict and applies a 2.50x multiplier. Structured lookup using BioGuide IDs — no fuzzy name matching.

Key insight: Layered filtering produces 4–9x better returns than any single filter alone. Each data source adds modest individual value. Combined, they compound.

18 Models. One Deliberate Choice.

Scoring congressional trades is not a standard classification problem. The signal is noisy, the labels are ambiguous, and the data has strong regime shifts between market environments.

The system evaluates six model architectures: WeightedScorers, MLScorers (gradient-boosted trees on 21 features), HybridScorer, EnsembleScorers, GatedScorers, and BlendScorers.

Walk-forward validation uses expanding windows — train on all data up to a cutoff, score the next period against actual returns. No look-ahead bias.

The most aggressive model posted a 626% CAGR in its best splits. We didn't pick it. It had a losing split in one validation window. The winning model is a BlendScorer at 60/40 — 222% CAGR with zero losing splits across all validation windows. We gave up upside ceiling for downside consistency, because in production what kills you isn't missing the best month — it's the drawdown that shakes you out of the strategy entirely.

The production model retrains automatically. Quick retrains run daily (~30 seconds). Full model comparisons run weekly on Sundays.

Results & Validation

Validated through eleven end-to-end runs, each exercising all thirteen pipeline phases including full portfolio resets.

Production Model — Walk-Forward (2024–2026)
CAGR222%
Profit Factor3.11
Win Rate55.1%
Losing Splits0 / 4
Stability Score0.697
Earlier Fixed-Rule Backtest (2020–2024)
Total Return232.8%
vs. S&P 50095.3%
Alpha+137.5 pp
Sharpe Ratio0.73
Win Rate66.3%

How This Was Built in 30 Days

Everyone has access to AI coding tools now. The tools are not the differentiator. Knowing what to build — and how to direct the tools toward a production-grade result — is.

This system was built across 21 development sessions spanning 30 days, using Claude Code as the AI-assisted development tool. That timeline would have been unthinkable for a system this complex even two years ago. But the speed didn't come from the tool alone — it came from years of experience in backend data infrastructure, pipeline architecture, and ML systems.

The eleven-agent pipeline design, the two-gate Telegram approval flow, the decision to use trigger-based exits instead of fixed holding periods, the choice to build reconciliation before scaling — none of that came from an AI suggestion. Those are patterns from years of building enterprise data systems. An AI tool can write a SQLAlchemy model or wire up an API integration in minutes. It cannot tell you that your pipeline needs a reconciliation agent, or that your ML model selection should penalize volatility.

The pattern is reusable. Swap out congressional trades for any other data source — earnings transcripts, patent filings, supply chain signals — and the same pipeline architecture applies. Eleven agents. Human approval gates. ML scoring with automated retraining. Reconciliation. Reporting.

Your Complex Data Workflow, Automated

If your organization has a complex data workflow that needs to be automated — collecting from multiple sources, applying intelligence to filter and score, executing decisions with audit trails — that's exactly what we do.

Schedule a Free Consultation

Built by Drip AI & Data LLC. The Congressional Trades Fund is a research and paper-trading system. Past backtested performance does not guarantee future results. This is not investment advice.

Previous
Previous

Dynasty Cap Manager