Financial data that's already
AI-ready
Vector embeddings of earnings calls and SEC filings, plus derived quant signals — delivered as Iceberg tables into your Snowflake, BigQuery, or Databricks. Bitemporal, nightly updated, no ETL.
curl https://api.vectorfinancials.com/v1/embeddings/AAPL \
-H "X-API-Key: vf_sk_••••••••••" \
-G -d "fiscal_period=2024-Q3&limit=5"
# Returns: [{ticker, fiscal_period, chunk_idx,
# embedding[1536], effective_ts, knowledge_ts}]The problem
Raw financial data isn't model-ready
Building an ML pipeline on financial data means months of plumbing: scraping filings, chunking transcripts, computing accounting ratios, normalizing schemas, and engineering point-in-time correctness. VectorFin ships all of that as a data product — so your team can focus on alpha, not infrastructure.
What you get
Two product lines — embeddings and signals — in one API and one Iceberg catalog.
Transcript & Filing Embeddings
Every earnings call and 10-K/10-Q chunked per fiscal period and vectorized with Google text-embedding-004. Drop into your RAG pipeline or similarity search — no preprocessing required.
Derived Quant Signals
Piotroski F-score, Altman Z-score, Beneish M-score, regime classification (safe/grey/distress), GARCH volatility forecasts, and anomaly flags — nightly updated across 5,000+ tickers.
Native Iceberg Delivery
Data lives as Apache Iceberg tables on GCS, served via Polaris catalog. Mount it in Snowflake, BigQuery, or Databricks as a native external table — no ETL, no replication, no pipelines.
Bitemporal by Design
Every record carries both effective_ts (when it was true in the world) and knowledge_ts (when we learned it). Point-in-time backtests are safe — no lookahead bias, no data leakage.
Nightly Pipeline
Embeddings and signals refresh every night. New earnings transcripts are embedded within hours of release. Iceberg tables are append-only — your queries always see the latest snapshot.
Ready to Model
Not raw scraped data — derived signals already normalized for ML consumption. Piotroski, Altman, and Beneish scores computed and stored. Embeddings chunked per fiscal period at consistent granularity.
How it works
A nightly pipeline ingests, derives, and delivers — so the data in your warehouse is always fresh.
Ingest
Nightly jobs pull earnings transcripts and SEC filings, then chunk and embed each document per fiscal period.
Derive
Quant signals — Piotroski, Altman Z, Beneish M, regime, volatility, anomaly — are computed and appended to Iceberg tables.
Deliver
Access via REST API in minutes. On Pro, mount Iceberg tables directly in your Snowflake, BigQuery, or Databricks workspace.
Deliver to your existing data warehouse — no ETL
Snowflake
CREATE ICEBERG TABLE via Polaris catalog
BigQuery
Analytics Hub shared datasets
Databricks
Unity Catalog external Iceberg tables
REST API
JSON endpoints, all tiers
Simple, transparent pricing
Start free. Upgrade when you need more tickers or warehouse delivery. See full pricing
Free
Evaluate VectorFin with the top 100 tickers. No credit card required.
Starter
For quant teams that need broad ticker coverage and bulk parquet delivery.
Pro
For hedge funds that need unlimited data delivered directly into their data warehouse.
Enterprise
Dedicated infrastructure, SLAs, and white-glove onboarding.
Stop building pipelines. Start building alpha.
Get your API key in minutes. First 1,000 calls and top 100 tickers are free — no credit card required.
Get API Access — Free