Now in private beta — apply for API access

Financial data that's already
AI-ready

Vector embeddings of earnings calls and SEC filings, plus derived quant signals — delivered as Iceberg tables into your Snowflake, BigQuery, or Databricks. Bitemporal, nightly updated, no ETL.

REST API — embeddings endpoint
curl https://api.vectorfinancials.com/v1/embeddings/AAPL \
  -H "X-API-Key: vf_sk_••••••••••" \
  -G -d "fiscal_period=2024-Q3&limit=5"

# Returns: [{ticker, fiscal_period, chunk_idx,
#            embedding[1536], effective_ts, knowledge_ts}]

The problem

Raw financial data isn't model-ready

Building an ML pipeline on financial data means months of plumbing: scraping filings, chunking transcripts, computing accounting ratios, normalizing schemas, and engineering point-in-time correctness. VectorFin ships all of that as a data product — so your team can focus on alpha, not infrastructure.

What you get

Two product lines — embeddings and signals — in one API and one Iceberg catalog.

Transcript & Filing Embeddings

Every earnings call and 10-K/10-Q chunked per fiscal period and vectorized with Google text-embedding-004. Drop into your RAG pipeline or similarity search — no preprocessing required.

Derived Quant Signals

Piotroski F-score, Altman Z-score, Beneish M-score, regime classification (safe/grey/distress), GARCH volatility forecasts, and anomaly flags — nightly updated across 5,000+ tickers.

Native Iceberg Delivery

Data lives as Apache Iceberg tables on GCS, served via Polaris catalog. Mount it in Snowflake, BigQuery, or Databricks as a native external table — no ETL, no replication, no pipelines.

Bitemporal by Design

Every record carries both effective_ts (when it was true in the world) and knowledge_ts (when we learned it). Point-in-time backtests are safe — no lookahead bias, no data leakage.

Nightly Pipeline

Embeddings and signals refresh every night. New earnings transcripts are embedded within hours of release. Iceberg tables are append-only — your queries always see the latest snapshot.

Ready to Model

Not raw scraped data — derived signals already normalized for ML consumption. Piotroski, Altman, and Beneish scores computed and stored. Embeddings chunked per fiscal period at consistent granularity.

How it works

A nightly pipeline ingests, derives, and delivers — so the data in your warehouse is always fresh.

01

Ingest

Nightly jobs pull earnings transcripts and SEC filings, then chunk and embed each document per fiscal period.

02

Derive

Quant signals — Piotroski, Altman Z, Beneish M, regime, volatility, anomaly — are computed and appended to Iceberg tables.

03

Deliver

Access via REST API in minutes. On Pro, mount Iceberg tables directly in your Snowflake, BigQuery, or Databricks workspace.

Deliver to your existing data warehouse — no ETL

Snowflake

CREATE ICEBERG TABLE via Polaris catalog

BigQuery

Analytics Hub shared datasets

Databricks

Unity Catalog external Iceberg tables

REST API

JSON endpoints, all tiers

Simple, transparent pricing

Start free. Upgrade when you need more tickers or warehouse delivery. See full pricing

Free

$0/forever

Evaluate VectorFin with the top 100 tickers. No credit card required.

Start for free

Starter

$2,000/per month

For quant teams that need broad ticker coverage and bulk parquet delivery.

Get started
Most popular

Pro

$10,000/per month

For hedge funds that need unlimited data delivered directly into their data warehouse.

Get started

Enterprise

Custom/pricing

Dedicated infrastructure, SLAs, and white-glove onboarding.

Talk to us

Stop building pipelines. Start building alpha.

Get your API key in minutes. First 1,000 calls and top 100 tickers are free — no credit card required.

Get API Access — Free