cvfile.org

One file. Three audiences. Real semantic search.

.cv bundles a designed PDF, a clean Markdown copy, a self-contained HTML rendering, and pre-computed BGE-M3 embeddings into a single PDF/A-3u file that opens everywhere on day one.

Install the CLI Read the spec Try the viewer

Why .cv exists

Today a person sends three different artifacts to three different audiences: a polished PDF to recruiters, a Markdown copy to ATS systems, a web bio for their site. They drift out of sync within weeks. AI tools re-embed the same documents over and over.

.cv fixes all three: one file, one source of truth, opens in any PDF reader on day one. Bots that ask for text/markdown get the markdown back. RAG pipelines that recognize the format read pre-computed BGE-M3 vectors directly instead of re-embedding.

What ships today

Hello, .cv

# Build a .cv from a markdown CV
cv pack \
  --pdf resume.pdf \
  --md resume.md \
  --html resume.html \
  --lang en \
  -o resume.cv

# Read it back any way you like
cv extract resume.cv --format md      # markdown stream
cv inspect resume.cv --json           # XMP + payload metadata
cv validate resume.cv --strict        # PDF/A-3u + cv-strict gate
cv search  resume.cv "kubernetes"     # semantic search via BGE-M3

The killer property

A producer ships one .cv file. The HTTP middleware makes the wrapper invisible to consumers:

Consumer Accept header Sees
text/html, */* (LLM crawlers, browsers) Extracted resume.html
text/markdown (newer agents, our SDK) Extracted resume.md
application/pdf The visual PDF
No Accept header The visual PDF (renders in built-in viewer)

The bot never needs to know .cv exists. The format is producer-side convenience; consumer-side it is invisible.