.cv Open Resume File Format

Spec cv 1.0Apache 2.0IANA media type

One file.
Three readers.
One source of truth.

.cv is the open file format for resumes. A single PDF/A-3u that carries a designed PDF, a clean Markdown copy, an HTML rendering, and pre-computed BGE-M3 vectors. It opens in any PDF reader on day one; machines read the text inside without OCR.

Humans
the designed PDF
ATS · CRM · recruiters
clean Markdown
AI · RAG pipelines
precomputed vectors
application/vnd.cv+pdfPDF/A-3u
  • resume.pdf application/pdf the visual layer
  • resume.md text/markdown ATS · AI ← primary
  • resume.html text/html the web
  • embeddings.cbor vnd.cv.embeddings+cbor BGE-M3 · 1024d
XMP · cv: namespaceSHA-256 per payload
Fig.1 · one container, every copy verified in sync
§02Rationale

Four artifacts drift. One file does not.

Today a person sends three different artifacts to three different audiences: a polished PDF to recruiters, a Markdown copy to ATS systems, a web bio for their site. They fall out of sync within weeks. AI tools re-OCR and re-embed the same documents over and over.

.cv collapses them into one. One file, one source of truth, opens in any PDF reader on day one. Bots that ask for text/markdown get the Markdown back. RAG pipelines that recognise the format read precomputed BGE-M3 vectors directly instead of re-embedding.

§03Comparison

Schemas drop the design. PDFs drop the text. .cv keeps both.

Format Visual PDF Clean ATS text Precomputed vectors Opens anywhere
.cv yesyesyesyes
JSON Resume / FRESH data onlyyesnono
Europass XML data onlyyesnono
HR-XML / HR Open data onlyyesnono
Plain PDF yesOCR onlynoyes
§04Invisible

The consumer never learns .cv exists.

A producer ships one file. The HTTP middleware reads the Accept header and hands every consumer exactly the representation it wants. The wrapper is producer-side convenience; consumer-side it stays invisible.

text/html, */* LLM crawlers, browsers extracted resume.html
text/markdown newer agents, our SDK extracted resume.md
application/pdf PDF clients the visual PDF
no Accept header anything else the visual PDF
§05Hello, .cv

Pack once. Read it any way you like.

terminal · cv command line

# Build a .cv from a markdown CV
cv pack \
  --pdf resume.pdf \
  --md resume.md \
  --html resume.html \
  --lang en \
  -o resume.cv

# Read it back any way you like
cv extract resume.cv --format md      # markdown stream
cv inspect resume.cv --json           # XMP + payload metadata
cv validate resume.cv --strict        # PDF/A-3u + cv-strict gate
cv search  resume.cv "kubernetes"     # semantic search via BGE-M3
§06Status

What ships today.

  1. Stable spec at cv-1.0, frozen and versioned.
  2. Reference SDKs in JavaScript, Python, and Go.
  3. Single-binary CLI: cv extract / inspect / validate / search.
  4. <cv-embed> web component (Lit, ~10 KB shell, lazy PDF.js worker).
  5. Server middleware for Express, Fastify, Hono, FastAPI, Flask, Django, Go net/http.
  6. Optional BGE-M3 embedding generation (@cvfile/embed, cvfile[embed]).
  7. LangChain, LlamaIndex, and Haystack document loaders.
  8. 200-line reference sniffer (cvfile-cv-detector) for crawler vendors.
§07FAQ

Frequently asked questions.

01 What is the .cv file format?

.cv is an open file format for resumes. It is a valid PDF/A-3u file that bundles a designed PDF, a clean Markdown copy, a self-contained HTML rendering, and optional pre-computed BGE-M3 embeddings inside one file. Any PDF reader opens it visually; ATS systems and AI agents read the embedded Markdown directly.

02 How is .cv different from JSON Resume, FRESH, or Europass?

Those are pure data schemas (JSON or XML) that drop the visual artifact and require the consumer to render. .cv keeps the designer-controlled PDF intact and travels the JSON/Markdown/HTML alongside it as PDF Associated Files (ISO 32000-2 §14.13). A recruiter still opens a polished PDF; an ATS or LLM still reads clean text; both come from the same file.

03 Do I need a special viewer to open a .cv file?

No. A .cv is a valid PDF. Preview, Adobe Reader, Chrome, every PDF reader shipped in the last fifteen years opens it normally and shows the visual layer. The additional payloads (Markdown, HTML, embeddings) are discoverable via standard PDF Associated Files mechanism.

04 Why are pre-computed embeddings inside the file?

So that any third-party RAG pipeline indexing the file (LangChain, LlamaIndex, Haystack, custom vector DB) can skip the embedding API call entirely. Default model is BAAI BGE-M3 (MIT licensed, multilingual, 1024-dim, free). Producers may also ship vectors in proprietary spaces (OpenAI text-embedding-3-large, Voyage-3, Gemini-text-004) when they target a specific downstream stack.

05 What MIME type does a .cv file use?

application/vnd.cv+pdf, registered (pending) with IANA per RFC 6838 vendor tree and RFC 8081 structured suffix +pdf. Until IANA approves, servers safely emit application/pdf alongside a Link header advertising the .cv alternates. A 200-line reference sniffer (cvfile-cv-detector, available in Python, Go, and TypeScript) detects .cv wrapping inside any application/pdf bytes.

06 Is the format and tooling free?

Yes. The spec is CC-BY-4.0. The CLI, the three SDKs (JavaScript, Python, Go), the web component, the server middleware, and the RAG integrations (langchain-cvfile, llama-index-readers-cvfile, cvfile-haystack) are Apache-2.0. There is no vendor lock-in. A planned cvfile.org/cloud paid tier exists for hosted convenience but is non-essential.

07 Can a job seeker create a .cv file?

Two paths. (1) Use the in-browser builder at cvfile.org/create, fill a form, and download a .cv with PDF + Markdown payloads ready. (2) Install the CLI (brew install cvfile/tap/cv) and run cv pack with your existing PDF + Markdown.

08 How can an ATS or LLM read the inside of a .cv file?

Several options. (a) Use one of the published reference SDKs (npm: @cvfile/sdk, PyPI: cvfile, Go: github.com/cvfile/cv/sdks/go). (b) Use one of the RAG integrations (langchain-cvfile, llama-index-readers-cvfile, cvfile-haystack). (c) Use the 200-line cvfile-cv-detector reference sniffer that depends only on the PDF parser the host already trusts. (d) Send an Accept: text/markdown header to a server running the @cvfile/server middleware; you receive the Markdown payload directly.