Nine+ formats · One bundle · Zero setup

Drag a folder. Get one LLM-ready file.

Most AI work falls apart at the same step: you have twelve files the model needs to see, and a context window that wants one. The File Concatenator parses PDF, DOCX, PPTX, XLSX, ODT, EML, and source code in your browser and emits a single bundle with a table of contents and per-file delimiters. Nothing uploads. Nothing waits on a server.

Drop a folder. Get a paste-ready bundle in seconds.

  • 0 uploads
  • 9+ formats parsed in-browser
  • 1 bundle out
bundle.txt - 47.3 kB - 12 files output
// Table of contents
1. README.md                  (1.2 kB)
2. src/auth.js                (3.4 kB)
3. src/auth.test.js           (2.1 kB)
4. docs/spec.pdf              (8.9 kB extracted)
5. docs/brief.docx            (4.7 kB extracted)
... 7 more

===== FILE: src/auth.js =====
import { verify } from './jwt.js';

export async function authenticate(req) {
  const token = req.headers.get('authorization');
  if (!token) return { ok: false, reason: 'missing' };
  return verify(token);
}

===== FILE: docs/spec.pdf =====
// Extracted via pdf.js - page 1
Authentication flow. The service accepts a bearer
token in the Authorization header. Tokens are signed
with the shared HMAC key rotated every 90 days.
...

delimited · table of contents · ready to paste

Parsers ship with the page pdf.js, mammoth.js, xlsx, and jszip load on demand. No remote conversion service.
See for yourself, not just claimed The page's own Content-Security-Policy is connect-src 'self'; the browser blocks outbound calls. View source to verify.
Output is plain text UTF-8 with explicit delimiters. Inspect it in any editor before sending.

Why concatenation matters

A folder is a structure. A chat window is a slot.

Feeding files to an LLM is mostly a structure problem. Every time you upload a doc separately, you lose the relationship between files. Every time you paste twelve excerpts in a row, you lose source attribution. Concatenation with explicit delimiters gives the model the same view of your folder that you have.

What a good bundle contains toc + delimiters + extracted text
// 1. Table of contents (always first)
file count, total size, format breakdown

// 2. Per-file delimiter
===== FILE: relative/path/to/file =====

// 3. Extracted text
the actual contents, decoded from the source format

// 4. Repeat for every file in the drop
no silent skips - unreadable files get an explicit
note so you can decide whether to drop them or not
  1. Context windows have a budget.Pasting twelve files individually wastes tokens on chat boilerplate and breaks the model's sense of which file you mean. One bundle keeps the budget tight.
  2. Structure disappears in copy-paste.A PDF table becomes a wall of text. A repository becomes a pile of snippets. Delimiters and a table of contents put the folder shape back.
  3. Attribution matters when the model is wrong.If the model misquotes a clause, you want to know which file it came from. Per-file headers make every reply auditable.
  4. One file is portable.One bundle pastes cleanly into ChatGPT, Claude, Gemini, or your IDE. The same artifact also goes into a Git commit, an email thread, or a code review note.

If you would open it in Finder, it parses here.

pdf.js handles PDFs page by page. mammoth.js converts DOCX to clean text without losing headings. xlsx parses spreadsheets sheet by sheet, row by row. jszip cracks open PPTX and ODT containers so the same pipeline can read both.

Unknown extensions fall back to a UTF-8 plain-text read with byte-order-mark handling. Source code files are passed through unchanged, so syntax stays exactly as you wrote it.

A whitespace-minimization engine collapses redundant blank runs before they hit the bundle. It is a toggle, off by default.

  • PDFpdf.js parses pages locally, including PDFs with embedded text layers
  • DOCXmammoth.js extracts text and heading structure from Word files
  • PPTXjszip unwraps the slide XML; readable slide-by-slide text
  • XLSXxlsx reads every sheet, every row, with column headers preserved
  • ODTOpenDocument text via the same XML-in-zip path as PPTX
  • EMLheaders extracted, MIME bodies decoded, attachments listed by name
  • Plain text.txt, .md, .json, .csv, .log, .yaml, .toml, .xml
  • Source codeJS, TS, Python, Go, Rust, Java, Ruby, C/C++, shell, SQL
  • Unknown extensionsUTF-8 plain-text fallback so nothing silently drops

Four steps you can watch happen.

The whole flow is four steps and you can see each one happen. The progress meter shows file-by-file parsing. The preview shows the bundle as it grows. Nothing finalizes until you click copy or download.

Step 1Drop the folder.Drag a folder or pick individual files. The browser hands the app File handles, and parsing starts in the tab.
Step 2Watch the parse.Each format runs through its specific reader. PDFs come out page by page. DOCX keeps heading structure. The progress meter shows every file as it lands.
Step 3Optional minimize.Toggle the whitespace engine to collapse redundant blank lines. Off by default, so original spacing survives unless you turn it on.
Step 4Copy or download.One UTF-8 text file. Paste into ChatGPT, Claude, Gemini, or your editor. Save it next to your prompts. Diff it next week.
preview - bundle.txt 12 files · 47.3 kB · 0 uploads
// 12 files queued
README.md             [parsed]   1.2 kB
src/auth.js           [parsed]   3.4 kB
src/auth.test.js      [parsed]   2.1 kB
src/jwt.js            [parsed]   1.8 kB
docs/spec.pdf         [parsed]   8.9 kB
docs/brief.docx       [parsed]   4.7 kB
data/pricing.xlsx     [parsed]   2.4 kB
slides/q3.pptx        [parsed]   6.1 kB
notes/inbox.eml       [parsed]   1.9 kB
notes/research.odt    [parsed]   3.0 kB
config/settings.json  [parsed]   0.4 kB
LICENSE               [parsed]   1.1 kB

// done. 12 of 12. 0 errors.

Frequently asked, plainly answered.

Does this work offline?

Yes. Every parser (pdf.js, mammoth.js, xlsx, jszip) runs in your browser. The folder you drop never leaves the tab. Open the network panel and watch: no upload requests fire during parsing.

What file types does it read?

PDF, DOCX, PPTX, XLSX, ODT, EML, plain text, Markdown, JSON, CSV, and source code (JS, TS, Python, Go, Rust, Java, C/C++, and similar). Unknown extensions fall back to a plain-text reader so nothing is silently dropped.

What happens with very large folders?

Files are parsed one at a time and streamed into the output buffer. A 200-file repository with a mix of PDFs and source code typically completes in a few seconds. Memory pressure shows in the progress meter so you can stop and trim if needed.

What does the output look like?

One UTF-8 text bundle. A table of contents is prepended, each file is wrapped in a labeled delimiter (===== FILE: path/name =====), and the result is ready to copy, save, or pipe into a context window.

One bundle is worth a hundred copy-pastes.

The File Concatenator is built into Prompt Organizer. Free, local, no account. The folder you drop and the bundle that comes out both live in this browser until you choose to move them.