Nine+ formats · One bundle · Zero setup
Drag a folder. Get one LLM-ready file.
Most AI work falls apart at the same step: you have twelve files the model needs to see, and a context window that wants one. The File Concatenator parses PDF, DOCX, PPTX, XLSX, ODT, EML, and source code in your browser and emits a single bundle with a table of contents and per-file delimiters. Nothing uploads. Nothing waits on a server.
Drop a folder. Get a paste-ready bundle in seconds.
- 0 uploads
- 9+ formats parsed in-browser
- 1 bundle out
// Table of contents 1. README.md (1.2 kB) 2. src/auth.js (3.4 kB) 3. src/auth.test.js (2.1 kB) 4. docs/spec.pdf (8.9 kB extracted) 5. docs/brief.docx (4.7 kB extracted) ... 7 more ===== FILE: src/auth.js ===== import { verify } from './jwt.js'; export async function authenticate(req) { const token = req.headers.get('authorization'); if (!token) return { ok: false, reason: 'missing' }; return verify(token); } ===== FILE: docs/spec.pdf ===== // Extracted via pdf.js - page 1 Authentication flow. The service accepts a bearer token in the Authorization header. Tokens are signed with the shared HMAC key rotated every 90 days. ...
delimited · table of contents · ready to paste
connect-src 'self'; the browser blocks outbound calls. View source to verify.
Why concatenation matters
A folder is a structure. A chat window is a slot.
Feeding files to an LLM is mostly a structure problem. Every time you upload a doc separately, you lose the relationship between files. Every time you paste twelve excerpts in a row, you lose source attribution. Concatenation with explicit delimiters gives the model the same view of your folder that you have.
// 1. Table of contents (always first) file count, total size, format breakdown // 2. Per-file delimiter ===== FILE: relative/path/to/file ===== // 3. Extracted text the actual contents, decoded from the source format // 4. Repeat for every file in the drop no silent skips - unreadable files get an explicit note so you can decide whether to drop them or not
- Context windows have a budget.Pasting twelve files individually wastes tokens on chat boilerplate and breaks the model's sense of which file you mean. One bundle keeps the budget tight.
- Structure disappears in copy-paste.A PDF table becomes a wall of text. A repository becomes a pile of snippets. Delimiters and a table of contents put the folder shape back.
- Attribution matters when the model is wrong.If the model misquotes a clause, you want to know which file it came from. Per-file headers make every reply auditable.
- One file is portable.One bundle pastes cleanly into ChatGPT, Claude, Gemini, or your IDE. The same artifact also goes into a Git commit, an email thread, or a code review note.
If you would open it in Finder, it parses here.
pdf.js handles PDFs page by page. mammoth.js converts DOCX to clean text without losing headings. xlsx parses spreadsheets sheet by sheet, row by row. jszip cracks open PPTX and ODT containers so the same pipeline can read both.
Unknown extensions fall back to a UTF-8 plain-text read with byte-order-mark handling. Source code files are passed through unchanged, so syntax stays exactly as you wrote it.
A whitespace-minimization engine collapses redundant blank runs before they hit the bundle. It is a toggle, off by default.
- PDFpdf.js parses pages locally, including PDFs with embedded text layers
- DOCXmammoth.js extracts text and heading structure from Word files
- PPTXjszip unwraps the slide XML; readable slide-by-slide text
- XLSXxlsx reads every sheet, every row, with column headers preserved
- ODTOpenDocument text via the same XML-in-zip path as PPTX
- EMLheaders extracted, MIME bodies decoded, attachments listed by name
- Plain text.txt, .md, .json, .csv, .log, .yaml, .toml, .xml
- Source codeJS, TS, Python, Go, Rust, Java, Ruby, C/C++, shell, SQL
- Unknown extensionsUTF-8 plain-text fallback so nothing silently drops
Four steps you can watch happen.
The whole flow is four steps and you can see each one happen. The progress meter shows file-by-file parsing. The preview shows the bundle as it grows. Nothing finalizes until you click copy or download.
// 12 files queued README.md [parsed] 1.2 kB src/auth.js [parsed] 3.4 kB src/auth.test.js [parsed] 2.1 kB src/jwt.js [parsed] 1.8 kB docs/spec.pdf [parsed] 8.9 kB docs/brief.docx [parsed] 4.7 kB data/pricing.xlsx [parsed] 2.4 kB slides/q3.pptx [parsed] 6.1 kB notes/inbox.eml [parsed] 1.9 kB notes/research.odt [parsed] 3.0 kB config/settings.json [parsed] 0.4 kB LICENSE [parsed] 1.1 kB // done. 12 of 12. 0 errors.
Frequently asked, plainly answered.
Does this work offline?
Yes. Every parser (pdf.js, mammoth.js, xlsx, jszip) runs in your browser. The folder you drop never leaves the tab. Open the network panel and watch: no upload requests fire during parsing.
What file types does it read?
PDF, DOCX, PPTX, XLSX, ODT, EML, plain text, Markdown, JSON, CSV, and source code (JS, TS, Python, Go, Rust, Java, C/C++, and similar). Unknown extensions fall back to a plain-text reader so nothing is silently dropped.
What happens with very large folders?
Files are parsed one at a time and streamed into the output buffer. A 200-file repository with a mix of PDFs and source code typically completes in a few seconds. Memory pressure shows in the progress meter so you can stop and trim if needed.
What does the output look like?
One UTF-8 text bundle. A table of contents is prepended, each file is wrapped in a labeled delimiter (===== FILE: path/name =====), and the result is ready to copy, save, or pipe into a context window.
One bundle is worth a hundred copy-pastes.
The File Concatenator is built into Prompt Organizer. Free, local, no account. The folder you drop and the bundle that comes out both live in this browser until you choose to move them.