pdfrex

pdfrex is a command-line tool and Deno module for manipulating PDF files. It offers functionality to split, merge, extract text, and perform other PDF operations, making it a versatile tool for managing PDFs programmatically. pdfrex is built using the pdf-lib and PDF.js libraries, ensuring efficient and high-quality PDF manipulation.

Features

Merge PDFs: Combine multiple PDFs into one.
Split PDFs: Separate a PDF into individual pages.
Extract text: Convert PDFs to text files.
Flexible CLI: Easily execute PDF operations from the command line.
Programmatic Use: Import pdfrex functions directly in Deno projects.

Installation

To install pdfrex as a CLI tool, run:

deno install --global --allow-read --allow-write jsr:@jackfiszr/pdfrex@0.0.7

This command installs pdfrex globally, enabling the pdfrex command with merge and split subcommands.

Permissions

Since pdfrex reads and writes files, it requires the following permissions:

--allow-read for reading PDF files.
--allow-write for writing merged, split, or extracted text files.

Usage

General CLI Usage

pdfrex <command> [options]

Commands

merge: Combines multiple PDF files into a single document.
split: Divides a PDF document into individual pages.
totxt: Extracts text from PDFs and saves it as text files.

Merge

Combine multiple PDF files into one.

CLI Usage

pdfrex merge -d <directory> -f <file1,file2,...> -o <output-file>

Options

-d, --dir <string>: Directory to search for PDF files to merge (defaults to the current directory).
-f, --files <string>: Specific files to merge (comma-separated).
-o, --output <string>: File path for the output merged PDF (defaults to merged.pdf in the current directory).

Examples

Merge all PDFs in a directory:
```
pdfrex merge -d ./pdfs -o combined.pdf
```

Merge specific files:

pdfrex merge -f file1.pdf,file2.pdf,file3.pdf -o result.pdf

Split

Split a PDF into individual pages.

CLI Usage

pdfrex split -d <directory> -f <file1,file2,...> -o <output-dir> -p <prefix>

Options

-d, --dir <string>: Directory to search for PDF files to split (defaults to the current directory).
-f, --files <string>: Specific files to split (comma-separated).
-o, --output-dir <string>: Directory to save the split PDF pages (default creates a new directory named after the source file).
-p, --prefix <string>: Prefix for naming split files (default is the source file name).

Examples

Split all PDFs in a directory:
```
pdfrex split -d ./pdfs -o ./split_pdfs
```

Split a specific file with a custom prefix:

pdfrex split -f my_document.pdf -o ./output -p page

Programmatic Usage

Import pdfrex functions in your Deno project to perform PDF operations directly.

Merge PDFs

import { mergeAll, mergePdfs } from "jsr:@jackfiszr/pdfrex@0.0.7";

// Merge all PDFs in the current directory
await mergeAll();

// Merge all PDFs in a specified directory
await mergeAll({ dir: "./my_pdfs" });

// Merge specific files with a custom output path
await mergePdfs(["file1.pdf", "file2.pdf", "file3.pdf"], {
  output: "combined.pdf",
});

Split PDFs

import { splitAll, splitPdf } from "jsr:@jackfiszr/pdfrex@0.0.7";

// Split all PDFs in the current directory
await splitAll();

// Split all PDFs in a specified directory
await splitAll({ dir: "./my_pdfs" });

// Split a specific PDF
await splitPdf("document.pdf", { outputDir: "./pages", prefix: "page" });

Extract Text

Convert PDF files to text files.

CLI Usage

pdfrex totxt -d <directory> -f <file1,file2,...> -o <output-dir>

Options

-d, --dir <string>: Directory to search for PDF files to extract text from (defaults to the current directory).
-f, --files <string>: Specific files to extract text from (comma-separated).
-o, --output-dir <string>: Directory to save the extracted text files (default is the same as the source PDF location).

Examples

Extract text from all PDFs in a directory:
```
pdfrex totxt -d ./pdfs -o ./texts
```

Extract text from a specific PDF:

pdfrex totxt -f document.pdf -o ./output

Programmatic Usage

Import pdfrex functions in your Deno project to perform PDF operations directly.

Extract Text from PDFs

import { pdfToTxt, toTxtAll } from "jsr:@jackfiszr/pdfrex@0.0.7";

// Extract text from all PDFs in the current directory
await toTxtAll();

// Extract text from a specific PDF
await pdfToTxt("document.pdf", { outputDir: "./texts" });

Contributing

Contributions, issues, and feature requests are welcome! Feel free to check out the issue tracker and contribute.

License

GNU GENERAL PUBLIC LICENSE 3.0

Name	Name	Last commit message	Last commit date
Latest commit History 39 Commits 39 Commits
.github/workflows	.github/workflows
.vscode	.vscode
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
cli.ts	cli.ts
deno.json	deno.json
deno.lock	deno.lock
merge.ts	merge.ts
merge_test.ts	merge_test.ts
mod.ts	mod.ts
split.ts	split.ts
split_test.ts	split_test.ts
test_utils.ts	test_utils.ts
totxt.ts	totxt.ts
totxt_test.ts	totxt_test.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdfrex

Features

Installation

Permissions

Usage

General CLI Usage

Commands

Merge

CLI Usage

Options

Examples

Split

CLI Usage

Options

Examples

Programmatic Usage

Merge PDFs

Split PDFs

Extract Text

CLI Usage

Options

Examples

Programmatic Usage

Extract Text from PDFs

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

pdfrex

Features

Installation

Permissions

Usage

General CLI Usage

Commands

Merge

CLI Usage

Options

Examples

Split

CLI Usage

Options

Examples

Programmatic Usage

Merge PDFs

Split PDFs

Extract Text

CLI Usage

Options

Examples

Programmatic Usage

Extract Text from PDFs

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages