Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

TalonProbeite/PureFile

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PureFile 🛡️

A high-performance Python service designed to extract and sanitize sensitive metadata from your files. Stop leaking your GPS location, device serial numbers, and personal info before sharing documents or photos.

🌟 Key Features

  • Metadata Extraction: View all hidden technical tags before they are wiped.
  • Deep Sanitization: Completely strips EXIF, XMP, and other metadata from Images, PDFs, and Word docs.
  • Privacy-First: Files are processed in-memory and never stored on the server.
  • Modern Stack: Built with FastAPI and powered by the ultra-fast uv package manager.

🛠️ Tech Stack

  • Framework: FastAPI (Asynchronous API)
  • Package Manager: uv
  • Libraries: Pillow (Images), PyMuPDF (PDF), python-docx (Word)
  • Containerization: Docker

🚀 Getting Started

Option 1: Docker (Recommended)

  1. Build the image: docker build -t purefile-app .

  2. Run the container: docker run -d -p 8000:8000 --name purefile-container purefile-app

Option 2: Local Development (using uv)

  1. Install dependencies: uv pip install -r pyproject.toml

  2. Run the application: python run.py


🖥️ Usage

Once the service is running, open your browser at: http://localhost:8000/docs

You will see the interactive Swagger UI where you can upload a file and get a "purified" version back instantly.


📂 Supported Formats

  • Images: .jpg, .jpeg, .png, .webp
  • PDF: .pdf
  • Documents: .docx

⚙️ How it works

PureFile acts as a digital filter. It parses the binary structure of your file, identifies metadata segments (like EXIF in photos or Author properties in DOCX), and re-saves the file while intentionally omitting these segments. The result is a visually identical file with a clean "digital history".

About

API designed for removing and reading metadata for png, jpg, docx, pdf

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.