internet-archiving

Here are 25 public repositories matching this topic...

ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Updated Jun 10, 2024
Python

akamhy / waybackpy

Star

Wayback Machine API interface & a command-line tool

osint internet-archive web-archiving wayback-machine webarchiving cdx-api internet-archiving savepagenow archive-webpage archive-webpages wayback-machine-api wayback-machine-python

Updated Feb 26, 2024
Python

pirate / wikipedia-mirror

Sponsor

Star

🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump

html docker nginx wiki docker-compose mediawiki wikipedia archiving datascience kiwix zim wikipedia-dump wikipedia-mirror openzim xowa internet-archiving mwdumper kiwix-offline-wikipedia

Updated Apr 7, 2021
Shell

ArchiveBox / electron-archivebox

Sponsor

Star

Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)

electron windows macos linux docker gui desktop web-archiving digipres internet-archiving archivebox desktop-electron

Updated Feb 28, 2023
JavaScript

ArchiveBox / archivebox-browser-extension

Sponsor

Star

Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

chrome-extension archiving svelte firefox-extension browser-extension web-archiving digital-preservation digipres internet-archiving archivebox

Updated Apr 11, 2024
TypeScript

ArchiveBox / readability-extractor

Sponsor

Star

Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.

wrapper node readability internet-archiving archivebox

Updated Apr 11, 2024
JavaScript

ArchiveBox / docker-archivebox

Sponsor

Star

Home of the official docker image for ArchiveBox

docker kubernetes image docker-compose docker-image container oci digipres podman internet-archiving archivebox

Updated Feb 19, 2024
Dockerfile

ArchiveBox / good-karma-kit

Sponsor

Star

😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...

docker docker-compose ipfs distributed-computing tor distributed-storage sia boinc kiwix i2p foldingathome storj pywb internet-archiving archivebox good-karma archivewarrior zimfarm

Updated May 11, 2024

vegetableman / vandal

Star

Navigator for Web Archive

chrome-extension firefox-addon wayback-machine webarchive internet-archiving

Updated Nov 23, 2023
JavaScript

pirate / internet-archiving-talk

Sponsor

Star

🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.

slideshow wget talks warc censorship web-archiving ethics internet-archiving archivebox

Updated Oct 19, 2020
JavaScript

ArchiveBox / debian-archivebox

Sponsor

Star

Home of the official apt/deb package for Ubuntu/Debian-based systems.

package debian apt ubuntu web-archiving aptitude digipres internet-archiving archivebox stdeb

Updated May 20, 2024
Python

ArchiveBox / docs

Sponsor

Star

Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.

python cli community documentation ui rest wiki sphinx usage web-archiving digipres internet-archiving archivebox

Updated May 7, 2024
CSS

ArchiveBox / homebrew-archivebox

Sponsor

Star

Homebrew formula for the ArchiveBox self-hosted internet archiving solution.

macos homebrew package linuxbrew web-archiving digipres brew-tap internet-archiving archivebox

Updated Feb 19, 2024
Ruby

ArchiveBox / pip-archivebox

Sponsor

Star

Official Python package for ArchiveBox, the self-hosted internet archiving solution.

python pypi wheel pip setuptools web-archiving digipres sdist internet-archiving archivebox

Updated May 21, 2024

mikwielgus / forum-dl

Sponsor

Star

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

python scraper forum discourse phpbb warc data-fetching simplemachines internet-archiving

Updated Sep 19, 2023
Python

itsliamdowd / WaybackBrowserMacOS

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jul 1, 2022
Swift

A suite of tools for mirroring and hoarding web pages you visit for later offline viewing. I.e. your own personal Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data, which also follows "archive everything now, figure out what to do with it later" philosophy.

backups internet self-hosted archive web-archiving wayback-machine internet-archiving

Updated Jun 7, 2024
Python

itsliamdowd / WaybackBrowserWindows

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jun 14, 2022
Python

httpreserve / conventoarchiver

Star

Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.

internet-archive web-archiving digipres webarchives internet-archiving press-releases myconvento pr-newsroom my-convento

Updated Jan 5, 2022
Python

gabldotink / sharkive.old

Star

upload stuff to the Internet Archive using a shell script

youtube youtube-dl internet-archive youtube-downloader internet-archiving

Updated Jul 28, 2023
Shell

Improve this page

Add a description, image, and links to the internet-archiving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the internet-archiving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internet-archiving

Here are 25 public repositories matching this topic...

ArchiveBox / ArchiveBox

akamhy / waybackpy

pirate / wikipedia-mirror

ArchiveBox / electron-archivebox

ArchiveBox / archivebox-browser-extension

ArchiveBox / readability-extractor

ArchiveBox / docker-archivebox

ArchiveBox / good-karma-kit

vegetableman / vandal

pirate / internet-archiving-talk

ArchiveBox / debian-archivebox

ArchiveBox / docs

ArchiveBox / homebrew-archivebox

ArchiveBox / pip-archivebox

mikwielgus / forum-dl

itsliamdowd / WaybackBrowserMacOS

Own-Data-Privateer / pwebarc

itsliamdowd / WaybackBrowserWindows

httpreserve / conventoarchiver

gabldotink / sharkive.old

Improve this page

Add this topic to your repo