soyjersh
kiwifarms.net
- Registrado
- 12 de Jun, 2023
This post is a call to action as well as a technical proposal for review. I claim no expertise in the systematic implementation of a solution similar to the one proposed herein or otherwise, but a modest level of familiarity with the technologies involved and an idea of how to deploy them. The reason for such a deployment is simple: a dissatisfaction with the current landscape of archival resources for Kiwifarms material, specifically those which pertain to websites. There only exists one suggested on this thread, archive.is, and while it serves the necessary purpose it exists alone and out of our control. As we’ve seen in the past with censorship performed by internet archive and the databreach/ddos of the wayback machine, other services are unreliable, and having one reliable choice is a system with a target on its back, probably one drawn there by someone who commits “consent accidents” with impunity. Therefore my desire is to utilize the untapped social capital of Kiwifarms to host a truly antifragile archive system for website snapshots that is hosted by trusted individuals and governed by, preferably, our enlightened despot Null or an adequate stand in.
The system is here explained in nontechnical language for the purposes of a more general understanding. It would involve the use of OpenZiti, tooling necessary to create a zero trust network which allows for the creation of an “invisible Internet” where only approved, cryptographically authenticated peers can communicate, and only with the services they’re authorized to see. This would allow for participants to be connected through programmed policies in an anonymous encrypted way to peers without complex routing, firewall changes, or public DNS configuration. Services can be published securely from anywhere. It decouples identity and access from the underlying topology, making it easy to securely connect microservices, containers, or IoT nodes across disparate environments.It would be possible for anyone meeting system criteria set by the network operator to join and provide their system’s resources in a highly secure use specific way. What this would best be used for in my opinion is hosting a bunch of Inter Planetary File System (IPFS) nodes, with select nodes being IPFS gateways. IPFS is a peer-to-peer, content-addressable network for storing and sharing data in a distributed fashion. IPFS, being distributed, replicates data across many nodes. This means files could exist in dozens or hundreds of locations simultaneously, ensuring resilience against single-point failures reduced downtime and global redundancy without relying on one provider.
In essence, network participants would run a node on this file system through the openziti network creating a hidden linking of various peers into a highly resilient storage pool. What would be the best use of such a pool? In my opinion it would be the hosting of ZIM file snapshots of website urls. ZIM files already achieve high compression (often reducing website size by 60–90%). IPFS adds deduplication: if multiple ZIM backups share identical chunks (say, common assets or articles), those chunks are stored only once across the network.
The combination results in massive bandwidth savings for distribution, storage optimization across multiple archives or mirrors and incremental versioning: new versions of a site only store changed blocks, not the entire archive. This makes IPFS + ZIM an ideal setup for versioned website archiving over time. Traditional backups often involve database dumps (SQL files), separate asset directories, and configuration files. Restoring them requires a specific environment and version compatibility.A ZIM file is a “frozen” encapsulating everything necessary to render the site. By hosting a ZIM on IPFS complexity is reduced to one file and one CID. Restoration becomes as easy as fetching the file and viewing it locally or serving it from an IPFS gateway. It’s also possible to integrate IPFS CIDs into a website’s metadata or Git commits for transparent public archiving.The system described so far is basically designed so that whoever runs specific OpenZiti overlay network components can decide who runs a node and helps in it in an anonymous fashion where no one really needs to know who’s who and would have a hard time doing so. There are concerns of course of a malicious node, possibly hosting content which would ideally be inaccessible.
This is where the fatefully named Kiwitrix and IPFS Gateway nodes comes into play. Any IPFS resource is identified by a content hash (CID). This means the CID is the content. If even one byte changes, the CID changes. This provides a tamper-proof fingerprint of any given dataset. For moderation, this is powerful because at Gateway nodes you can blacklist specific CIDs that represent illegal, malicious, or inappropriate archives. You can publish whitelists of verified, reputable ZIM archives. Communities can build reputation registries mapping CIDs to trust scores or moderation categories. So, instead of deleting data, IPFS moderation often means maintaining cryptographically verified blacklists and trust registries that nodes, gateways, or frontends can voluntarily follow.
The frontend in this case would be a Kiwitrix server. Kiwitrix is client and server, one for the host of various backup ZIMs and one for the end user to connect to so that they can access said ZIMs. When Kiwitrix imports a ZIM file, it also reads and stores metadata. This metadata can be used to detect and flag anomalies (e.g., fake Wikipedia mirrors or files with no provenance, categorize content based on educational, cultural, or sensitive material tags, and finally filter content visibility, only loading ZIMs signed by known public keys. Kiwitrix can therefore implement a policy-based moderation layer, where the visibility of content depends on cryptographic signatures from trusted curators, metadata validation or local administrator preferences. This transforms moderation from censorship into contextual access control.
Any technical minded reader understands the many details lacking in this explanation, but hopefully comprehends the potential such a system has, particularly in combination with what I firmly believe is a vast untapped potential of website users united under a common cause of archiving everything even against pressure of ontologically evil people, to borrow a term used by someone who is one. I have further details about my potential implementation of this system which I think are better shared with the people who would implement it for the goals it should be used to fulfill, but I’m open to questions and comments as I again am no expert on how a system like this could be used or implemented, only having somewhat the technical knowledge needed to deploy it.
The system is here explained in nontechnical language for the purposes of a more general understanding. It would involve the use of OpenZiti, tooling necessary to create a zero trust network which allows for the creation of an “invisible Internet” where only approved, cryptographically authenticated peers can communicate, and only with the services they’re authorized to see. This would allow for participants to be connected through programmed policies in an anonymous encrypted way to peers without complex routing, firewall changes, or public DNS configuration. Services can be published securely from anywhere. It decouples identity and access from the underlying topology, making it easy to securely connect microservices, containers, or IoT nodes across disparate environments.It would be possible for anyone meeting system criteria set by the network operator to join and provide their system’s resources in a highly secure use specific way. What this would best be used for in my opinion is hosting a bunch of Inter Planetary File System (IPFS) nodes, with select nodes being IPFS gateways. IPFS is a peer-to-peer, content-addressable network for storing and sharing data in a distributed fashion. IPFS, being distributed, replicates data across many nodes. This means files could exist in dozens or hundreds of locations simultaneously, ensuring resilience against single-point failures reduced downtime and global redundancy without relying on one provider.
In essence, network participants would run a node on this file system through the openziti network creating a hidden linking of various peers into a highly resilient storage pool. What would be the best use of such a pool? In my opinion it would be the hosting of ZIM file snapshots of website urls. ZIM files already achieve high compression (often reducing website size by 60–90%). IPFS adds deduplication: if multiple ZIM backups share identical chunks (say, common assets or articles), those chunks are stored only once across the network.
The combination results in massive bandwidth savings for distribution, storage optimization across multiple archives or mirrors and incremental versioning: new versions of a site only store changed blocks, not the entire archive. This makes IPFS + ZIM an ideal setup for versioned website archiving over time. Traditional backups often involve database dumps (SQL files), separate asset directories, and configuration files. Restoring them requires a specific environment and version compatibility.A ZIM file is a “frozen” encapsulating everything necessary to render the site. By hosting a ZIM on IPFS complexity is reduced to one file and one CID. Restoration becomes as easy as fetching the file and viewing it locally or serving it from an IPFS gateway. It’s also possible to integrate IPFS CIDs into a website’s metadata or Git commits for transparent public archiving.The system described so far is basically designed so that whoever runs specific OpenZiti overlay network components can decide who runs a node and helps in it in an anonymous fashion where no one really needs to know who’s who and would have a hard time doing so. There are concerns of course of a malicious node, possibly hosting content which would ideally be inaccessible.
This is where the fatefully named Kiwitrix and IPFS Gateway nodes comes into play. Any IPFS resource is identified by a content hash (CID). This means the CID is the content. If even one byte changes, the CID changes. This provides a tamper-proof fingerprint of any given dataset. For moderation, this is powerful because at Gateway nodes you can blacklist specific CIDs that represent illegal, malicious, or inappropriate archives. You can publish whitelists of verified, reputable ZIM archives. Communities can build reputation registries mapping CIDs to trust scores or moderation categories. So, instead of deleting data, IPFS moderation often means maintaining cryptographically verified blacklists and trust registries that nodes, gateways, or frontends can voluntarily follow.
The frontend in this case would be a Kiwitrix server. Kiwitrix is client and server, one for the host of various backup ZIMs and one for the end user to connect to so that they can access said ZIMs. When Kiwitrix imports a ZIM file, it also reads and stores metadata. This metadata can be used to detect and flag anomalies (e.g., fake Wikipedia mirrors or files with no provenance, categorize content based on educational, cultural, or sensitive material tags, and finally filter content visibility, only loading ZIMs signed by known public keys. Kiwitrix can therefore implement a policy-based moderation layer, where the visibility of content depends on cryptographic signatures from trusted curators, metadata validation or local administrator preferences. This transforms moderation from censorship into contextual access control.
Any technical minded reader understands the many details lacking in this explanation, but hopefully comprehends the potential such a system has, particularly in combination with what I firmly believe is a vast untapped potential of website users united under a common cause of archiving everything even against pressure of ontologically evil people, to borrow a term used by someone who is one. I have further details about my potential implementation of this system which I think are better shared with the people who would implement it for the goals it should be used to fulfill, but I’m open to questions and comments as I again am no expert on how a system like this could be used or implemented, only having somewhat the technical knowledge needed to deploy it.