Skip to content

Oniongroove Design and Specification v0.0.1

!!! warning "Under development"

The current Oniongroove codebase does not fully implement these specifications.
Also, they're likely to change, so they're now used mainly as a reference, and
not something written on stone.

groove, n:

  1. A long, narrow channel or depression [...] to provide a location for an engineering component

  2. A fixed routine.

[...]

  1. (music) A pronounced, enjoyable rhythm.

Goals

  1. Give balance, rhythm and smoothness in configuring, monitoring and maintaining Tor Onion Services.

  2. Increase Onion Service adoption by providing such a tool that abstracts most of the details involved in Onion Services management.

The problem

Setting up Onion Services sites with load balancing, Denial of Service (DoS) defenses, guard protection and other best practices involves many moving parts, from installing different software to proceed with many incremental steps, making it difficult to go beyond creating a simple .onion address.

Proposed solution

The solution proposed in this specification tackles the problem using the Infrastruture as Code (IaC) paradigm to manage high availability Onion Services sites by:

  1. Offering a simplified interface, configuration format and convention.
  2. Providing a deployment tool that takes care of most of the setup.

Addressing the problem helps to make the Onion Services technology an usual website feature and not a special case, similar to what happened along the years regarding HTTPS adoption.

It's important to note what Oniongroove is not a caching/mirroring solution, but an Onion Services proxy layer. It requires active endpoint(s) to connect to.

The rest of this document details design choices, the architecture and implementation of Oniongroove.

Use cases

Fit

Onionprobe fits the following use cases:

  • Onionizing existing websites (as EOTK does).
  • Hosting only one Onion Service websites with scalability (Oniongroove provider with an unique site).
  • Hosting many Onion Services websites.
  • Hosting anonymous websites (i.e., those not served via regular HTTP/HTTPS connections outside the Tor network).
  • Hosting websites with or without HTTPS.
  • A command line tool (CLI) for Onion Service management supporting teamwork.
  • A library for Onion Service management, available to be integrated into deployment systems.
  • A service API, enabling systems as "click-to-deploy" dashboards to offer Onion Service deployment an additional feature when creating websites.

Don't fit

Onionprobe does not fits the following use cases:

  • Hosting basic Onion Services websites (like Tor and Apache in a shared hosting environment).
  • Adding the Onion-Location header into existing sites (that's the job of the existing site infrastructure that serves content outside the Tor network).

May fit

Oniongroove MAY fit the following use cases in the future:

  • Hosting Onion Services that do not rely on HTTP (Onion Services that aren't websites).

Preliminaries

Assumptions

  • The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14.

  • This specifications takes the meaning for terms like Operator, User and Client from the Tor Rendezvous Specification - Version 3.

  • It's assumed familiarity with Tor Onion Services, Onionbalance and the Vanguards Add-On.

  • The terms "suite" and "software suite" both refers implicitly to Oniongroove.

  • Every specification deals with the difficulty of defining all the relevant details of a would-be universe where instances of a system exists, provided that the specification is carefully followed. Question remains of not only what should be added but what should be removed until only the necessary (while never sufficient) is there. This current specification is no exception.

  • At the same time, it's not the intention of this document to be "written in the rock" or to serve as a perfect model to future implementations as what happens in waterfall-like models. It's mainly intended to aid the development.

This is a living specs/design: is not meant to be finished, but continuously updated as needed: in the spirit (and paraphrasing) the Tor Protocol Specification:

Note: This document aims to specify Oniongroove as currently implemented, though it may take it a little time to become fully up to date. Future versions of Oniongroove may implement improved protocols, and compatibility is not guaranteed. We may or may not remove compatibility notes for other obsolete versions of Oniongroove as they become obsolete.

Guiding questions

The basic questions oriented the overall Oniongroove planning are split in three dimensions:

  1. Architecture: which are the possible architectures to build an Onion balanced service? Knowing that there's no "one size fits all" choice, which service topology could work best for most use cases?

  2. Implementation: which are the available stacks/tools to implement those architectures? Ideally, the suite should not be tightly coupled or heavily dependent with any specific implementation, leaving room from refactoring and internals replacement while keeping the same user experience and configuration format.

  3. Support: which are the most common hosting environments and use cases that should be supported?

Overview

Oniongroove design focuses in some useful properties/features for Onion Services security and high availability:

  1. Reusability: a single Oniongroove instance MAY host more than a single .onion site. The total number of hosted .onion sites SHOULD not be limited by any software constraint but only by the available computing resources. Such Oniongroove instance is called a provider. Teams of Operators MAY manage multiple providers.
graph LR
  subgraph Operator Computer
    Oniongroove[Oniongroove tool]

    R1[Provider Repository 1]
    R2[Provider Repository 2]
    Rn[Provider Repository n]

    Oniongroove -- Manages --> R1
    Oniongroove -- Manages --> R2
    Oniongroove -- Manages --> Rn

    S11[Onion Site 1.1]
    S12[Onion Site 1.2]
    S1n[Onion Site 1.n]

    R1 -- Defines --> S11
    R1 -- Defines --> S12
    R1 -- Defines --> S1n

    S21[Onion Site 2.1]
    S22[Onion Site 2.2]
    S2n[Onion Site 2.n]

    R2 -- Defines --> S21
    R2 -- Defines --> S22
    R2 -- Defines --> S2n

    S31[Onion Site 3.1]
    S32[Onion Site 3.2]
    S3n[Onion Site 3.n]

    Rn -- Defines --> S31
    Rn -- Defines --> S32
    Rn -- Defines --> S3n

  end
  1. Key generation: all .onion keys MUST be generated locally -- i.e, at the Operators's box prior to deployment -- encrypted and optionally pushed to a private repository. This allows other members of the Operators team to clone that internal repository and start managing sites and nodes. This ensures fast disaster recovery and also avoid the need for a backup system just for onion balancing.

  2. Disposability: depending on design choice and implementation availability at Oniobalance, the frontend .onion address MAY be the only persistent data, and everything else MAY be disposable/recycled/recreated in case of failure or major infrastructure/design refactoring.

That of course depends if Onionbalance supports backend rotation. But even in the case that backend .onion keys need also to be backed up, the system could be regenerated with only the combination with the software suite, the keys and the custom configuration.

graph LR
  subgraph Operator Computer
    Conf[Configuration]
    Keys

    OC[Oniongroove configurator] -- Generates --> Conf
    OC -- Generates --> Keys
    OP[Oniongroove provisioner]
  end

  Conf --> OP -- Applies into --> P[Remote machines]
  Keys --> OP
  P -- Produces --> CDN[Oniongroove CDN]
  1. Elasticity: disposability properties leads to the OPTIONAL feature in which backend nodes can be added and removed at will, making it from the start compatible with any elastic capability to be specified in the future -- like adding and removing nodes according to backend overload.

This depends if the Onionbalance instance to be restarted (resulting in unwanted downtimes) every time the set of backend nodes changes -- either by adding or removing nodes.

This should also be done in sync with the timeframe of Onion Services' descriptor updates, so Clients don't end up trying to access backends that are no longer online.

By the time this spec was written, Onionbalance maximum number of backends is reported to be 8, so initial implementation MAY use a fixed number of max. 8 backends per .onion site, but OPTIONAL experiments can be made to test if a site could have a dynamic number of backends.

  1. Uniformity with flexibility: Oniongroove works with the assumption that all sites can have the same "CDN" fronting setup, while their "last mile"/endpoints might be all different. That said, the "first half" of the solution MUST be based in the same software suite and workflow which MUST be flexible enough to accept distinct "last mile" (endpoint) configurations.

  2. Migration support: built-in support to migrate existing onion services into the CDN instance, by just

    • Importing their frontend and backend keys along with configuration.
    • Deploying configuration and keys to fresh nodes using Oniongroove.
    • Turning off the old system.
  3. Testing: this suite MAY include test suites using either Chutney, the Shadow Simulator (Gitlab CI) or custom procedures.

  4. Limits: while Oniongroove aims to scale and be as secure as possible, some limits might exists from the current technology. Some can be worked around, but others require improvements in the underlying stack, like the following:

    • Tor daemon is single process, with limited threading support. How it scales under load for Onion Services and with a varying number of Onion Services?
  5. The reported maximum number of Onionbalance backends.

  6. Despite the support for offline keys in the Onion Services v3 specification (section 1.7), the Tor daemon currently does not support this feature, requiring a high level of operational security to protect Onion Services' keys.

    • Other limits important to be considered in the scope of this project.
  7. No lock-in: operators aren't required to use the Oniongroove suite in order to implement it's functionality. They can instead selectively pick some subsystems and plug into their existing deployment workflow or just follow the conventions.

  8. Requirements: Oniongroove CDN requirements MUST be low. The system can run in parallel along with the existing web sites available via the "regular/vanilla" internet and can be hosted in a different infrastructure if needed. Minor modifications in existing systems are needed only for OPTIONAL security and circumvention features like HTTPS for .onion addresses and the Onion-Location header for service discovery.

Architecture

Overall functioning

Inspired by rdsys, Oniongroove aspires to implement the design philosophy of The Clean Architecture.

The topology is different from "classic" content delivery networks: an Onionbalance frontend acts as a publisher for backend servers in Tor’s Onion Service’s hashring (DHT). A single .onion address allows access to any backend server.

Backends MUST be configured to host Onion Services with the following topology:

  • A single horizontal layer (or "wall") of backend instances (in different nodes/VMs/containers) act as Onionbalance backends for many sites, given that a single Tor daemon can host many services.

  • There's no limit in the number of backend instances, but with no guarantee that all instances will be used by every frontend instance due to the size limit of each Onion Service descriptor.

  • Operators wishing to offer a single Onion Service in an entire horizontal layer of backend nodes MUST setup different Oniongroove providers, one for each site.

Frontend architecture MUST be the following:

  • A single frontend instance can manage many frontends, each one with multiple associated backends.

  • Additional frontend nodes MAY act as failover, being activated only if the main one fails.

  • Operators wishing to run more than a single frontend instance at the same time, each one managing different sets of sites MUST setup different Oniongroove providers. Frontend instances MUST be considered only as a publishing node of superdescriptors, so currently there is no actual need to scale these nodes, but only having failover capabilities. This might be an issue in the future if providers contain a huge number of Onion Services to be managed (many superdescriptors to handle), but in that case those "superproviders" may be acting as a single point of failure. This also might be solved by patching Onionbalance to allow for multiprocessing.

The connection between the Onionbalance backends and the Onion Sites endpoints it MAY be set in a different number of ways:

  • Running the Onionbalance backend in the same host or network as the application endpoint.

  • Using VPN (like Wireguard) links.

  • Setting up OnionCat links.

  • HTTPS link, with certificate verification and OPTIONAL authentication.

  • A collection of .onion addresses (perhaps each one even be set using Oniongroove), but that would probably defeat the purpose of having an .onion fronting in the first place.

Topology

The proposed solution is composed by services in the following diagram:

graph TD

    subgraph "Oniongroove CDN"

        B1[Tor backend 1] --> L1[Proxy balancer 1]
        B2[Tor backend 2] --> L2[Proxy balancer 2]
        B3[Tor backend 3] --> L3[Proxy balancer 3]
        Bn[Tor backend n] --> Ln[Proxy balancer n]

    end

    subgraph "Application"

        L1 --> E1
        L1 --> E2
        L1 --> E3
        L1 --> En

        L2 --> E1
        L2 --> E2
        L2 --> E3
        L2 --> En

        L3 --> E1
        L3 --> E2
        L3 --> E3
        L3 --> En

        Ln --> E1
        Ln --> E2
        Ln --> E3
        Ln --> En

        E1[Endpoint 1]
        E2[Endpoint 2]
        E3[Endpoint 3]
        En[Endpoint n]

    end

    subgraph "Oniongroove fronting"

        O1[Onionbalance 1] --> F1[Tor Frontend 1] --> H1[Tor HSDir]

    end

Please note that the graph above is simplified:

  1. In fact, each .onion site can have different number of endpoints, even a single one.
  2. Metrics and monitoring subsystems are not displayed.

Types of services

Main services:

  • endpoints: each endpoint service can be a entry point for upstream sites available via HTTP (available only in the case of local connection) or HTTPS; a layer of endpoints gives redundancy up to N - 1 failing endpoints.
  • balancers: each proxy balancer service runs an NGINX/OpenResty instance configured to serve all websites from all endpoints; each website is also run on it's own port and a reverse proxy is set with all endpoints as upstreams; this also gives additional redundancy; port number to server name conversion happens at this stage.
  • backends: each backend service runs a Tor instance running an Onion Service for each site; for redundancy, the Onion Service for each site is linked to an NGINX instance which is linked to all application endpoints.
  • frontends: runs a Tor instance to act as the Onion Service frontend for each Onion Site; right now just a single frontend is supported, so redundancy is this level should be implemented with high availability config where a spare service with the same Tor data starts only if the current frontend goes unresponsive.
  • onionbalance: runs an instance of Onionbalance; like the frontend service, only a single service is supported and needs additional high availability setup; controls the frontend service for all the sites.

Additional services:

  • monitor: monitoring nodes.
  • tor: nodes with Tor daemons used by the monitoring nodes.

Service grouping

  • Each backends and it's proxy balancer counterpart (say backend1 and balancer1 MAY be grouped together in the same worker machine).

  • Each frontend and it's onionbalance counterpart (say frontend1 and onionbalance1) MAY be also grouped in a single machine.

  • Each worker MAY stay in a separate machine/location.

  • Each endpoint MAY stay in a separate machine/location. Isolation between the Oniongroove CDN and the endpoints is RECOMMENDED as a best practice.

  • Each monitor node could be grouped with it's tor counterpart (say monitor1 and tor1).

Entity-relationship model

Simply put, this is the ER model for Oniongroove:

  • The "suite" can manage many
    • "providers",
    • which hosts many
      • "onion services" sites, each one split in
      • a "frontend" Onionbalance service instance
      • "backend" Onion Service instances,
        • which connects to one or more:
        • endpoints / upstreams
      • a "monitor" instance (also gathers stats)
      • "probes"
    • which is split in
      • frontend node(s)
      • backend nodes
      • proxy balancers (to be plugged in the endpoint)
      • monitor nodes
      • probe nodes

Which in an ER diagram looks like this:

erDiagram
  Oniongroove ||--o{ Provider  : manages
  Provider ||--o{ OnionSite    : hosts
  Provider ||--o{ FrontendNode : "split in"
  Provider ||--o{ BackendNode  : "split in"
  Provider ||--o{ BalancerNode : "split in"
  Provider ||--o{ MonitorNode  : "split in"
  Provider ||--o{ ProbeNode    : "split in"

  OnionSite ||--o| FrontendInstance : "split in"
  OnionSite ||--o{ BackendInstance  : "split in"
  OnionSite ||--o| MonitorInstance  : "split in"
  OnionSite ||--o{ ProbeInstance    : "split in"

  BackendInstance ||--o{ Upstream : "connects to"

The cardinality between OnionSite and FrontendInstance and MonitorInstance depends whether the implementation has failover between frontend and monitoring instances.

Implementation

Overview

Implementations MUST consider a two-tier approach, where sysadmins keep a copy of all .onion keys (and optionally TLS certs) so the entire Onion Service Operation can be redeployed if needed (eg. disaster recovery, switching providers or even in case the deployment procedure/stack changes and is decided that a redeployment is less costly than keeping an existing installation).

  1. Vendor-specific tier: module(s) that bootstrap the basic environment (like a system with shell, could be based on Terraform), taking care of vendor-specific logic depending on the hosting platform chosen.

  2. Vendor neutral tier: a module that bootstraps the Onionbalance frontend(s), backends, proxy balancers etc (could use Ansible for that, even Docker). This step should be vendor-neutral.

Oniongroove MUST operate as a translator between an implementation-neutral configuration format and the specific configuration of any module/recipe such as Ansible cookbooks. In this sense, Oniongroove behaves basically as a configuration generator and dispatcher, abstracting details and presenting a clean configuration structure.

This means that it MUST be possible to (re)bootstrap a whole provider by using just:

  • The relevant Onion Service keys.
  • The configuration.
  • The software solution.
  • A bunch of nodes (like VPSes).

A system which such capacity would then behave like an "Onion As a Service":

graph LR
  OC[Provider configuration]
  Keys[Provider keys]

  subgraph "Oniongroove provisioner"
    Keys -- Compilation --> VNC
    OC -- Compilation --> VNC[Provisioning configuration] -- Provisioning --> AP[Applied configuration]
  end

Not sticking with an specific tool allows Oniongroove to be sustainable in the long term if the uderlying deployment technology changes or if the community prefers to change implementations.

Modularization

Such a system MUST not be composed of a single, monolitic codebase but instead by of many smaller modules that can be used as standalone applications or reused in other implementations.

Deployment

Deployments MUST happen according to this workflow:

  • Every site created MUST trigger a keypair generation for the frontend and all backends as well as an Onionbalance config update. Since all key generation happens locally during "compile time", there's no need for a back-and-forth between the frontend and the backends in order to discover all backend addresses/pubkeys for all sites.

  • Every backend node creation MUST trigger a backend keypair generation for all sites as well as an Oniobalance config update. This step also happens during compilation time.

  • Ideally, Onionbalance SHOULD pick backend descriptor in a randomized order to ensure a regular distribution of backend node usage. This depends on Onionbalance behavior.

  • Every site has a set of endpoints (min. of 1) which are compiled into the configuration of each balance instance -- in the case of NGINX, it's the upstream block.

This ensures regular distribution in the upstream connection. If the endpoint is already a CDN then the set endpoint is composed of a single item.

  • Redeployment MAY not only needed, but planned with minimized downtimes and wiping out keys from the old, decommissioned systems.

Key management

When designing this solution, two choices for Onion Services key handling and certificates were considered:

  1. Locally generated at the Operators computer and then deployed remotely along with the configuration OR

  2. Generated directly in the machines hosting the Onion Service.

In the end, local generation was chosen as this approach have some advantages:

  • Easier to backup the keys (using GnuPG and optionally with a password/secret sharing application with multiple recipients and with digital signature support).

  • Easy to redeploy the whole frontend/backend infrastructure if needed.

The Onionbalance frontend address MUST be considered as long term and hence their keys MUST be stored on encrypted backups with some redundancy (both physical and human).

Keys MUST not be stored in plaintext in the provider repository.

Security considerations

Running Onion Services is more sensitive than Tor relays (not to say even more than exit nodes), so not only basic/general but also additional security checks and measures are needed, including but not limited to:

  • SHOULD use UNIX sockets whenever possible: they're faster and more secure as no TCP is involved. When in use, they MUST be used properly (like with correct permissions and ownership).

  • MUST use Tor packages from the official Tor repositories instead of the operating system's distribution repositories (e.g. Debian).

  • The application server (be an webserver or anything else) MUST not leak identifiable information such as hostnames or IP addresses (which can be achieved by disabling ServerTokens and mod_status on Apache).

  • SHOULD follow other recomended and best practices from the Tor and Riseup documentations.

Defenses and SPOFs (single points of failure) avoidance SHOULD also be considered:

  • Setting The Vanguards Onion Service Addon on each Tor daemon. For reference, check:
  • Consider adopting vanguard's security suggestions for onionbalance.
  • How to OnionBalance - Vanguards security documentation.

  • Encrypted offline backups of Onion Service keys: at least from the Onion Balance frontend service. If backend keys cannot be disposable, they SHOULD also be backed up.

  • Using automated and deterministic admin tasks prevents SPOFs represented by the absency of the Onion Site Operator, as automation allows more people do take care of the maintenance. While this requirement cannot be fulfilled by Oniongroove itself, it's implementation can lower the entrance barrier for new Operators.

For mission-critical applications, additional precautions SHOULD also be implemented whenever possible:

  • System checks: disk encryption, firewall and other protections.

  • More careful service provider selection, or even better: run Onion Services on premises. While this is out of Oniongroove scope, it's something that needs to be documented and explained to Operators.

  • Regular audits.

Scalability

Performance measures Anti-DoS mitigations to be taken into account:

  • MUST use Onionbalance for Onion Service scalability:
  • The number of backends MAY be way greater than those included in each superdescriptor. This means that the number of introduction points available $I_a$ for each onion site MAY be much bigger than the number of introduction points included in the superdescriptor $I_s$ for that same onion site, i.e., $I_a >> I_s$.

  • The task of introduction point selection remains to Onionbalance, which MAY be patched to select introduction points randomly each time it publishes a superdescriptor, providing automatic backend failover selection and an additional balancing mechanism.

  • As an hypothetical example, the number of backends nodes -- each one publishing one backend onion service per site in the provider -- could yield $I_a = 200$ per site, whereas each frontend Onion Service could publish up to $I_s = 20$.

  • It's important to note that having a huge number of published backend onion services with established introduction points connections MAY imply in some overhead to the Tor network, although this is believed to be low but MAY be subject of further research.

  • SHOULD Apply relevant anti-DoS measures.

  • For public-facing (known IP addresses), MUST offer single onion service, using HiddenServiceSingleHopMode to enhance performance (with the cost of losing server-side anonymity). In this case, using the Vanguards Addon is OPTIONAL, but note that disabling it is a provider-wide configuration, so it is RECOMMENDED that public-facing and pure-.onion services should not me mixed in the same provider.

  • Basic CDN functionalities such as temporary caching to avoid excessive access of the endpoints SHOULD be considered, without converting the suite into a caching/mirroring application, but only relying on caching to alleviate load in the endpoints.

Oniongroove SHOULD mainly be a load balanced proxy solution. Temporary caching SHOULD be available for optimization, which is an interesting feature for public sites. But some applications MAY require that cache is always off to prevent upstream content being stored in caches.

Proxy balancers

These are the requisites for the proxy balancer instances:

  • Rewriting proxy capabilities: the proxy balance instances MUST be equiped with URL rewriting to ensure that links are changed to their .onion counterparts.

  • Blocking capabilities: proxy balance instances also MUST provide support for URL restrictions in a way that Operators can block access to login, admin and other pages from .onion access.

  • REQUIRED TLS certificate verification in the upstream connection with OPTIONAL custom warning page in case of an invalid certificate.

  • OPTIONAL upstream connection via Tor circuit, to prevent detection of the Oniongroove CDN location.

  • OPTIONAL page in case of any reverse proxy connection error.

  • SHOULD support for Onion Service sites with subdomains in the URL. Implementations MUST provide a "catch-all" rule to match all subdomains, regardless if a string matches a configured subdomain: this ensures that a sequence like $random_string.$onion_address.onion does not match a Virtual Host definition. In the case of a non-configured subdomain, an error page/message MUST be presented.

Folder scheme

Folder scheme relies on the XDG Base Directory Specification v0.8:

  • $XDG_CONFIG_HOME/oniongroove: local configurations for oniongroove, i.e, not shared between operators. Subfolders:

    • $XDG_CONFIG_HOME/oniongroove/providers/<provider-name>: operator configuration for a given provider. Per-persona settings, not meant to be shared between operators, so it's left outside the provider git repository:
      • $XDG_CONFIG_HOME/oniongroove/providers/<provider-name>/config.yaml: main configuration file for per-persona and provider specific settings.
  • $PROVIDER_PATH: arbitrary path specified by the operator when setting up (or cloning) a new provider repository. Used to store the local working copy of the provider's Git repository. Subfolders and files:

    • $PROVIDER_PATH/oniongroove.yaml: main config file.
    • $PROVIDER_PATH/sites/<site-name>.yaml: configuration for a single site name.
    • $PROVIDER_PATH/nodes/<type>/<node-name>.yaml: configuration for a single node from a given type.
    • $PROVIDER_PATH/provisioners/<implementation>: generated configuration for a given provisioning technology.
    • $PROVIDER_PATH/keyring/: encrypted keystore.

Configuration format

Oniongroove MUST use a semantic versioned configuration in the YAML format for defining all entity instances in the model.

Structure for the $XDG_CONFIG_HOME/oniongroove/providers/<provider-name>/config.yaml config file:

---
operator_name        : 'Some Name'                                # Operator persona name (nickname)
operator_email       : 'someone@example.org'                      # Operator persona email
openpgpg_fingerprint : 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' # OpenPGP key fingerprint

provider_path: "~/some/path/to/oniongroove/provider" # Provider path in the local machine

Structure for the $PROVIDER_PATH/oniongroove.yaml config file:

---
version : '0.0.1'   # Config version
provider: 'Example' # Provider name

Structure for the $PROVIDER_PATH/sites/<site-name>.yaml config file:

---
endpoints: # List of endpoints for the "top-level" .onion domain
  'endpoint1.example.org'
  'endpoint2.example.org'
  'endpoint3.example.org'
  'endpoint4.example.org'

subdomains:
  www:
    endpoints: # List of endpoints for that subdomain
      'endpoint1.example.org'
      'endpoint2.example.org'
      'endpoint3.example.org'
      'endpoint4.example.org'

  someservice:
    endpoints: # List of endpoints for that subdomain
      'endpoint5.example.org'
      'endpoint6.example.org'
      'endpoint7.example.org'
      'endpoint8.example.org'

Structure for the $PROVIDER_PATH/nodes/<type>/<node-name>.yaml config file:

---
type: 'backend' # Node type
fqdn: 'node-name.example.org' # FQDN for SSH access. Can be an onion service.

UX

Example command line invocations and behaviors.

Provider config CRUD

Creating a provider:

oniongroove create provider <provider-name>  \
    [--path <local-folder-path]              \
    [--repository <remote-repository-url]    \
    [--gnupg-homedir <path-to-gnupg-homedir] \
    [--operator-name <operator-persona-name] \
    [--operator-email <operator-email-addrs] \
    [--openpgp <openpgp-fingerprint>

-> Check if provider already exists
-> Check if a local operator config for that provider exists
  -> Check if OpenPGP keypair is available
    -> Create OpenPGP keypair if needed
-> Setup provider folder
  -> Clone provider git repository if that was specified
    -> (Optionally) check latest commit OpenPGP signature
  -> Create provider git repository from template and using operator name/email
  -> (Optionally) Setup OpenPGP repository signature

Reading a provider:

oniongroove info provider <provider-name>

-> Prints config, nodes etc

Updating a provider:

  • Should be done by editing the configuration files and re-provisioning.

Removing a provider:

oniongroove remove provider <provider-name> [-y] [-wipe]

-> Ask for confirmation if needed
-> Removes or wipes $XDG_CONFIG_HOME/oniongroove/providers/<provider-name>
-> Removes or wipes $PROVIDER_PATH
-> Informs Operator that all remote nodes should be removed manually

Node config CRUD

Creating a node:

oniongroove create node <provider-name> <node-name> \
  [--type <node-type]

-> Check if node already exists
-> Create config file with (default?) node type
-> If backend, generate .onion keys for this backend for each defined site
-> If frontend, generate main .onion keys for each defined site
-> Encrypt keys for storage

Reading a node:

oniongroove info node <provider-name> <node-name>

Updating a node:

  • Should be done by editing the configuration files and re-provisioning.

Removing a node:

oniongroove remove node <provider-name> <node-name> [-y]

Onion Services config CRUD

Creating an onion site:

oniongroove create site <provider-name> <site-name>

Reading a site:

oniongroove info site <provider-name> <site-name>

Updating a node:

  • Should be done by editing the configuration files and re-provisioning.

Removing a site:

oniongroove remove site <provider-name> <site-name>

Applying the configuration

Deployment is done with:

oniongroove deploy <provider-name>

-> Compiles configuration
-> Provision each node

Roadmap

It's RECOMMENDED that implementation start by doing simple usable modules, planned as follows:

  • Milestone #0: Prototype: a working CDN running vendor-neutral modules, including:
    • Key generation module.
    • Provider configuration management module.
  • Milestone #1: MVP: statistic modules.
  • Milestone #2: Monitoring modules.
  • Milestone #3: Traffic control / protections modules.
  • Milestone #4: Redundancy at the frontend, stats and monitoring nodes.
  • Milestone #5: Vendor-specific module tiers.

Relevant issues

Some of the relevant Onion Services issues for the Oniongroove project:

Inspirations

Open topics and questions

Meta

  • This document is very big! Perhaps the design discussion should be separated from the specifications (a more synthetic document).

Existing solutions

  • EOTK: why not just use/extend it?

    • Need for an idempotent Infrastructure as Code (IaC) implementation ensuring that systems are in a defined state.

    • Different use-cases: EOTK is focused mainly in public-face sites, while a more general tool might be useful.

    • Onionbalance support.

    • Key management with encrypted backups.

Deployment

  • OPTIONAL support for node discovery using and external service/source/API.

Onionbalance

  • See these questions about current Onionbalance limits.

  • Backend rotation:

    • Does Onionbalance really need persistent backend addresses? From what was investigated until now, it seems that what matters is that:

      • The backend addresses are currently published .

      • Every backend has both HiddenServiceOnionBalanceInstance and MasterOnionAddress configuration options set.

    • When rotating backends, it's important to not just turn off the older backends, but to keep then online until all copies of the older superdescriptor are expired/replaced with the updated list of introduction points.

    • But this behavior MUST be tested in practice before considering any backend rotation feature.

Key management

Metrics

  • Needs description and diagram about metrics collection and reporting.

  • Can metrics be submited/collected via .onion services?

Monitoring

  • Needs description and diagram about monitoring and notifications.

  • Don't prioritize a domain for the monitoring dashboard: use an .onion by default with authentication. Currently Onionbalance does not support authentication so this service MUST be implemented outside de CDN pool.

  • The nature of Tor networking might make unnecessary the use of many probes in terms of network perspectives (like OONI does), making additional probes only a matter of failover. But in any case probes could report it's status in case they cannot reach the Tor network.

  • Oniongroove SHOULD use Onionprobe or a compatible tool.

Failover

  • How to do failover for probing/monitoring/stats/notifications?

  • How to do the Onionbalance failover? Config and keys might be easily replicated but needs a way for backup frontend nodes to takeover if others are unavailable.

Security considerations

  • Some of them can be moved to an official documentation an be used as a reference of what SHOULD be done.

Scalability

  • HTTPS support MAY be considered depending also on performance considerations:

    • Arguments to avoid HTTPS in the Onion Service related to performance:

      • HTTPS required additional CPU consumption, especially for the handshake, which could worsen the consequences of a DoS attack (CLIENT HELLO flooding).
      • Since HTTPS is served from an Onion Service, tradditional DoS HTTPS mitigations would not work (like fail2ban or firewall rate limiting).
    • Arguments in favour of HTTPS with a special cipher suite:

      • Hypothesis do be researched/tested: perhaps an Onion Service with HTTPS with a proper cipher suite might provide both an additional layer of security and some rough kind of "proof of work" by letting the client use more CPU than the server; but right now I don't know the state of the art on cipher suites and if there are a choice which provides both properties.
      • But even with this properties, HTTPS over Onion Service might be still costly in comparison with plain HTTP over Onion Service.
      • Only with more research and testing this can be decided.
      • So, if performance is a concern, it would be best to start by only using HTTP over Onion Service and, if HTTPS proves to be useful, introduce it later with HSTS header and HTTP-to-HTTPS redirection.

Service discovery

Service discovery for public-facing sites is functionalities are outside of the Oniongroove scope as they require to be set directly in the endpoints public-facing servers, but MAY be mentioned as recommendations to be used altogether with the CND deployment.

Some technologies are available for that:

  • Onion-Location HTTP header in the endpoint.

  • Alt-Svc Header.

  • Sauteed Onions.

Subdomains for non-HTTP services

Altough RFC 7686 - The ".onion" Special-Use Domain Name specifies subdomain support for Onion Services, not all protocols will support it as there's no subdomain-to-IP-address mapping. Oniongroove support for subdomains may be limited only for a few protocols where this information is exchanged, like HTTP.

Metrics

Regarding metric collection, some approaches do collect metrics from the Onion Service needs to be considered:

  • Use a proxy middleware that parses HTTP requests and responses, processing the minimum metadata needed for stats on pages hits etc.

  • Read data from the ControlPort, preferably as an unix socket, but that needs additional security protections.

  • Use the new MetricsPort config. But does this param support unix sockets as well?

  • Perhaps the Stem python library has some functionality to monitor Onion Services.

  • Onionbalance also supports a unix status socket.

  • Log parsing. But which useful information on Onion Service is available at Tor daemon's log?

Aggregation can be done in a number of ways:

  • Sending them to Prometheus or other system.

  • Setting up a private metrics/monitoring dashboard for the CDN.

  • With temporal and other metrics aggregation to avoid any traffic correlation.

Testing

The whole infrastructure needs to be tested, with notifications in case of incidents like failures or high load:

  • By setting up external probes using OnionScan and other monitoring tools to test the health, quality and security of the Onion Service from the outside. They could probe both the Onionbalance frontend and the backends. They do not need to be positioned in any special place in the internet as long as they can access the Tor network, but preferably in a different network from the Onionbalance CDN.

  • By using internal probes between each Onion Service backend and the application endpoint.

UX

Other subcommands to be considered:

  • Git wrapper.
  • Key regeneration?
  • X.509 keys and certs management?

Dashboard

A possible integration with the Bypass Censorship Dashboard: