Proposals for changes in the Tor protocols

This "book" is a list of proposals that people have made over the years, (dating back to 2007) for protocol changes in Tor. Some of these proposals are already implemented or rejected; others are under active discussion.

If you're looking for a specific proposal, you can find it, by filename, in the summary bar on the left, or at this index. You can also see a list of Tor protocols by their status at [README.md].

For information on creating a new proposal, you would ideally look at [001-process.txt]. That file is a bit out-of-date, though, and you should probably just contact the developers.

Tor proposals by number

Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.

Below are a list of proposals sorted by their proposal number. See README.md for a list of proposals sorted by status.

Tor proposals by status

Here we have a set of proposals for changes to the Tor protocol. Some of these proposals are implemented; some are works in progress; and some will never be implemented.

Below are a list of proposals sorted by status. See BY_INDEX.md for a list of proposals sorted by number.

Active proposals by status

OPEN proposals: under discussion

These are proposals that we think are likely to be complete, and ripe for discussion.

ACCEPTED proposals: slated for implementation

These are the proposals that we agree we'd like to implement. They might or might not have a specific timeframe planned for their implementation.

FINISHED proposals: implemented, specs not merged

These proposals are implemented in some version of Tor; the proposals themselves still need to be merged into the specifications proper.

META proposals: about the proposal process

These proposals describe ongoing policies and changes to the proposals process.

INFORMATIONAL proposals: not actually specifications

These proposals describe a process or project, but aren't actually proposed changes in the Tor specifications.

Preliminary proposals

DRAFT proposals: incomplete works

These proposals have been marked as a draft by their author or the editors, indicating that they aren't yet in a complete form. They're still open for discussion.

NEEDS-REVISION proposals: ideas that we can't implement as-is

These proposals have some promise, but we can't implement them without certain changes.

NEEDS-RESEARCH proposals: blocking on research

These proposals are interesting ideas, but there's more research that would need to happen before we can know whether to implement them or not, or to fill in certain details.

(There are no proposals in this category)

Inactive proposals by status

CLOSED proposals: implemented and specified

These proposals have been implemented in some version of Tor, and the changes from the proposals have been merged into the specifications as necessary.

RESERVE proposals: saving for later

These proposals aren't anything we plan to implement soon, but for one reason or another we think they might be a good idea in the future. We're keeping them around as a reference in case we someday confront the problems that they try to solve.

SUPERSEDED proposals: replaced by something else

These proposals were obsoleted by a later proposal before they were implemented.

DEAD, REJECTED, OBSOLETE proposals: not in our plans

These proposals are not on-track for discussion or implementation. Either discussion has stalled out (the proposal is DEAD), the proposal has been considered and not adopted (the proposal is REJECTED), or the proposal addresses an issue or a solution that is no longer relevant (the proposal is OBSOLETE).

Filename: 000-index.txt Title: Index of Tor Proposals Author: Nick Mathewson Created: 26-Jan-2007 Status: Meta Overview: This document provides an index to Tor proposals. This is an informational document. Everything in this document below the line of '=' signs is automatically generated by reindex.py; do not edit by hand. ============================================================ Proposals by number: 000 Index of Tor Proposals [META] 001 The Tor Proposal Process [META] 098 Proposals that should be written [OBSOLETE] 099 Miscellaneous proposals [OBSOLETE] 100 Tor Unreliable Datagram Extension Proposal [DEAD] 101 Voting on the Tor Directory System [CLOSED] 102 Dropping "opt" from the directory format [CLOSED] 103 Splitting identity key from regularly used signing key [CLOSED] 104 Long and Short Router Descriptors [CLOSED] 105 Version negotiation for the Tor protocol [CLOSED] 106 Checking fewer things during TLS handshakes [CLOSED] 107 Uptime Sanity Checking [CLOSED] 108 Base "Stable" Flag on Mean Time Between Failures [CLOSED] 109 No more than one server per IP address [CLOSED] 110 Avoiding infinite length circuits [CLOSED] 111 Prioritizing local traffic over relayed traffic [CLOSED] 112 Bring Back Pathlen Coin Weight [SUPERSEDED] 113 Simplifying directory authority administration [SUPERSEDED] 114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED] 115 Two Hop Paths [DEAD] 116 Two hop paths from entry guards [DEAD] 117 IPv6 exits [CLOSED] 118 Advertising multiple ORPorts at once [SUPERSEDED] 119 New PROTOCOLINFO command for controllers [CLOSED] 120 Shutdown descriptors when Tor servers stop [DEAD] 121 Hidden Service Authentication [CLOSED] 122 Network status entries need a new Unnamed flag [CLOSED] 123 Naming authorities automatically create bindings [CLOSED] 124 Blocking resistant TLS certificate usage [SUPERSEDED] 125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED] 126 Getting GeoIP data and publishing usage summaries [CLOSED] 127 Relaying dirport requests to Tor download site / website [OBSOLETE] 128 Families of private bridges [DEAD] 129 Block Insecure Protocols by Default [CLOSED] 130 Version 2 Tor connection protocol [CLOSED] 131 Help users to verify they are using Tor [OBSOLETE] 132 A Tor Web Service For Verifying Correct Browser Configuration [OBSOLETE] 133 Incorporate Unreachable ORs into the Tor Network [RESERVE] 134 More robust consensus voting with diverse authority sets [REJECTED] 135 Simplify Configuration of Private Tor Networks [CLOSED] 136 Mass authority migration with legacy keys [CLOSED] 137 Keep controllers informed as Tor bootstraps [CLOSED] 138 Remove routers that are not Running from consensus documents [CLOSED] 139 Download consensus documents only when it will be trusted [CLOSED] 140 Provide diffs between consensuses [CLOSED] 141 Download server descriptors on demand [OBSOLETE] 142 Combine Introduction and Rendezvous Points [DEAD] 143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [SUPERSEDED] 144 Increase the diversity of circuits by detecting nodes belonging the same provider [OBSOLETE] 145 Separate "suitable as a guard" from "suitable as a new guard" [SUPERSEDED] 146 Add new flag to reflect long-term stability [SUPERSEDED] 147 Eliminate the need for v2 directories in generating v3 directories [REJECTED] 148 Stream end reasons from the client side should be uniform [CLOSED] 149 Using data from NETINFO cells [SUPERSEDED] 150 Exclude Exit Nodes from a circuit [CLOSED] 151 Improving Tor Path Selection [CLOSED] 152 Optionally allow exit from single-hop circuits [CLOSED] 153 Automatic software update protocol [SUPERSEDED] 154 Automatic Software Update Protocol [SUPERSEDED] 155 Four Improvements of Hidden Service Performance [CLOSED] 156 Tracking blocked ports on the client side [SUPERSEDED] 157 Make certificate downloads specific [CLOSED] 158 Clients download consensus + microdescriptors [CLOSED] 159 Exit Scanning [INFORMATIONAL] 160 Authorities vote for bandwidth offsets in consensus [CLOSED] 161 Computing Bandwidth Adjustments [CLOSED] 162 Publish the consensus in multiple flavors [CLOSED] 163 Detecting whether a connection comes from a client [SUPERSEDED] 164 Reporting the status of server votes [OBSOLETE] 165 Easy migration for voting authority sets [REJECTED] 166 Including Network Statistics in Extra-Info Documents [CLOSED] 167 Vote on network parameters in consensus [CLOSED] 168 Reduce default circuit window [REJECTED] 169 Eliminate TLS renegotiation for the Tor connection handshake [SUPERSEDED] 170 Configuration options regarding circuit building [SUPERSEDED] 171 Separate streams across circuits by connection metadata [CLOSED] 172 GETINFO controller option for circuit information [RESERVE] 173 GETINFO Option Expansion [OBSOLETE] 174 Optimistic Data for Tor: Server Side [CLOSED] 175 Automatically promoting Tor clients to nodes [REJECTED] 176 Proposed version-3 link handshake for Tor [CLOSED] 177 Abstaining from votes on individual flags [RESERVE] 178 Require majority of authorities to vote for consensus parameters [CLOSED] 179 TLS certificate and parameter normalization [CLOSED] 180 Pluggable transports for circumvention [CLOSED] 181 Optimistic Data for Tor: Client Side [CLOSED] 182 Credit Bucket [OBSOLETE] 183 Refill Intervals [CLOSED] 184 Miscellaneous changes for a v3 Tor link protocol [CLOSED] 185 Directory caches without DirPort [SUPERSEDED] 186 Multiple addresses for one OR or bridge [CLOSED] 187 Reserve a cell type to allow client authorization [CLOSED] 188 Bridge Guards and other anti-enumeration defenses [RESERVE] 189 AUTHORIZE and AUTHORIZED cells [OBSOLETE] 190 Bridge Client Authorization Based on a Shared Secret [OBSOLETE] 191 Bridge Detection Resistance against MITM-capable Adversaries [OBSOLETE] 192 Automatically retrieve and store information about bridges [OBSOLETE] 193 Safe cookie authentication for Tor controllers [CLOSED] 194 Mnemonic .onion URLs [SUPERSEDED] 195 TLS certificate normalization for Tor 0.2.4.x [DEAD] 196 Extended ORPort and TransportControlPort [CLOSED] 197 Message-based Inter-Controller IPC Channel [REJECTED] 198 Restore semantics of TLS ClientHello [CLOSED] 199 Integration of BridgeFinder and BridgeFinderHelper [OBSOLETE] 200 Adding new, extensible CREATE, EXTEND, and related cells [CLOSED] 201 Make bridges report statistics on daily v3 network status requests [RESERVE] 202 Two improved relay encryption protocols for Tor cells [META] 203 Avoiding censorship by impersonating an HTTPS server [OBSOLETE] 204 Subdomain support for Hidden Service addresses [CLOSED] 205 Remove global client-side DNS caching [CLOSED] 206 Preconfigured directory sources for bootstrapping [CLOSED] 207 Directory guards [CLOSED] 208 IPv6 Exits Redux [CLOSED] 209 Tuning the Parameters for the Path Bias Defense [OBSOLETE] 210 Faster Headless Consensus Bootstrapping [SUPERSEDED] 211 Internal Mapaddress for Tor Configuration Testing [RESERVE] 212 Increase Acceptable Consensus Age [NEEDS-REVISION] 213 Remove stream-level sendmes from the design [DEAD] 214 Allow 4-byte circuit IDs in a new link protocol [CLOSED] 215 Let the minimum consensus method change with time [CLOSED] 216 Improved circuit-creation key exchange [CLOSED] 217 Tor Extended ORPort Authentication [CLOSED] 218 Controller events to better understand connection/circuit usage [CLOSED] 219 Support for full DNS and DNSSEC resolution in Tor [NEEDS-REVISION] 220 Migrate server identity keys to Ed25519 [CLOSED] 221 Stop using CREATE_FAST [CLOSED] 222 Stop sending client timestamps [CLOSED] 223 Ace: Improved circuit-creation key exchange [RESERVE] 224 Next-Generation Hidden Services in Tor [CLOSED] 225 Strawman proposal: commit-and-reveal shared rng [SUPERSEDED] 226 "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" [RESERVE] 227 Include package fingerprints in consensus documents [CLOSED] 228 Cross-certifying identity keys with onion keys [CLOSED] 229 Further SOCKS5 extensions [REJECTED] 230 How to change RSA1024 relay identity keys [OBSOLETE] 231 Migrating authority RSA1024 identity keys [OBSOLETE] 232 Pluggable Transport through SOCKS proxy [CLOSED] 233 Making Tor2Web mode faster [REJECTED] 234 Adding remittance field to directory specification [REJECTED] 235 Stop assigning (and eventually supporting) the Named flag [CLOSED] 236 The move to a single guard node [CLOSED] 237 All relays are directory servers [CLOSED] 238 Better hidden service stats from Tor relays [CLOSED] 239 Consensus Hash Chaining [OPEN] 240 Early signing key revocation for directory authorities [OPEN] 241 Resisting guard-turnover attacks [REJECTED] 242 Better performance and usability for the MyFamily option [SUPERSEDED] 243 Give out HSDir flag only to relays with Stable flag [CLOSED] 244 Use RFC5705 Key Exporting in our AUTHENTICATE calls [CLOSED] 245 Deprecating and removing the TAP circuit extension protocol [NEEDS-REVISION] 246 Merging Hidden Service Directories and Introduction Points [REJECTED] 247 Defending Against Guard Discovery Attacks using Vanguards [SUPERSEDED] 248 Remove all RSA identity keys [NEEDS-REVISION] 249 Allow CREATE cells with >505 bytes of handshake data [SUPERSEDED] 250 Random Number Generation During Tor Voting [CLOSED] 251 Padding for netflow record resolution reduction [CLOSED] 252 Single Onion Services [SUPERSEDED] 253 Out of Band Circuit HMACs [DEAD] 254 Padding Negotiation [CLOSED] 255 Controller features to allow for load-balancing hidden services [RESERVE] 256 Key revocation for relays and authorities [RESERVE] 257 Refactoring authorities and making them more isolated from the net [META] 258 Denial-of-service resistance for directory authorities [DEAD] 259 New Guard Selection Behaviour [OBSOLETE] 260 Rendezvous Single Onion Services [FINISHED] 261 AEZ for relay cryptography [OBSOLETE] 262 Re-keying live circuits with new cryptographic material [RESERVE] 263 Request to change key exchange protocol for handshake v1.2 [OBSOLETE] 264 Putting version numbers on the Tor subprotocols [CLOSED] 265 Load Balancing with Overhead Parameters [OPEN] 266 Removing current obsolete clients from the Tor network [SUPERSEDED] 267 Tor Consensus Transparency [OPEN] 268 New Guard Selection Behaviour [OBSOLETE] 269 Transitionally secure hybrid handshakes [NEEDS-REVISION] 270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE] 271 Another algorithm for guard selection [CLOSED] 272 Listed routers should be Valid, Running, and treated as such [CLOSED] 273 Exit relay pinning for web services [RESERVE] 274 Rotate onion keys less frequently [CLOSED] 275 Stop including meaningful "published" time in microdescriptor consensus [CLOSED] 276 Report bandwidth with lower granularity in consensus documents [DEAD] 277 Detect multiple relay instances running with same ID [OPEN] 278 Directory Compression Scheme Negotiation [CLOSED] 279 A Name System API for Tor Onion Services [NEEDS-REVISION] 280 Privacy-Preserving Statistics with Privcount in Tor [SUPERSEDED] 281 Downloading microdescriptors in bulk [RESERVE] 282 Remove "Named" and "Unnamed" handling from consensus voting [ACCEPTED] 283 Move IPv6 ORPorts from microdescriptors to the microdesc consensus [CLOSED] 284 Hidden Service v3 Control Port [CLOSED] 285 Directory documents should be standardized as UTF-8 [ACCEPTED] 286 Controller APIs for hibernation access on mobile [REJECTED] 287 Reduce circuit lifetime without overloading the network [OPEN] 288 Privacy-Preserving Statistics with Privcount in Tor (Shamir version) [RESERVE] 289 Authenticating sendme cells to mitigate bandwidth attacks [CLOSED] 290 Continuously update consensus methods [META] 291 The move to two guard nodes [FINISHED] 292 Mesh-based vanguards [ACCEPTED] 293 Other ways for relays to know when to publish [CLOSED] 294 TLS 1.3 Migration [DRAFT] 295 Using ADL for relay cryptography (solving the crypto-tagging attack) [OPEN] 296 Have Directory Authorities expose raw bandwidth list files [CLOSED] 297 Relaxing the protover-based shutdown rules [CLOSED] 298 Putting family lines in canonical form [CLOSED] 299 Preferring IPv4 or IPv6 based on IP Version Failure Count [SUPERSEDED] 300 Walking Onions: Scaling and Saving Bandwidth [INFORMATIONAL] 301 Don't include package fingerprints in consensus documents [CLOSED] 302 Hiding onion service clients using padding [CLOSED] 303 When and how to remove support for protocol versions [OPEN] 304 Extending SOCKS5 Onion Service Error Codes [CLOSED] 305 ESTABLISH_INTRO Cell DoS Defense Extension [CLOSED] 306 A Tor Implementation of IPv6 Happy Eyeballs [OPEN] 307 Onion Balance Support for Onion Service v3 [RESERVE] 308 Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography [SUPERSEDED] 309 Optimistic SOCKS Data [OPEN] 310 Towards load-balancing in Prop 271 [CLOSED] 311 Tor Relay IPv6 Reachability [ACCEPTED] 312 Tor Relay Automatic IPv6 Address Discovery [ACCEPTED] 313 Tor Relay IPv6 Statistics [ACCEPTED] 314 Allow Markdown for proposal format [CLOSED] 315 Updating the list of fields required in directory documents [CLOSED] 316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT] 317 Improve security aspects of DNS name resolution [NEEDS-REVISION] 318 Limit protover values to 0-63 [CLOSED] 319 RELAY_FRAGMENT cells [OBSOLETE] 320 Removing TAP usage from v2 onion services [REJECTED] 321 Better performance and usability for the MyFamily option (v2) [ACCEPTED] 322 Extending link specifiers to include the directory port [OPEN] 323 Specification for Walking Onions [OPEN] 324 RTT-based Congestion Control for Tor [FINISHED] 325 Packed relay cells: saving space on small commands [OBSOLETE] 326 The "tor-relay" Well-Known Resource Identifier [OPEN] 327 A First Take at PoW Over Introduction Circuits [FINISHED] 328 Make Relays Report When They Are Overloaded [CLOSED] 329 Overcoming Tor's Bottlenecks with Traffic Splitting [FINISHED] 330 Modernizing authority contact entries [OPEN] 331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT] 332 Ntor protocol with extra data, version 3 [CLOSED] 333 Vanguards lite [FINISHED] 334 A Directory Authority Flag To Mark Relays As Middle-only [SUPERSEDED] 335 An authority-only design for MiddleOnly [CLOSED] 336 Randomized schedule for guard retries [CLOSED] 337 A simpler way to decide, "Is this guard usable?" [CLOSED] 338 Use an 8-byte timestamp in NETINFO cells [ACCEPTED] 339 UDP traffic over Tor [ACCEPTED] 340 Packed and fragmented relay messages [OPEN] 341 A better algorithm for out-of-sockets eviction [OPEN] 342 Decoupling hs_interval and SRV lifetime [DRAFT] 343 CAA Extensions for the Tor Rendezvous Specification [OPEN] 344 Prioritizing Protocol Information Leaks in Tor [OPEN] 345 Migrating the tor specifications to mdbook [CLOSED] Proposals by status: DRAFT: 294 TLS 1.3 Migration 316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal) 331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience 342 Decoupling hs_interval and SRV lifetime NEEDS-REVISION: 212 Increase Acceptable Consensus Age [for 0.2.4.x+] 219 Support for full DNS and DNSSEC resolution in Tor [for 0.2.5.x] 245 Deprecating and removing the TAP circuit extension protocol 248 Remove all RSA identity keys 269 Transitionally secure hybrid handshakes 279 A Name System API for Tor Onion Services 317 Improve security aspects of DNS name resolution OPEN: 239 Consensus Hash Chaining 240 Early signing key revocation for directory authorities 265 Load Balancing with Overhead Parameters [for arti-dirauth] 267 Tor Consensus Transparency 277 Detect multiple relay instances running with same ID [for 0.3.??] 287 Reduce circuit lifetime without overloading the network 295 Using ADL for relay cryptography (solving the crypto-tagging attack) 303 When and how to remove support for protocol versions 306 A Tor Implementation of IPv6 Happy Eyeballs 309 Optimistic SOCKS Data 322 Extending link specifiers to include the directory port 323 Specification for Walking Onions 326 The "tor-relay" Well-Known Resource Identifier 330 Modernizing authority contact entries 340 Packed and fragmented relay messages 341 A better algorithm for out-of-sockets eviction 343 CAA Extensions for the Tor Rendezvous Specification 344 Prioritizing Protocol Information Leaks in Tor ACCEPTED: 282 Remove "Named" and "Unnamed" handling from consensus voting [for arti-dirauth] 285 Directory documents should be standardized as UTF-8 [for arti-dirauth] 292 Mesh-based vanguards 311 Tor Relay IPv6 Reachability 312 Tor Relay Automatic IPv6 Address Discovery 313 Tor Relay IPv6 Statistics 321 Better performance and usability for the MyFamily option (v2) 338 Use an 8-byte timestamp in NETINFO cells 339 UDP traffic over Tor META: 000 Index of Tor Proposals 001 The Tor Proposal Process 202 Two improved relay encryption protocols for Tor cells 257 Refactoring authorities and making them more isolated from the net 290 Continuously update consensus methods FINISHED: 260 Rendezvous Single Onion Services [in 0.2.9.3-alpha] 291 The move to two guard nodes 324 RTT-based Congestion Control for Tor 327 A First Take at PoW Over Introduction Circuits 329 Overcoming Tor's Bottlenecks with Traffic Splitting 333 Vanguards lite [in 0.4.7.1-alpha] CLOSED: 101 Voting on the Tor Directory System [in 0.2.0.x] 102 Dropping "opt" from the directory format [in 0.2.0.x] 103 Splitting identity key from regularly used signing key [in 0.2.0.x] 104 Long and Short Router Descriptors [in 0.2.0.x] 105 Version negotiation for the Tor protocol [in 0.2.0.x] 106 Checking fewer things during TLS handshakes [in 0.2.0.x] 107 Uptime Sanity Checking [in 0.2.0.x] 108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x] 109 No more than one server per IP address [in 0.2.0.x] 110 Avoiding infinite length circuits [for 0.2.3.x] [in 0.2.1.3-alpha, 0.2.3.11-alpha] 111 Prioritizing local traffic over relayed traffic [in 0.2.0.x] 114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x] 117 IPv6 exits [for 0.2.4.x] [in 0.2.4.7-alpha] 119 New PROTOCOLINFO command for controllers [in 0.2.0.x] 121 Hidden Service Authentication [in 0.2.1.x] 122 Network status entries need a new Unnamed flag [in 0.2.0.x] 123 Naming authorities automatically create bindings [in 0.2.0.x] 125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x] 126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x] 129 Block Insecure Protocols by Default [in 0.2.0.x] 130 Version 2 Tor connection protocol [in 0.2.0.x] 135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha] 136 Mass authority migration with legacy keys [in 0.2.0.x] 137 Keep controllers informed as Tor bootstraps [in 0.2.1.x] 138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha] 139 Download consensus documents only when it will be trusted [in 0.2.1.x] 140 Provide diffs between consensuses [in 0.3.1.1-alpha] 148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha] 150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha] 151 Improving Tor Path Selection [in 0.2.2.2-alpha] 152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha] 155 Four Improvements of Hidden Service Performance [in 0.2.1.x] 157 Make certificate downloads specific [for 0.2.4.x] 158 Clients download consensus + microdescriptors [in 0.2.3.1-alpha] 160 Authorities vote for bandwidth offsets in consensus [for 0.2.1.x] 161 Computing Bandwidth Adjustments [for 0.2.1.x] 162 Publish the consensus in multiple flavors [in 0.2.3.1-alpha] 166 Including Network Statistics in Extra-Info Documents [for 0.2.2] 167 Vote on network parameters in consensus [in 0.2.2] 171 Separate streams across circuits by connection metadata [in 0.2.3.3-alpha] 174 Optimistic Data for Tor: Server Side [in 0.2.3.1-alpha] 176 Proposed version-3 link handshake for Tor [for 0.2.3] 178 Require majority of authorities to vote for consensus parameters [in 0.2.3.9-alpha] 179 TLS certificate and parameter normalization [for 0.2.3.x] 180 Pluggable transports for circumvention [in 0.2.3.x] 181 Optimistic Data for Tor: Client Side [in 0.2.3.3-alpha] 183 Refill Intervals [in 0.2.3.5-alpha] 184 Miscellaneous changes for a v3 Tor link protocol [for 0.2.3.x] 186 Multiple addresses for one OR or bridge [for 0.2.4.x+] 187 Reserve a cell type to allow client authorization [for 0.2.3.x] 193 Safe cookie authentication for Tor controllers 196 Extended ORPort and TransportControlPort [in 0.2.5.2-alpha] 198 Restore semantics of TLS ClientHello [for 0.2.4.x] 200 Adding new, extensible CREATE, EXTEND, and related cells [in 0.2.4.8-alpha] 204 Subdomain support for Hidden Service addresses 205 Remove global client-side DNS caching [in 0.2.4.7-alpha.] 206 Preconfigured directory sources for bootstrapping [in 0.2.4.7-alpha] 207 Directory guards [for 0.2.4.x] 208 IPv6 Exits Redux [for 0.2.4.x] [in 0.2.4.7-alpha] 214 Allow 4-byte circuit IDs in a new link protocol [in 0.2.4.11-alpha] 215 Let the minimum consensus method change with time [in 0.2.6.1-alpha] 216 Improved circuit-creation key exchange [in 0.2.4.8-alpha] 217 Tor Extended ORPort Authentication [for 0.2.5.x] 218 Controller events to better understand connection/circuit usage [in 0.2.5.2-alpha] 220 Migrate server identity keys to Ed25519 [in 0.3.0.1-alpha] 221 Stop using CREATE_FAST [for 0.2.5.x] 222 Stop sending client timestamps [in 0.2.4.18] 224 Next-Generation Hidden Services in Tor [in 0.3.2.1-alpha] 227 Include package fingerprints in consensus documents [in 0.2.6.3-alpha] 228 Cross-certifying identity keys with onion keys 232 Pluggable Transport through SOCKS proxy [in 0.2.6] 235 Stop assigning (and eventually supporting) the Named flag [in 0.2.6, 0.2.7] 236 The move to a single guard node 237 All relays are directory servers [for 0.2.7.x] 238 Better hidden service stats from Tor relays 243 Give out HSDir flag only to relays with Stable flag 244 Use RFC5705 Key Exporting in our AUTHENTICATE calls [in 0.3.0.1-alpha] 250 Random Number Generation During Tor Voting 251 Padding for netflow record resolution reduction [in 0.3.1.1-alpha] 254 Padding Negotiation 264 Putting version numbers on the Tor subprotocols [in 0.2.9.4-alpha] 271 Another algorithm for guard selection [in 0.3.0.1-alpha] 272 Listed routers should be Valid, Running, and treated as such [in 0.2.9.3-alpha, 0.2.9.4-alpha] 274 Rotate onion keys less frequently [in 0.3.1.1-alpha] 275 Stop including meaningful "published" time in microdescriptor consensus [for 0.3.1.x-alpha] [in 0.4.8.1-alpha] 278 Directory Compression Scheme Negotiation [in 0.3.1.1-alpha] 283 Move IPv6 ORPorts from microdescriptors to the microdesc consensus [for 0.3.3.x] [in 0.3.3.1-alpha] 284 Hidden Service v3 Control Port 289 Authenticating sendme cells to mitigate bandwidth attacks [in 0.4.1.1-alpha] 293 Other ways for relays to know when to publish [for 0.3.5] [in 0.4.0.1-alpha] 296 Have Directory Authorities expose raw bandwidth list files [in 0.4.0.1-alpha] 297 Relaxing the protover-based shutdown rules [for 0.3.5.x] [in 0.4.0.x] 298 Putting family lines in canonical form [for 0.3.6.x] [in 0.4.0.1-alpha] 301 Don't include package fingerprints in consensus documents 302 Hiding onion service clients using padding [in 0.4.1.1-alpha] 304 Extending SOCKS5 Onion Service Error Codes 305 ESTABLISH_INTRO Cell DoS Defense Extension 310 Towards load-balancing in Prop 271 314 Allow Markdown for proposal format 315 Updating the list of fields required in directory documents [in 0.4.5.1-alpha] 318 Limit protover values to 0-63 [in 0.4.5.1-alpha] 328 Make Relays Report When They Are Overloaded 332 Ntor protocol with extra data, version 3 335 An authority-only design for MiddleOnly [in 0.4.7.2-alpha] 336 Randomized schedule for guard retries 337 A simpler way to decide, "Is this guard usable?" 345 Migrating the tor specifications to mdbook SUPERSEDED: 112 Bring Back Pathlen Coin Weight 113 Simplifying directory authority administration 118 Advertising multiple ORPorts at once 124 Blocking resistant TLS certificate usage 143 Improvements of Distributed Storage for Tor Hidden Service Descriptors 145 Separate "suitable as a guard" from "suitable as a new guard" 146 Add new flag to reflect long-term stability 149 Using data from NETINFO cells 153 Automatic software update protocol 154 Automatic Software Update Protocol 156 Tracking blocked ports on the client side 163 Detecting whether a connection comes from a client 169 Eliminate TLS renegotiation for the Tor connection handshake 170 Configuration options regarding circuit building 185 Directory caches without DirPort 194 Mnemonic .onion URLs 210 Faster Headless Consensus Bootstrapping 225 Strawman proposal: commit-and-reveal shared rng 242 Better performance and usability for the MyFamily option 247 Defending Against Guard Discovery Attacks using Vanguards 249 Allow CREATE cells with >505 bytes of handshake data 252 Single Onion Services 266 Removing current obsolete clients from the Tor network 280 Privacy-Preserving Statistics with Privcount in Tor 299 Preferring IPv4 or IPv6 based on IP Version Failure Count 308 Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography 334 A Directory Authority Flag To Mark Relays As Middle-only DEAD: 100 Tor Unreliable Datagram Extension Proposal 115 Two Hop Paths 116 Two hop paths from entry guards 120 Shutdown descriptors when Tor servers stop 128 Families of private bridges 142 Combine Introduction and Rendezvous Points 195 TLS certificate normalization for Tor 0.2.4.x 213 Remove stream-level sendmes from the design 253 Out of Band Circuit HMACs 258 Denial-of-service resistance for directory authorities 276 Report bandwidth with lower granularity in consensus documents REJECTED: 134 More robust consensus voting with diverse authority sets 147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.4.x] 165 Easy migration for voting authority sets 168 Reduce default circuit window 175 Automatically promoting Tor clients to nodes 197 Message-based Inter-Controller IPC Channel 229 Further SOCKS5 extensions 233 Making Tor2Web mode faster 234 Adding remittance field to directory specification 241 Resisting guard-turnover attacks 246 Merging Hidden Service Directories and Introduction Points 286 Controller APIs for hibernation access on mobile 320 Removing TAP usage from v2 onion services OBSOLETE: 098 Proposals that should be written 099 Miscellaneous proposals 127 Relaying dirport requests to Tor download site / website 131 Help users to verify they are using Tor 132 A Tor Web Service For Verifying Correct Browser Configuration 141 Download server descriptors on demand 144 Increase the diversity of circuits by detecting nodes belonging the same provider 164 Reporting the status of server votes 173 GETINFO Option Expansion 182 Credit Bucket 189 AUTHORIZE and AUTHORIZED cells 190 Bridge Client Authorization Based on a Shared Secret 191 Bridge Detection Resistance against MITM-capable Adversaries 192 Automatically retrieve and store information about bridges [for 0.2.[45].x] 199 Integration of BridgeFinder and BridgeFinderHelper 203 Avoiding censorship by impersonating an HTTPS server 209 Tuning the Parameters for the Path Bias Defense [for 0.2.4.x+] 230 How to change RSA1024 relay identity keys [for 0.2.?] 231 Migrating authority RSA1024 identity keys [for 0.2.?] 259 New Guard Selection Behaviour 261 AEZ for relay cryptography 263 Request to change key exchange protocol for handshake v1.2 268 New Guard Selection Behaviour 270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope 319 RELAY_FRAGMENT cells 325 Packed relay cells: saving space on small commands RESERVE: 133 Incorporate Unreachable ORs into the Tor Network 172 GETINFO controller option for circuit information 177 Abstaining from votes on individual flags [for 0.2.4.x] 188 Bridge Guards and other anti-enumeration defenses 201 Make bridges report statistics on daily v3 network status requests [for 0.2.4.x] 211 Internal Mapaddress for Tor Configuration Testing [for 0.2.4.x+] 223 Ace: Improved circuit-creation key exchange 226 "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" 255 Controller features to allow for load-balancing hidden services 256 Key revocation for relays and authorities 262 Re-keying live circuits with new cryptographic material 273 Exit relay pinning for web services [for n/a] 281 Downloading microdescriptors in bulk 288 Privacy-Preserving Statistics with Privcount in Tor (Shamir version) 307 Onion Balance Support for Onion Service v3 INFORMATIONAL: 159 Exit Scanning 300 Walking Onions: Scaling and Saving Bandwidth
Filename: 001-process.txt Title: The Tor Proposal Process Author: Nick Mathewson Created: 30-Jan-2007 Status: Meta Overview: This document describes how to change the Tor specifications, how Tor proposals work, and the relationship between Tor proposals and the specifications. This is an informational document. Motivation: Previously, our process for updating the Tor specifications was maximally informal: we'd patch the specification (sometimes forking first, and sometimes not), then discuss the patches, reach consensus, and implement the changes. This had a few problems. First, even at its most efficient, the old process would often have the spec out of sync with the code. The worst cases were those where implementation was deferred: the spec and code could stay out of sync for versions at a time. Second, it was hard to participate in discussion, since you had to know which portions of the spec were a proposal, and which were already implemented. Third, it littered the specifications with too many inline comments. [This was a real problem -NM] [Especially when it went to multiple levels! -NM] [XXXX especially when they weren't signed and talked about that thing that you can't remember after a year] How to change the specs now: First, somebody writes a proposal document. It should describe the change that should be made in detail, and give some idea of how to implement it. Once it's fleshed out enough, it becomes a proposal. Like an RFC, every proposal gets a number. Unlike RFCs, proposals can change over time and keep the same number, until they are finally accepted or rejected. The history for each proposal will be stored in the Tor repository. Once a proposal is in the repository, we should discuss and improve it until we've reached consensus that it's a good idea, and that it's detailed enough to implement. When this happens, we implement the proposal and incorporate it into the specifications. Thus, the specs remain the canonical documentation for the Tor protocol: no proposal is ever the canonical documentation for an implemented feature. (This process is pretty similar to the Python Enhancement Process, with the major exception that Tor proposals get re-integrated into the specs after implementation, whereas PEPs _become_ the new spec.) {It's still okay to make small changes directly to the spec if the code can be written more or less immediately, or cosmetic changes if no code change is required. This document reflects the current developers' _intent_, not a permanent promise to always use this process in the future: we reserve the right to get really excited and run off and implement something in a caffeine-or-m&m-fueled all-night hacking session.} How new proposals get added: Once an idea has been proposed on the development list, a properly formatted (see below) draft exists, and rough consensus within the active development community exists that this idea warrants consideration, the proposal editors will officially add the proposal. To get your proposal in, send it to the tor-dev mailing list. The current proposal editors are Nick Mathewson, George Kadianakis, Damian Johnson, Isis Lovecruft, and David Goulet. What should go in a proposal: Every proposal should have a header containing these fields: Filename, Title, Author, Created, Status. These fields are optional but recommended: Target, Implemented-In, Ticket**. The Target field should describe which version the proposal is hoped to be implemented in (if it's Open or Accepted). The Implemented-In field should describe which version the proposal was implemented in (if it's Finished or Closed). The Ticket field should be a ticket number referring to Tor's canonical bug tracker (e.g. "#7144" refers to https://bugs.torproject.org/7144) or to a publicly accessible URI where one may subscribe to updates and/or retrieve information on implementation status. ** Proposals with assigned numbers of prop#283 and higher are REQUIRED to have a Ticket field if the Status is OPEN, ACCEPTED, CLOSED, or FINISHED. The body of the proposal should start with an Overview section explaining what the proposal's about, what it does, and about what state it's in. After the Overview, the proposal becomes more free-form. Depending on its length and complexity, the proposal can break into sections as appropriate, or follow a short discursive format. Every proposal should contain at least the following information before it is "ACCEPTED", though the information does not need to be in sections with these names. Motivation: What problem is the proposal trying to solve? Why does this problem matter? If several approaches are possible, why take this one? Design: A high-level view of what the new or modified features are, how the new or modified features work, how they interoperate with each other, and how they interact with the rest of Tor. This is the main body of the proposal. Some proposals will start out with only a Motivation and a Design, and wait for a specification until the Design seems approximately right. Security implications: What effects the proposed changes might have on anonymity, how well understood these effects are, and so on. Specification: A detailed description of what needs to be added to the Tor specifications in order to implement the proposal. This should be in about as much detail as the specifications will eventually contain: it should be possible for independent programmers to write mutually compatible implementations of the proposal based on its specifications. Compatibility: Will versions of Tor that follow the proposal be compatible with versions that do not? If so, how will compatibility be achieved? Generally, we try to not drop compatibility if at all possible; we haven't made a "flag day" change since May 2004, and we don't want to do another one. Implementation: If the proposal will be tricky to implement in Tor's current architecture, the document can contain some discussion of how to go about making it work. Actual patches should go on public git branches, or be uploaded to trac. Performance and scalability notes: If the feature will have an effect on performance (in RAM, CPU, bandwidth) or scalability, there should be some analysis on how significant this effect will be, so that we can avoid really expensive performance regressions, and so we can avoid wasting time on insignificant gains. How to format proposals: Proposals may be written in plain text (like this one), or in Markdown. If using Markdown, the header must be wrapped in triple-backtick ("```") lines. Whenever possible, we prefer the Commonmark dialect of Markdown. Proposal status: Open: A proposal under discussion. Accepted: The proposal is complete, and we intend to implement it. After this point, substantive changes to the proposal should be avoided, and regarded as a sign of the process having failed somewhere. Finished: The proposal has been accepted and implemented. After this point, the proposal should not be changed. Closed: The proposal has been accepted, implemented, and merged into the main specification documents. The proposal should not be changed after this point. Rejected: We're not going to implement the feature as described here, though we might do some other version. See comments in the document for details. The proposal should not be changed after this point; to bring up some other version of the idea, write a new proposal. Draft: This isn't a complete proposal yet; there are definite missing pieces. Please don't add any new proposals with this status; put them in the "ideas" sub-directory instead. Needs-Revision: The idea for the proposal is a good one, but the proposal as it stands has serious problems that keep it from being accepted. See comments in the document for details. Dead: The proposal hasn't been touched in a long time, and it doesn't look like anybody is going to complete it soon. It can become "Open" again if it gets a new proponent. Needs-Research: There are research problems that need to be solved before it's clear whether the proposal is a good idea. Meta: This is not a proposal, but a document about proposals. Reserve: This proposal is not something we're currently planning to implement, but we might want to resurrect it some day if we decide to do something like what it proposes. Informational: This proposal is the last word on what it's doing. It isn't going to turn into a spec unless somebody copy-and-pastes it into a new spec for a new subsystem. Obsolete: This proposal was flawed and has been superseded by another proposal. See comments in the document for details. The editors maintain the correct status of proposals, based on rough consensus and their own discretion. Proposal numbering: Numbers 000-099 are reserved for special and meta-proposals. 100 and up are used for actual proposals. Numbers aren't recycled.
Filename: 098-todo.txt Title: Proposals that should be written Author: Nick Mathewson, Roger Dingledine Created: 26-Jan-2007 Status: Obsolete {Obsolete: This document has been replaced by the tor-spec issue tracker.} Overview: This document lists ideas that various people have had for improving the Tor protocol. These should be implemented and specified if they're trivial, or written up as proposals if they're not. This is an active document, to be edited as proposals are written and as we come up with new ideas for proposals. We should take stuff out as it seems irrelevant. For some later protocol version. - It would be great to get smarter about identity and linkability. It's not crazy to say, "Never use the same circuit for my SSH connections and my web browsing." How far can/should we take this? See ideas/xxx-separate-streams-by-port.txt for a start. - Fix onionskin handshake scheme to be more mainstream, less nutty. Can we just do E(HMAC(g^x), g^x) rather than just E(g^x) ? No, that has the same flaws as before. We should send E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy). Better ask Ian; probably Stephen too. - Length on CREATE and friends - Versioning on circuits and create cells, so we have a clear path to improve the circuit protocol. - SHA1 is showing its age. We should get a design for upgrading our hash once the AHS competition is done, or even sooner. - Not being able to upgrade ciphersuites or increase key lengths is lame. - Paul has some ideas about circuit creation; read his PET paper once it's out. Any time: - Some ideas for revising the directory protocol: - Extend the "r" line in network-status to give a set of buckets (say, comma-separated) for that router. - Buckets are deterministic based on IP address. - Then clients can choose a bucket (or set of buckets) to download and use. - We need a way for the authorities to declare that nodes are in a family. Also, it kinda sucks that family declarations use O(N^2) space in the descriptors. - REASON_CONNECTFAILED should include an IP. - Spec should incorporate some prose from tor-design to be more readable. - Spec when we should rotate which keys - Spec how to publish descriptors less often - Describe pros and cons of non-deterministic path lengths - We should use a variable-length path length by default -- 3 +/- some distribution. Need to think harder about allowing values less than 3, and there's a tradeoff between having a wide variance and performance. - Clients currently use certs during TLS. Is this wise? It does make it easier for servers to tell which NATted client is which. We could use a seprate set of certs for each guard, I suppose, but generating so many certs could get expensive. Omitting them entirely would make OP->OR easier to tell from OR->OR. Things that should change... B.1. ... but which will require backward-incompatible change - Circuit IDs should be longer. . IPv6 everywhere. - Maybe, keys should be longer. - Maybe, key-length should be adjustable. How to do this without making anonymity suck? - Drop backward compatibility. - We should use a 128-bit subgroup of our DH prime. - Handshake should use HMAC. - Multiple cell lengths. - Ability to split circuits across paths (If this is useful.) - SENDME windows should be dynamic. - Directory - Stop ever mentioning socks ports B.1. ... and that will require no changes - Advertised outbound IP? - Migrate streams across circuits. - Fix bug 469 by limiting the number of simultaneous connections per IP. B.2. ... and that we have no idea how to do. - UDP (as transport) - UDP (as content) - Use a better AES mode that has built-in integrity checking, doesn't grow with the number of hops, is not patented, and is implemented and maintained by smart people. Let onion keys be not just RSA but maybe DH too, for Paul's reply onion design.
Filename: 099-misc.txt Title: Miscellaneous proposals Author: Various Created: 26-Jan-2007 Status: Obsolete {This document is obsolete; we only used it once, and we have implemented its only idea.) Overview: This document is for small proposal ideas that are about one paragraph in length. From here, ideas can be rejected outright, expanded into full proposals, or specified and implemented as-is. Proposals 1. Directory compression. Gzip would be easier to work with than zlib; bzip2 would result in smaller data lengths. [Concretely, we're looking at about 10-15% space savings at the expense of 3-5x longer compression time for using bzip2.] Doing on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib. Pre-compressing status documents in multiple formats would force us to use more memory to hold them. Status: Open -- Nick Mathewson
Filename: 100-tor-spec-udp.txt Title: Tor Unreliable Datagram Extension Proposal Author: Marc Liberatore Created: 23 Feb 2006 Status: Dead Overview: This is a modified version of the Tor specification written by Marc Liberatore to add UDP support to Tor. For each TLS link, it adds a corresponding DTLS link: control messages and TCP data flow over TLS, and UDP data flows over DTLS. This proposal is not likely to be accepted as-is; see comments at the end of the document. Contents 0. Introduction Tor is a distributed overlay network designed to anonymize low-latency TCP-based applications. The current tor specification supports only TCP-based traffic. This limitation prevents the use of tor to anonymize other important applications, notably voice over IP software. This document is a proposal to extend the tor specification to support UDP traffic. The basic design philosophy of this extension is to add support for tunneling unreliable datagrams through tor with as few modifications to the protocol as possible. As currently specified, tor cannot directly support such tunneling, as connections between nodes are built using transport layer security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable to the operation of most UDP-based application level protocols. Thus, we propose the addition of links between nodes using datagram transport layer security (DTLS). These links allow packets to traverse a route through tor quickly, but their unreliable nature requires minor changes to the tor protocol. This proposal outlines the necessary additions and changes to the tor specification to support UDP traffic. We note that a separate set of DTLS links between nodes creates a second overlay, distinct from the that composed of TLS links. This separation and resulting decrease in each anonymity set's size will make certain attacks easier. However, it is our belief that VoIP support in tor will dramatically increase its appeal, and correspondingly, the size of its user base, number of deployed nodes, and total traffic relayed. These increases should help offset the loss of anonymity that two distinct networks imply. 1. Overview of Tor-UDP and its complications As described above, this proposal extends the Tor specification to support UDP with as few changes as possible. Tor's overlay network is managed through TLS based connections; we will re-use this control plane to set up and tear down circuits that relay UDP traffic. These circuits be built atop DTLS, in a fashion analogous to how Tor currently sends TCP traffic over TLS. The unreliability of DTLS circuits creates problems for Tor at two levels: 1. Tor's encryption of the relay layer does not allow independent decryption of individual records. If record N is not received, then record N+1 will not decrypt correctly, as the counter for AES/CTR is maintained implicitly. 2. Tor's end-to-end integrity checking works under the assumption that all RELAY cells are delivered. This assumption is invalid when cells are sent over DTLS. The fix for the first problem is straightforward: add an explicit sequence number to each cell. To fix the second problem, we introduce a system of nonces and hashes to RELAY packets. In the following sections, we mirror the layout of the Tor Protocol Specification, presenting the necessary modifications to the Tor protocol as a series of deltas. 2. Connections Tor-UDP uses DTLS for encryption of some links. All DTLS links must have corresponding TLS links, as all control messages are sent over TLS. All implementations MUST support the DTLS ciphersuite "[TODO]". DTLS connections are formed using the same protocol as TLS connections. This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell, as detailed in section 4.6. Once a paired TLS/DTLS connection is established, the two sides send cells to one another. All but two types of cells are sent over TLS links. RELAY cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified below, are sent over DTLS links. [Should all cells still be 512 bytes long? Perhaps upon completion of a preliminary implementation, we should do a performance evaluation for some class of UDP traffic, such as VoIP. - ML] Cells may be sent embedded in TLS or DTLS records of any size or divided across such records. The framing of these records MUST NOT leak any more information than the above differentiation on the basis of cell type. [I am uncomfortable with this leakage, but don't see any simple, elegant way around it. -ML] As with TLS connections, DTLS connections are not permanent. 3. Cell format Each cell contains the following fields: CircID [2 bytes] Command [1 byte] Sequence Number [2 bytes] Payload (padded with 0 bytes) [507 bytes] [Total size: 512 bytes] The 'Command' field holds one of the following values: 0 -- PADDING (Padding) (See Sec 6.2) 1 -- CREATE (Create a circuit) (See Sec 4) 2 -- CREATED (Acknowledge create) (See Sec 4) 3 -- RELAY (End-to-end data) (See Sec 5) 4 -- DESTROY (Stop using a circuit) (See Sec 4) 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4) 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4) 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4) 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4) 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4) 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4) The sequence number allows for AES/CTR decryption of RELAY cells independently of one another; this functionality is required to support cells sent over DTLS. The sequence number is described in more detail in section 4.5. [Should the sequence number only appear in RELAY packets? The overhead is small, and I'm hesitant to force more code paths on the implementor. -ML] [There's already a separate relay header that has other material in it, so it wouldn't be the end of the world to move it there if it's appropriate. -RD] [Having separate commands for UDP circuits seems necessary, unless we can assume a flag day event for a large number of tor nodes. -ML] 4. Circuit management 4.2. Setting circuit keys Keys are set up for UDP circuits in the same fashion as for TCP circuits. Each UDP circuit shares keys with its corresponding TCP circuit. [If the keys are used for both TCP and UDP connections, how does it work to mix sequence-number-less cells with sequenced-numbered cells -- how do you know you have the encryption order right? -RD] 4.3. Creating circuits UDP circuits are created as TCP circuits, using the *_UDP cells as appropriate. 4.4. Tearing down circuits UDP circuits are torn down as TCP circuits, using the *_UDP cells as appropriate. 4.5. Routing relay cells When an OR receives a RELAY cell, it checks the cell's circID and determines whether it has a corresponding circuit along that connection. If not, the OR drops the RELAY cell. Otherwise, if the OR is not at the OP edge of the circuit (that is, either an 'exit node' or a non-edge node), it de/encrypts the payload with AES/CTR, as follows: 'Forward' relay cell (same direction as CREATE): Use Kf as key; decrypt, using sequence number to synchronize ciphertext and keystream. 'Back' relay cell (opposite direction from CREATE): Use Kb as key; encrypt, using sequence number to synchronize ciphertext and keystream. Note that in counter mode, decrypt and encrypt are the same operation. [Since the sequence number is only 2 bytes, what do you do when it rolls over? -RD] Each stream encrypted by a Kf or Kb has a corresponding unique state, captured by a sequence number; the originator of each such stream chooses the initial sequence number randomly, and increments it only with RELAY cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so there's no need for counting bytes directly. Right? - ML] [I believe this is true. You'll find out for sure when you try to build it. ;) -RD] The OR then decides whether it recognizes the relay cell, by inspecting the payload as described in section 5.1 below. If the OR recognizes the cell, it processes the contents of the relay cell. Otherwise, it passes the decrypted relay cell along the circuit if the circuit continues. If the OR at the end of the circuit encounters an unrecognized relay cell, an error has occurred: the OR sends a DESTROY cell to tear down the circuit. When a relay cell arrives at an OP, the OP decrypts the payload with AES/CTR as follows: OP receives data cell: For I=N...1, Decrypt with Kb_I, using the sequence number as above. If the payload is recognized (see section 5.1), then stop and process the payload. For more information, see section 5 below. 4.6. CREATE_UDP and CREATED_UDP cells Users set up UDP circuits incrementally. The procedure is similar to that for TCP circuits, as described in section 4.1. In addition to the TLS connection to the first node, the OP also attempts to open a DTLS connection. If this succeeds, the OP sends a CREATE_UDP cell, with a payload in the same format as a CREATE cell. To extend a UDP circuit past the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which instructs the last node in the circuit to send a CREATE_UDP cell to extend the circuit. The relay payload for an EXTEND_UDP relay cell consists of: Address [4 bytes] TCP port [2 bytes] UDP port [2 bytes] Onion skin [186 bytes] Identity fingerprint [20 bytes] The address field and ports denote the IPV4 address and ports of the next OR in the circuit. The payload for a CREATED_UDP cell or the relay payload for an RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or RELAY_EXTENDED cell. Both circuits are established using the same key. Note that the existence of a UDP circuit implies the existence of a corresponding TCP circuit, sharing keys, sequence numbers, and any other relevant state. 4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells As above, the OP must successfully connect using DTLS before attempting to send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in section 4.1.1. 5. Application connections and stream management 5.1. Relay cells Within a circuit, the OP and the exit node use the contents of RELAY cells to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets across circuits. End-to-end commands and UDP packets can be initiated by either edge; streams are initiated by the OP. The payload of each unencrypted RELAY cell consists of: Relay command [1 byte] 'Recognized' [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [498 bytes] The relay commands are: 1 -- RELAY_BEGIN [forward] 2 -- RELAY_DATA [forward or backward] 3 -- RELAY_END [forward or backward] 4 -- RELAY_CONNECTED [backward] 5 -- RELAY_SENDME [forward or backward] 6 -- RELAY_EXTEND [forward] 7 -- RELAY_EXTENDED [backward] 8 -- RELAY_TRUNCATE [forward] 9 -- RELAY_TRUNCATED [backward] 10 -- RELAY_DROP [forward or backward] 11 -- RELAY_RESOLVE [forward] 12 -- RELAY_RESOLVED [backward] 13 -- RELAY_BEGIN_UDP [forward] 14 -- RELAY_DATA_UDP [forward or backward] 15 -- RELAY_EXTEND_UDP [forward] 16 -- RELAY_EXTENDED_UDP [backward] 17 -- RELAY_DROP_UDP [forward or backward] Commands labelled as "forward" must only be sent by the originator of the circuit. Commands labelled as "backward" must only be sent by other nodes in the circuit back to the originator. Commands marked as either can be sent either by the originator or other nodes. The 'recognized' field in any unencrypted relay payload is always set to zero. The 'digest' field can have two meanings. For all cells sent over TLS connections (that is, all commands and all non-UDP RELAY data), it is computed as the first four bytes of the running SHA-1 digest of all the bytes that have been sent reliably and have been destined for this hop of the circuit or originated from this hop of the circuit, seeded from Df or Db respectively (obtained in section 4.2 above), and including this RELAY cell's entire payload (taken with the digest field set to zero). Cells sent over DTLS connections do not affect this running digest. Each cell sent over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field set to the SHA-1 digest of the current RELAY cells' entire payload, with the digest field set to zero. Coupled with a randomly-chosen streamID, this provides per-cell integrity checking on UDP cells. [If you drop malformed UDP relay cells but don't close the circuit, then this 8 bytes of digest is not as strong as what we get in the TCP-circuit side. Is this a problem? -RD] When the 'recognized' field of a RELAY cell is zero, and the digest is correct, the cell is considered "recognized" for the purposes of decryption (see section 4.5 above). (The digest does not include any bytes from relay cells that do not start or end at this hop of the circuit. That is, it does not include forwarded data. Therefore if 'recognized' is zero but the digest does not match, the running digest at that node should not be updated, and the cell should be forwarded on.) All RELAY cells pertaining to the same tunneled TCP stream have the same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY cells that affect the entire circuit rather than a particular stream use a StreamID of zero. All RELAY cells pertaining to the same UDP tunnel have the same streamID. This streamID is chosen randomly by the OP, but cannot be zero. The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the payload is padded with NUL bytes. If the RELAY cell is recognized but the relay command is not understood, the cell must be dropped and ignored. Its contents still count with respect to the digests, though. [Before 0.1.1.10, Tor closed circuits when it received an unknown relay command. Perhaps this will be more forward-compatible. -RD] 5.2.1. Opening UDP tunnels and transferring data To open a new anonymized UDP connection, the OP chooses an open circuit to an exit that may be able to connect to the destination address, selects a random streamID not yet used on that circuit, and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address and port of the destination host. The payload format is: ADDRESS | ':' | PORT | [00] where ADDRESS can be a DNS hostname, or an IPv4 address in dotted-quad format, or an IPv6 address surrounded by square brackets; and where PORT is encoded in decimal. [What is the [00] for? -NM] [It's so the payload is easy to parse out with string funcs -RD] Upon receiving this cell, the exit node resolves the address as necessary. If the address cannot be resolved, the exit node replies with a RELAY_END cell. (See 5.4 below.) Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose payload is in one of the following formats: The IPv4 address to which the connection was made [4 octets] A number of seconds (TTL) for which the address may be cached [4 octets] or Four zero-valued octets [4 octets] An address type (6) [1 octet] The IPv6 address to which the connection was made [16 octets] A number of seconds (TTL) for which the address may be cached [4 octets] [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL field. No version of Tor currently generates the IPv6 format.] The OP waits for a RELAY_CONNECTED cell before sending any data. Once a connection has been established, the OP and exit node package UDP data in RELAY_DATA_UDP cells, and upon receiving such cells, echo their contents to the corresponding socket. RELAY_DATA_UDP cells sent to unrecognized streams are dropped. Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such a cell, the OR or OP must drop it. 5.3. Closing streams UDP tunnels are closed in a fashion corresponding to TCP connections. 6. Flow Control UDP streams are not subject to flow control. 7.2. Router descriptor format. The items' formats are as follows: "router" nickname address ORPort SocksPort DirPort UDPPort Indicates the beginning of a router descriptor. "address" must be an IPv4 address in dotted-quad format. The last three numbers indicate the TCP ports at which this OR exposes functionality. ORPort is a port at which this OR accepts TLS connections for the main OR protocol; SocksPort is deprecated and should always be 0; DirPort is the port at which this OR accepts directory-related HTTP connections; and UDPPort is a port at which this OR accepts DTLS connections for UDP data. If any port is not supported, the value 0 is given instead of a port number. Other sections: What changes need to happen to each node's exit policy to support this? -RD Switching to UDP means managing the queues of incoming packets better, so we don't miss packets. How does this interact with doing large public key operations (handshakes) in the same thread? -RD ======================================================================== COMMENTS ======================================================================== [16 May 2006] I don't favor this approach; it makes packet traffic partitioned from stream traffic end-to-end. The architecture I'd like to see is: A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on TCP/TLS for firewall penetration or something. (This also gives us an upgrade path for routing through legacy servers.) B Stream traffic is handled with end-to-end per-stream acks/naks and retries. On failure, the data is retransmitted in a new RELAY_DATA cell; a cell isn't retransmitted. We'll need to do A anyway, to fix our behavior on packet-loss. Once we've done so, B is more or less inevitable, and we can support end-to-end UDP traffic "for free". (Also, there are some details that this draft spec doesn't address. For example, what happens when a UDP packet doesn't fit in a single cell?) -NM
Filename: 101-dir-voting.txt Title: Voting on the Tor Directory System Author: Nick Mathewson Created: Nov 2006 Status: Closed Implemented-In: 0.2.0.x Overview This document describes a consensus voting scheme for Tor directories; instead of publishing different network statuses, directories would vote on and publish a single "consensus" network status document. This is an open proposal. Proposal: 0. Scope and preliminaries This document describes a consensus voting scheme for Tor directories. Once it's accepted, it should be merged with dir-spec.txt. Some preliminaries for authority and caching support should be done during the 0.1.2.x series; the main deployment should come during the 0.2.0.x series. 0.1. Goals and motivation: voting. The current directory system relies on clients downloading separate network status statements from the caches signed by each directory. Clients download a new statement every 30 minutes or so, choosing to replace the oldest statement they currently have. This creates a partitioning problem: different clients have different "most recent" networkstatus sources, and different versions of each (since authorities change their statements often). It also creates a scaling problem: most of the downloaded networkstatus are probably quite similar, and the redundancy grows as we add more authorities. So if we have clients only download a single multiply signed consensus network status statement, we can: - Save bandwidth. - Reduce client partitioning - Reduce client-side and cache-side storage - Simplify client-side voting code (by moving voting away from the client) We should try to do this without: - Assuming that client-side or cache-side clocks are more correct than we assume now. - Assuming that authority clocks are perfectly correct. - Degrading badly if a few authorities die or are offline for a bit. We do not have to perform well if: - No clique of more than half the authorities can agree about who the authorities are. 1. The idea. Instead of publishing a network status whenever something changes, each authority instead publishes a fresh network status only once per "period" (say, 60 minutes). Authorities either upload this network status (or "vote") to every other authority, or download every other authority's "vote" (see 3.1 below for discussion on push vs pull). After an authority has (or has become convinced that it won't be able to get) every other authority's vote, it deterministically computes a consensus networkstatus, and signs it. Authorities download (or are uploaded; see 3.1) one another's signatures, and form a multiply signed consensus. This multiply-signed consensus is what caches cache and what clients download. If an authority is down, authorities vote based on what they *can* download/get uploaded. If an authority is "a little" down and only some authorities can reach it, authorities try to get its info from other authorities. If an authority computes the vote wrong, its signature isn't included on the consensus. Clients use a consensus if it is "trusted": signed by more than half the authorities they recognize. If clients can't find any such consensus, they use the most recent trusted consensus they have. If they don't have any trusted consensus, they warn the user and refuse to operate (and if DirServers is not the default, beg the user to adapt the list of authorities). 2. Details. 2.0. Versioning All documents generated here have version "3" given in their network-status-version entries. 2.1. Vote specifications Votes in v3 are similar to v2 network status documents. We add these fields to the preamble: "vote-status" -- the word "vote". "valid-until" -- the time when this authority expects to publish its next vote. "known-flags" -- a space-separated list of flags that will sometimes be included on "s" lines later in the vote. "dir-source" -- as before, except the "hostname" part MUST be the authority's nickname, which MUST be unique among authorities, and MUST match the nickname in the "directory-signature" entry. Authorities SHOULD cache their most recently generated votes so they can persist them across restarts. Authorities SHOULD NOT generate another document until valid-until has passed. Router entries in the vote MUST be sorted in ascending order by router identity digest. The flags in "s" lines MUST appear in alphabetical order. Votes SHOULD be synchronized to half-hour publication intervals (one hour? XXX say more; be more precise.) XXXX some way to request older networkstatus docs? 2.2. Consensus directory specifications Consensuses are like v3 votes, except for the following fields: "vote-status" -- the word "consensus". "published" is the latest of all the published times on the votes. "valid-until" is the earliest of all the valid-until times on the votes. "dir-source" and "fingerprint" and "dir-signing-key" and "contact" are included for each authority that contributed to the vote. "vote-digest" for each authority that contributed to the vote, calculated as for the digest in the signature on the vote. [XXX re-English this sentence] "client-versions" and "server-versions" are sorted in ascending order based on version-spec.txt. "dir-options" and "known-flags" are not included. [XXX really? why not list the ones that are used in the consensus? For example, right now BadExit is in use, but no servers would be labelled BadExit, and it's still worth knowing that it was considered by the authorities. -RD] The fields MUST occur in the following order: "network-status-version" "vote-status" "published" "valid-until" For each authority, sorted in ascending order of nickname, case- insensitively: "dir-source", "fingerprint", "contact", "dir-signing-key", "vote-digest". "client-versions" "server-versions" The signatures at the end of the document appear as multiple instances of directory-signature, sorted in ascending order by nickname, case-insensitively. A router entry should be included in the result if it is included by more than half of the authorities (total authorities, not just those whose votes we have). A router entry has a flag set if it is included by more than half of the authorities who care about that flag. [XXXX this creates an incentive for attackers to DOS authorities whose votes they don't like. Can we remember what flags people set the last time we saw them? -NM] [Which 'we' are we talking here? The end-users never learn which authority sets which flags. So you're thinking the authorities should record the last vote they saw from each authority and if it's within a week or so, count all the flags that it advertised as 'no' votes? Plausible. -RD] The signature hash covers from the "network-status-version" line through the characters "directory-signature" in the first "directory-signature" line. Consensus directories SHOULD be rejected if they are not signed by more than half of the known authorities. 2.2.1. Detached signatures Assuming full connectivity, every authority should compute and sign the same consensus directory in each period. Therefore, it isn't necessary to download the consensus computed by each authority; instead, the authorities only push/fetch each others' signatures. A "detached signature" document contains a single "consensus-digest" entry and one or more directory-signature entries. [XXXX specify more.] 2.3. URLs and timelines 2.3.1. URLs and timeline used for agreement An authority SHOULD publish its vote immediately at the start of each voting period. It does this by making it available at http://<hostname>/tor/status-vote/current/authority.z and sending it in an HTTP POST request to each other authority at the URL http://<hostname>/tor/post/vote If, N minutes after the voting period has begun, an authority does not have a current statement from another authority, the first authority retrieves the other's statement. Once an authority has a vote from another authority, it makes it available at http://<hostname>/tor/status-vote/current/<fp>.z where <fp> is the fingerprint of the other authority's identity key. The consensus network status, along with as many signatures as the server currently knows, should be available at http://<hostname>/tor/status-vote/current/consensus.z All of the detached signatures it knows for consensus status should be available at: http://<hostname>/tor/status-vote/current/consensus-signatures.z Once an authority has computed and signed a consensus network status, it should send its detached signature to each other authority in an HTTP POST request to the URL: http://<hostname>/tor/post/consensus-signature [XXXX Store votes to disk.] 2.3.2. Serving a consensus directory Once the authority is done getting signatures on the consensus directory, it should serve it from: http://<hostname>/tor/status/consensus.z Caches SHOULD download consensus directories from an authority and serve them from the same URL. 2.3.3. Timeline and synchronization [XXXX] 2.4. Distributing routerdescs between authorities Consensus will be more meaningful if authorities take steps to make sure that they all have the same set of descriptors _before_ the voting starts. This is safe, since all descriptors are self-certified and timestamped: it's always okay to replace a signed descriptor with a more recent one signed by the same identity. In the long run, we might want some kind of sophisticated process here. For now, since authorities already download one another's networkstatus documents and use them to determine what descriptors to download from one another, we can rely on this existing mechanism to keep authorities up to date. [We should do a thorough read-through of dir-spec again to make sure that the authorities converge on which descriptor to "prefer" for each router. Right now the decision happens at the client, which is no longer the right place for it. -RD] 3. Questions and concerns 3.1. Push or pull? The URLs above define a push mechanism for publishing votes and consensus signatures via HTTP POST requests, and a pull mechanism for downloading these documents via HTTP GET requests. As specified, every authority will post to every other. The "download if no copy has been received" mechanism exists only as a fallback. 4. Migration * It would be cool if caches could get ready to download consensus status docs, verify enough signatures, and serve them now. That way once stuff works all we need to do is upgrade the authorities. Caches don't need to verify the correctness of the format so long as it's signed (or maybe multisigned?). We need to make sure that caches back off very quickly from downloading consensus docs until they're actually implemented.
Filename: 102-drop-opt.txt Title: Dropping "opt" from the directory format Author: Nick Mathewson Created: Jan 2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document proposes a change in the format used to transmit router and directory information. This proposal has been accepted, implemented, and merged into dir-spec.txt. Proposal: The "opt" keyword in Tor's directory formats was originally intended to mean, "it is okay to ignore this entry if you don't understand it"; the default behavior has been "discard a routerdesc if it contains entries you don't recognize." But so far, every new flag we have added has been marked 'opt'. It would probably make sense to change the default behavior to "ignore unrecognized fields", and add the statement that clients SHOULD ignore fields they don't recognize. As a meta-principle, we should say that clients and servers MUST NOT have to understand new fields in order to use directory documents correctly. Of course, this will make it impossible to say, "The format has changed a lot; discard this quietly if you don't understand it." We could do that by adding a version field. Status: * We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it once earlier formats are obsolete.
Filename: 103-multilevel-keys.txt Title: Splitting identity key from regularly used signing key Author: Nick Mathewson Created: Jan 2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document proposes a change in the way identity keys are used, so that highly sensitive keys can be password-protected and seldom loaded into RAM. It presents options; it is not yet a complete proposal. Proposal: Replacing a directory authority's identity key in the event of a compromise would be tremendously annoying. We'd need to tell every client to switch their configuration, or update to a new version with an uploaded list. So long as some weren't upgraded, they'd be at risk from whoever had compromised the key. With this in mind, it's a shame that our current protocol forces us to store identity keys unencrypted in RAM. We need some kind of signing key stored unencrypted, since we need to generate new descriptors/directories and rotate link and onion keys regularly. (And since, of course, we can't ask server operators to be on-hand to enter a passphrase every time we want to rotate keys or sign a descriptor.) The obvious solution seems to be to have a signing-only key that lives indefinitely (months or longer) and signs descriptors and link keys, and a separate identity key that's used to sign the signing key. Tor servers could run in one of several modes: 1. Identity key stored encrypted. You need to pick a passphrase when you enable this mode, and re-enter this passphrase every time you rotate the signing key. 1'. Identity key stored separate. You save your identity key to a floppy, and use the floppy when you need to rotate the signing key. 2. All keys stored unencrypted. In this case, we might not want to even *have* a separate signing key. (We'll need to support no-separate- signing-key mode anyway to keep old servers working.) 3. All keys stored encrypted. You need to enter a passphrase to start Tor. (Of course, we might not want to implement all of these.) Case 1 is probably most usable and secure, if we assume that people don't forget their passphrases or lose their floppies. We could mitigate this a bit by encouraging people to PGP-encrypt their passphrases to themselves, or keep a cleartext copy of their secret key secret-split into a few pieces, or something like that. Migration presents another difficulty, especially with the authorities. If we use the current set of identity keys as the new identity keys, we're in the position of having sensitive keys that have been stored on media-of-dubious-encryption up to now. Also, we need to keep old clients (who will expect descriptors to be signed by the identity keys they know and love, and who will not understand signing keys) happy. A possible solution: One thing to consider is that router identity keys are not very sensitive: if an OR disappears and reappears with a new key, the network treats it as though an old router had disappeared and a new one had joined the network. The Tor network continues unharmed; this isn't a disaster. Thus, the ideas above are mostly relevant for authorities. The most straightforward solution for the authorities is probably to take advantage of the protocol transition that will come with proposal 101, and introduce a new set of signing _and_ identity keys used only to sign votes and consensus network-status documents. Signing and identity keys could be delivered to users in a separate, rarely changing "keys" document, so that the consensus network-status documents wouldn't need to include N signing keys, N identity keys, and N certifications. Note also that there is no reason that the identity/signing keys used by directory authorities would necessarily have to be the same as the identity keys those authorities use in their capacity as routers. Decoupling these keys would give directory authorities the following set of keys: Directory authority identity: Highly confidential; stored encrypted and/or offline. Used to identity directory authorities. Shipped with clients. Used to sign Directory authority signing keys. Directory authority signing key: Stored online, accessible to regular Tor process. Used to sign votes and consensus directories. Downloaded as part of a "keys" document. [Administrators SHOULD rotate their signing keys every month or two, just to keep in practice and keep from forgetting the password to the authority identity.] V1-V2 directory authority identity: Stored online, never changed. Used to sign legacy network-status and directory documents. Router identity: Stored online, seldom changed. Used to sign server descriptors for this authority in its role as a router. Implicitly certified by being listed in network-status documents. Onion key, link key: As in tor-spec.txt Extensions to Proposal 101. Define a new document type, "Key certificate". It contains the following fields, in order: "dir-key-certificate-version": As network-status-version. Must be "3". "fingerprint": Hex fingerprint, with spaces, based on the directory authority's identity key. "dir-identity-key": The long-term identity key for this authority. "dir-key-published": The time when this directory's signing key was last changed. "dir-key-expires": A time after which this key is no longer valid. "dir-signing-key": As in proposal 101. "dir-key-certification": A signature of the above fields, in order. The signed material extends from the beginning of "dir-key-certicate-version" through the newline after "dir-key-certification". The identity key is used to generate this signature. These elements together constitute a "key certificate". These are generated offline when starting a v3 authority. Private identity keys SHOULD be stored offline, encrypted, or both. A running authority only needs access to the signing key. Unlike other keys currently used by Tor, the authority identity keys and directory signing keys MAY be longer than 1024 bits. (They SHOULD be 2048 bits or longer; they MUST NOT be shorter than 1024.) Vote documents change as follows: A key certificate MUST be included in-line in every vote document. With the exception of "fingerprint", its elements MUST NOT appear in consensus documents. Consensus network statuses change as follows: Remove dir-signing-key. Change "directory-signature" to take a fingerprint of the authority's identity key and a fingerprint of the authority's current signing key rather than the authority's nickname. Change "dir-source" to take the a fingerprint of the authority's identity key rather than the authority's nickname or hostname. Add a new document type: A "keys" document contains all currently known key certificates. All authorities serve it at http://<hostname>/tor/status/keys.z Caches and clients download the keys document whenever they receive a consensus vote that uses a key they do not recognize. Caches download from authorities; clients download from caches. Processing votes: When receiving a vote, authorities check to see if the key certificate for the voter is different from the one they have. If the key certificate _is_ different, and its dir-key-published is more recent than the most recently known one, and it is well-formed and correctly signed with the correct identity key, then authorities remember it as the new canonical key certificate for that voter. A key certificate is invalid if any of the following hold: * The version is unrecognized. * The fingerprint does not match the identity key. * The identity key or the signing key is ill-formed. * The published date is very far in the past or future. * The signature is not a valid signature of the key certificate generated with the identity key. When processing the signatures on consensus, clients and caches act as follows: 1. Only consider the directory-signature entries whose identity key hashes match trusted authorities. 2. If any such entries have signing key hashes that match unknown signing keys, download a new keys document. 3. For every entry with a known (identity key,signing key) pair, check the signature on the document. 4. If the document has been signed by more than half of the authorities the client recognizes, treat the consensus as correctly signed. If not, but the number entries with known identity keys but unknown signing keys might be enough to make the consensus correctly signed, do not use the consensus, but do not discard it until we have a new keys document.
Filename: 104-short-descriptors.txt Title: Long and Short Router Descriptors Author: Nick Mathewson Created: Jan 2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document proposes moving unused-by-clients information from regular router descriptors into a new "extra info" router descriptor. Proposal: Some of the costliest fields in the current directory protocol are ones that no client actually uses. In particular, the "read-history" and "write-history" fields are used only by the authorities for monitoring the status of the network. If we took them out, the size of a compressed list of all the routers would fall by about 60%. (No other disposable field would save much more than 2%.) We propose to remove these fields from descriptors, and and have them uploaded as a part of a separate signed "extra info" to the authorities. This document will be signed. A hash of this document will be included in the regular descriptors. (We considered another design, where routers would generate and upload a short-form and a long-form descriptor. Only the short-form descriptor would ever be used by anybody for routing. The long-form descriptor would be used only for analytics and other tools. We decided against this because well-behaved tools would need to download short-form descriptors too (as these would be the only ones indexed), and hence get redundant info. Badly behaved tools would download only long-form descriptors, and expose themselves to partitioning attacks.) Other disposable fields: Clients don't need these fields, but removing them doesn't help bandwidth enough to be worthwhile. contact (save about 1%) fingerprint (save about 3%) We could represent these fields more succinctly, but removing them would only save 1%. (!) reject accept (Apparently, exit polices are highly compressible.) [Does size-on-disk matter to anybody? Some clients and servers don't have much disk, or have really slow disk (e.g. USB). And we don't store caches compressed right now. -RD] Specification: 1. Extra Info Format. An "extra info" descriptor contains the following fields: "extra-info" Nickname Fingerprint Identifies what router this is an extra info descriptor for. Fingerprint is encoded in hex (using upper-case letters), with no spaces. "published" As currently documented in dir-spec.txt. It MUST match the "published" field of the descriptor published at the same time. "read-history" "write-history" As currently documented in dir-spec.txt. Optional. "router-signature" NL Signature NL A signature of the PKCS1-padded hash of the entire extra info document, taken from the beginning of the "extra-info" line, through the newline after the "router-signature" line. An extra info document is not valid unless the signature is performed with the identity key whose digest matches FINGERPRINT. The "extra-info" field is required and MUST appear first. The router-signature field is required and MUST appear last. All others are optional. As for other documents, unrecognized fields must be ignored. 2. Existing formats Implementations that use "read-history" and "write-history" SHOULD continue accepting router descriptors that contain them. (Prior to 0.2.0.x, this information was encoded in ordinary router descriptors; in any case they have always been listed as opt, so they should be accepted anyway.) Add these fields to router descriptors: "extra-info-digest" Digest "Digest" is a hex-encoded digest (using upper-case characters) of the router's extra-info document, as signed in the router's extra-info. (If this field is absent, no extra-info-digest exists.) "caches-extra-info" Present if this router is a directory cache that provides extra-info documents, or an authority that handles extra-info documents. (Since implementations before 0.1.2.5-alpha required that the "opt" keyword precede any unrecognized entry, these keys MUST be preceded with "opt" until 0.1.2.5-alpha is obsolete.) 3. New communications rules Servers SHOULD generate and upload one extra-info document after each descriptor they generate and upload; no more, no less. Servers MUST upload the new descriptor before they upload the new extra-info. Authorities receiving an extra-info document SHOULD verify all of the following: * They have a router descriptor for some server with a matching nickname and identity fingerprint. * That server's identity key has been used to sign the extra-info document. * The extra-info-digest field in the router descriptor matches the digest of the extra-info document. * The published fields in the two documents match. Authorities SHOULD drop extra-info documents that do not meet these criteria. Extra-info documents MAY be uploaded as part of the same HTTP post as the router descriptor, or separately. Authorities MUST accept both methods. Authorities SHOULD try to fetch extra-info documents from one another if they do not have one matching the digest declared in a router descriptor. Caches that are running locally with a tool that needs to use extra-info documents MAY download and store extra-info documents. They should do so when they notice that the recommended descriptor has an extra-info-digest not matching any extra-info document they currently have. (Caches not running on a host that needs to use extra-info documents SHOULD NOT download or cache them.) 4. New URLs http://<hostname>/tor/extra/d/... http://<hostname>/tor/extra/fp/... http://<hostname>/tor/extra/all[.z] (As for /tor/server/ URLs: supports fetching extra-info documents by their digest, by the fingerprint of their servers, or all at once. When serving by fingerprint, we serve the extra-info that corresponds to the descriptor we would serve by that fingerprint. Only directory authorities are guaranteed to support these URLs.) http://<hostname>/tor/extra/authority[.z] (The extra-info document for this router.) Extra-info documents are uploaded to the same URLs as regular router descriptors. Migration: For extra info approach: * First: * Authorities should accept extra info, and support serving it. * Routers should upload extra info once authorities accept it. * Caches should support an option to download and cache it, once authorities serve it. * Tools should be updated to use locally cached information. These tools include: lefkada's exit.py script. tor26's noreply script and general directory cache. https://nighteffect.us/tns/ for its graphs and check with or-talk for the rest, once it's time. * Set a cutoff time for including bandwidth in router descriptors, so that tools that use bandwidth info know that they will need to fetch extra info documents. * Once tools that want bandwidth info support fetching extra info: * Have routers stop including bandwidth info in their router descriptors.
Filename: 105-handshake-revision.txt Title: Version negotiation for the Tor protocol. Author: Nick Mathewson, Roger Dingledine Created: Jan 2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document was extracted from a modified version of tor-spec.txt that we had written before the proposal system went into place. It adds two new cells types to the Tor link connection setup handshake: one used for version negotiation, and another to prevent MITM attacks. This proposal is partially implemented, and partially proceded by proposal 130. Motivation: Tor versions Our *current* approach to versioning the Tor protocol(s) has been as follows: - All changes must be backward compatible. - It's okay to add new cell types, if they would be ignored by previous versions of Tor. - It's okay to add new data elements to cells, if they would be ignored by previous versions of Tor. - For forward compatibility, Tor must ignore cell types it doesn't recognize, and ignore data in those cells it doesn't expect. - Clients can inspect the version of Tor declared in the platform line of a router's descriptor, and use that to learn whether a server supports a given feature. Servers, however, aren't assumed to all know about each other, and so don't know the version of who they're talking to. This system has these problems: - It's very hard to change fundamental aspects of the protocol, like the cell format, the link protocol, any of the various encryption schemes, and so on. - The router-to-router link protocol has remained more-or-less frozen for a long time, since we can't easily have an OR use new features unless it knows the other OR will understand them. We need to resolve these problems because: - Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will not seem like the best idea for all time. - There are many ideas circulating for multiple cell sizes; while it's not obvious whether these are safe, we can't do them at all without a mechanism to permit them. - There are many ideas circulating for alternative circuit building and cell relay rules: they don't work unless they can coexist in the current network. - If our protocol changes a lot, it's hard to describe any coherent version of it: we need to say "the version that Tor versions W through X use when talking to versions Y through Z". This makes analysis harder. Motivation: Preventing MITM attacks TLS prevents a man-in-the-middle attacker from reading or changing the contents of a communication. It does not, however, prevent such an attacker from observing timing information. Since timing attacks are some of the most effective against low-latency anonymity nets like Tor, we should take more care to make sure that we're not only talking to who we think we're talking to, but that we're using the network path we believe we're using. Motivation: Signed clock information It's very useful for Tor instances to know how skewed they are relative to one another. The only way to find out currently has been to download directory information, and check the Date header--but this is not authenticated, and hence subject to modification on the wire. Using BEGIN_DIR to create an authenticated directory stream through an existing circuit is better, but that's an extra step and it might be nicer to learn the information in the course of the regular protocol. Proposal: 1.0. Version numbers The node-to-node TLS-based "OR connection" protocol and the multi-hop "circuit" protocol are versioned quasi-independently. Of course, some dependencies will continue to exist: Certain versions of the circuit protocol may require a minimum version of the connection protocol to be used. The connection protocol affects: - Initial connection setup, link encryption, transport guarantees, etc. - The allowable set of cell commands - Allowable formats for cells. The circuit protocol determines: - How circuits are established and maintained - How cells are decrypted and relayed - How streams are established and maintained. Version numbers are incremented for backward-incompatible protocol changes only. Backward-compatible changes are generally implemented by adding additional fields to existing structures; implementations MUST ignore fields they do not expect. Unused portions of cells MUST be set to zero. Though versioning the protocol will make it easier to maintain backward compatibility with older versions of Tor, we will nevertheless continue to periodically drop support for older protocols, - to keep the implementation from growing without bound, - to limit the maintenance burden of patching bugs in obsolete Tors, - to limit the testing burden of verifying that many old protocol versions continue to be implemented properly, and - to limit the exposure of the network to protocol versions that are expensive to support. The Tor protocol as implemented through the 0.1.2.x Tor series will be called "version 1" in its link protocol and "version 1" in its relay protocol. Versions of the Tor protocol so old as to be incompatible with Tor 0.1.2.x can be considered to be version 0 of each, and are not supported. 2.1. VERSIONS cells When a Tor connection is established, both parties normally send a VERSIONS cell before sending any other cells. (But see below.) VersionsLen [2 byte] Versions [VersionsLen bytes] "Versions" is a sequence of VersionsLen bytes. Each value between 1 and 127 inclusive represents a single version; current implementations MUST ignore other bytes. Parties should list all of the versions which they are able and willing to support. Parties can only communicate if they have some connection protocol version in common. Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells, and therefore don't support version negotiation. Thus, waiting until the other side has sent a VERSIONS cell won't work for these servers: if the other side sends no cells back, it is impossible to tell whether they have sent a VERSIONS cell that has been stalled, or whether they have dropped our own VERSIONS cell as unrecognized. Therefore, we'll change the TLS negotiation parameters so that old parties can still negotiate, but new parties can recognize each other. Immediately after a TLS connection has been established, the parties check whether the other side negotiated the connection in an "old" way or a "new" way. If either party negotiated in the "old" way, we assume a v1 connection. Otherwise, both parties send VERSIONS cells listing all their supported versions. Upon receiving the other party's VERSIONS cell, the implementation begins using the highest-valued version common to both cells. If the first cell from the other party has a recognized command, and is _not_ a VERSIONS cell, we assume a v1 protocol. (For more detail on the TLS protocol change, see forthcoming draft proposals from Steven Murdoch.) Implementations MUST discard VERSIONS cells that are not the first recognized cells sent on a connection. The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1 byte of command, 509 bytes of payload). [NOTE: The VERSIONS cell is assigned the command number 7.] 2.2. MITM-prevention and time checking If we negotiate a v2 connection or higher, the second cell we send SHOULD be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other times. A NETINFO cell contains: Timestamp [4 bytes] Other OR's address [variable] Number of addresses [1 byte] This OR's addresses [variable] Timestamp is the OR's current Unix time, in seconds since the epoch. If an implementation receives time values from many ORs that indicate that its clock is skewed, it SHOULD try to warn the administrator. (We leave the definition of 'many' intentionally vague for now.) Before believing the timestamp in a NETINFO cell, implementations SHOULD compare the time at which they received the cell to the time when they sent their VERSIONS cell. If the difference is very large, it is likely that the cell was delayed long enough that its contents are out of date. Each address contains Type/Length/Value as used in Section 6.4 of tor-spec.txt. The first address is the one that the party sending the NETINFO cell believes the other has -- it can be used to learn what your IP address is if you have no other hints. The rest of the addresses are the advertised addresses of the party sending the NETINFO cell -- we include them to block a man-in-the-middle attack on TLS that lets an attacker bounce traffic through his own computers to enable timing and packet-counting attacks. A Tor instance should use the other Tor's reported address information as part of logic to decide whether to treat a given connection as suitable for extending circuits to a given address/ID combination. When we get an extend request, we use an existing OR connection if the ID matches, and ANY of the following conditions hold: - The IP matches the requested IP. - We know that the IP we're using is canonical because it was listed in the NETINFO cell. - We know that the IP we're using is canonical because it was listed in the server descriptor. [NOTE: The NETINFO cell is assigned the command number 8.] Discussion: Versions versus feature lists Many protocols negotiate lists of available features instead of (or in addition to) protocol versions. While it's possible that some amount of feature negotiation could be supported in a later Tor, we should prefer to use protocol versions whenever possible, for reasons discussed in the "Anonymity Loves Company" paper. Discussion: Bytes per version, versions per cell This document provides for a one-byte count of how many versions a Tor supports, and allows one byte per version. Thus, it can only support only 254 more versions of the protocol beyond the unallocated v0 and the current v1. If we ever need to split the protocol into 255 incompatible versions, we've probably screwed up badly somewhere. Nevertheless, here are two ways we could support more versions: - Change the version count to a two-byte field that counts the number of _bytes_ used, and use a UTF8-style encoding: versions 0 through 127 take one byte to encode, versions 128 through 2047 take two bytes to encode, and so on. We wouldn't need to parse any version higher than 127 right now, since all bytes used to encode higher versions would have their high bit set. We'd still have a limit of 380 simultaneously versions that could be declared in any version. This is probably okay. - Decide that if we need to support more versions, we can add a MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec above requires Tors to ignore unrecognized cell types that they get before the first VERSIONS cell, and still allows version negotiation to succeed. [Resolution: Reserve the high bit and the v0 value for later use. If we ever have more live versions than we can fit in a cell, we've made a bad design decision somewhere along the line.] Discussion: Reducing round-trips It might be appealing to see if we can cram more information in the initial VERSIONS cell. For example, the contents of NETINFO will pretty soon be sent by everybody before any more information is exchanged, but decoupling them from the version exchange increases round-trips. Instead, we could speculatively include handshaking information at the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore this." This could be extended to opportunistically reduce round trips when possible for future versions when we guess the versions right. Of course, we'd need to be careful about using a feature like this: - We don't want to include things that are expensive to compute, like PK signatures or proof-of-work. - We don't want to speculate as a mobile client: it may leak our experience with the server in question. Discussion: Advertising versions in routerdescs and networkstatuses. In network-statuses: The networkstatus "v" line now has the format: "v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST "Circuit" CIRCUIT-VERSION-LIST NL LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of supported version numbers. IMPLEMENTATION is the name of the implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the version of the implementation. Examples: v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5 v OtherOR 2000+ Link 3 Circuit 5 Implementations that release independently of the Tor codebase SHOULD NOT use "Tor" as the value of their IMPLEMENTATION. Additional fields on the "v" line MUST be ignored. In router descriptors: The router descriptor should contain a line of the form, "protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST Additional fields on the "protocols" line MUST be ignored. [Versions of Tor before 0.1.2.5-alpha rejected router descriptors with unrecognized items; the protocols line should be preceded with an "opt" until these Tors are obsolete.] Security issues: Client partitioning is the big danger when we introduce new versions; if a client supports some very unusual set of protocol versions, it will stand out from others no matter where it goes. If a server supports an unusual version, it will get a disproportionate amount of traffic from clients who prefer that version. We can mitigate this somewhat as follows: - Do not have clients prefer any protocol version by default until that version is widespread. (First introduce the new version to servers, and have clients admit to using it only when configured to do so for testing. Then, once many servers are running the new protocol version, enable its use by default.) - Do not multiply protocol versions needlessly. - Encourage protocol implementors to implement the same protocol version sets as some popular version of Tor. - Disrecommend very old/unpopular versions of Tor via the directory authorities' RecommmendedVersions mechanism, even if it is still technically possible to use them.
Filename: 106-less-tls-constraint.txt Title: Checking fewer things during TLS handshakes Author: Nick Mathewson Created: 9-Feb-2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document proposes that we relax our requirements on the context of X.509 certificates during initial TLS handshakes. Motivation: Later, we want to try harder to avoid protocol fingerprinting attacks. This means that we'll need to make our connection handshake look closer to a regular HTTPS connection: one certificate on the server side and zero certificates on the client side. For now, about the best we can do is to stop requiring things during handshake that we don't actually use. What we check now, and where we check it: tor_tls_check_lifetime: peer has certificate notBefore <= now <= notAfter tor_tls_verify: peer has at least one certificate There is at least one certificate in the chain At least one of the certificates in the chain is not the one used to negotiate the connection. (The "identity cert".) The certificate _not_ used to negotiate the connection has signed the link cert tor_tls_get_peer_cert_nickname: peer has a certificate. certificate has a subjectName. subjectName has a commonName. commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2] tor_tls_peer_has_cert: peer has a certificate. connection_or_check_valid_handshake: tor_tls_peer_has_cert [1] tor_tls_get_peer_cert_nickname [1] tor_tls_verify [1] If nickname in cert is a known, named router, then its identity digest must be as expected. If we initiated the connection, then we got the identity digest we expected. USEFUL THINGS WE COULD DO: [1] We could just not force clients to have any certificate at all, let alone an identity certificate. Internally to the code, we could assign the identity_digest field of these or_connections to a random number, or even not add them to the identity_digest->or_conn map. [so if somebody connects with no certs, we let them. and mark them as a client and don't treat them as a server. great. -rd] [2] Instead of using a restricted nickname character set that makes our commonName structure look unlike typical SSL certificates, we could treat the nickname as extending from the start of the commonName up to but not including the first non-nickname character. Alternatively, we could stop checking commonNames entirely. We don't actually _do_ anything based on the nickname in the certificate, so there's really no harm in letting every router have any commonName it wants. [this is the better choice -rd] [agreed. -nm] REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS: Assuming that we removed the above requirements, we could then (in a later release) have clients not send certificates, and sometimes and started making our DNs a little less formulaic, client->server OR connections would still be recognizable by: having a two-certificate chain sent by the server using a particular set of ciphersuites traffic patterns probing the server later OTHER IMPLICATIONS: If we stop verifying the above requirements: It will be slightly (but only slightly) more common to connect to a non-Tor server running TLS, and believe that you're talking to a Tor server (until you send the first cell). It will be far easier for non-Tor SSL clients to accidentally connect to Tor servers and speak HTTPS or whatever to them. If, in a later release, we have clients not send certificates, and we make DNs less recognizable: If clients don't send certs, servers don't need to verify them: win! If we remove these restrictions, it will be easier for people to write clients to fuzz our protocol: sorta win! If clients don't send certs, they look slightly less like servers. OTHER SPEC CHANGES: When a client doesn't give us an identity, we should never extend any circuits to it (duh), and we should allow it to set circuit ID however it wants.
Filename: 107-uptime-sanity-checking.txt Title: Uptime Sanity Checking Author: Kevin Bauer & Damon McCoy Created: 8-March-2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document describes how to cap the uptime that is used when computing which routers are marked as stable such that highly stable routers cannot be displaced by malicious routers that report extremely high uptime values. This is similar to how bandwidth is capped at 1.5MB/s. Motivation: It has been pointed out that an attacker can displace all stable nodes and entry guard nodes by reporting high uptimes. This is an easy fix that will prevent highly stable nodes from being displaced. Security implications: It should decrease the effectiveness of routing attacks that report high uptimes while not impacting the normal routing algorithms. Specification: So we could patch Section 3.1 of dir-spec.txt to say: "Stable" -- A router is 'Stable' if it is running, valid, not hibernating, and either its uptime is at least the median uptime for known running, valid, non-hibernating routers, or its uptime is at least 30 days. Routers are never called stable if they are running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.) Compatibility: There should be no compatibility issues due to uptime capping. Implementation: Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788). Discussion: Initially, this proposal set the maximum at 60 days, not 30; the 30 day limit and spec wording was suggested by Roger in an or-dev post on 9 March 2007. This proposal also led to 108-mtbf-based-stability.txt
Filename: 108-mtbf-based-stability.txt Title: Base "Stable" Flag on Mean Time Between Failures Author: Nick Mathewson Created: 10-Mar-2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document proposes that we change how directory authorities set the stability flag from inspection of a router's declared Uptime to the authorities' perceived mean time between failure for the router. Motivation: Clients prefer nodes that the authorities call Stable. This flag is (as of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for uptime. This creates an opportunity for malicious nodes to declare falsely high uptimes in order to get more traffic. Spec changes: Replace the current rule for setting the Stable flag with: "Stable" -- A router is 'Stable' if it is active and its observed Stability for the past month is at or above the median Stability for active routers. Routers are never called stable if they are running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.) Stability shall be defined as the weighted mean length of the runs observed by a given directory authority. A run begins when an authority decides that the server is Running, and ends when the authority decides that the server is not Running. In-progress runs are counted when measuring Stability. When calculating the mean, runs are weighted by $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and $0 < \alpha < 1$. Time when an authority is down do not count to the length of the run. Rejected Alternative: "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every $d$ such that the router was considered reachable for the entire day $d$ days ago. This allows a simpler implementation: every day, we multiply yesterday's Stability by alpha, and if the router was observed to be available every time we looked today, we add 1. Instead of "day", we could pick an arbitrary time unit. We should pick alpha to be high enough that long-term stability counts, but low enough that the distant past is eventually forgotten. Something between .8 and .95 seems right. (By requiring that routers be up for an entire day to get their stability increased, instead of counting fractions of a day, we capture the notion that stability is more like "probability of staying up for the next hour" than it is like "probability of being up at some randomly chosen time over the next hour." The former notion of stability is far more relevant for long-lived circuits.) Limitations: Authorities can have false positives and false negatives when trying to tell whether a router is up or down. So long as these aren't terribly wrong, and so long as they aren't significantly biased, we should be able to use them to estimate stability pretty well. Probing approaches like the above could miss short incidents of downtime. If we use the router's declared uptime, we could detect these: but doing so would penalize routers who reported their uptime accurately. Implementation: For now, the easiest way to store this information at authorities would probably be in some kind of periodically flushed flat file. Later, we could move to Berkeley db or something if we really had to. For each router, an authority will need to store: The router ID. Whether the router is up. The time when the current run started, if the router is up. The weighted sum length of all previous runs. The time at which the weighted sum length was last weighted down. Servers should probe at random intervals to test whether servers are running.
Filename: 109-no-sharing-ips.txt Title: No more than one server per IP address Author: Kevin Bauer & Damon McCoy Created: 9-March-2007 Status: Closed Implemented-In: 0.2.0.x Overview: This document describes a solution to a Sybil attack vulnerability in the directory servers. Currently, it is possible for a single IP address to host an arbitrarily high number of Tor routers. We propose that the directory servers limit the number of Tor routers that may be registered at a particular IP address to some small (fixed) number, perhaps just one Tor router per IP address. While Tor never uses more than one server from a given /16 in the same circuit, an attacker with multiple servers in the same place is still dangerous because he can get around the per-server bandwidth cap that is designed to prevent a single server from attracting too much of the overall traffic. Motivation: Since it is possible for an attacker to register an arbitrarily large number of Tor routers, it is possible for malicious parties to do this as part of a traffic analysis attack. Security implications: This countermeasure will increase the number of IP addresses that an attacker must control in order to carry out traffic analysis. Specification: For each IP address, each directory authority tracks the number of routers using that IP address, along with their total observed bandwidth. If there are more than MAX_SERVERS_PER_IP servers at some IP, the authority should "disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers to disable, the authority should first disable non-Running servers in increasing order of observed bandwidth, and then should disable Running servers in increasing order of bandwidth. [[ We don't actually do this part here. -NM If the total observed bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP, the authority should "disable" some of the remaining servers until only one server remains, or until the remaining observed bandwidth of non-"disabled" servers is under MAX_BW_PER_IP. ]] Servers that are "disabled" MUST be marked as non-Valid and non-Running. MAX_SERVERS_PER_IP is 3. MAX_BW_PER_IP is 8 MB per s. Compatibility: Upon inspection of a directory server, we found that the following IP addresses have more than one Tor router: Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443 WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001 Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001 sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001 moria1 18.244.0.188 moria.mit.edu 9001 peacetime 18.244.0.188 moria.mit.edu 9100 There may exist compatibility issues with this proposed fix. Reasons why more than one server would share an IP address include: * Testing. moria1, moria2, peacetime, and other morias all run on one computer at MIT, because that way we get testing. Moria1 and moria2 are run by Roger, and peacetime is run by Nick. * NAT. If there are several servers but they port-forward through the same IP address, ... we can hope that the operators coordinate with each other. Also, we should recognize that while they help the network in terms of increased capacity, they don't help as much as they could in terms of location diversity. But our approach so far has been to take what we can get. * People who have more than 1.5MB/s and want to help out more. For example, for a while Tonga was offering 10MB/s and its Tor server would only make use of a bit of it. So Roger suggested that he run two Tor servers, to use more. [Note Roger's tweak to this behavior, in http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]
Filename: 110-avoid-infinite-circuits.txt Title: Avoiding infinite length circuits Author: Roger Dingledine Created: 13-Mar-2007 Status: Closed Target: 0.2.3.x Implemented-In: 0.2.1.3-alpha, 0.2.3.11-alpha History: Revised 28 July 2008 by nickm: set K. Revised 3 July 2008 by nickm: rename from relay_extend to relay_early. Revise to current migration plan. Allow K cells over circuit lifetime, not just at start. Overview: Right now, an attacker can add load to the Tor network by extending a circuit an arbitrary number of times. Every cell that goes down the circuit then adds N times that amount of load in overall bandwidth use. This vulnerability arises because servers don't know their position on the path, so they can't tell how many nodes there are before them on the path. We propose a new set of relay cells that are distinguishable by intermediate hops as permitting extend cells. This approach will allow us to put an upper bound on circuit length relative to the number of colluding adversary nodes; but there are some downsides too. Motivation: The above attack can be used to generally increase load all across the network, or it can be used to target specific servers: by building a circuit back and forth between two victim servers, even a low-bandwidth attacker can soak up all the bandwidth offered by the fastest Tor servers. The general attacks could be used as a demonstration that Tor isn't perfect (leading to yet more media articles about "breaking" Tor), and the targetted attacks will come into play once we have a reputation system -- it will be trivial to DoS a server so it can't pass its reputation checks, in turn impacting security. Design: We should split RELAY cells into two types: RELAY and RELAY_EARLY. Only K (say, 10) Relay_early cells can be sent across a circuit, and only relay_early cells are allowed to contain extend requests. We still support obscuring the length of the circuit (if more research shows us what to do), because Alice can choose how many of the K to mark as relay_early. Note that relay_early cells *can* contain any sort of data cell; so in effect it's actually the relay type cells that are restricted. By default, she would just send the first K data cells over the stream as relay_early cells, regardless of their actual type. (Note that a circuit that is out of relay_early cells MUST NOT be cannibalized later, since it can't extend. Note also that it's always okay to use regular RELAY cells when sending non-EXTEND commands targetted at the first hop of a circuit, since there is no intermediate hop to try to learn the relay command type.) Each intermediate server would pass on the same type of cell that it received (either relay or relay_early), and the cell's destination will be able to learn whether it's allowed to contain an Extend request. If an intermediate server receives more than K relay_early cells, or if it sees a relay cell that contains an extend request, then it tears down the circuit (protocol violation). Security implications: The upside is that this limits the bandwidth amplification factor to K: for an individual circuit to become arbitrary-length, the attacker would need an adversary-controlled node every K hops, and at that point the attack is no worse than if the attacker creates N/K separate K-hop circuits. On the other hand, we want to pick a large enough value of K that we don't mind the cap. If we ever want to take steps to hide the number of hops in the circuit or a node's position in the circuit, this design probably makes that more complex. Migration: In 0.2.0, servers speaking v2 or later of the link protocol accept RELAY_EARLY cells, and pass them on. If the next OR in the circuit is not speaking the v2 link protocol, the server relays the cell as a RELAY cell. In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2 connections. This functionality can be safely backported to 0.2.0.x. Clients should pick a random number betweeen (say) K and K-2 to send. In 0.2.1.3-alpha, servers close any circuit in which more than K relay_early cells are sent. Once all versions the do not send RELAY_EARLY cells are obsolete, servers can begin to reject any EXTEND requests not sent in a RELAY_EARLY cell. Parameters: Let K = 8, for no terribly good reason. Spec: [We can formalize this part once we think the design is a good one.] Acknowledgements: This design has been kicking around since Christian Grothoff and I came up with it at PET 2004. (Nathan Evans, Christian Grothoff's student, is working on implementing a fix based on this design in the summer 2007 timeframe.)
Filename: 111-local-traffic-priority.txt Title: Prioritizing local traffic over relayed traffic Author: Roger Dingledine Created: 14-Mar-2007 Status: Closed Implemented-In: 0.2.0.x Overview: We describe some ways to let Tor users operate as a relay and enforce rate limiting for relayed traffic without impacting their locally initiated traffic. Motivation: Right now we encourage people who use Tor as a client to configure it as a relay too ("just click the button in Vidalia"). Most of these users are on asymmetric links, meaning they have a lot more download capacity than upload capacity. But if they enable rate limiting too, suddenly they're limited to the same download capacity as upload capacity. And they have to enable rate limiting, or their upstream pipe gets filled up, starts dropping packets, and now their net connection doesn't work even for non-Tor stuff. So they end up turning off the relaying part so they can use Tor (and other applications) again. So far this hasn't mattered that much: most of our fast relays are being operated only in relay mode, so the rate limiting makes sense for them. But if we want to be able to attract many more relays in the future, we need to let ordinary users act as relays too. Further, as we begin to deploy the blocking-resistance design and we rely on ordinary users to click the "Tor for Freedom" button, this limitation will become a serious stumbling block to getting volunteers to act as bridges. The problem: Tor implements its rate limiting on the 'read' side by only reading a certain number of bytes from the network in each second. If it has emptied its token bucket, it doesn't read any more from the network; eventually TCP notices and stalls until we resume reading. But if we want to have two classes of service, we can't know what class a given incoming cell will be until we look at it, at which point we've already read it. Some options: Option 1: read when our token bucket is full enough, and if it turns out that what we read was local traffic, then add the tokens back into the token bucket. This will work when local traffic load alternates with relayed traffic load; but it's a poor option in general, because when we're receiving both local and relayed traffic, there are plenty of cases where we'll end up with an empty token bucket, and then we're back where we were before. More generally, notice that our problem is easy when a given TCP connection either has entirely local circuits or entirely relayed circuits. In fact, even if they are both present, if one class is entirely idle (none of its circuits have sent or received in the past N seconds), we can ignore that class until it wakes up again. So it only gets complex when a single connection contains active circuits of both classes. Next, notice that local traffic uses only the entry guards, whereas relayed traffic likely doesn't. So if we're a bridge handling just a few users, the expected number of overlapping connections would be almost zero, and even if we're a full relay the number of overlapping connections will be quite small. Option 2: build separate TCP connections for local traffic and for relayed traffic. In practice this will actually only require a few extra TCP connections: we would only need redundant TCP connections to at most the number of entry guards in use. However, this approach has some drawbacks. First, if the remote side wants to extend a circuit to you, how does it know which TCP connection to send it on? We would need some extra scheme to label some connections "client-only" during construction. Perhaps we could do this by seeing whether any circuit was made via CREATE_FAST; but this still opens up a race condition where the other side sends a create request immediately. The only ways I can imagine to avoid the race entirely are to specify our preference in the VERSIONS cell, or to add some sort of "nope, not this connection, why don't you try another rather than failing" response to create cells, or to forbid create cells on connections that you didn't initiate and on which you haven't seen any circuit creation requests yet -- this last one would lead to a bit more connection bloat but doesn't seem so bad. And we already accept this race for the case where directory authorities establish new TCP connections periodically to check reachability, and then hope to hang up on them soon after. (In any case this issue is moot for bridges, since each destination will be one-way with respect to extend requests: either receiving extend requests from bridge users or sending extend requests to the Tor server, never both.) The second problem with option 2 is that using two TCP connections reveals that there are two classes of traffic (and probably quickly reveals which is which, based on throughput). Now, it's unclear whether this information is already available to the other relay -- he would easily be able to tell that some circuits are fast and some are rate limited, after all -- but it would be nice to not add even more ways to leak that information. Also, it's less clear that an external observer already has this information if the circuits are all bundled together, and for this case it's worth trying to protect it. Option 3: tell the other side about our rate limiting rules. When we establish the TCP connection, specify the different policy classes we have configured. Each time we extend a circuit, specify which policy class that circuit should be part of. Then hope the other side obeys our wishes. (If he doesn't, hang up on him.) Besides the design and coordination hassles involved in this approach, there's a big problem: our rate limiting classes apply to all our connections, not just pairwise connections. How does one server we're connected to know how much of our bucket has already been spent by another? I could imagine a complex and inefficient "ok, now you can send me those two more cells that you've got queued" protocol. I'm not sure how else we could do it. (Gosh. How could UDP designs possibly be compatible with rate limiting with multiple bucket sizes?) Option 4: put both classes of circuits over a single connection, and keep track of the last time we read or wrote a high-priority cell. If it's been less than N seconds, give the whole connection high priority, else give the whole connection low priority. Option 5: put both classes of circuits over a single connection, and play a complex juggling game by periodically telling the remote side what rate limits to set for that connection, so you end up giving priority to the right connections but still stick to roughly your intended bandwidthrate and relaybandwidthrate. Option 6: ? Prognosis: Nick really didn't like option 2 because of the partitioning questions. I've put option 4 into place as of Tor 0.2.0.3-alpha. In terms of implementation, it will be easy: just add a time_t to or_connection_t that specifies client_used (used by the initiator of the connection to rate limit it differently depending on how recently the time_t was reset). We currently update client_used in three places: - command_process_relay_cell() when we receive a relay cell for an origin circuit. - relay_send_command_from_edge() when we send a relay cell for an origin circuit. - circuit_deliver_create_cell() when send a create cell. We could probably remove the third case and it would still work, but hey.
Filename: 112-bring-back-pathlencoinweight.txt Title: Bring Back Pathlen Coin Weight Author: Mike Perry Created: Status: Superseded Superseded-By: 115 Overview: The idea is that users should be able to choose a weight which probabilistically chooses their path lengths to be 2 or 3 hops. This weight will essentially be a biased coin that indicates an additional hop (beyond 2) with probability P. The user should be allowed to choose 0 for this weight to always get 2 hops and 1 to always get 3. This value should be modifiable from the controller, and should be available from Vidalia. Motivation: The Tor network is slow and overloaded. Increasingly often I hear stories about friends and friends of friends who are behind firewalls, annoying censorware, or under surveillance that interferes with their productivity and Internet usage, or chills their speech. These people know about Tor, but they choose to put up with the censorship because Tor is too slow to be usable for them. In fact, to download a fresh, complete copy of levine-timing.pdf for the Anonymity Implications section of this proposal over Tor took me 3 tries. There are many ways to improve the speed problem, and of course we should and will implement as many as we can. Johannes's GSoC project and my reputation system are longer term, higher-effort things that will still provide benefit independent of this proposal. However, reducing the path length to 2 for those who do not need the (questionable) extra anonymity 3 hops provide not only improves their Tor experience but also reduces their load on the Tor network by 33%, and can be done in less than 10 lines of code. That's not just Win-Win, it's Win-Win-Win. Furthermore, when blocking resistance measures insert an extra relay hop into the equation, 4 hops will certainly be completely unusable for these users, especially since it will be considerably more difficult to balance the load across a dark relay net than balancing the load on Tor itself (which today is still not without its flaws). Anonymity Implications: It has long been established that timing attacks against mixed networks are extremely effective, and that regardless of path length, if the adversary has compromised your first and last hop of your path, you can assume they have compromised your identity for that connection. In [1], it is demonstrated that for all but the slowest, lossiest networks, error rates for false positives and false negatives were very near zero. Only for constant streams of traffic over slow and (more importantly) extremely lossy network links did the error rate hit 20%. For loss rates typical to the Internet, even the error rate for slow nodes with constant traffic streams was 13%. When you take into account that most Tor streams are not constant, but probably much more like their "HomeIP" dataset, which consists mostly of web traffic that exists over finite intervals at specific times, error rates drop to fractions of 1%, even for the "worst" network nodes. Therefore, the user has little benefit from the extra hop, assuming the adversary does timing correlation on their nodes. The real protection is the probability of getting both the first and last hop, and this is constant whether the client chooses 2 hops, 3 hops, or 42. Partitioning attacks form another concern. Since Tor uses telescoping to build circuits, it is possible to tell a user is constructing only two hop paths at the entry node. It is questionable if this data is actually worth anything though, especially if the majority of users have easy access to this option, and do actually choose their path lengths semi-randomly. Nick has postulated that exits may also be able to tell that you are using only 2 hops by the amount of time between sending their RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they see from the OP. I doubt that they will be able to make much use of this timing pattern, since it will likely vary widely depending upon the type of node selected for that first hop, and the user's connection rate to that first hop. It is also questionable if this data is worth anything, especially if many users are using this option (and I imagine many will). Perhaps most seriously, two hop paths do allow malicious guards to easily fail circuits if they do not extend to their colluding peers for the exit hop. Since guards can detect the number of hops in a path, they could always fail the 3 hop circuits and focus on selectively failing the two hop ones until a peer was chosen. I believe currently guards are rotated if circuits fail, which does provide some protection, but this could be changed so that an entry guard is completely abandoned after a certain ratio of extend or general circuit failures with respect to non-failed circuits. This could possibly be gamed to increase guard turnover, but such a game would be much more noticeable than an individual guard failing circuits, though, since it would affect all clients, not just those who chose a particular guard. Why not fix Pathlen=2?: The main reason I am not advocating that we always use 2 hops is that in some situations, timing correlation evidence by itself may not be considered as solid and convincing as an actual, uninterrupted, fully traced path. Are these timing attacks as effective on a real network as they are in simulation? Would an extralegal adversary or authoritarian government even care? In the face of these situation-dependent unknowns, it should be up to the user to decide if this is a concern for them or not. It should probably also be noted that even a false positive rate of 1% for a 200k concurrent-user network could mean that for a given node, a given stream could be confused with something like 10 users, assuming ~200 nodes carry most of the traffic (ie 1000 users each). Though of course to really know for sure, someone needs to do an attack on a real network, unfortunately. Implementation: new_route_len() can be modified directly with a check of the PathlenCoinWeight option (converted to percent) and a call to crypto_rand_int(0,100) for the weighted coin. The entry_guard_t structure could have num_circ_failed and num_circ_succeeded members such that if it exceeds N% circuit extend failure rate to a second hop, it is removed from the entry list. N should be sufficiently high to avoid churn from normal Tor circuit failure as determined by TorFlow scans. The Vidalia option should be presented as a boolean, to minimize confusion for the user. Something like a radiobutton with: * "I use Tor for Censorship Resistance, not Anonymity. Speed is more important to me than Anonymity." * "I use Tor for Anonymity. I need extra protection at the cost of speed." and then some explanation in the help for exactly what this means, and the risks involved with eliminating the adversary's need for timing attacks wrt to false positives, etc. Migration: Phase one: Experiment with the proper ratio of circuit failures used to expire garbage or malicious guards via TorFlow. Phase two: Re-enable config and modify new_route_len() to add an extra hop if coin comes up "heads". Phase three: Make radiobutton in Vidalia, along with help entry that explains in layman's terms the risks involved. [1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
Filename: 113-fast-authority-interface.txt Title: Simplifying directory authority administration Author: Nick Mathewson Created: Status: Superseded Overview The problem: Administering a directory authority is a pain: you need to go through emails and manually add new nodes as "named". When bad things come up, you need to mark nodes (or whole regions) as invalid, badexit, etc. This means that mostly, authority admins don't: only 2/4 current authority admins actually bind names or list bad exits, and those two have often complained about how annoying it is to do so. Worse, name binding is a common path, but it's a pain in the neck: nobody has done it for a couple of months. Digression: who knows what? It's trivial for Tor to automatically keep track of all of the following information about a server: name, fingerprint, IP, last-seen time, first-seen time, declared contact. All we need to have the administrator set is: - Is this name/fingerprint pair bound? - Is this fingerprint/IP a bad exit? - Is this fingerprint/IP an invalid node? - Is this fingerprint/IP to be rejected? The workflow for authority admins has two parts: - Periodically, go through tor-ops and add new names. This doesn't need to be done urgently. - Less often, mark badly behaved serves as badly behaved. This is more urgent. Possible solution #1: Web-interface for name binding. Deprecate use of the tor-ops mailing list; instead, have operators go to a webform and enter their server info. This would put the information in a standardized format, thus allowing quick, nearly-automated approval and reply. Possible solution #2: Self-binding names. Peter Palfrader has proposed that names be assigned automatically to nodes that have been up and running and valid for a while. Possible solution #3: Self-maintaining approved-routers file Mixminion alpha has a neat feature where whenever a new server is seen, a stub line gets added to a configuration file. For Tor, it could look something like this: ## First seen with this key on 2007-04-21 13:13:14 ## Stayed up for at least 12 hours on IP 192.168.10.10 #RouterName AAAABBBBCCCCDDDDEFEF (Note that the implementation needs to parse commented lines to make sure that it doesn't add duplicates, but that's not so hard.) To add a router as named, administrators would only need to uncomment the entry. This automatically maintained file could be kept separately from a manually maintained one. This could be combined with solution #2, such that Tor would do the hard work of uncommenting entries for routers that should get Named, but operators could override its decisions. Possible solution #4: A separate mailing list for authority operators. Right now, the tor-ops list is very high volume. There should be another list that's only for dealing with problems that need prompt action, like marking a router as !badexit. Resolution: Solution #2 is described in "Proposal 123: Naming authorities automatically create bindings", and that approach is implemented. There are remaining issues in the problem statement above that need their own solutions.
Filename: 114-distributed-storage.txt Title: Distributed Storage for Tor Hidden Service Descriptors Author: Karsten Loesing Created: 13-May-2007 Status: Closed Implemented-In: 0.2.0.x Change history: 13-May-2007 Initial proposal 14-May-2007 Added changes suggested by Lasse Øverlier 30-May-2007 Changed descriptor format, key length discussion, typos 09-Jul-2007 Incorporated suggestions by Roger, added status of specification and implementation for upcoming GSoC mid-term evaluation 11-Aug-2007 Updated implementation statuses, included non-consecutive replication to descriptor format 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2 02-Dec-2007 Closed proposal Overview: The basic idea of this proposal is to distribute the tasks of storing and serving hidden service descriptors from currently three authoritative directory nodes among a large subset of all onion routers. The three reasons to do this are better robustness (availability), better scalability, and improved security properties. Further, this proposal suggests changes to the hidden service descriptor format to prevent new security threats coming from decentralization and to gain even better security properties. Status: As of December 2007, the new hidden service descriptor format is implemented and usable. However, servers and clients do not yet make use of descriptor cookies, because there are open usability issues of this feature that might be resolved in proposal 121. Further, hidden service directories do not perform replication by themselves, because (unauthorized) replica fetch requests would allow any attacker to fetch all hidden service descriptors in the system. As neither issue is critical to the functioning of v2 descriptors and their distribution, this proposal is considered as Closed. Motivation: The current design of hidden services exhibits the following performance and security problems: First, the three hidden service authoritative directories constitute a performance bottleneck in the system. The directory nodes are responsible for storing and serving all hidden service descriptors. As of May 2007 there are about 1000 descriptors at a time, but this number is assumed to increase in the future. Further, there is no replication protocol for descriptors between the three directory nodes, so that hidden services must ensure the availability of their descriptors by manually publishing them on all directory nodes. Whenever a fourth or fifth hidden service authoritative directory is added, hidden services will need to maintain an equally increasing number of replicas. These scalability issues have an impact on the current usage of hidden services and put an even higher burden on the development of new kinds of applications for hidden services that might require storing even more descriptors. Second, besides posing a limitation to scalability, storing all hidden service descriptors on three directory nodes also constitutes a security risk. The directory node operators could easily analyze the publish and fetch requests to derive information on service activity and usage and read the descriptor contents to determine which onion routers work as introduction points for a given hidden service and need to be attacked or threatened to shut it down. Furthermore, the contents of a hidden service descriptor offer only minimal security properties to the hidden service. Whoever gets aware of the service ID can easily find out whether the service is active at the moment and which introduction points it has. This applies to (former) clients, (former) introduction points, and of course to the directory nodes. It requires only to request the descriptor for the given service ID, which can be performed by anyone anonymously. This proposal suggests two major changes to approach the described performance and security problems: The first change affects the storage location for hidden service descriptors. Descriptors are distributed among a large subset of all onion routers instead of three fixed directory nodes. Each storing node is responsible for a subset of descriptors for a limited time only. It is not able to choose which descriptors it stores at a certain time, because this is determined by its onion ID which is hard to change frequently and in time (only routers which are stable for a given time are accepted as storing nodes). In order to resist single node failures and untrustworthy nodes, descriptors are replicated among a certain number of storing nodes. A first replication protocol makes sure that descriptors don't get lost when the node population changes; therefore, a storing node periodically requests the descriptors from its siblings. A second replication protocol distributes descriptors among non-consecutive nodes of the ID ring to prevent a group of adversaries from generating new onion keys until they have consecutive IDs to create a 'black hole' in the ring and make random services unavailable. Connections to storing nodes are established by extending existing circuits by one hop to the storing node. This also ensures that contents are encrypted. The effect of this first change is that the probability that a single node operator learns about a certain hidden service is very small and that it is very hard to track a service over time, even when it collaborates with other node operators. The second change concerns the content of hidden service descriptors. Obviously, security problems cannot be solved only by decentralizing storage; in fact, they could also get worse if done without caution. At first, a descriptor ID needs to change periodically in order to be stored on changing nodes over time. Next, the descriptor ID needs to be computable only for the service's clients, but should be unpredictable for all other nodes. Further, the storing node needs to be able to verify that the hidden service is the true originator of the descriptor with the given ID even though it is not a client. Finally, a storing node should learn as little information as necessary by storing a descriptor, because it might not be as trustworthy as a directory node; for example it does not need to know the list of introduction points. Therefore, a second key is applied that is only known to the hidden service provider and its clients and that is not included in the descriptor. It is used to calculate descriptor IDs and to encrypt the introduction points. This second key can either be given to all clients together with the hidden service ID, or to a group or a single client as an authentication token. In the future this second key could be the result of some key agreement protocol between the hidden service and one or more clients. A new text-based format is proposed for descriptors instead of an extension of the existing binary format for reasons of future extensibility. Design: The proposed design is described by the required changes to the current design. These requirements are grouped by content, rather than by affected specification documents or code files, and numbered for reference below. Hidden service clients, servers, and directories: /1/ Create routing list All participants can filter the consensus status document received from the directory authorities to one routing list containing only those servers that store and serve hidden service descriptors and which are running for at least 24 hours. A participant only trusts its own routing list and never learns about routing information from other parties. /2/ Determine responsible hidden service directory All participants can determine the hidden service directory that is responsible for storing and serving a given ID, as well as the hidden service directories that replicate its content. Every hidden service directory is responsible for the descriptor IDs in the interval from its predecessor, exclusive, to its own ID, inclusive. Further, a hidden service directory holds replicas for its n predecessors, where n denotes the number of consecutive replicas. (requires /1/) [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory requests which have not been fulfilled in the course of the implementation of this proposal, but elsewhere.] Hidden service directory nodes: /5/ Advertise hidden service directory functionality Every onion router that has its directory port open can decide whether it wants to store and serve hidden service descriptors by setting a new config option "HidServDirectoryV2" 0|1 to 1. An onion router with this config option being set includes the flag "hidden-service-dir" in its router descriptors that it sends to directory authorities. /6/ Accept v2 publish requests, parse and store v2 descriptors Hidden service directory nodes accept publish requests for hidden service descriptors and store them to their local memory. (It is not necessary to make descriptors persistent, because after disconnecting, the onion router would not be accepted as storing node anyway, because it has not been running for at least 24 hours.) All requests and replies are formatted as HTTP messages. Requests are directed to the router's directory port and are contained within BEGIN_DIR cells. A hidden service directory node stores a descriptor only when it thinks that it is responsible for storing that descriptor based on its own routing table. Every hidden service directory node is responsible for the descriptor IDs in the interval of its n-th predecessor in the ID circle up to its own ID (n denotes the number of consecutive replicas). (requires /1/) /7/ Accept v2 fetch requests Same as /6/, but with fetch requests for hidden service descriptors. (requires /2/) /8/ Replicate descriptors with neighbors A hidden service directory node replicates descriptors from its two predecessors by downloading them once an hour. Further, it checks its routing table periodically for changes. Whenever it realizes that a predecessor has left the network, it establishes a connection to the new n-th predecessor and requests its stored descriptors in the interval of its (n+1)-th predecessor and the requested n-th predecessor. Whenever it realizes that a new onion router has joined with an ID higher than its former n-th predecessor, it adds it to its predecessors and discards all descriptors in the interval of its (n+1)-th and its n-th predecessor. (requires /1/) [Dec 02: This function has not been implemented, because arbitrary nodes what have been able to download the entire set of v2 descriptors. An authorized replication request would be necessary. For the moment, the system runs without any directory-side replication. -KL] Authoritative directory nodes: /9/ Confirm a router's hidden service directory functionality Directory nodes include a new flag "HSDir" for routers that decided to provide storage for hidden service descriptors and that are running for at least 24 hours. The last requirement prevents a node from frequently changing its onion key to become responsible for an identifier it wants to target. Hidden service provider: /10/ Configure v2 hidden service Each hidden service provider that has set the config option "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2 descriptors and conform to the v2 connection establishment protocol. When configuring a hidden service, a hidden service provider checks if it has already created a random secret_cookie and a hostname2 file; if not, it creates both of them. (requires /2/) /11/ Establish introduction points with fresh key If configured to publish only v2 descriptors and no v0/v1 descriptors any more, a hidden service provider that is setting up the hidden service at introduction points does not pass its own public key, but the public key of a freshly generated key pair. It also includes these fresh public keys in the hidden service descriptor together with the other introduction point information. The reason is that the introduction point does not need to and therefore should not know for which hidden service it works, so as to prevent it from tracking the hidden service's activity. (If a hidden service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients rely on the fact that all introduction points accept the same public key, so that this new feature cannot be used.) /12/ Encode v2 descriptors and send v2 publish requests If configured to publish v2 descriptors, a hidden service provider publishes a new descriptor whenever its content changes or a new publication period starts for this descriptor. If the current publication period would only last for less than 60 minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind and the client 30 minutes ahead), the hidden service provider publishes both a current descriptor and one for the next period. Publication is performed by sending the descriptor to all hidden service directories that are responsible for keeping replicas for the descriptor ID. This includes two non-consecutive replicas that are stored at 3 consecutive nodes each. (requires /1/ and /2/) Hidden service client: /13/ Send v2 fetch requests A hidden service client that has set the config option "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion addresses by requesting a v2 descriptor from a randomly chosen hidden service directory that is responsible for keeping replica for the descriptor ID. In total there are six replicas of which the first and the last three are stored on consecutive nodes. The probability of picking one of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the fact that the availability will be the highest on the node with next higher ID. A hidden service client relies on the hidden service provider to store two sets of descriptors to compensate clock skew between service and client. (requires /1/ and /2/) /14/ Process v2 fetch reply and parse v2 descriptors A hidden service client that has sent a request for a v2 descriptor can parse it and store it to the local cache of rendezvous service descriptors. /15/ Establish connection to v2 hidden service A hidden service client can establish a connection to a hidden service using a v2 descriptor. This includes using the secret cookie for decrypting the introduction points contained in the descriptor. When contacting an introduction point, the client does not use the public key of the hidden service provider, but the freshly-generated public key that is included in the hidden service descriptor. Whether or not a fresh key is used instead of the key of the hidden service depends on the available protocol versions that are included in the descriptor; by this, connection establishment is to a certain extend decoupled from fetching the descriptor. Hidden service descriptor: (Requirements concerning the descriptor format are contained in /6/ and /7/.) The new v2 hidden service descriptor format looks like this: onion-address = h(public-key) + cookie descriptor-id = h(h(public-key) + h(time-period + cookie + relica)) descriptor-content = { descriptor-id, version, public-key, h(time-period + cookie + replica), timestamp, protocol-versions, { introduction-points } encrypted with cookie } signed with private-key The "descriptor-id" needs to change periodically in order for the descriptor to be stored on changing nodes over time. It may only be computable by a hidden service provider and all of his clients to prevent unauthorized nodes from tracking the service activity by periodically checking whether there is a descriptor for this service. Finally, the hidden service directory needs to be able to verify that the hidden service provider is the true originator of the descriptor with the given ID. Therefore, "descriptor-id" is derived from the "public-key" of the hidden service provider, the current "time-period" which changes every 24 hours, a secret "cookie" shared between hidden service provider and clients, and a "replica" denoting the number of this non-consecutive replica. (The "time-period" is constructed in a way that time periods do not change at the same moment for all descriptors by deriving a value between 0:00 and 23:59 hours from h(public-key) and making the descriptors of this hidden service provider expire at that time of the day.) The "descriptor-id" is defined to be 160 bits long. [extending the "descriptor-id" length suggested by LØ] Only the hidden service provider and the clients are able to generate future "descriptor-ID"s. Hence, the "onion-address" is extended from now the hash value of "public-key" by the secret "cookie". The "public-key" is determined to be 80 bits long, whereas the "cookie" is dimensioned to be 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is quite a lot to handle for a human, but necessary to provide sufficient protection against an adversary from generating a key pair with same "public-key" hash or guessing the "cookie". A hidden service directory can verify that a descriptor was created by the hidden service provider by checking if the "descriptor-id" corresponds to the "public-key" and if the signature can be verified with the "public-key". The "introduction-points" that are included in the descriptor are encrypted using the same "cookie" that is shared between hidden service provider and clients. [correction to use another key than h(time-period + cookie) as encryption key for introduction points made by LØ] A new text-based format is proposed for descriptors instead of an extension of the existing binary format for reasons of future extensibility. Security implications: The security implications of the proposed changes are grouped by the roles of nodes that could perform attacks or on which attacks could be performed. Attacks by authoritative directory nodes Authoritative directory nodes are no longer the single places in the network that know about a hidden service's activity and introduction points. Thus, they cannot perform attacks using this information, e.g. track a hidden service's activity or usage pattern or attack its introduction points. Formerly, it would only require a single corrupted authoritative directory operator to perform such an attack. Attacks by hidden service directory nodes A hidden service directory node could misuse a stored descriptor to track a hidden service's activity and usage pattern by clients. Though there is no countermeasure against this kind of attack, it is very expensive to track a certain hidden service over time. An attacker would need to run a large number of stable onion routers that work as hidden service directory nodes to have a good probability to become responsible for its changing descriptor IDs. For each period, the probability is: 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N as total number of hidden service directories, c as compromised nodes, and r as number of replicas The hidden service directory nodes could try to make a certain hidden service unavailable to its clients. Therefore, they could discard all stored descriptors for that hidden service and reply to clients that there is no descriptor for the given ID or return an old or false descriptor content. The client would detect a false descriptor, because it could not contain a correct signature. But an old content or an empty reply could confuse the client. Therefore, the countermeasure is to replicate descriptors among a small number of hidden service directories, e.g. 5. The probability of a group of collaborating nodes to make a hidden service completely unavailable is in each period: (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise, with N as total number of hidden service directories, c as compromised nodes, and r as number of replicas A hidden service directory could try to find out which introduction points are working on behalf of a hidden service. In contrast to the previous design, this is not possible anymore, because this information is encrypted to the clients of a hidden service. Attacks on hidden service directory nodes An anonymous attacker could try to swamp a hidden service directory with false descriptors for a given descriptor ID. This is prevented by requiring that descriptors are signed. Anonymous attackers could swamp a hidden service directory with correct descriptors for non-existing hidden services. There is no countermeasure against this attack. However, the creation of valid descriptors is more expensive than verification and storage in local memory. This should make this kind of attack unattractive. Attacks by introduction points Current or former introduction points could try to gain information on the hidden service they serve. But due to the fresh key pair that is used by the hidden service, this attack is not possible anymore. Attacks by clients Current or former clients could track a hidden service's activity, attack its introduction points, or determine the responsible hidden service directory nodes and attack them. There is nothing that could prevent them from doing so, because honest clients need the full descriptor content to establish a connection to the hidden service. At the moment, the only countermeasure against dishonest clients is to change the secret cookie and pass it only to the honest clients. Compatibility: The proposed design is meant to replace the current design for hidden service descriptors and their storage in the long run. There should be a first transition phase in which both, the current design and the proposed design are served in parallel. Onion routers should start serving as hidden service directories, and hidden service providers and clients should make use of the new design if both sides support it. Hidden service providers should be allowed to publish descriptors of the current format in parallel, and authoritative directories should continue storing and serving these descriptors. After the first transition phase, hidden service providers should stop publishing descriptors on authoritative directories, and hidden service clients should not try to fetch descriptors from the authoritative directories. However, the authoritative directories should continue serving hidden service descriptors for a second transition phase. As of this point, all v2 config options should be set to a default value of 1. After the second transition phase, the authoritative directories should stop serving hidden service descriptors.
Filename: 115-two-hop-paths.txt Title: Two Hop Paths Author: Mike Perry Created: Status: Dead Supersedes: 112 Overview: The idea is that users should be able to choose if they would like to have either two or three hop paths through the tor network. Let us be clear: the users who would choose this option should be those that are concerned with IP obfuscation only: ie they would not be targets of a resource-intensive multi-node attack. It is sometimes said that these users should find some other network to use other than Tor. This is a foolish suggestion: more users improves security of everyone, and the current small userbase size is a critical hindrance to anonymity, as is discussed below and in [1]. This value should be modifiable from the controller, and should be available from Vidalia. Motivation: The Tor network is slow and overloaded. Increasingly often I hear stories about friends and friends of friends who are behind firewalls, annoying censorware, or under surveillance that interferes with their productivity and Internet usage, or chills their speech. These people know about Tor, but they choose to put up with the censorship because Tor is too slow to be usable for them. In fact, to download a fresh, complete copy of levine-timing.pdf for the Theoretical Argument section of this proposal over Tor took me 3 tries. Furthermore, the biggest current problem with Tor's anonymity for those who really need it is not someone attacking the network to discover who they are. It's instead the extreme danger that so few people use Tor because it's so slow, that those who do use it have essentially no confusion set. The recent case where the professor and the rogue Tor user were the only Tor users on campus, and thus suspected in an incident involving Tor and that University underscores this point: "That was why the police had come to see me. They told me that only two people on our campus were using Tor: me and someone they suspected of engaging in an online scam. The detectives wanted to know whether the other user was a former student of mine, and why I was using Tor"[1]. Not only does Tor provide no anonymity if you use it to be anonymous but are obviously from a certain institution, location or circumstance, it is also dangerous to use Tor for risk of being accused of having something significant enough to hide to be willing to put up with the horrible performance as opposed to using some weaker alternative. There are many ways to improve the speed problem, and of course we should and will implement as many as we can. Johannes's GSoC project and my reputation system are longer term, higher-effort things that will still provide benefit independent of this proposal. However, reducing the path length to 2 for those who do not need the extra anonymity 3 hops provide not only improves their Tor experience but also reduces their load on the Tor network by 33%, and should increase adoption of Tor by a good deal. That's not just Win-Win, it's Win-Win-Win. Who will enable this option? This is the crux of the proposal. Admittedly, there is some anonymity loss and some degree of decreased investment required on the part of the adversary to attack 2 hop users versus 3 hop users, even if it is minimal and limited mostly to up-front costs and false positives. The key questions are: 1. Are these users in a class such that their risk is significantly less than the amount of this anonymity loss? 2. Are these users able to identify themselves? Many many users of Tor are not at risk for an adversary capturing c/n nodes of the network just to see what they do. These users use Tor to circumvent aggressive content filters, or simply to keep their IP out of marketing and search engine databases. Most content filters have no interest in running Tor nodes to catch violators, and marketers certainly would never consider such a thing, both on a cost basis and a legal one. In a sense, this represents an alternate threat model against these users who are not at risk for Tor's normal threat model. It should be evident to these users that they fall into this class. All that should be needed is a radio button * "I use Tor for local content filter circumvention and/or IP obfuscation, not anonymity. Speed is more important to me than high anonymity. No one will make considerable efforts to determine my real IP." * "I use Tor for anonymity and/or national-level, legally enforced censorship. It is possible effort will be taken to identify me, including but not limited to network surveillance. I need more protection." and then some explanation in the help for exactly what this means, and the risks involved with eliminating the adversary's need for timing attacks with respect to false positives. Ultimately, the decision is a simple one that can be made without this information, however. The user does not need Paul Syverson to instruct them on the deep magic of Onion Routing to make this decision. They just need to know why they use Tor. If they use it just to stay out of marketing databases and/or bypass a local content filter, two hops is plenty. This is likely the vast majority of Tor users, and many non-users we would like to bring on board. So, having established this class of users, let us now go on to examine theoretical and practical risks we place them at, and determine if these risks violate the users needs, or introduce additional risk to node operators who may be subject to requests from law enforcement to track users who need 3 hops, but use 2 because they enjoy the thrill of russian roulette. Theoretical Argument: It has long been established that timing attacks against mixed and onion networks are extremely effective, and that regardless of path length, if the adversary has compromised your first and last hop of your path, you can assume they have compromised your identity for that connection. In fact, it was demonstrated that for all but the slowest, lossiest networks, error rates for false positives and false negatives were very near zero[2]. Only for constant streams of traffic over slow and (more importantly) extremely lossy network links did the error rate hit 20%. For loss rates typical to the Internet, even the error rate for slow nodes with constant traffic streams was 13%. When you take into account that most Tor streams are not constant, but probably much more like their "HomeIP" dataset, which consists mostly of web traffic that exists over finite intervals at specific times, error rates drop to fractions of 1%, even for the "worst" network nodes. Therefore, the user has little benefit from the extra hop, assuming the adversary does timing correlation on their nodes. Since timing correlation is simply an implementation issue and is most likely a single up-front cost (and one that is like quite a bit cheaper than the cost of the machines purchased to host the nodes to mount an attack), the real protection is the low probability of getting both the first and last hop of a client's stream. Practical Issues: Theoretical issues aside, there are several practical issues with the implementation of Tor that need to be addressed to ensure that identity information is not leaked by the implementation. Exit policy issues: If a client chooses an exit with a very restrictive exit policy (such as an IP or IP range), the first hop then knows a good deal about the destination. For this reason, clients should not select exits that match their destination IP with anything other than "*". Partitioning: Partitioning attacks form another concern. Since Tor uses telescoping to build circuits, it is possible to tell a user is constructing only two hop paths at the entry node and on the local network. An external adversary can potentially differentiate 2 and 3 hop users, and decide that all IP addresses connecting to Tor and using 3 hops have something to hide, and should be scrutinized more closely or outright apprehended. One solution to this is to use the "leaky-circuit" method of attaching streams: The user always creates 3-hop circuits, but if the option is enabled, they always exit from their 2nd hop. The ideal solution would be to create a RELAY_SHISHKABOB cell which contains onion skins for every host along the path, but this requires protocol changes at the nodes to support. Guard nodes: Since guard nodes can rotate due to client relocation, network failure, node upgrades and other issues, if you amortize the risk a mobile, dialup, or otherwise intermittently connected user is exposed to over any reasonable duration of Tor usage (on the order of a year), it is the same with or without guard nodes. Assuming an adversary has c%/n% of network bandwidth, and guards rotate on average with period R, statistically speaking, it's merely a question of if the user wishes their risk to be concentrated with probability c/n over an expected period of R*c, and probability 0 over an expected period of R*(n-c), versus a continuous risk of (c/n)^2. So statistically speaking, guards only create a time-tradeoff of risk over the long run for normal Tor usage. Rotating guards do not reduce risk for normal client usage long term.[3] On other other hand, assuming a more stable method of guard selection and preservation is devised, or a more stable client side network than my own is typical (which rotates guards frequently due to network issues and moving about), guard nodes provide a tradeoff in the form of c/n% of the users being "sacrificial users" who are exposed to high risk O(c/n) of identification, while the rest of the network is exposed to zero risk. The nature of Tor makes it likely an adversary will take a "shock and awe" approach to suppressing Tor by rounding up a few users whose browsing activity has been observed to be made into examples, in an attempt to prove that Tor is not perfect. Since this "shock and awe" attack can be applied with or without guard nodes, stable guard nodes do offer a measure of accountability of sorts. If a user was using a small set of guard nodes and knows them well, and then is suddenly apprehended as a result of Tor usage, having a fixed set of entry points to suspect is a lot better than suspecting the whole network. Conversely, it can also give non-apprehended users comfort that they are likely to remain safe indefinitely with their set of (now presumably trusted) guards. This is probably the most beneficial property of reliable guards: they deter the adversary from mounting "shock and awe" attacks because the surviving users will not intimidated, but instead made more confident. Of course, guards need to be made much more stable and users need to be encouraged to know their guards for this property to really take effect. This beneficial property of client vigilance also carries over to an active adversary, except in this case instead of relying on the user to remember their guard nodes and somehow communicate them after apprehension, the code can alert them to the presence of an active adversary before they are apprehended. But only if they use guard nodes. So lets consider the active adversary: Two hop paths allow malicious guards to get considerably more benefit from failing circuits if they do not extend to their colluding peers for the exit hop. Since guards can detect the number of hops in a path via either timing or by statistical analysis of the exit policy of the 2nd hop, they can perform this attack predominantly against 2 hop users. This can be addressed by completely abandoning an entry guard after a certain ratio of extend or general circuit failures with respect to non-failed circuits. The proper value for this ratio can be determined experimentally with TorFlow. There is the possibility that the local network can abuse this feature to cause certain guards to be dropped, but they can do that anyways with the current Tor by just making guards they don't like unreachable. With this mechanism, Tor will complain loudly if any guard failure rate exceeds the expected in any failure case, local or remote. Eliminating guards entirely would actually not address this issue due to the time-tradeoff nature of risk. In fact, it would just make it worse. Without guard nodes, it becomes much more difficult for clients to become alerted to Tor entry points that are failing circuits to make sure that they only devote bandwidth to carry traffic for streams which they observe both ends. Yet the rogue entry points are still able to significantly increase their success rates by failing circuits. For this reason, guard nodes should remain enabled for 2 hop users, at least until an IP-independent, undetectable guard scanner can be created. TorFlow can scan for failing guards, but after a while, its unique behavior gives away the fact that its IP is a scanner and it can be given selective service. Consideration of risks for node operators: There is a serious risk for two hop users in the form of guard profiling. If an adversary running an exit node notices that a particular site is always visited from a fixed previous hop, it is likely that this is a two hop user using a certain guard, which could be monitored to determine their identity. Thus, for the protection of both 2 hop users and node operators, 2 hop users should limit their guard duration to a sufficient number of days to verify reliability of a node, but not much more. This duration can be determined experimentally by TorFlow. Considering a Tor client builds on average 144 circuits/day (10 minutes per circuit), if the adversary owns c/n% of exits on the network, they can expect to see 144*c/n circuits from this user, or about 14 minutes of usage per day per percentage of network penetration. Since it will take several occurrences of user-linkable exit content from the same predecessor hop for the adversary to have any confidence this is a 2 hop user, it is very unlikely that any sort of demands made upon the predecessor node would guaranteed to be effective (ie it actually was a guard), let alone be executed in time to apprehend the user before they rotated guards. The reverse risk also warrants consideration. If a malicious guard has orders to surveil Mike Perry, it can determine Mike Perry is using two hops by observing his tendency to choose a 2nd hop with a viable exit policy. This can be done relatively quickly, unfortunately, and indicates Mike Perry should spend some of his time building real 3 hop circuits through the same guards, to require them to at least wait for him to actually use Tor to determine his style of operation, rather than collect this information from his passive building patterns. However, to actively determine where Mike Perry is going, the guard will need to require logging ahead of time at multiple exit nodes that he may use over the course of the few days while he is at that guard, and correlate the usage times of the exit node with Mike Perry's activity at that guard for the few days he uses it. At this point, the adversary is mounting a scale and method of attack (widespread logging, timing attacks) that works pretty much just as effectively against 3 hops, so exit node operators are exposed to no additional danger than they otherwise normally are. Why not fix Pathlen=2?: The main reason I am not advocating that we always use 2 hops is that in some situations, timing correlation evidence by itself may not be considered as solid and convincing as an actual, uninterrupted, fully traced path. Are these timing attacks as effective on a real network as they are in simulation? Maybe the circuit multiplexing of Tor can serve to frustrate them to a degree? Would an extralegal adversary or authoritarian government even care? In the face of these situation dependent unknowns, it should be up to the user to decide if this is a concern for them or not. It should probably also be noted that even a false positive rate of 1% for a 200k concurrent-user network could mean that for a given node, a given stream could be confused with something like 10 users, assuming ~200 nodes carry most of the traffic (ie 1000 users each). Though of course to really know for sure, someone needs to do an attack on a real network, unfortunately. Additionally, at some point cover traffic schemes may be implemented to frustrate timing attacks on the first hop. It is possible some expert users may do this ad-hoc already, and may wish to continue using 3 hops for this reason. Implementation: new_route_len() can be modified directly with a check of the Pathlen option. However, circuit construction logic should be altered so that both 2 hop and 3 hop users build the same types of circuits, and the option should ultimately govern circuit selection, not construction. This improves coverage against guard nodes being able to passively profile users who aren't even using Tor. PathlenCoinWeight, anyone? :) The exit policy hack is a bit more tricky. compare_addr_to_addr_policy needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in circuit_is_acceptable. The leaky exit is trickier still.. handle_control_attachstream does allow paths to exit at a given hop. Presumably something similar can be done in connection_ap_handshake_process_socks, and elsewhere? Circuit construction would also have to be performed such that the 2nd hop's exit policy is what is considered, not the 3rd's. The entry_guard_t structure could have num_circ_failed and num_circ_succeeded members such that if it exceeds F% circuit extend failure rate to a second hop, it is removed from the entry list. F should be sufficiently high to avoid churn from normal Tor circuit failure as determined by TorFlow scans. The Vidalia option should be presented as a radio button. Migration: Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky circuit ability, and 2-3 hop circuit selection logic governed by Pathlen. Phase 2: Experiment to determine the proper ratio of circuit failures used to expire garbage or malicious guards via TorFlow (pending Bug #440 backport+adoption). Phase 3: Implement guard expiration code to kick off failure-prone guards and warn the user. Cap 2 hop guard duration to a proper number of days determined sufficient to establish guard reliability (to be determined by TorFlow). Phase 4: Make radiobutton in Vidalia, along with help entry that explains in layman's terms the risks involved. Phase 5: Allow user to specify path length by HTTP URL suffix. [1] http://p2pnet.net/story/11279 [2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf [3] Proof available upon request ;)
Filename: 116-two-hop-paths-from-guard.txt Title: Two hop paths from entry guards Author: Michael Lieberman Created: 26-Jun-2007 Status: Dead This proposal is related to (but different from) Mike Perry's proposal 115 "Two Hop Paths." Overview: Volunteers who run entry guards should have the option of using only 2 additional tor nodes when constructing their own tor circuits. While the option of two hop paths should perhaps be extended to every client (as discussed in Mike Perry's thread), I believe the anonymity properties of two hop paths are particularly well-suited to client computers that are also serving as entry guards. First I will describe the details of the strategy, as well as possible avenues of attack. Then I will list advantages and disadvantages. Then, I will discuss some possibly safer variations of the strategy, and finally some implementation issues. Details: Suppose Alice is an entry guard, and wants to construct a two hop circuit. Alice chooses a middle node at random (not using the entry guard strategy), and gains anonymity by having her traffic look just like traffic from someone else using her as an entry guard. Can Alice's middle node figure out that she is initiator of the traffic? I can think of four possible approaches for distinguishing traffic from Alice with traffic through Alice: 1) Notice that communication from Alice comes too fast: Experimentation is needed to determine if traffic from Alice can be distinguished from traffic from a computer with a decent link to Alice. 2) Monitor Alice's network traffic to discover the lack of incoming packets at the appropriate times. If an adversary has this ability, then Alice already has problems in the current system, because the adversary can run a standard timing attack on Alice's traffic. 3) Notice that traffic from Alice is unique in some way such that if Alice was just one of 3 entry guards for this traffic, then the traffic should be coming from two other entry guards as well. An example of "unique traffic" could be always sending 117 packets every 3 minutes to an exit node that exits to port 4661. However, if such patterns existed with sufficient precision, then it seems to me that Tor already has a problem. (This "unique traffic" may not be a problem if clients often end up choosing a single entry guard because their other two are down. Does anyone know if this is the case?) 4) First, control the middle node *and* some other part of the traffic, using standard attacks on a two hop circuit without entry nodes (my recent paper on Browser-Based Attacks would work well for this http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With control of the circuit, we can now cause "unique traffic" as in 3). Alternatively, if we know something about Alice independently, and we can see what websites are being visited, we might be able to guess that she is the kind of person that would visit those websites. Anonymity Advantages: -Alice never has the problem of choosing a malicious entry guard. In some sense, Alice acts as her own entry guard. Anonymity Disadvantages: -If Alice's traffic is identified as originating from herself (see above for how hard that might be), then she has the anonymity of a 2 hop circuit without entry guards. Additional advantages: -A discussion of the latency advantages of two hop circuits is going on in Mike Perry's thread already. -Also, we can advertise this change as "Run an entry guard and decrease your own Tor latency." This incentive has the potential to add nodes to the network, improving the network as a whole. Safer variations: To solve the "unique traffic" problem, Alice could use two hop paths only 1/3 of the time, and choose 2 other entry guards for the other 2/3 of the time. All the advantages are now 1/3 as useful (possibly more, if the other 2 entry guards are not always up). To solve the problem that Alice's responses are too fast, Alice could delay her responses (ideally based on some real data of response time when Alice is used an entry guard). This loses most of the speed advantages of the two hop path, but if Alice is a fast entry guard, it doesn't lose everything. It also still has the (arguable) anonymity advantage that Alice doesn't have to worry about having a malicious entry guard. Implementation details: For Alice to remain anonymous using this strategy, she has to actually be acting as an entry guard for other nodes. This means the two hop option can only be available to whatever high-performance threshold is currently set on entry guards. Alice may need to somehow check her own current status as an entry guard before choosing this two hop strategy. Another thing to consider: suppose Alice is also an exit node. If the fraction of exit nodes in existence is too small, she may rarely or never be chosen as an entry guard. It would be sad if we offered an incentive to run an entry guard that didn't extend to exit nodes. I suppose clients of Exit nodes could pull the same trick, and bypass using Tor altogether (zero hop paths), though that has additional issues.* Mike Lieberman MIT *Why we shouldn't recommend Exit nodes pull the same trick: 1) Exit nodes would suffer heavily from the problem of "unique traffic" mentioned above. 2) It would give governments an incentive to confiscate exit nodes to see if they are pulling this trick.
Filename: 117-ipv6-exits.txt Title: IPv6 exits Author: coderman Created: 10-Jul-2007 Status: Closed Target: 0.2.4.x Implemented-In: 0.2.4.7-alpha Overview Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6 addresses. This proposal does not imply any IPv6 support for OR traffic, only exit and name resolution. Contents 0. Motivation As the IPv4 address space becomes more scarce there is increasing effort to provide Internet services via the IPv6 protocol. Many hosts are available at IPv6 endpoints which are currently inaccessible for Tor users. Extending Tor to support IPv6 exit streams and IPv6 DNS name resolution will allow users of the Tor network to access these hosts. This capability would be present for those who do not currently have IPv6 access, thus increasing the utility of Tor and furthering adoption of IPv6. 1. Design 1.1. General design overview There are three main components to this proposal. The first is a method for routers to advertise their ability to exit IPv6 traffic. The second is the manner in which routers resolve names to IPv6 addresses. Last but not least is the method in which clients communicate with Tor to resolve and connect to IPv6 endpoints anonymously. 1.2. Router IPv6 exit support In order to specify exit policies and IPv6 capability new directives in the Tor configuration will be needed. If a router advertises IPv6 exit policies in its descriptor this will signal the ability to provide IPv6 exit. There are a number of additional default deny rules associated with this new address space which are detailed in the addendum. When Tor is started on a host it should check for the presence of a global unicast IPv6 address and if present include the default IPv6 exit policies and any user specified IPv6 exit policies. If a user provides IPv6 exit policies but no global unicast IPv6 address is available Tor should generate a warning and not publish the IPv6 policies in the router descriptor. It should be noted that IPv4 mapped IPv6 addresses are not valid exit destinations. This mechanism is mainly used to interoperate with both IPv4 and IPv6 clients on the same socket. Any attempts to use an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for IPv4, must be refused. 1.3. DNS name resolution of IPv6 addresses (AAAA records) In addition to exit support for IPv6 TCP connections, a method to resolve domain names to their respective IPv6 addresses is also needed. This is accomplished in the existing DNS system via AAAA records. Routers will perform both A and AAAA requests when resolving a name so that the client can utilize an IPv6 endpoint when available or preferred. To avoid potential problems with caching DNS servers that behave poorly all NXDOMAIN responses to AAAA requests should be ignored if a successful response is received for an A request. This implies that both AAAA and A requests will always be performed for each name resolution. For reverse lookups on IPv6 addresses, like that used for RESOLVE_PTR, Tor will perform the necessary PTR requests via IP6.ARPA. All routers which perform DNS resolution on behalf of clients (RELAY_RESOLVE) should perform and respond with both A and AAAA resources. [NOTE: In a future version, when we extend the behavior of RESOLVE to encapsulate more of real DNS, it will make sense to allow more flexibility here. -nickm] 1.4. Client interaction with IPv6 exit capability 1.4.1. Usability goals There are a number of behaviors which Tor can provide when interacting with clients that will improve the usability of IPv6 exit capability. These behaviors are designed to make it simple for clients to express a preference for IPv6 transport and utilize IPv6 host services. 1.4.2. SOCKSv5 IPv6 client behavior The SOCKS version 5 protocol supports IPv6 connections. When using SOCKSv5 with hostnames it is difficult to determine if a client wishes to use an IPv4 or IPv6 address to connect to the desired host if it resolves to both address types. In order to make this more intuitive the SOCKSv5 protocol can be supported on a local IPv6 endpoint, [::1] port 9050 for example. When a client requests a connection to the desired host via an IPv6 SOCKS connection Tor will prefer IPv6 addresses when resolving the host name and connecting to the host. Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS connection will return IPv6 addresses when available, and fall back to IPv4 addresses if not. [NOTE: This means that SocksListenAddress and DNSListenAddress should support IPv6 addresses. Perhaps there should also be a general option to have listeners that default to 127.0.0.1 and 0.0.0.0 listen additionally or instead on ::1 and :: -nickm] 1.4.3. MAPADDRESS behavior The MAPADDRESS capability supports clients that may not be able to use the SOCKSv4a or SOCKSv5 hostname support to resolve names via Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as well. When a client requests an address mapping from the wildcard IPv6 address, [::0], the server will respond with a unique local IPv6 address on success. It is important to note that there may be two mappings for the same name if both an IPv4 and IPv6 address are associated with the host. In this case a CONNECT to a mapped IPv6 address should prefer IPv6 for the connection to the host, if available, while CONNECT to a mapped IPv4 address will prefer IPv4. It should be noted that IPv6 does not provide the concept of a host local subnet, like 127.0.0.0/8 in IPv4. For this reason integration of Tor with IPv6 clients should consider a firewall or filter rule to drop unique local addresses to or from the network when possible. These packets should not be routed, however, keeping them off the subnet entirely is worthwhile. 1.4.3.1. Generating unique local IPv6 addresses The usual manner of generating a unique local IPv6 address is to select a Global ID part randomly, along with a Subnet ID, and sharing this prefix among the communicating parties who each have their own distinct Interface ID. In this style a given Tor instance might select a random Global and Subnet ID and provide MAPADDRESS assignments with a random Interface ID as needed. This has the potential to associate unique Global/Subnet identifiers with a given Tor instance and may expose attacks against the anonymity of Tor users. To avoid this potential problem entirely MAPADDRESS must always generate the Global, Subnet, and Interface IDs randomly for each request. It is also highly suggested that explicitly specifying an IPv6 source address instead of the wildcard address not be supported to ensure that a good random address is used. 1.4.4. DNSProxy IPv6 client behavior A new capability in recent Tor versions is the transparent DNS proxy. This feature will need to return both A and AAAA resource records when responding to client name resolution requests. The transparent DNS proxy should also support reverse lookups for IPv6 addresses. It is suggested that any such requests to the deprecated IP6.INT domain should be translated to IP6.ARPA instead. This translation is not likely to be used and is of low priority. It would be nice to support DNS over IPv6 transport as well, however, this is not likely to be used and is of low priority. 1.4.5. TransPort IPv6 client behavior Tor also provides transparent TCP proxy support via the Trans* directives in the configuration. The TransListenAddress directive should accept an IPv6 address in addition to IPv4 so that IPv6 TCP connections can be transparently proxied. 1.5. Additional changes The RedirectExit option should be deprecated rather than extending this feature to IPv6. 2. Spec changes 2.1. Tor specification In '6.2. Opening streams and transferring data' the following should be changed to indicate IPv6 exit capability: "No version of Tor currently generates the IPv6 format." In '6.4. Remote hostname lookup' the following should be updated to reflect use of ip6.arpa in addition to in-addr.arpa. "For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an in-addr.arpa address." In 'A.1. Differences between spec and implementation' the following should be updated to indicate IPv6 exit capability: "The current codebase has no IPv6 support at all." [NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2 type that can hold an ipv6 address, since the way we encode ipv6 addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6") is a bit dumb. -nickm] [Actually, the length field lets us distinguish EXITPOLICY. -nickm] 2.2. Directory specification In '2.1. Router descriptor format' a new set of directives is needed for IPv6 exit policy. The existing accept/reject directives should be clarified to indicate IPv4 or wildcard address relevance. The new IPv6 directives will be in the form of: "accept6" exitpattern NL "reject6" exitpattern NL The section describing accept6/reject6 should explain that the presence of accept6 or reject6 exit policies in a router descriptor signals the ability of that router to exit IPv6 traffic (according to IPv6 exit policies). The "[::]/0" notation is used to represent "all IPv6 addresses". "[::0]/0" may also be used for this representation. If a user specifies a 'reject6 [::]/0:*' policy in the Tor configuration this will be interpreted as forcing no IPv6 exit support and no accept6/reject6 policies will be included in the published descriptor. This will prevent IPv6 exit if the router host has a global unicast IPv6 address present. It is important to note that a wildcard address in an accept or reject policy applies to both IPv4 and IPv6 addresses. 2.3. Control specification In '3.8. MAPADDRESS' the potential to have to addresses for a given name should be explained. The method for generating unique local addresses for IPv6 mappings needs explanation as described above. When IPv6 addresses are used in this document they should include the brackets for consistency. For example, the null IPv6 address should be written as "[::0]" and not "::0". The control commands will expect the same syntax as well. In '3.9. GETINFO' the "address" command should return both public IPv4 and IPv6 addresses if present. These addresses should be separated via \r\n. 2.4. Tor SOCKS extensions In '2. Name lookup' a description of IPv6 address resolution is needed for SOCKSv5 as described above. IPv6 addresses should be supported in both the RESOLVE and RESOLVE_PTR extensions. A new section describing the ability to accept SOCKSv5 clients on a local IPv6 address to indicate a preference for IPv6 transport as described above is also needed. The behavior of Tor SOCKSv5 proxy with an IPv6 preference should be explained, for example, preferring IPv6 transport to a named host with both IPv4 and IPv6 addresses available (A and AAAA records). 3. Questions and concerns 3.1. DNS A6 records A6 is explicitly avoided in this document. There are potential reasons for implementing this, however, the inherent complexity of the protocol and resolvers make this unappealing. Is there a compelling reason to consider A6 as part of IPv6 exit support? [IMO not till anybody needs it. -nickm] 3.2. IPv4 and IPv6 preference The design above tries to infer a preference for IPv4 or IPv6 transport based on client interactions with Tor. It might be useful to provide more explicit control over this preference. For example, an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts in CONNECT requests while the current implementation would assume an IPv4 preference. Should more explicit control be available, through either configuration directives or control commands? Many applications support a inet6-only or prefer-family type option that provides the user manual control over address preference. This could be provided as a Tor configuration option. An explicit preference is still possible by resolving names and then CONNECTing to an IPv4 or IPv6 address as desired, however, not all client applications may have this option available. 3.3. Support for IPv6 only transparent proxy clients It may be useful to support IPv6 only transparent proxy clients using IPv4 mapped IPv6 like addresses. This would require transparent DNS proxy using IPv6 transport and the ability to map A record responses into IPv4 mapped IPv6 like addresses in the manner described in the "NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The transparent TCP proxy would thus need to detect these mapped addresses and connect to the desired IPv4 host. The IPv6 prefix used for this purpose must not be the actual IPv4 mapped IPv6 address prefix, though the manner in which IPv4 addresses are embedded in IPv6 addresses would be the same. The lack of any IPv6 only hosts which would use this transparent proxy method makes this a lot of work for very little gain. Is there a compelling reason to support this NAT-PT like capability? 3.4. IPv6 DNS and older Tor routers It is expected that many routers will continue to run with older versions of Tor when the IPv6 exit capability is released. Clients who wish to use IPv6 will need to route RELAY_RESOLVE requests to the newer routers which will respond with both A and AAAA resource records when possible. One way to do this is to route RELAY_RESOLVE requests to routers with IPv6 exit policies published, however, this would not utilize current routers that can resolve IPv6 addresses even if they can't exit such traffic. There was also concern expressed about the ability of existing clients to cope with new RELAY_RESOLVE responses that contain IPv6 addresses. If this breaks backward compatibility, a new request type may be necessary, like RELAY_RESOLVE6, or some other mechanism of indicating the ability to parse IPv6 responses when making the request. 3.5. IPv4 and IPv6 bindings in MAPADDRESS It may be troublesome to try and support two distinct address mappings for the same name in the existing MAPADDRESS implementation. If this cannot be accommodated then the behavior should replace existing mappings with the new address regardless of family. A warning when this occurs would be useful to assist clients who encounter problems when both an IPv4 and IPv6 application are using MAPADDRESS for the same names concurrently, causing lost connections for one of them. 4. Addendum 4.1. Sample IPv6 default exit policy reject 0.0.0.0/8 reject 169.254.0.0/16 reject 127.0.0.0/8 reject 192.168.0.0/16 reject 10.0.0.0/8 reject 172.16.0.0/12 reject6 [0000::]/8 reject6 [0100::]/8 reject6 [0200::]/7 reject6 [0400::]/6 reject6 [0800::]/5 reject6 [1000::]/4 reject6 [4000::]/3 reject6 [6000::]/3 reject6 [8000::]/3 reject6 [A000::]/3 reject6 [C000::]/3 reject6 [E000::]/4 reject6 [F000::]/5 reject6 [F800::]/6 reject6 [FC00::]/7 reject6 [FE00::]/9 reject6 [FE80::]/10 reject6 [FEC0::]/10 reject6 [FF00::]/8 reject *:25 reject *:119 reject *:135-139 reject *:445 reject *:1214 reject *:4661-4666 reject *:6346-6429 reject *:6699 reject *:6881-6999 accept *:* # accept6 [2000::]/3:* is implied 4.2. Additional resources 'DNS Extensions to Support IP Version 6' http://www.ietf.org/rfc/rfc3596.txt 'DNS Extensions to Support IPv6 Address Aggregation and Renumbering' http://www.ietf.org/rfc/rfc2874.txt 'SOCKS Protocol Version 5' http://www.ietf.org/rfc/rfc1928.txt 'Unique Local IPv6 Unicast Addresses' http://www.ietf.org/rfc/rfc4193.txt 'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE' http://www.iana.org/assignments/ipv6-address-space 'Network Address Translation - Protocol Translation (NAT-PT)' http://www.ietf.org/rfc/rfc2766.txt
Filename: 118-multiple-orports.txt Title: Advertising multiple ORPorts at once Author: Nick Mathewson Created: 09-Jul-2007 Status: Superseded Superseded-By: 186-multiple-orports.txt [Needs Revision: This proposal needs revision to come up to 2011 standards and take microdescriptors into account.] Overview: This document is a proposal for servers to advertise multiple address/port combinations for their ORPort. Motivation: Sometimes servers want to support multiple ports for incoming connections, either in order to support multiple address families, to better use multiple interfaces, or to support a variety of FascistFirewallPorts settings. This is easy to set up now, but there's no way to advertise it to clients. New descriptor syntax: We add a new line in the router descriptor, "or-address". This line can occur zero, one, or multiple times. Its format is: or-address SP ADDRESS ":" PORTLIST NL ADDRESS = IP6ADDR / IP4ADDR IPV6ADDR = an ipv6 address, surrounded by square brackets. IPV4ADDR = an ipv4 address, represented as a dotted quad. PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST PORTSPEC = PORT | PORT "-" PORT [This is the regular format for specifying sets of addresses and ports in Tor.] New OR behavior: We add two more options to supplement ORListenAddress: ORPublishedListenAddress, and ORPublishAddressSet. The former listens on an address-port combination and publishes it in addition to the regular address. The latter advertises a set of address-port combinations, but does not listen on them. [To use this option, the server operator should set up port forwarding to the regular ORPort, as for example with firewall rules.] Servers should extend their testing to include advertised addresses and ports. No address or port should be advertised until it's been tested. [This might get expensive in practice.] New authority behavior: Authorities should spot-test descriptors, and reject any where a substantial part of the addresses can't be reached. New client behavior: When connecting to another server, clients SHOULD pick an address-port ocmbination at random as supported by their reachableaddresses. If a client has a connection to a server at one address, it SHOULD use that address for any simultaneous connections to that server. Clients SHOULD use the canonical address for any server when generating extend cells. Not addressed here: * There's no reason to listen on multiple dirports; current Tors mostly don't connect directly to the dirport anyway. * It could be advantageous to list something about extra addresses in the network-status document. This would, however, eat space there. More analysis is needed, particularly in light of proposal 141 ("Download server descriptors on demand") Dependencies: Testing for canonical connections needs to be implemented before it's safe to use this proposal. Notes 3 July: - Write up the simple version of this. No ranges needed yet. No networkstatus chagnes yet.
Filename: 119-controlport-auth.txt Title: New PROTOCOLINFO command for controllers Author: Roger Dingledine Created: 14-Aug-2007 Status: Closed Implemented-In: 0.2.0.x Overview: Here we describe how to help controllers locate the cookie authentication file when authenticating to Tor, so we can a) require authentication by default for Tor controllers and b) still keep things usable. Also, we propose an extensible, general-purpose mechanism for controllers to learn about a Tor instance's protocol and authentication requirements before authenticating. The Problem: When we first added the controller protocol, we wanted to make it easy for people to play with it, so by default we didn't require any authentication from controller programs. We allowed requests only from localhost as a stopgap measure for security. Due to an increasing number of vulnerabilities based on this approach, it's time to add authentication in default configurations. We have a number of goals: - We want the default Vidalia bundles to transparently work. That means we don't want the users to have to type in or know a password. - We want to allow multiple controller applications to connect to the control port. So if Vidalia is launching Tor, it can't just keep the secrets to itself. Right now there are three authentication approaches supported by the control protocol: NULL, CookieAuthentication, and HashedControlPassword. See Sec 5.1 in control-spec.txt for details. There are a couple of challenges here. The first is: if the controller launches Tor, how should we teach Tor what authentication approach it should require, and the secret that goes along with it? Next is: how should this work when the controller attaches to an existing Tor, rather than launching Tor itself? Cookie authentication seems most amenable to letting multiple controller applications interact with Tor. But that brings in yet another question: how does the controller guess where to look for the cookie file, without first knowing what DataDirectory Tor is using? Design: We should add a new controller command PROTOCOLINFO that can be sent as a valid first command (the others being AUTHENTICATE and QUIT). If PROTOCOLINFO is sent as the first command, the second command must be either a successful AUTHENTICATE or a QUIT. If the initial command sequence is not valid, Tor closes the connection. Spec: C: "PROTOCOLINFO" *(SP PIVERSION) CRLF S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF InfoLine = AuthLine / VersionLine / OtherLine AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod *(SP "COOKIEFILE=" AuthCookieFile) CRLF VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF AuthMethod = "NULL" / ; No authentication is required "HASHEDPASSWORD" / ; A controller must supply the original password "COOKIE" / ; A controller must supply the contents of a cookie AuthCookieFile = QuotedString TorVersion = QuotedString OtherLine = "250-" Keyword [SP Arguments] CRLF For example: C: PROTOCOLINFO CRLF S: "250+PROTOCOLINFO 1" CRLF S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF S: "250-VERSION Tor=0.2.0.5-alpha" CRLF S: "250 OK" CRLF Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines with keywords it does not recognize. Controllers MUST ignore extraneous data on any InfoLine. PIVERSION is there in case we drastically change the syntax one day. For now it should always be "1", for the controller protocol. Controllers MAY provide a list of the protocol versions they support; Tor MAY select a version that the controller does not support. Right now only two "topics" (AUTH and VERSION) are included, but more may be included in the future. Controllers must accept lines with unexpected topics. AuthCookieFile = QuotedString AuthMethod is used to specify one or more control authentication methods that Tor currently accepts. AuthCookieFile specifies the absolute path and filename of the authentication cookie that Tor is expecting and is provided iff the METHODS field contains the method "COOKIE". Controllers MUST handle escape sequences inside this string. The VERSION line contains the Tor version. [What else might we want to include that could be useful? -RD] Compatibility: Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed command. Earlier Tors don't know about this command but don't hang up. That means controllers will need a mechanism for distinguishing whether they're talking to a Tor that speaks PROTOCOLINFO or not. I suggest that the controllers attempt a PROTOCOLINFO. Then: - If it works, great. Authenticate as required. - If they get hung up on, reconnect and do a NULL AUTHENTICATE. - If it's unrecognized but they're not hung up on, do a NULL AUTHENTICATE. Unsolved problems: If Torbutton wants to be a Tor controller one day... talking TCP is bad enough, but reading from the filesystem is even harder. Is there a way to let simple programs work with the controller port without needing all the auth infrastructure? Once we put this approach in place, the next vulnerability we see will involve an attacker somehow getting read access to the victim's files --- and then we're back where we started. This means we still need to think about how to demand password-based authentication without bothering the user about it.
Filename: 120-shutdown-descriptors.txt Title: Shutdown descriptors when Tor servers stop Author: Roger Dingledine Created: 15-Aug-2007 Status: Dead [Proposal dead as of 11 Jul 2008. The point of this proposal was to give routers a good way to get out of the networkstatus early, but proposal 138 (already implemented) has achieved this.] Overview: Tor servers should publish a last descriptor whenever they shut down, to let others know that they are no longer offering service. The Problem: The main reason for this is in reaction to Internet services that want to treat connections from the Tor network differently. Right now, if a user experiments with turning on the "relay" functionality, he is punished by being locked out of some websites, some IRC networks, etc --- and this lockout persists for several days even after he turns the server off. Design: During the "slow shutdown" period if exiting, or shortly after the user sets his ORPort back to 0 if not exiting, Tor should publish a final descriptor with the following characteristics: 1) Exit policy is listed as "reject *:*" 2) It includes a new entry called "opt shutdown 1" The first step is so current blacklists will no longer list this node as exiting to whatever the service is. The second step is so directory authorities can avoid wasting time doing reachability testing. Authorities should automatically not list as Running any router whose latest descriptor says it shut down. [I originally had in mind a third step --- Advertised bandwidth capacity is listed as "0" --- so current Tor clients will skip over this node when building most circuits. But since clients won't fetch descriptors from nodes not listed as Running, this step seems pointless. -RD] Spec: TBD but should be pretty straightforward. Security issues: Now external people can learn exactly when a node stopped offering relay service. How bad is this? I can see a few minor attacks based on this knowledge, but on the other hand as it is we don't really take any steps to keep this information secret. Overhead issues: We are creating more descriptors that want to be remembered. However, since the router won't be marked as Running, ordinary clients won't fetch the shutdown descriptors. Caches will, though. I hope this is ok. Implementation: To make things easy, we should publish the shutdown descriptor only on controlled shutdown (SIGINT as opposed to SIGTERM). That would leave enough time for publishing that we probably wouldn't need any extra synchronization code. If that turns out to be too unintuitive for users, I could imagine doing it on SIGTERMs too, and just delaying exit until we had successfully published to at least one authority, at which point we'd hope that it propagated from there. Acknowledgements: tup suggested this idea. Comments: 2) Maybe add a rule "Don't do this for hibernation if we expect to wake up before the next consensus is published"? - NM 9 Oct 2007
Filename: 121-hidden-service-authentication.txt Title: Hidden Service Authentication Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger, Christoph Weingarten Created: 10-Sep-2007 Status: Closed Implemented-In: 0.2.1.x Change history: 26-Sep-2007 Initial proposal for or-dev 08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007 15-Dec-2007 Rewrote complete proposal for better readability, modified authentication protocol, merged in personal notes 24-Dec-2007 Replaced misleading term "authentication" by "authorization" and added some clarifications (comments by Sven Kaffille) 28-Apr-2008 Updated most parts of the concrete authorization protocol 04-Jul-2008 Add a simple algorithm to delay descriptor publication for different clients of a hidden service 19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay protection for INTRODUCE2 cells (1.3), described limitations for auth protocols (1.6), improved hidden service protocol without client authorization (2.1), added second, more scalable authorization protocol (2.2), rewrote existing authorization protocol (2.3); changes based on discussion with Nick 31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent abuse. 01-Aug-2008 Use first part of Diffie-Hellman handshake for replay protection instead of rendezvous cookie. 01-Aug-2008 Remove improved hidden service protocol without client authorization (2.1). It might get implemented in proposal 142. Overview: This proposal deals with a general infrastructure for performing authorization (not necessarily implying authentication) of requests to hidden services at three points: (1) when downloading and decrypting parts of the hidden service descriptor, (2) at the introduction point, and (3) at Bob's Tor client before contacting the rendezvous point. A service provider will be able to restrict access to his service at these three points to authorized clients only. Further, the proposal contains specific authorization protocols as instances that implement the presented authorization infrastructure. This proposal is based on v2 hidden service descriptors as described in proposal 114 and introduced in version 0.2.0.10-alpha. The proposal is structured as follows: The next section motivates the integration of authorization mechanisms in the hidden service protocol. Then we describe a general infrastructure for authorization in hidden services, followed by specific authorization protocols for this infrastructure. At the end we discuss a number of attacks and non-attacks as well as compatibility issues. Motivation: The major part of hidden services does not require client authorization now and won't do so in the future. To the contrary, many clients would not want to be (pseudonymously) identifiable by the service (though this is unavoidable to some extent), but rather use the service anonymously. These services are not addressed by this proposal. However, there may be certain services which are intended to be accessed by a limited set of clients only. A possible application might be a wiki or forum that should only be accessible for a closed user group. Another, less intuitive example might be a real-time communication service, where someone provides a presence and messaging service only to his buddies. Finally, a possible application would be a personal home server that should be remotely accessed by its owner. Performing authorization for a hidden service within the Tor network, as proposed here, offers a range of advantages compared to allowing all client connections in the first instance and deferring authorization to the transported protocol: (1) Reduced traffic: Unauthorized requests would be rejected as early as possible, thereby reducing the overall traffic in the network generated by establishing circuits and sending cells. (2) Better protection of service location: Unauthorized clients could not force Bob to create circuits to their rendezvous points, thus preventing the attack described by Øverlier and Syverson in their paper "Locating Hidden Servers" even without the need for guards. (3) Hiding activity: Apart from performing the actual authorization, a service provider could also hide the mere presence of his service from unauthorized clients when not providing hidden service descriptors to them, rejecting unauthorized requests already at the introduction point (ideally without leaking presence information at any of these points), or not answering unauthorized introduction requests. (4) Better protection of introduction points: When providing hidden service descriptors to authorized clients only and encrypting the introduction points as described in proposal 114, the introduction points would be unknown to unauthorized clients and thereby protected from DoS attacks. (5) Protocol independence: Authorization could be performed for all transported protocols, regardless of their own capabilities to do so. (6) Ease of administration: A service provider running multiple hidden services would be able to configure access at a single place uniformly instead of doing so for all services separately. (7) Optional QoS support: Bob could adapt his node selection algorithm for building the circuit to Alice's rendezvous point depending on a previously guaranteed QoS level, thus providing better latency or bandwidth for selected clients. A disadvantage of performing authorization within the Tor network is that a hidden service cannot make use of authorization data in the transported protocol. Tor hidden services were designed to be independent of the transported protocol. Therefore it's only possible to either grant or deny access to the whole service, but not to specific resources of the service. Authorization often implies authentication, i.e. proving one's identity. However, when performing authorization within the Tor network, untrusted points should not gain any useful information about the identities of communicating parties, neither server nor client. A crucial challenge is to remain anonymous towards directory servers and introduction points. However, trying to hide identity from the hidden service is a futile task, because a client would never know if he is the only authorized client and therefore perfectly identifiable. Therefore, hiding client identity from the hidden service is not an aim of this proposal. The current implementation of hidden services does not provide any kind of authorization. The hidden service descriptor version 2, introduced by proposal 114, was designed to use a descriptor cookie for downloading and decrypting parts of the descriptor content, but this feature is not yet in use. Further, most relevant cell formats specified in rend-spec contain fields for authorization data, but those fields are neither implemented nor do they suffice entirely. Details: 1. General infrastructure for authorization to hidden services We spotted three possible authorization points in the hidden service protocol: (1) when downloading and decrypting parts of the hidden service descriptor, (2) at the introduction point, and (3) at Bob's Tor client before contacting the rendezvous point. The general idea of this proposal is to allow service providers to restrict access to some or all of these points to authorized clients only. 1.1. Client authorization at directory Since the implementation of proposal 114 it is possible to combine a hidden service descriptor with a so-called descriptor cookie. If done so, the descriptor cookie becomes part of the descriptor ID, thus having an effect on the storage location of the descriptor. Someone who has learned about a service, but is not aware of the descriptor cookie, won't be able to determine the descriptor ID and download the current hidden service descriptor; he won't even know whether the service has uploaded a descriptor recently. Descriptor IDs are calculated as follows (see section 1.2 of rend-spec for the complete specification of v2 hidden service descriptors): descriptor-id = H(service-id | H(time-period | descriptor-cookie | replica)) Currently, service-id is equivalent to permanent-id which is calculated as in the following formula. But in principle it could be any public key. permanent-id = H(permanent-key)[:10] The second purpose of the descriptor cookie is to encrypt the list of introduction points, including optional authorization data. Hence, the hidden service directories won't learn any introduction information from storing a hidden service descriptor. This feature is implemented but unused at the moment. So this proposal will harness the advantages of proposal 114. The descriptor cookie can be used for authorization by keeping it secret from everyone but authorized clients. A service could then decide whether to publish hidden service descriptors using that descriptor cookie later on. An authorized client being aware of the descriptor cookie would be able to download and decrypt the hidden service descriptor. The number of concurrently used descriptor cookies for one hidden service is not restricted. A service could use a single descriptor cookie for all users, a distinct cookie per user, or something in between, like one cookie per group of users. It is up to the specific protocol and how it is applied by a service provider. Two or more hidden service descriptors for different groups or users should not be uploaded at the same time. A directory node could conclude easily that the descriptors were issued by the same hidden service, thus being able to link the two groups or users. Therefore, descriptors for different users or clients that ought to be stored on the same directory are delayed, so that only one descriptor is uploaded to a directory at a time. The remaining descriptors are uploaded with a delay of up to 30 seconds. Further, descriptors for different groups or users that are to be stored on different directories are delayed for a random time of up to 30 seconds to hide relations from colluding directories. Certainly, this does not prevent linking entirely, but it makes it somewhat harder. There is a conflict between hiding links between clients and making a service available in a timely manner. Although this part of the proposal is meant to describe a general infrastructure for authorization, changing the way of using the descriptor cookie to look up hidden service descriptors, e.g. applying some sort of asymmetric crypto system, would require in-depth changes that would be incompatible to v2 hidden service descriptors. On the contrary, using another key for en-/decrypting the introduction point part of a hidden service descriptor, e.g. a different symmetric key or asymmetric encryption, would be easy to implement and compatible to v2 hidden service descriptors as understood by hidden service directories (clients and services would have to be upgraded anyway for using the new features). An adversary could try to abuse the fact that introduction points can be encrypted by storing arbitrary, unrelated data in the hidden service directory. This abuse can be limited by setting a hard descriptor size limit, forcing the adversary to split data into multiple chunks. There are some limitations that make splitting data across multiple descriptors unattractive: 1) The adversary would not be able to choose descriptor IDs freely and would therefore have to implement his own indexing structure. 2) Validity of descriptors is limited to at most 24 hours after which descriptors need to be republished. The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data. A large descriptor with 7 introduction points and 5 kilobytes of authorization data would be 11724 bytes in size. The upper size limit of descriptors should be set to 20 kilobytes, which limits the effect of abuse while retaining enough flexibility in designing authorization protocols. 1.2. Client authorization at introduction point The next possible authorization point after downloading and decrypting a hidden service descriptor is the introduction point. It may be important for authorization, because it bears the last chance of hiding presence of a hidden service from unauthorized clients. Further, performing authorization at the introduction point might reduce traffic in the network, because unauthorized requests would not be passed to the hidden service. This applies to those clients who are aware of a descriptor cookie and thereby of the hidden service descriptor, but do not have authorization data to pass the introduction point or access the service (such a situation might occur when authorization data for authorization at the directory is not issued on a per-user basis, but authorization data for authorization at the introduction point is). It is important to note that the introduction point must be considered untrustworthy, and therefore cannot replace authorization at the hidden service itself. Nor should the introduction point learn any sensitive identifiable information from either the service or the client. In order to perform authorization at the introduction point, three message formats need to be modified: (1) v2 hidden service descriptors, (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells. A v2 hidden service descriptor needs to contain authorization data that is introduction-point-specific and sometimes also authorization data that is introduction-point-independent. Therefore, v2 hidden service descriptors as specified in section 1.2 of rend-spec already contain two reserved fields "intro-authorization" and "service-authorization" (originally, the names of these fields were "...-authentication") containing an authorization type number and arbitrary authorization data. We propose that authorization data consists of base64 encoded objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and "-----END MESSAGE-----". This will increase the size of hidden service descriptors, but this is allowed since there is no strict upper limit. The current ESTABLISH_INTRO cells as described in section 1.3 of rend-spec do not contain either authorization data or version information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO cells adding these two issues as follows: V Format byte: set to 255 [1 octet] V Version byte: set to 1 [1 octet] KL Key length [2 octets] PK Bob's public key [KL octets] HS Hash of session info [20 octets] AUTHT The auth type that is supported [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] SIG Signature of above information [variable] From the format it is possible to determine the maximum allowed size for authorization data: given the fact that cells are 512 octets long, of which 498 octets are usable (see section 6.1 of tor-spec), and assuming 1024 bit = 128 octet long keys, there are 215 octets left for authorization data. Hence, authorization protocols are bound to use no more than these 215 octets, regardless of the number of clients that shall be authenticated at the introduction point. Otherwise, one would need to send multiple ESTABLISH_INTRO cells or split them up, which we do not specify here. In order to understand a v1 ESTABLISH_INTRO cell, the implementation of a relay must have a certain Tor version. Hidden services need to be able to distinguish relays being capable of understanding the new v1 cell formats and perform authorization. We propose to use the version number that is contained in networkstatus documents to find capable introduction points. The current INTRODUCE1 cell as described in section 1.8 of rend-spec is not designed to carry authorization data and has no version number, too. Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size, seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This makes it impossible to distinguish unversioned INTRODUCE1 cells from any later format. In particular, it is not possible to introduce some kind of format and version byte for newer versions of this cell. That's probably where the comment "[XXX011 want to put intro-level auth info here, but no version. crap. -RD]" that was part of rend-spec some time ago comes from. We propose that new versioned INTRODUCE1 cells use the new cell type 41 RELAY_INTRODUCE1V (where V stands for versioned): Cleartext V Version byte: set to 1 [1 octet] PK_ID Identifier for Bob's PK [20 octets] AUTHT The auth type that is included [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] Encrypted to Bob's PK: (RELAY_INTRODUCE2 cell) The maximum length of contained authorization data depends on the length of the contained INTRODUCE2 cell. A calculation follows below when describing the INTRODUCE2 cell format we propose to use. 1.3. Client authorization at hidden service The time when a hidden service receives an INTRODUCE2 cell constitutes the last possible authorization point during the hidden service protocol. Performing authorization here is easier than at the other two authorization points, because there are no possibly untrusted entities involved. In general, a client that is successfully authorized at the introduction point should be granted access at the hidden service, too. Otherwise, the client would receive a positive INTRODUCE_ACK cell from the introduction point and conclude that it may connect to the service, but the request will be dropped without notice. This would appear as a failure to clients. Therefore, the number of cases in which a client successfully passes the introduction point but fails at the hidden service should be zero. However, this does not lead to the conclusion that the authorization data used at the introduction point and the hidden service must be the same, but only that both authorization data should lead to the same authorization result. Authorization data is transmitted from client to server via an INTRODUCE2 cell that is forwarded by the introduction point. There are versions 0 to 2 specified in section 1.8 of rend-spec, but none of these contain fields for carrying authorization data. We propose a slightly modified version of v3 INTRODUCE2 cells that is specified in section 1.8.1 and which is not implemented as of December 2007. In contrast to the specified v3 we avoid specifying (and implementing) IPv6 capabilities, because Tor relays will be required to support IPv4 addresses for a long time in the future, so that this seems unnecessary at the moment. The proposed format of v3 INTRODUCE2 cells is as follows: VER Version byte: set to 3. [1 octet] AUTHT The auth type that is used [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] TS Timestamp (seconds since 1-1-1970) [4 octets] IP Rendezvous point's address [4 octets] PORT Rendezvous point's OR port [2 octets] ID Rendezvous point identity ID [20 octets] KLEN Length of onion key [2 octets] KEY Rendezvous point onion key [KLEN octets] RC Rendezvous cookie [20 octets] g^x Diffie-Hellman data, part 1 [128 octets] The maximum possible length of authorization data is related to the enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with 1024 bit = 128 octets long public key without any authorization data occupies 306 octets (AUTHL is only used when AUTHT has a value != 0), plus 58 octets for hybrid public key encryption (see section 5.1 of tor-spec on hybrid encryption of CREATE cells). The surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110 of the 498 available octets free, which must be shared between authorization data to the introduction point _and_ to the hidden service. When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has provided valid authorization data to him. He also requires that the timestamp is no more than 30 minutes in the past or future and that the first part of the Diffie-Hellman handshake has not been used in the past 60 minutes to prevent replay attacks by rogue introduction points. (The reason for not using the rendezvous cookie to detect replays---even though it is only sent once in the current design---is that it might be desirable to re-use rendezvous cookies for multiple introduction requests in the future.) If all checks pass, Bob builds a circuit to the provided rendezvous point. Otherwise he drops the cell. 1.4. Summary of authorization data fields In summary, the proposed descriptor format and cell formats provide the following fields for carrying authorization data: (1) The v2 hidden service descriptor contains: - a descriptor cookie that is used for the lookup process, and - an arbitrary encryption schema to ensure authorization to access introduction information (currently symmetric encryption with the descriptor cookie). (2) For performing authorization at the introduction point we can use: - the fields intro-authorization and service-authorization in hidden service descriptors, - a maximum of 215 octets in the ESTABLISH_INTRO cell, and - one part of 110 octets in the INTRODUCE1V cell. (3) For performing authorization at the hidden service we can use: - the fields intro-authorization and service-authorization in hidden service descriptors, - the other part of 110 octets in the INTRODUCE2 cell. It will also still be possible to access a hidden service without any authorization or only use a part of the authorization infrastructure. However, this requires to consider all parts of the infrastructure. For example, authorization at the introduction point relying on confidential intro-authorization data transported in the hidden service descriptor cannot be performed without using an encryption schema for introduction information. 1.5. Managing authorization data at servers and clients In order to provide authorization data at the hidden service and the authenticated clients, we propose to use files---either the Tor configuration file or separate files. The exact format of these special files depends on the authorization protocol used. Currently, rend-spec contains the proposition to encode client-side authorization data in the URL, like in x.y.z.onion. This was never used and is also a bad idea, because in case of HTTP the requested URL may be contained in the Host and Referer fields. 1.6. Limitations for authorization protocols There are two limitations of the current hidden service protocol for authorization protocols that shall be identified here. 1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2 restricts the amount of data that can be used for authorization. This forces authorization protocols that require per-user authorization data at the introduction point to restrict the number of authorized clients artificially. A possible solution could be to split contents among multiple cells and reassemble them at the introduction points. 2. The current hidden service protocol does not specify cell types to perform interactive authorization between client and introduction point or hidden service. If there should be an authorization protocol that requires interaction, new cell types would have to be defined and integrated into the hidden service protocol. 2. Specific authorization protocol instances In the following we present two specific authorization protocols that make use of (parts of) the new authorization infrastructure: 1. The first protocol allows a service provider to restrict access to clients with a previously received secret key only, but does not attempt to hide service activity from others. 2. The second protocol, albeit being feasible for a limited set of about 16 clients, performs client authorization and hides service activity from everyone but the authorized clients. These two protocol instances extend the existing hidden service protocol version 2. Hidden services that perform client authorization may run in parallel to other services running versions 0, 2, or both. 2.1. Service with large-scale client authorization The first client authorization protocol aims at performing access control while consuming as few additional resources as possible. A service provider should be able to permit access to a large number of clients while denying access for everyone else. However, the price for scalability is that the service won't be able to hide its activity from unauthorized or formerly authorized clients. The main idea of this protocol is to encrypt the introduction-point part in hidden service descriptors to authorized clients using symmetric keys. This ensures that nobody else but authorized clients can learn which introduction points a service currently uses, nor can someone send a valid INTRODUCE1 message without knowing the introduction key. Therefore, a subsequent authorization at the introduction point is not required. A service provider generates symmetric "descriptor cookies" for his clients and distributes them outside of Tor. The suggested key size is 128 bits, so that descriptor cookies can be encoded in 22 base64 chars (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the authorization type (here: "0") and allow a client to distinguish this authorization protocol from others like the one proposed below). Typically, the contact information for a hidden service using this authorization protocol looks like this: v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz When generating a hidden service descriptor, the service encrypts the introduction-point part with a single randomly generated symmetric 128-bit session key using AES-CTR as described for v2 hidden service descriptors in rend-spec. Afterwards, the service encrypts the session key to all descriptor cookies using AES. Authorized client should be able to efficiently find the session key that is encrypted for him/her, so that 4 octet long client ID are generated consisting of descriptor cookie and initialization vector. Descriptors always contain a number of encrypted session keys that is a multiple of 16 by adding fake entries. Encrypted session keys are ordered by client IDs in order to conceal addition or removal of authorized clients by the service provider. ATYPE Authorization type: set to 1. [1 octet] ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet] for each symmetric descriptor cookie: ID Client ID: H(descriptor cookie | IV)[:4] [4 octets] SKEY Session key encrypted with descriptor cookie [16 octets] (end of client-specific part) RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets] IV AES initialization vector [16 octets] IPOS Intro points, encrypted with session key [remaining octets] An authorized client needs to configure Tor to use the descriptor cookie when accessing the hidden service. Therefore, a user adds the contact information that she received from the service provider to her torrc file. Upon downloading a hidden service descriptor, Tor finds the encrypted introduction-point part and attempts to decrypt it using the configured descriptor cookie. (In the rare event of two or more client IDs being equal a client tries to decrypt all of them.) Upon sending the introduction, the client includes her descriptor cookie as auth type "1" in the INTRODUCE2 cell that she sends to the service. The hidden service checks whether the included descriptor cookie is authorized to access the service and either responds to the introduction request, or not. 2.2. Authorization for limited number of clients A second, more sophisticated client authorization protocol goes the extra mile of hiding service activity from unauthorized clients. With all else being equal to the preceding authorization protocol, the second protocol publishes hidden service descriptors for each user separately and gets along with encrypting the introduction-point part of descriptors to a single client. This allows the service to stop publishing descriptors for removed clients. As long as a removed client cannot link descriptors issued for other clients to the service, it cannot derive service activity any more. The downside of this approach is limited scalability. Even though the distributed storage of descriptors (cf. proposal 114) tackles the problem of limited scalability to a certain extent, this protocol should not be used for services with more than 16 clients. (In fact, Tor should refuse to advertise services for more than this number of clients.) A hidden service generates an asymmetric "client key" and a symmetric "descriptor cookie" for each client. The client key is used as replacement for the service's permanent key, so that the service uses a different identity for each of his clients. The descriptor cookie is used to store descriptors at changing directory nodes that are unpredictable for anyone but service and client, to encrypt the introduction-point part, and to be included in INTRODUCE2 cells. Once the service has created client key and descriptor cookie, he tells them to the client outside of Tor. The contact information string looks similar to the one used by the preceding authorization protocol (with the only difference that it has "1" encoded as auth-type in the remaining 4 of 132 bits instead of "0" as before). When creating a hidden service descriptor for an authorized client, the hidden service uses the client key and descriptor cookie to compute secret ID part and descriptor ID: secret-id-part = H(time-period | descriptor-cookie | replica) descriptor-id = H(client-key[:10] | secret-id-part) The hidden service also replaces permanent-key in the descriptor with client-key and encrypts introduction-points with the descriptor cookie. ATYPE Authorization type: set to 2. [1 octet] IV AES initialization vector [16 octets] IPOS Intro points, encr. with descriptor cookie [remaining octets] When uploading descriptors, the hidden service needs to make sure that descriptors for different clients are not uploaded at the same time (cf. Section 1.1) which is also a limiting factor for the number of clients. When a client is requested to establish a connection to a hidden service it looks up whether it has any authorization data configured for that service. If the user has configured authorization data for authorization protocol "2", the descriptor ID is determined as described in the last paragraph. Upon receiving a descriptor, the client decrypts the introduction-point part using its descriptor cookie. Further, the client includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that it sends to the service. 2.3. Hidden service configuration A hidden service that is meant to perform client authorization adds a new option HiddenServiceAuthorizeClient to its hidden service configuration. This option contains the authorization type which is either "1" for the protocol described in 2.1 or "2" for the protocol in 2.2 and a comma-separated list of human-readable client names, so that Tor can create authorization data for these clients: HiddenServiceAuthorizeClient auth-type client-name,client-name,... If this option is configured, HiddenServiceVersion is automatically reconfigured to contain only version numbers of 2 or higher. Tor stores all generated authorization data for the authorization protocols described in Sections 2.1 and 2.2 in a new file using the following file format: "client-name" human-readable client identifier NL "descriptor-cookie" 128-bit key ^= 22 base64 chars NL If the authorization protocol of Section 2.2 is used, Tor also generates and stores the following data: "client-key" NL a public key in PEM format 2.4. Client configuration Clients need to make their authorization data known to Tor using another configuration option that contains a service name (mainly for the sake of convenience), the service address, and the descriptor cookie that is required to access a hidden service (the authorization protocol number is encoded in the descriptor cookie): HidServAuth service-name service-address descriptor-cookie Security implications: In the following we want to discuss possible attacks by dishonest entities in the presented infrastructure and specific protocol. These security implications would have to be verified once more when adding another protocol. The dishonest entities (theoretically) include the hidden service itself, the authenticated clients, hidden service directory nodes, introduction points, and rendezvous points. The relays that are part of circuits used during protocol execution, but never learn about the exchanged descriptors or cells by design, are not considered. Obviously, this list makes no claim to be complete. The discussed attacks are sorted by the difficulty to perform them, in ascending order, starting with roles that everyone could attempt to take and ending with partially trusted entities abusing the trust put in them. (1) A hidden service directory could attempt to conclude presence of a service from the existence of a locally stored hidden service descriptor: This passive attack is possible only for a single client-service relation, because descriptors need to contain a publicly visible signature of the service using the client key. A possible protection would be to increase the number of hidden service directories in the network. (2) A hidden service directory could try to break the descriptor cookies of locally stored descriptors: This attack can be performed offline. The only useful countermeasure against it might be using safe passwords that are generated by Tor. [passwords? where did those come in? -RD] (3) An introduction point could try to identify the pseudonym of the hidden service on behalf of which it operates: This is impossible by design, because the service uses a fresh public key for every establishment of an introduction point (see proposal 114) and the introduction point receives a fresh introduction cookie, so that there is no identifiable information about the service that the introduction point could learn. The introduction point cannot even tell if client accesses belong to the same client or not, nor can it know the total number of authorized clients. The only information might be the pattern of anonymous client accesses, but that is hardly enough to reliably identify a specific service. (4) An introduction point could want to learn the identities of accessing clients: This is also impossible by design, because all clients use the same introduction cookie for authorization at the introduction point. (5) An introduction point could try to replay a correct INTRODUCE1 cell to other introduction points of the same service, e.g. in order to force the service to create a huge number of useless circuits: This attack is not possible by design, because INTRODUCE1 cells are encrypted using a freshly created introduction key that is only known to authorized clients. (6) An introduction point could attempt to replay a correct INTRODUCE2 cell to the hidden service, e.g. for the same reason as in the last attack: This attack is stopped by the fact that a service will drop INTRODUCE2 cells containing a DH handshake they have seen recently. (7) An introduction point could block client requests by sending either positive or negative INTRODUCE_ACK cells back to the client, but without forwarding INTRODUCE2 cells to the server: This attack is an annoyance for clients, because they might wait for a timeout to elapse until trying another introduction point. However, this attack is not introduced by performing authorization and it cannot be targeted towards a specific client. A countermeasure might be for the server to periodically perform introduction requests to his own service to see if introduction points are working correctly. (8) The rendezvous point could attempt to identify either server or client: This remains impossible as it was before, because the rendezvous cookie does not contain any identifiable information. (9) An authenticated client could swamp the server with valid INTRODUCE1 and INTRODUCE2 cells, e.g. in order to force the service to create useless circuits to rendezvous points; as opposed to an introduction point replaying the same INTRODUCE2 cell, a client could include a new rendezvous cookie for every request: The countermeasure for this attack is the restriction to 10 connection establishments per client per hour. Compatibility: An implementation of this proposal would require changes to hidden services and clients to process authorization data and encode and understand the new formats. However, both services and clients would remain compatible to regular hidden services without authorization. Implementation: The implementation of this proposal can be divided into a number of changes to hidden service and client side. There are no changes necessary on directory, introduction, or rendezvous nodes. All changes are marked with either [service] or [client] do denote on which side they need to be made. /1/ Configure client authorization [service] - Parse configuration option HiddenServiceAuthorizeClient containing authorized client names. - Load previously created client keys and descriptor cookies. - Generate missing client keys and descriptor cookies, add them to client_keys file. - Rewrite the hostname file. - Keep client keys and descriptor cookies of authorized clients in memory. [- In case of reconfiguration, mark which client authorizations were added and whether any were removed. This can be used later when deciding whether to rebuild introduction points and publish new hidden service descriptors. Not implemented yet.] /2/ Publish hidden service descriptors [service] - Create and upload hidden service descriptors for all authorized clients. [- See /1/ for the case of reconfiguration.] /3/ Configure permission for hidden services [client] - Parse configuration option HidServAuth containing service authorization, store authorization data in memory. /5/ Fetch hidden service descriptors [client] - Look up client authorization upon receiving a hidden service request. - Request hidden service descriptor ID including client key and descriptor cookie. Only request v2 descriptors, no v0. /6/ Process hidden service descriptor [client] - Decrypt introduction points with descriptor cookie. /7/ Create introduction request [client] - Include descriptor cookie in INTRODUCE2 cell to introduction point. - Pass descriptor cookie around between involved connections and circuits. /8/ Process introduction request [service] - Read descriptor cookie from INTRODUCE2 cell. - Check whether descriptor cookie is authorized for access, including checking access counters. - Log access for accountability.
Filename: 122-unnamed-flag.txt Title: Network status entries need a new Unnamed flag Author: Roger Dingledine Created: 04-Oct-2007 Status: Closed Implemented-In: 0.2.0.x 1. Overview: Tor's directory authorities can give certain servers a "Named" flag in the network-status entry, when they want to bind that nickname to that identity key. This allows clients to specify a nickname rather than an identity fingerprint and still be certain they're getting the "right" server. As dir-spec.txt describes it, Name X is bound to identity Y if at least one binding directory lists it, and no directory binds X to some other Y'. In practice, clients can refer to servers by nickname whether they are Named or not; if they refer to nicknames that aren't Named, a complaint shows up in the log asking them to use the identity key in the future --- but it still works. The problem? Imagine a Tor server with nickname Bob. Bob and his identity fingerprint are registered in tor26's approved-routers file, but none of the other authorities registered him. Imagine there are several other unregistered servers also with nickname Bob ("the imposters"). While Bob is online, all is well: a) tor26 gives a Named flag to the real one, and refuses to list the other ones; and b) the other authorities list the imposters but don't give them a Named flag. Clients who have all the network-statuses can compute which one is the real Bob. But when the real Bob disappears and his descriptor expires? tor26 continues to refuse to list any of the imposters, and the other authorities continue to list the imposters. Clients don't have any idea that there exists a Named Bob, so they can ask for server Bob and get one of the imposters. (A warning will also appear in their log, but so what.) 2. The stopgap solution: tor26 should start accepting and listing the imposters, but it should assign them a new flag: "Unnamed". This would produce three cases in terms of assigning flags in the consensus networkstatus: i) a router gets the Named flag in the v3 networkstatus if a) it's the only router with that nickname that has the Named flag out of all the votes, and b) no vote lists it as Unnamed else, ii) a router gets the Unnamed flag if a) some vote lists a different router with that nickname as Named, or b) at least one vote lists it as Unnamed, or c) there are other routers with the same nickname that are Unnamed else, iii) the router neither gets a Named nor an Unnamed flag. (This whole proposal is meant only for v3 dir flags; we shouldn't try to backport it to the v2 dir world.) Then client behavior is: a) If there's a Bob with a Named flag, pick that one. else b) If the Bobs don't have the Unnamed flag (notice that they should either all have it, or none), pick one of them and warn. else c) They all have the Unnamed flag -- no router found. 3. Problems not solved by this stopgap: 3.1. Naming authorities can go offline. If tor26 is the only authority that provides a binding for Bob, when tor26 goes offline we're back in our previous situation -- the imposters can be referenced with a mere ignorable warning in the client's log. If some other authority Names a different Bob, and tor26 goes offline, then that other Bob becomes the unique Named Bob. So be it. We should try to solve these one day, but there's no clear way to do it that doesn't destroy usability in other ways, and if we want to get the Unnamed flag into v3 network statuses we should add it soon. 3.2. V3 dir spec magnifies brief discrepancies. Another point to notice is if tor26 names Bob(1), doesn't know about Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag even if it should (and Bob(1) is not around). Right now, in v2 dirs, the case where an authority doesn't know about a server but the other authorities do know is rare. That's because authorities periodically ask for other networkstatuses and then fetch descriptors that are missing. With v3, if that window occurs at the wrong time, it is extended for the entire period. We could solve this by making the voting more complex, but that doesn't seem worth it. [3.3. Tor26 is only one tor26. We need more naming authorities, possibly with some kind of auto-naming feature. This is out-of-scope for this proposal -NM] 4. Changes to the v2 directory Previously, v2 authorities that had a binding for a server named Bob did not list any other server named Bob. This will change too: Version 2 authorities will start listing all routers they know about, whether they conflict with a name-binding or not: Servers for which this authority has a binding will continue to be marked Named, additionally all other servers of that nickname will be listed without the Named flag (i.e. there will be no Unnamed flag in v2 status documents). Clients already should handle having a named Bob alongside unnamed Bobs correctly, and having the unnamed Bobs in the status file even without the named server is no worse than the current status quo where clients learn about those servers from other authorities. The benefit of this is that an authority's opinion on a server like Guard, Stable, Fast etc. can now be learned by clients even if that specific authority has reserved that server's name for somebody else. 5. Other benefits: This new flag will allow people to operate servers that happen to have the same nickname as somebody who registered their server two years ago and left soon after. Right now there are dozens of nicknames that are registered on all three binding directory authorities, yet haven't been running for years. While it's bad that these nicknames are effectively blacklisted from the network, the really bad part is that this logic is really unintuitive to prospective new server operators.
Filename: 123-autonaming.txt Title: Naming authorities automatically create bindings Author: Peter Palfrader Created: 2007-10-11 Status: Closed Implemented-In: 0.2.0.x Overview: Tor's directory authorities can give certain servers a "Named" flag in the network-status entry, when they want to bind that nickname to that identity key. This allows clients to specify a nickname rather than an identity fingerprint and still be certain they're getting the "right" server. Authority operators name a server by adding their nickname and identity fingerprint to the 'approved-routers' file. Historically being listed in the file was required for a router, at first for being listed in the directory at all, and later in order to be used by clients as a first or last hop of a circuit. Adding identities to the list of named routers so far has been a manual, time consuming, and boring job. Given that and the fact that the Tor network works just fine without named routers the last authority to keep a current binding list stopped updating it well over half a year ago. Naming, if it were done, would serve a useful purpose however in that users can have a reasonable expectation that the exit server Bob they are using in their http://www.google.com.bob.exit/ URL is the same Bob every time. Proposal: I propose that identity<->name binding be completely automated: New bindings should be added after the router has been around for a bit and their name has not been used by other routers, similarly names that have not appeared on the network for a long time should be freed in case a new router wants to use it. The following rules are suggested: i) If a named router has not been online for half a year, the identity<->name binding for that name is removed. The nickname is free to be taken by other routers now. ii) If a router claims a certain nickname and a) has been on the network for at least two weeks, and b) that nickname is not yet linked to a different router, and c) no other router has wanted that nickname in the last month, a new binding should be created for this router and its desired nickname. This automaton does not necessarily need to live in the Tor code, it can do its job just as well when it's an external tool.
Filename: 124-tls-certificates.txt Title: Blocking resistant TLS certificate usage Author: Steven J. Murdoch Created: 2007-10-25 Status: Superseded Overview: To be less distinguishable from HTTPS web browsing, only Tor servers should present TLS certificates. This should be done whilst maintaining backwards compatibility with Tor nodes which present and expect client certificates, and while preserving existing security properties. This specification describes the negotiation protocol, what certificates should be presented during the TLS negotiation, and how to move the client authentication within the encrypted tunnel. Motivation: In Tor's current TLS [1] handshake, both client and server present a two-certificate chain. Since TLS performs authentication prior to establishing the encrypted tunnel, the contents of these certificates are visible to an eavesdropper. In contrast, during normal HTTPS web browsing, the server presents a single certificate, signed by a root CA and the client presents no certificate. Hence it is possible to distinguish Tor from HTTP by identifying this pattern. To resist blocking based on traffic identification, Tor should behave as close to HTTPS as possible, i.e. servers should offer a single certificate and not request a client certificate; clients should present no certificate. This presents two difficulties: clients are no longer authenticated and servers are authenticated by the connection key, rather than identity key. The link protocol must thus be modified to preserve the old security semantics. Finally, in order to maintain backwards compatibility, servers must correctly identify whether the client supports the modified certificate handling. This is achieved by modifying the cipher suites that clients advertise support for. These cipher suites are selected to be similar to those chosen by web browsers, in order to resist blocking based on client hello. Terminology: Initiator: OP or OR which initiates a TLS connection ("client" in TLS terminology) Responder: OR which receives an incoming TLS connection ("server" in TLS terminology) Version negotiation and cipher suite selection: In the modified TLS handshake, the responder does not request a certificate from the initiator. This request would normally occur immediately after the responder receives the client hello (the first message in a TLS handshake) and so the responder must decide whether to request a certificate based only on the information in the client hello. This is achieved by examining the cipher suites in the client hello. List 1: cipher suites lists offered by version 0/1 Tor From src/common/tortls.c, revision 12086: TLS1_TXT_DHE_RSA_WITH_AES_128_SHA TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA Client hello sent by initiator: Initiators supporting version 2 of the Tor connection protocol MUST offer a different cipher suite list from those sent by pre-version 2 Tors, contained in List 1. To maintain compatibility with older Tor versions and common browsers, the cipher suite list MUST include support for: TLS_DHE_RSA_WITH_AES_256_CBC_SHA TLS_DHE_RSA_WITH_AES_128_CBC_SHA SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA Client hello received by responder/server hello sent by responder: Responders supporting version 2 of the Tor connection protocol should compare the cipher suite list in the client hello with those in List 1. If it matches any in the list then the responder should assume that the initiatior supports version 1, and thus should maintain the version 1 behavior, i.e. send a two-certificate chain, request a client certificate and do not send or expect a VERSIONS cell [2]. Otherwise, the responder should assume version 2 behavior and select a cipher suite following TLS [1] behavior, i.e. select the first entry from the client hello cipher list which is acceptable. Responders MUST NOT select any suite that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits, or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT allow other SSLv3 ciphersuites. Should no mutually acceptable cipher suite be found, the connection MUST be closed. If the responder is implementing version 2 of the connection protocol it SHOULD send a server certificate with random contents. The organizationName field MUST NOT be "Tor", "TOR" or "t o r". Server certificate received by initiator: If the server certificate has an organizationName of "Tor", "TOR" or "t o r", the initiator should assume that the responder does not support version 2 of the connection protocol. In which case the initiator should respond following version 1, i.e. send a two-certificate client chain and do not send or expect a VERSIONS cell. [SJM: We could also use the fact that a client certificate request was sent] If the server hello contains a ciphersuite which does not comply with the key length requirements above, even if it was one offered in the client hello, the connection MUST be closed. This will only occur if the responder is not a Tor server. Backward compatibility: v1 Initiator, v1 Responder: No change v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator detects v1 server certificate and continues with v1 protocol v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator detects v2 server certificate and continues with v2 protocol. Additional link authentication process: Following VERSION and NETINFO negotiation, both responder and initiator MUST send a certification chain in a CERT cell. If one party does not have a certificate, the CERT cell MUST still be sent, but with a length of zero. A CERT cell is a variable length cell, of the format CircID [2 bytes] Command [1 byte] Length [2 bytes] Payload [<length> bytes] CircID MUST set to be 0x0000 Command is [SJM: TODO] Length is the length of the payload Payload contains 0 or more certificates, each is of the format: Cert_Length [2 bytes] Certificate [<cert_length> bytes] Each certificate MUST sign the one preceding it. The initator MUST place its connection certificate first; the responder, having already sent its connection certificate as part of the TLS handshake MUST place its identity certificate first. Initiators who send a CERT cell MUST follow that with an LINK_AUTH cell to prove that they posess the corresponding private key. A LINK_AUTH cell is fixed-lenth, of the format: CircID [2 bytes] Command [1 byte] Length [2 bytes] Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes] CircID MUST set to be 0x0000 Command is [SJM: TODO] Length is the valid portion of the payload Payload is of the format: Signature version [1 byte] Signature [<length> - 1 bytes] Padding [PAYLOAD_LEN - <length> - 2 bytes] Signature version: Identifies the type of signature, currently 0x00 Signature: Digital signature under the initiator's connection key of the following item, in PKCS #1 block type 1 [3] format: HMAC-SHA1, using the TLS master secret as key, of the following elements concatenated: - The signature version (0x00) - The NUL terminated ASCII string: "Tor initiator certificate verification" - client_random, as sent in the Client Hello - server_random, as sent in the Server Hello - SHA-1 hash of the initiator connection certificate - SHA-1 hash of the responder connection certificate Security checks: - Before sending a LINK_AUTH cell, a node MUST ensure that the TLS connection is authenticated by the responder key. - For the handshake to have succeeded, the initiator MUST confirm: - That the TLS handshake was authenticated by the responder connection key - That the responder connection key was signed by the first certificate in the CERT cell - That each certificate in the CERT cell was signed by the following certificate, with the exception of the last - That the last certificate in the CERT cell is the expected identity certificate for the node being connected to - For the handshake to have succeeded, the responder MUST confirm either: A) - A zero length CERT cell was sent and no LINK_AUTH cell was sent In which case the responder shall treat the identity of the initiator as unknown or B) - That the LINK_AUTH MAC contains a signature by the first certificate in the CERT cell - That the MAC signed matches the expected value - That each certificate in the CERT cell was signed by the following certificate, with the exception of the last In which case the responder shall treat the identity of the initiator as that of the last certificate in the CERT cell Protocol summary: 1. I(nitiator) <-> R(esponder): TLS handshake, including responder authentication under connection certificate R_c 2. I <->: VERSION and NETINFO negotiation 3. R -> I: CERT (Responder identity certificate R_i (which signs R_c)) 4. I -> R: CERT (Initiator connection certificate I_c, Initiator identity certificate I_i (which signs I_c) 5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret, "Tor initiator certificate verification" || client_random || server_random || I_c hash || R_c hash) Notes: I -> R doesn't need to wait for R_i before sending its own messages (reduces round-trips). Certificate hash is calculated like identity hash in CREATE cells. Initiator signature is calculated in a similar way to Certificate Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7). If I is an OP, a zero length certificate chain may be sent in step 4; In which case, step 5 is not performed Rationale: - Version and netinfo negotiation before authentication: The version cell needs to come before before the rest of the protocol, since we may choose to alter the rest at some later point, e.g switch to a different MAC/signature scheme. It is useful to keep the NETINFO and VERSION cells close to each other, since the time between them is used to check if there is a delay-attack. Still, a server might want to not act on NETINFO data from an initiator until the authentication is complete. Appendix A: Cipher suite choices This specification intentionally does not put any constraints on the TLS ciphersuite lists presented by clients, other than a minimum required for compatibility. However, to maximize blocking resistance, ciphersuite lists should be carefully selected. Recommended client ciphersuite list Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h 0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA 0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA 0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA 0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA 0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA 0x0035: TLS_RSA_WITH_AES_256_CBC_SHA 0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA 0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA 0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA 0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA 0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA 0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA 0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA 0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA 0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA 0x0004: SSL_RSA_WITH_RC4_128_MD5 0x0005: SSL_RSA_WITH_RC4_128_SHA 0x002f: TLS_RSA_WITH_AES_128_CBC_SHA 0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA 0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA 0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA 0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA 0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC) 0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA Order specified in: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47 Recommended options: 0x0000: Server Name Indication [4] 0x000a: Supported Elliptic Curves [5] 0x000b: Supported Point Formats [5] Recommended compression: 0x00 Recommended server ciphersuite selection: The responder should select the first entry in this list which is listed in the client hello: 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ] 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ] 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ] 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ] References: [1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF [2] Version negotiation for the Tor protocol, Tor proposal 105 [3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1: RSA Cryptography Specifications Version 1.5", RFC 2313, March 1998. [4] TLS Extensions, RFC 3546 [5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS) % <!-- Local IspellDict: american -->
Filename: 125-bridges.txt Title: Behavior for bridge users, bridge relays, and bridge authorities Author: Roger Dingledine Created: 11-Nov-2007 Status: Closed Implemented-In: 0.2.0.x 0. Preface This document describes the design decisions around support for bridge users, bridge relays, and bridge authorities. It acts as an overview of the bridge design and deployment for developers, and it also tries to point out limitations in the current design and implementation. For more details on what all of these mean, look at blocking.tex in /doc/design-paper/ 1. Bridge relays Bridge relays are just like normal Tor relays except they don't publish their server descriptors to the main directory authorities. 1.1. PublishServerDescriptor To configure your relay to be a bridge relay, just add BridgeRelay 1 PublishServerDescriptor bridge to your torrc. This will cause your relay to publish its descriptor to the bridge authorities rather than to the default authorities. Alternatively, you can say BridgeRelay 1 PublishServerDescriptor 0 which will cause your relay to not publish anywhere. This could be useful for private bridges. 1.2. Exit policy Bridge relays should use an exit policy of "reject *:*". This is because they only need to relay traffic between the bridge users and the rest of the Tor network, so there's no need to let people exit directly from them. 1.3. RelayBandwidthRate / RelayBandwidthBurst We invented the RelayBandwidth* options for this situation: Tor clients who want to allow relaying too. See proposal 111 for details. Relay operators should feel free to rate-limit their relayed traffic. 1.4. Helping the user with port forwarding, NAT, etc. Just as for operating normal relays, our documentation and hints for how to make your ORPort reachable are inadequate for normal users. We need to work harder on this step, perhaps in 0.2.2.x. 1.5. Vidalia integration Vidalia has turned its "Relay" settings page into a tri-state "Don't relay" / "Relay for the Tor network" / "Help censored users". If you click the third choice, it forces your exit policy to reject *:*. If all the bridges end up on port 9001, that's not so good. On the other hand, putting the bridges on a low-numbered port in the Unix world requires jumping through extra hoops. The current compromise is that Vidalia makes the ORPort default to 443 on Windows, and 9001 on other platforms. At the bottom of the relay config settings window, Vidalia displays the bridge identifier to the operator (see Section 3.1) so he can pass it on to bridge users. 1.6. What if the default ORPort is already used? If the user already has a webserver or some other application bound to port 443, then Tor will fail to bind it and complain to the user, probably in a cryptic way. Rather than just working on a better error message (though we should do this), we should consider an "ORPort auto" option that tells Tor to try to find something that's bindable and reachable. This would also help us tolerate ISPs that filter incoming connections on port 80 and port 443. But this should be a different proposal, and can wait until 0.2.2.x. 2. Bridge authorities. Bridge authorities are like normal directory authorities, except they don't create their own network-status documents or votes. So if you ask an authority for a network-status document or consensus, they behave like a directory mirror: they give you one from one of the main authorities. But if you ask the bridge authority for the descriptor corresponding to a particular identity fingerprint, it will happily give you the latest descriptor for that fingerprint. To become a bridge authority, add these lines to your torrc: AuthoritativeDirectory 1 BridgeAuthoritativeDir 1 Right now there's one bridge authority, running on the Tonga relay. 2.1. Exporting bridge-purpose descriptors We've added a new purpose for server descriptors: the "bridge" purpose. With the new router-descriptors file format that includes annotations, it's easy to look through it and find the bridge-purpose descriptors. Currently we export the bridge descriptors from Tonga to the BridgeDB server, so it can give them out according to the policies in blocking.pdf. 2.2. Reachability/uptime testing Right now the bridge authorities do active reachability testing of bridges, so we know which ones to recommend for users. But in the design document, we suggested that bridges should publish anonymously (i.e. via Tor) to the bridge authority, so somebody watching the bridge authority can't just enumerate all the bridges. But if we're doing active measurement, the game is up. Perhaps we should back off on this goal, or perhaps we should do our active measurement anonymously? Answering this issue is scheduled for 0.2.1.x. 2.3. Migrating to multiple bridge authorities Having only one bridge authority is both a trust bottleneck (if you break into one place you learn about every single bridge we've got) and a robustness bottleneck (when it's down, bridge users become sad). Right now if we put up a second bridge authority, all the bridges would publish to it, and (assuming the code works) bridge users would query a random bridge authority. This resolves the robustness bottleneck, but makes the trust bottleneck even worse. In 0.2.2.x and later we should think about better ways to have multiple bridge authorities. 3. Bridge users. Bridge users are like ordinary Tor users except they use encrypted directory connections by default, and they use bridge relays as both entry guards (their first hop) and directory guards (the source of all their directory information). To become a bridge user, add the following line to your torrc: UseBridges 1 and then add at least one "Bridge" line to your torrc based on the format below. 3.1. Format of the bridge identifier. The canonical format for a bridge identifier contains an IP address, an ORPort, and an identity fingerprint: bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 However, the identity fingerprint can be left out, in which case the bridge user will connect to that relay and use it as a bridge regardless of what identity key it presents: bridge 128.31.0.34:9009 This might be useful for cases where only short bridge identifiers can be communicated to bridge users. In a future version we may also support bridge identifiers that are only a key fingerprint: bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 and the bridge user can fetch the latest descriptor from the bridge authority (see Section 3.4). 3.2. Bridges as entry guards For now, bridge users add their bridge relays to their list of "entry guards" (see path-spec.txt for background on entry guards). They are managed by the entry guard algorithms exactly as if they were a normal entry guard -- their keys and timing get cached in the "state" file, etc. This means that when the Tor user starts up with "UseBridges" disabled, he will skip past the bridge entries since they won't be listed as up and usable in his networkstatus consensus. But to be clear, the "entry_guards" list doesn't currently distinguish guards by purpose. Internally, each bridge user keeps a smartlist of "bridge_info_t" that reflects the "bridge" lines from his torrc along with a download schedule (see Section 3.5 below). When he starts Tor, he attempts to fetch a descriptor for each configured bridge (see Section 3.4 below). When he succeeds at getting a descriptor for one of the bridges in his list, he adds it directly to the entry guard list using the normal add_an_entry_guard() interface. Once a bridge descriptor has been added, should_delay_dir_fetches() will stop delaying further directory fetches, and the user begins to bootstrap his directory information from that bridge (see Section 3.3). Currently bridge users cache their bridge descriptors to the "cached-descriptors" file (annotated with purpose "bridge"), but they don't make any attempt to reuse descriptors they find in this file. The theory is that either the bridge is available now, in which case you can get a fresh descriptor, or it's not, in which case an old descriptor won't do you much good. We could disable writing out the bridge lines to the state file, if we think this is a problem. As an exception, if we get an application request when we have one or more bridge descriptors but we believe none of them are running, we mark them all as running again. This is similar to the exception already in place to help long-idle Tor clients realize they should fetch fresh directory information rather than just refuse requests. 3.3. Bridges as directory guards In addition to using bridges as the first hop in their circuits, bridge users also use them to fetch directory updates. Other than initial bootstrapping to find a working bridge descriptor (see Section 3.4 below), all further non-anonymized directory fetches will be redirected to the bridge. This means that bridge relays need to have cached answers for all questions the bridge user might ask. This makes the upgrade path tricky --- for example, if we migrate to a v4 directory design, the bridge user would need to keep using v3 so long as his bridge relays only knew how to answer v3 queries. In a future design, for cases where the user has enough information to build circuits yet the chosen bridge doesn't know how to answer a given query, we might teach bridge users to make an anonymized request to a more suitable directory server. 3.4. How bridge users get their bridge descriptor Bridge users can fetch bridge descriptors in two ways: by going directly to the bridge and asking for "/tor/server/authority", or by going to the bridge authority and asking for "/tor/server/fp/ID". By default, they will only try the direct queries. If the user sets UpdateBridgesFromAuthority 1 in his config file, then he will try querying the bridge authority first for bridges where he knows a digest (if he only knows an IP address and ORPort, then his only option is a direct query). If the user has at least one working bridge, then he will do further queries to the bridge authority through a full three-hop Tor circuit. But when bootstrapping, he will make a direct begin_dir-style connection to the bridge authority. As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor from the bridge authority and it returns a 404 not found, the user will automatically fall back to trying a direct query. Therefore it is recommended that bridge users always set UpdateBridgesFromAuthority, since at worst it will delay their fetches a little bit and notify the bridge authority of the identity fingerprint (but not location) of their intended bridges. 3.5. Bridge descriptor retry schedule Bridge users try to fetch a descriptor for each bridge (using the steps in Section 3.4 above) on startup. Whenever they receive a bridge descriptor, they reschedule a new descriptor download for 1 hour from then. If on the other hand it fails, they try again after 15 minutes for the first attempt, after 15 minutes for the second attempt, and after 60 minutes for subsequent attempts. In 0.2.2.x we should come up with some smarter retry schedules. 3.6. Vidalia integration Vidalia 0.0.16 has a checkbox in its Network config window called "My ISP blocks connections to the Tor network." Users who click that box change their configuration to: UseBridges 1 UpdateBridgesFromAuthority 1 and should specify at least one Bridge identifier. 3.7. Do we need a second layer of entry guards? If the bridge user uses the bridge as its entry guard, then the triangulation attacks from Lasse and Paul's Oakland paper work to locate the user's bridge(s). Worse, this is another way to enumerate bridges: if the bridge users keep rotating through second hops, then if you run a few fast servers (and avoid getting considered an Exit or a Guard) you'll quickly get a list of the bridges in active use. That's probably the strongest reason why bridge users will need to pick second-layer guards. Would this mean bridge users should switch to four-hop circuits? We should figure this out in the 0.2.1.x timeframe.
Filename: 126-geoip-reporting.txt Title: Getting GeoIP data and publishing usage summaries Author: Roger Dingledine Created: 2007-11-24 Status: Closed Implemented-In: 0.2.0.x 0. Status In 0.2.0.x, this proposal is implemented to the extent needed to address its motivations. See notes below with the test "RESOLUTION" for details. 1. Background and motivation Right now we can keep a rough count of Tor users, both total and by country, by watching connections to a single directory mirror. Being able to get usage estimates is useful both for our funders (to demonstrate progress) and for our own development (so we know how quickly we're scaling and can design accordingly, and so we know which countries and communities to focus on more). This need for information is the only reason we haven't deployed "directory guards" (think of them like entry guards but for directory information; in practice, it would seem that Tor clients should simply use their entry guards as their directory guards; see also proposal 125). With the move toward bridges, we will no longer be able to track Tor clients that use bridges, since they use their bridges as directory guards. Further, we need to be able to learn which bridges stop seeing use from certain countries (and are thus likely blocked), so we can avoid giving them out to other users in those countries. Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays and circuits on its 'network map', and it performs anonymized GeoIP lookups to its central servers to know where to put the dots. Vidalia caches answers it gets -- to reduce delay, to reduce overhead on the network, and to reduce anonymity issues where users reveal their knowledge about the network through which IP addresses they ask about. But with the advent of bridges, Tor clients are asking about IP addresses that aren't in the main directory. In particular, bridge users inform the central Vidalia servers about each bridge as they discover it and their Vidalia tries to map it. Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's own IP address, so it can provide a more useful map. Finally, Vidalia's central servers leave users open to partitioning attacks, even if they can't target specific users. Further, as we start using GeoIP results for more operational or security-relevant goals, such as avoiding or including particular countries in circuits, it becomes more important that users can't be singled out in terms of their IP-to-country mapping beliefs. 2. The available GeoIP databases There are at least two classes of GeoIP database out there: "IP to country", which tells us the country code for the IP address but no more details, and "IP to city", which tells us the country code, the name of the city, and some basic latitude/longitude guesses. A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252 bytes. A typical line is: "205500992","208605279","US","USA","UNITED STATES" http://ip-to-country.webhosting.info/node/view/5 Similarly, the maxmind GeoLite Country database is also about 500KB compressed. http://www.maxmind.com/app/geolitecountry The maxmind GeoLite City database gives more finegrained detail like geo coordinates and city name. Vidalia currently makes use of this information. On the other hand it's 16MB compressed. A typical line is: 206.124.149.146,Bellevue,WA,US,47.6051,-122.1134 http://www.maxmind.com/app/geolitecity There are other databases out there, like http://www.hostip.info/faq.html http://www.webconfs.com/ip-to-city.php that want more attention, but for now let's assume that all the db's are around this size. 3. What we'd like to solve Goal #1a: Tor relays collect IP-to-country user stats and publish sanitized versions. Goal #1b: Tor bridges collect IP-to-country user stats and publish sanitized versions. Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better mapping. Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user can pick countries for her paths. Goal #3: Vidalia doesn't do external lookups on bridge relay addresses. Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city for better mapping. Goal #5: Reduce partitioning opportunities where Vidalia central servers can give different (distinguishing) responses. 4. Solution overview Our goal is to allow Tor relays, bridges, and clients to learn enough GeoIP information so they can do local private queries. 4.1. The IP-to-country db Directory authorities should publish a "geoip" file that contains IP-to-country mappings. Directory caches will mirror it, and Tor clients and relays (including bridge relays) will fetch it. Thus we can solve goals 1a and 1b (publish sanitized usage info). Controllers could also use this to solve goal 2b (choosing path by country attributes). It also solves goal 4 (learning the Tor client's country), though for huge countries like the US we'd still need to decide where the "middle" should be when we're mapping that address. The IP-to-country details are described further in Sections 5 and 6 below. [RESOLUTION: The geoip file in 0.2.0.x is not distributed through Tor. Instead, it is shipped with the bundle.] 4.2. The IP-to-city db In an ideal world, the IP-to-city db would be small enough that we could distribute it in the above manner too. But for now, it is too large. Here's where the design choice forks. Option A: Vidalia should continue doing its anonymized IP-to-city queries. Thus we can achieve goals 2a and 2b. We would solve goal 3 by only doing lookups on descriptors that are purpose "general" (see Section 4.2.1 for how). We would leave goal 5 unsolved. Option B: Each directory authority should keep an IP-to-city db, lookup the value for each router it lists, and include that line in the router's network-status entry. The network-status consensus would then use the line that appears in the majority of votes. This approach also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups at all now), and goal 5 (reduced partitioning risks). Option B has the advantage that Vidalia can simplify its operation, and the advantage that this consensus IP-to-city data is available to other controllers besides just Vidalia. But it has the disadvantage that the networkstatus consensus becomes larger, even though most of the GeoIP information won't change from one consensus to the next. Is there another reasonable location for it that can provide similar consensus security properties? [RESOLUTION: IP-to-city is not supported.] 4.2.1. Controllers can query for router annotations Vidalia needs to stop doing queries on bridge relay IP addresses. It could do that by only doing lookups on descriptors that are in the networkstatus consensus, but that precludes designs like Blossom that might want to map its relay locations. The best answer is that it should learn the router annotations, with a new controller 'getinfo' command: "GETINFO desc-annotations/id/<OR identity>" which would respond with something like @downloaded-at 2007-11-29 08:06:38 @source "128.31.0.34" @purpose bridge [We could also make the answer include the digest for the router in question, which would enable us to ask GETINFO router-annotations/all. Is this worth it? -RD] Then Vidalia can avoid doing lookups on descriptors with purpose "bridge". Even better would be to add a new annotation "@private true" so Vidalia can know how to handle new purposes that we haven't created yet. Vidalia could special-case "bridge" for now, for compatibility with the current 0.2.0.x-alphas. 4.3. Recommendation My overall recommendation is that we should implement 4.1 soon (e.g. early in 0.2.1.x), and we can go with 4.2 option A for now, with the hope that later we discover a better way to distribute the IP-to-city info and can switch to 4.2 option B. Below we discuss more how to go about achieving 4.1. 5. Publishing and caching the GeoIP (IP-to-country) database Each v3 directory authority should put a copy of the "geoip" file in its datadirectory. Then its network-status votes should include a hash of this file (Recommended-geoip-hash: %s), and the resulting consensus directory should specify the consensus hash. There should be a new URL for fetching this geoip db (by "current.z" for testing purposes, and by hash.z for typical downloads). Authorities should fetch and serve the one listed in the consensus, even when they vote for their own. This would argue for storing the cached version in a better filename than "geoip". Directory mirrors should keep a copy of this file available via the same URLs. We assume that the file would change at most a few times a month. Should Tor ship with a bootstrap geoip file? An out-of-date geoip file may open you up to partitioning attacks, but for the most part it won't be that different. There should be a config option to disable updating the geoip file, in case users want to use their own file (e.g. they have a proprietary GeoIP file they prefer to use). In that case we leave it up to the user to update his geoip file out-of-band. [XXX Should consider forward/backward compatibility, e.g. if we want to move to a new geoip file format. -RD] [RESOLUTION: Not done over Tor.] 6. Controllers use the IP-to-country db for mapping and for path building Down the road, Vidalia could use the IP-to-country mappings for placing on its map: - The location of the client - The location of the bridges, or other relays not in the networkstatus, on the map. - Any relays that it doesn't yet have an IP-to-city answer for. Other controllers can also use it to set EntryNodes, ExitNodes, etc in a per-country way. To support these features, we need to export the IP-to-country data via the Tor controller protocol. Is it sufficient just to add a new GETINFO command? GETINFO ip-to-country/128.31.0.34 250+ip-to-country/128.31.0.34="US","USA","UNITED STATES" [RESOLUTION: Not done now, except for the getinfo command.] 6.1. Other interfaces Robert Hogan has also suggested a GETINFO relays-by-country/cn as well as torrc options for ExitCountryCodes, EntryCountryCodes, ExcludeCountryCodes, etc. [RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.] 7. Relays and bridges use the IP-to-country db for usage summaries Once bridges have a GeoIP database locally, they can start to publish sanitized summaries of client usage -- how many users they see and from what countries. This might also be a more useful way for ordinary Tor relays to convey the level of usage they see, which would allow us to switch to using directory guards for all users by default. But how to safely summarize this information without opening too many anonymity leaks? 7.1 Attacks to think about First, note that we need to have a large enough time window that we're not aiding correlation attacks much. I hope 24 hours is enough. So that means no publishing stats until you've been up at least 24 hours. And you can't publish follow-up stats more often than every 24 hours, or people could look at the differential. Second, note that we need to be sufficiently vague about the IP addresses we're reporting. We are hoping that just specifying the country will be vague enough. But a) what about active attacks where we convince a bridge to use a GeoIP db that labels each suspect IP address as a unique country? We have to assume that the consensus GeoIP db won't be malicious in this way. And b) could such singling-out attacks occur naturally, for example because of countries that have a very small IP space? We should investigate that. 7.2. Granularity of users Do we only want to report countries that have a sufficient anonymity set (that is, number of users) for the day? For example, we might avoid listing any countries that have seen less than five addresses over the 24 hour period. This approach would be helpful in reducing the singling-out opportunities -- in the extreme case, we could imagine a situation where one blogger from the Sudan used Tor on a given day, and we can discover which entry guard she used. But I fear that especially for bridges, seeing only one hit from a given country in a given day may be quite common. As a compromise, we should start out with an "Other" category in the reported stats, which is the sum of unlisted countries; if that category is consistently interesting, we can think harder about how to get the right data from it safely. But note that bridge summaries will not be made public individually, since doing so would help people enumerate bridges. Whereas summaries from normal relays will be public. So perhaps that means we can afford to be more specific in bridge summaries? In particular, I'm thinking the "other" category should be used by public relays but not for bridges (or if it is, used with a lower threshold). Even for countries that have many Tor users, we might not want to be too specific about how many users we've seen. For example, we might round down the number of users we report to the nearest multiple of 5. My instinct for now is that this won't be that useful. 7.3 Other issues Another note: we'll likely be overreporting in the case of users with dynamic IP addresses: if they rotate to a new address over the course of the day, we'll count them twice. So be it. 7.4. Where to publish the summaries? We designed extrainfo documents for information like this. So they should just be more entries in the extrainfo doc. But if we want to publish summaries every 24 hours (no more often, no less often), aren't we tried to the router descriptor publishing schedule? That is, if we publish a new router descriptor at the 18 hour mark, and nothing much has changed at the 24 hour mark, won't the new descriptor get dropped as being "cosmetically similar", and then nobody will know to ask about the new extrainfo document? One solution would be to make and remember the 24 hour summary at the 24 hour mark, but not actually publish it anywhere until we happen to publish a new descriptor for other reasons. If we happen to go down before publishing a new descriptor, then so be it, at least we tried. 7.5. What if the relay is unreachable or goes to sleep? Even if you've been up for 24 hours, if you were hibernating for 18 of them, then we're not getting as much fuzziness as we'd like. So I guess that means that we need a 24-hour period of being "awake" before we'll willing to publish a summary. A similar attack works if you've been awake but unreachable for the first 18 of the 24 hours. As another example, a bridge that's on a laptop might be suspended for some of each day. This implies that some relays and bridges will never publish summary stats, because they're not ever reliably working for 24 hours in a row. If a significant percentage of our reporters end up being in this boat, we should investigate whether we can accumulate 24 hours of "usefulness", even if there are holes in the middle, and publish based on that. What other issues are like this? It seems that just moving to a new IP address shouldn't be a reason to cancel stats publishing, assuming we were usable at each address. 7.6. IP addresses that aren't in the geoip db Some IP addresses aren't in the public geoip databases. In particular, I've found that a lot of African countries are missing, but there are also some common ones in the US that are missing, like parts of Comcast. We could just lump unknown IP addresses into the "other" category, but it might be useful to gather a general sense of how many lookups are failing entirely, by adding a separate "Unknown" category. We could also contribute back to the geoip db, by letting bridges set a config option to report the actual IP addresses that failed their lookup. Then the bridge authority operators can manually make sure the correct answer will be in later geoip files. This config option should be disabled by default. 7.7 Bringing it all together So here's the plan: 24 hours after starting up (modulo Section 7.5 above), bridges and relays should construct a daily summary of client countries they've seen, including the above "Unknown" category (Section 7.6) as well. Non-bridge relays lump all countries with less than K (e.g. K=5) users into the "Other" category (see Sec 7.2 above), whereas bridge relays are willing to list a country even when it has only one user for the day. Whenever we have a daily summary on record, we include it in our extrainfo document whenever we publish one. The daily summary we remember locally gets replaced with a newer one when another 24 hours pass. 7.8. Some forward secrecy How should we remember addresses locally? If we convert them into country-codes immediately, we will count them again if we see them again. On the other hand, we don't really want to keep a list hanging around of all IP addresses we've seen in the past 24 hours. Step one is that we should never write this stuff to disk. Keeping it only in ram will make things somewhat better. Step two is to avoid keeping any timestamps associated with it: rather than a rolling 24-hour window, which would require us to remember the various times we've seen that address, we can instead just throw out the whole list every 24 hours and start over. We could hash the addresses, and then compare hashes when deciding if we've seen a given address before. We could even do keyed hashes. Or Bloom filters. But if our goal is to defend against an adversary who steals a copy of our ram while we're running and then does guess-and-check on whatever blob we're keeping, we're in bad shape. We could drop the last octet of the IP address as soon as we see it. That would cause us to undercount some users from cablemodem and DSL networks that have a high density of Tor users. And it wouldn't really help that much -- indeed, the extent to which it does help is exactly the extent to which it makes our stats less useful. Other ideas?
Filename: 127-dirport-mirrors-downloads.txt Title: Relaying dirport requests to Tor download site / website Author: Roger Dingledine Created: 2007-12-02 Status: Obsolete 1. Overview Some countries and networks block connections to the Tor website. As time goes by, this will remain a problem and it may even become worse. We have a big pile of mirrors (google for "Tor mirrors"), but few of our users think to try a search like that. Also, many of these mirrors might be automatically blocked since their pages contain words that might cause them to get banned. And lastly, we can imagine a future where the blockers are aware of the mirror list too. Here we describe a new set of URLs for Tor's DirPort that will relay connections from users to the official Tor download site. Rather than trying to cache a bunch of new Tor packages (which is a hassle in terms of keeping them up to date, and a hassle in terms of drive space used), we instead just proxy the requests directly to Tor's /dist page. Specifically, we should support GET /tor/dist/$1 and GET /tor/website/$1 2. Direct connections, one-hop circuits, or three-hop circuits? We could relay the connections directly to the download site -- but this produces recognizable outgoing traffic on the bridge or cache's network, which will probably surprise our nice volunteers. (Is this a good enough reason to discard the direct connection idea?) Even if we don't do direct connections, should we do a one-hop begindir-style connection to the mirror site (make a one-hop circuit to it, then send a 'begindir' cell down the circuit), or should we do a normal three-hop anonymized connection? If these mirrors are mainly bridges, doing either a direct or a one-hop connection creates another way to enumerate bridges. That would argue for three-hop. On the other hand, downloading a 10+ megabyte installer through a normal Tor circuit can't be fun. But if you're already getting throttled a lot because you're in the "relayed traffic" bucket, you're going to have to accept a slow transfer anyway. So three-hop it is. Speaking of which, we would want to label this connection as "relay" traffic for the purposes of rate limiting; see connection_counts_as_relayed_traffic() and or_conn->client_used. This will be a bit tricky though, because these connections will use the bridge's guards. 3. Scanning resistance One other goal we'd like to achieve, or at least not hinder, is making it hard to scan large swaths of the Internet to look for responses that indicate a bridge. In general this is a really hard problem, so we shouldn't demand to solve it here. But we can note that some bridges should open their DirPort (and offer this functionality), and others shouldn't. Then some bridges provide a download mirror while others can remain scanning-resistant. 4. Integrity checking If we serve this stuff in plaintext from the bridge, anybody in between the user and the bridge can intercept and modify it. The bridge can too. If we do an anonymized three-hop connection, the exit node can also intercept and modify the exe it sends back. Are we setting ourselves up for rogue exit relays, or rogue bridges, that trojan our users? Answer #1: Users need to do pgp signature checking. Not a very good answer, a) because it's complex, and b) because they don't know the right signing keys in the first place. Answer #2: The mirrors could exit from a specific Tor relay, using the '.exit' notation. This would make connections a bit more brittle, but would resolve the rogue exit relay issue. We could even round-robin among several, and the list could be dynamic -- for example, all the relays with an Authority flag that allow exits to the Tor website. Answer #3: The mirrors should connect to the main distribution site via SSL. That way the exit relay can't influence anything. Answer #4: We could suggest that users only use trusted bridges for fetching a copy of Tor. Hopefully they heard about the bridge from a trusted source rather than from the adversary. Answer #5: What if the adversary is trawling for Tor downloads by network signature -- either by looking for known bytes in the binary, or by looking for "GET /tor/dist/"? It would be nice to encrypt the connection from the bridge user to the bridge. And we can! The bridge already supports TLS. Rather than initiating a TLS renegotiation after connecting to the ORPort, the user should actually request a URL. Then the ORPort can either pass the connection off as a linked conn to the dirport, or renegotiate and become a Tor connection, depending on how the client behaves. 5. Linked connections: at what level should we proxy? Check out the connection_ap_make_link() function, as called from directory.c. Tor clients use this to create a "fake" socks connection back to themselves, and then they attach a directory request to it, so they can launch directory fetches via Tor. We can piggyback on this feature. We need to decide if we're going to be passing the bytes back and forth between the web browser and the main distribution site, or if we're going to be actually acting like a proxy (parsing out the file they want, fetching that file, and serving it back). Advantages of proxying without looking inside: - We don't need to build any sort of http support (including continues, partial fetches, etc etc). Disadvantages: - If the browser thinks it's speaking http, are there easy ways to pass the bytes to an https server and have everything work correctly? At the least, it would seem that the browser would complain about the cert. More generally, ssl wants to be negotiated before the URL and headers are sent, yet we need to read the URL and headers to know that this is a mirror request; so we have an ordering problem here. - Makes it harder to do caching later on, if we don't look at what we're relaying. (It might be useful down the road to cache the answers to popular requests, so we don't have to keep getting them again.) 6. Outstanding problems 1) HTTP proxies already exist. Why waste our time cloning one badly? When we clone existing stuff, we usually regret it. 2) It's overbroad. We only seem to need a secure get-a-tor feature, and instead we're contemplating building a locked-down HTTP proxy. 3) It's going to add a fair bit of complexity to our code. We do not currently implement HTTPS. We'd need to refactor lots of the low-level connection stuff so that "SSL" and "Cell-based" were no longer synonymous. 4) It's still unclear how effective this proposal would be in practice. You need to know that this feature exists, which means somebody needs to tell you about a bridge (mirror) address and tell you how to use it. And if they're doing that, they could (e.g.) tell you about a gmail autoresponder address just as easily, and then you'd get better authentication of the Tor program to boot.
Filename: 128-bridge-families.txt Title: Families of private bridges Author: Roger Dingledine Created: 2007-12-xx Status: Dead 1. Overview Proposal 125 introduced the basic notion of how bridge authorities, bridge relays, and bridge users should behave. But it doesn't get into the various mechanisms of how to distribute bridge relay addresses to bridge users. One of the mechanisms we have in mind is called 'families of bridges'. If a bridge user knows about only one private bridge, and that bridge shuts off for the night or gets a new dynamic IP address, the bridge user is out of luck and needs to re-bootstrap manually or wait and hope it comes back. On the other hand, if the bridge user knows about a family of bridges, then as long as one of those bridges is still reachable his Tor client can automatically learn about where the other bridges have gone. So in this design, a single volunteer could run multiple coordinated bridges, or a group of volunteers could each run a bridge. We abstract out the details of how these volunteers find each other and decide to set up a family. 2. Other notes. somebody needs to run a bridge authority it needs to have a torrc option to publish networkstatuses of its bridges it should also do reachability testing just of those bridges people ask for the bridge networkstatus by asking for a url that contains a password. (it's safe to do this because of begin_dir.) so the bridge users need to know a) a password, and b) a bridge authority line. the bridge users need to know the bridge authority line. the bridge authority needs to know the password. 3. Current state I implemented a BridgePassword config option. Bridge authorities should set it, and users who want to use those bridge authorities should set it. Now there is a new directory URL "/tor/networkstatus-bridges" that directory mirrors serve if BridgeAuthoritativeDir is set and it's a begin_dir connection. It looks for the header Authorization: Basic %s where %s is the base-64 bridge password. I never got around to teaching clients how to set the header though, so it may or may not, and may or may not do what we ultimate want. I've marked this proposal dead; it really never should have left the ideas/ directory. Somebody should pick it up sometime and finish the design and implementation.
Filename: 129-reject-plaintext-ports.txt Title: Block Insecure Protocols by Default Author: Kevin Bauer & Damon McCoy Created: 2008-01-15 Status: Closed Implemented-In: 0.2.0.x Overview: Below is a proposal to mitigate insecure protocol use over Tor. This document 1) demonstrates the extent to which insecure protocols are currently used within the Tor network, and 2) proposes a simple solution to prevent users from unknowingly using these insecure protocols. By insecure, we consider protocols that explicitly leak sensitive user names and/or passwords, such as POP, IMAP, Telnet, and FTP. Motivation: As part of a general study of Tor use in 2006/2007 [1], we attempted to understand what types of protocols are used over Tor. While we observed a enormous volume of Web and Peer-to-peer traffic, we were surprised by the number of insecure protocols that were used over Tor. For example, over an 8 day observation period, we observed the following number of connections over insecure protocols: POP and IMAP:10,326 connections Telnet: 8,401 connections FTP: 3,788 connections Each of the above listed protocols exchange user name and password information in plain-text. As an upper bound, we could have observed 22,515 user names and passwords. This observation echos the reports of a Tor router logging and posting e-mail passwords in August 2007 [2]. The response from the Tor community has been to further educate users about the dangers of using insecure protocols over Tor. However, we recently repeated our Tor usage study from last year and noticed that the trend in insecure protocol use has not declined. Therefore, we propose that additional steps be taken to protect naive Tor users from inadvertently exposing their identities (and even passwords) over Tor. Security Implications: This proposal is intended to improve Tor's security by limiting the use of insecure protocols. Roger added: By adding these warnings for only some of the risky behavior, users may do other risky behavior, not get a warning, and believe that it is therefore safe. But overall, I think it's better to warn for some of it than to warn for none of it. Specification: As an initial step towards mitigating the use of the above-mentioned insecure protocols, we propose that the default ports for each respective insecure service be blocked at the Tor client's socks proxy. These default ports include: 23 - Telnet 109 - POP2 110 - POP3 143 - IMAP Notice that FTP is not included in the proposed list of ports to block. This is because FTP is often used anonymously, i.e., without any identifying user name or password. This blocking scheme can be implemented as a set of flags in the client's torrc configuration file: BlockInsecureProtocols 0|1 WarnInsecureProtocols 0|1 When the warning flag is activated, a message should be displayed to the user similar to the message given when Tor's socks proxy is given an IP address rather than resolving a host name. We recommend that the default torrc configuration file block insecure protocols and provide a warning to the user to explain the behavior. Finally, there are many popular web pages that do not offer secure login features, such as MySpace, and it would be prudent to provide additional rules to Privoxy to attempt to protect users from unknowingly submitting their login credentials in plain-text. Compatibility: None, as the proposed changes are to be implemented in the client. References: [1] Shining Light in Dark Places: A Study of Anonymous Network Usage. University of Colorado Technical Report CU-CS-1032-07. August 2007. [2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise. http://www.wired.com/politics/security/news/2007/09/embassy_hacks. Wired. September 10, 2007. Implementation: Roger added this feature in http://archives.seul.org/or/cvs/Jan-2008/msg00182.html He also added a status event for Vidalia to recognize attempts to use vulnerable-plaintext ports, so it can help the user understand what's going on and how to fix it. Next steps: a) Vidalia should learn to recognize this controller status event, so we don't leave users out in the cold when we enable this feature. b) We should decide which ports to reject by default. The current consensus is 23,109,110,143 -- the same set that we warn for now.
Filename: 130-v2-conn-protocol.txt Title: Version 2 Tor connection protocol Author: Nick Mathewson Created: 2007-10-25 Status: Closed Implemented-In: 0.2.0.x Overview: This proposal describes the significant changes to be made in the v2 Tor connection protocol. This proposal relates to other proposals as follows: It refers to and supersedes: Proposal 124: Blocking resistant TLS certificate usage It refers to aspects of: Proposal 105: Version negotiation for the Tor protocol In summary, The Tor connection protocol has been in need of a redesign for a while. This proposal describes how we can add to the Tor protocol: - A new TLS handshake (to achieve blocking resistance without breaking backward compatibility) - Version negotiation (so that future connection protocol changes can happen without breaking compatibility) - The actual changes in the v2 Tor connection protocol. Motivation: For motivation, see proposal 124. Proposal: 0. Terminology The version of the Tor connection protocol implemented up to now is "version 1". This proposal describes "version 2". "Old" or "Older" versions of Tor are ones not aware that version 2 of this protocol exists; "New" or "Newer" versions are ones that are. The connection initiator is referred to below as the Client; the connection responder is referred to below as the Server. 1. The revised TLS handshake. For motivation, see proposal 124. This is a simplified version of the handshake that uses TLS's renegotiation capability in order to avoid some of the extraneous steps in proposal 124. The Client connects to the Server and, as in ordinary TLS, sends a list of ciphers. Older versions of Tor will send only ciphers from the list: TLS_DHE_RSA_WITH_AES_256_CBC_SHA TLS_DHE_RSA_WITH_AES_128_CBC_SHA SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA Clients that support the revised handshake will send the recommended list of ciphers from proposal 124, in order to emulate the behavior of a web browser. If the server notices that the list of ciphers contains only ciphers from this list, it proceeds with Tor's version 1 TLS handshake as documented in tor-spec.txt. (The server may also notice cipher lists used by other implementations of the Tor protocol (in particular, the BouncyCastle default cipher list as used by some Java-based implementations), and whitelist them.) On the other hand, if the server sees a list of ciphers that could not have been sent from an older implementation (because it includes other ciphers, and does not match any known-old list), the server sends a reply containing a single connection certificate, constructed as for the link certificate in the v1 Tor protocol. The subject names in this certificate SHOULD NOT have any strings to identify them as coming from a Tor server. The server does not ask the client for certificates. Old Servers will (mostly) ignore the cipher list and respond as in the v1 protocol, sending back a two-certificate chain. After the Client gets a response from the server, it checks for the number of certificates it received. If there are two certificates, the client assumes a V1 connection and proceeds as in tor-spec.txt. But if there is only one certificate, the client assumes a V2 or later protocol and continues. At this point, the client has established a TLS connection with the server, but the parties have not been authenticated: the server hasn't sent its identity certificate, and the client hasn't sent any certificates at all. To fix this, the client begins a TLS session renegotiation. This time, the server continues with two certificates as usual, and asks for certificates so that the client will send certificates of its own. Because the TLS connection has been established, all of this is encrypted. (The certificate sent by the server in the renegotiated connection need not be the same that as sentin the original connection.) The server MUST NOT write any data until the client has renegotiated. Once the renegotiation is finished, the server and client check one another's certificates as in V1. Now they are mutually authenticated. 1.1. Revised TLS handshake: implementation notes. It isn't so easy to adjust server behavior based on the client's ciphersuite list. Here's how we can do it using OpenSSL. This is a bit of an abuse of the OpenSSL APIs, but it's the best we can do, and we won't have to do it forever. We can use OpenSSL's SSL_set_info_callback() to register a function to be called when the state changes. The type/state tuple of SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A happens when we have completely parsed the client hello, and are about to send a response. From this callback, we can check the cipherlist and act accordingly: * If the ciphersuite list indicates a v1 protocol, we set the verify mode to SSL_VERIFY_NONE with a callback (so we get certificates). * If the ciphersuite list indicates a v2 protocol, we set the verify mode to SSL_VERIFY_NONE with no callback (so we get no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that we send only 1 certificate in the response. Once the handshake is done, the server clears the SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1 protocol. It then starts reading. The other problem to take care of is missing ciphers and OpenSSL's cipher sorting algorithms. The two main issues are a) OpenSSL doesn't support some of the default ciphers that Firefox advertises, and b) OpenSSL sorts the list of ciphers it offers in a different way than Firefox sorts them, so unless we fix that Tor will still look different than Firefox. [XXXX more on this.] 1.2. Compatibility for clients using libraries less hackable than OpenSSL. As discussed in proposal 105, servers advertise which protocol versions they support in their router descriptors. Clients can simply behave as v1 clients when connecting to servers that do not support link version 2 or higher, and as v2 clients when connecting to servers that do support link version 2 or higher. (Servers can't use this strategy because we do not assume that servers know one another's capabilities when connecting.) 2. Version negotiation. Version negotiation proceeds as described in proposal 105, except as follows: * Version negotiation only happens if the TLS handshake as described above completes. * The TLS renegotiation must be finished before the client sends a VERSIONS cell; the server sends its VERSIONS cell in response. * The VERSIONS cell uses the following variable-width format: Circuit [2 octets; set to 0] Command [1 octet; set to 7 for VERSIONS] Length [2 octets; big-endian] Data [Length bytes] The Data in the cell is a series of big-endian two-byte integers. * It is not allowed to negotiate V1 connections once the v2 protocol has been used. If this happens, Tor instances should close the connection. 3. The rest of the "v2" protocol Once a v2 protocol has been negotiated, NETINFO cells are exchanged as in proposal 105, and communications begin as per tor-spec.txt. Until NETINFO cells have been exchanged, the connection is not open.
Filename: 131-verify-tor-usage.txt Title: Help users to verify they are using Tor Author: Steven J. Murdoch Created: 2008-01-25 Status: Obsolete Overview: Websites for checking whether a user is accessing them via Tor are a very helpful aid to configuring web browsers correctly. Existing solutions have both false positives and false negatives when checking if Tor is being used. This proposal will discuss how to modify Tor so as to make testing more reliable. Motivation: Currently deployed websites for detecting Tor use work by comparing the client IP address for a request with a list of known Tor nodes. This approach is generally effective, but suffers from both false positives and false negatives. If a user has a Tor exit node installed, or just happens to have been allocated an IP address previously used by a Tor exit node, any web requests will be incorrectly flagged as coming from Tor. If any customer of an ISP which implements a transparent proxy runs an exit node, all other users of the ISP will be flagged as Tor users. Conversely, if the exit node chosen by a Tor user has not yet been recorded by the Tor checking website, requests will be incorrectly flagged as not coming via Tor. The only reliable way to tell whether Tor is being used or not is for the Tor client to flag this to the browser. Proposal: A DNS name should be registered and point to an IP address controlled by the Tor project and likely to remain so for the useful lifetime of a Tor client. A web server should be placed at this IP address. Tor should be modified to treat requests to port 80, at the specified DNS name or IP address specially. Instead of opening a circuit, it should respond to a HTTP request with a helpful web page: - If the request to open a connection was to the domain name, the web page should state that Tor is working properly. - If the request was to the IP address, the web page should state that there is a DNS-leakage vulnerability. If the request goes through to the real web server, the page should state that Tor has not been set up properly. Extensions: Identifying proxy server: If needed, other applications between the web browser and Tor (e.g. Polipo and Privoxy) could piggyback on the same mechanism to flag whether they are in use. All three possible web pages should include a machine-readable placeholder, into which another program could insert their own message. For example, the webpage returned by Tor to indicate a successful configuration could include the following HTML: <h2>Connection chain</h2> <ul> <li>Tor 0.1.2.14-alpha</li> <!-- Tor Connectivity Check: success --> </ul> When the proxy server observes this string, in response to a request for the Tor connectivity check web page, it would prepend it's own message, resulting in the following being returned to the web browser: <h2>Connection chain <ul> <li>Tor 0.1.2.14-alpha</li> <li>Polipo version 1.0.4</li> <!-- Tor Connectivity Check: success --> </ul> Checking external connectivity: If Tor intercepts a request, and returns a response itself, the user will not actually confirm whether Tor is able to build a successful circuit. It may then be advantageous to include an image in the web page which is loaded from a different domain. If this is able to be loaded then the user will know that external connectivity through Tor works. Automatic Firefox Notification: All forms of the website should return valid XHTML and have a hidden link with an id attribute "TorCheckResult" and a target property that can be queried to determine the result. For example, a hidden link would convey success like this: <a id="TorCheckResult" target="success" href="/"></a> failure like this: <a id="TorCheckResult" target="failure" href="/"></a> and DNS leaks like this: <a id="TorCheckResult" target="dnsleak" href="/"></a> Firefox extensions such as Torbutton would then be able to issue an XMLHttpRequest for the page and query the result with resultXML.getElementById("TorCheckResult").target to automatically report the Tor status to the user when they first attempt to enable Tor activity, or whenever they request a check from the extension preferences window. If the check website is to be themed with heavy graphics and/or extensive documentation, the check result itself should be contained in a seperate lightweight iframe that extensions can request via an alternate url. Security and resiliency implications: What attacks are possible? If the IP address used for this feature moves there will be two consequences: - A new website at this IP address will remain inaccessible over Tor - Tor users who are leaking DNS will be informed that Tor is not working, rather than that it is active but leaking DNS We should thus attempt to find an IP address which we reasonably believe can remain static. Open issues: If a Tor version which does not support this extra feature is used, the webpage returned will indicate that Tor is not being used. Can this be safely fixed? Related work: The proposed mechanism is very similar to config.privoxy.org. The most significant difference is that if the web browser is misconfigured, Tor will only get an IP address. Even in this case, Tor should be able to respond with a webpage to notify the user of how to fix the problem. This also implies that Tor must be told of the special IP address, and so must be effectively permanent.
Filename: 132-browser-check-tor-service.txt Title: A Tor Web Service For Verifying Correct Browser Configuration Author: Robert Hogan Created: 2008-03-08 Status: Obsolete Overview: Tor should operate a primitive web service on the loopback network device that tests the operation of user's browser, privacy proxy and Tor client. The tests are performed by serving unique, randomly generated elements in image URLs embedded in static HTML. The images are only displayed if the DNS and HTTP requests for them are routed through Tor, otherwise the 'alt' text may be displayed. The proposal assumes that 'alt' text is not displayed on all browsers so suggests that text and links should accompany each image advising the user on next steps in case the test fails. The service is primarily for the use of controllers, since presumably users aren't going to want to edit text files and then type something exotic like 127.0.0.1:9999 into their address bar. In the main use case the controller will have configured the actual port for the webservice so will know where to direct the request. It would also be the responsibility of the controller to ensure the webservice is available, and tor is running, before allowing the user to access the page through their browser. Motivation: This is a complementary approach to proposal 131. It overcomes some of the limitations of the approach described in proposal 131: reliance on a permanent, real IP address and compatibility with older versions of Tor. Unlike 131, it is not as useful to Tor users who are not running a controller. Objective: Provide a reliable means of helping users to determine if their Tor installation, privacy proxy and browser are properly configured for anonymous browsing. Proposal: When configured to do so, Tor should run a basic web service available on a configured port on 127.0.0.1. The purpose of this web service is to serve a number of basic test images that will allow the user to determine if their browser is properly configured and that Tor is working normally. The service can consist of a single web page with two columns. The left column contains images, the right column contains advice on what the display/non-display of the column means. The rest of this proposal assumes that the service is running on port 9999. The port should be configurable, and configuring the port enables the service. The service must run on 127.0.0.1. In all the examples below [uniquesessionid] refers to a random, base64 encoded string that is unique to the URL it is contained in. Tor only ever stores the most recently generated [uniquesessionid] for each URL, storing 3 in total. Tor should generate a [uniquesessionid] for each of the test URLs below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm. The most suitable image for each test case is an implementation decision. Tor will need to store and serve images for the first and second test images, and possibly the third (see 'Open Issues'). 1. DNS Request Test Image This is a HTML element embedded in the page served by Tor at http://127.0.0.1:9999: <IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see this text, your browser's DNS requests are not being routed through Tor." width="200" height="200" align="middle" border="2"> If the browser's DNS request for [uniquesessionid] is routed through Tor, Tor will intercept the request and return 127.0.0.1 as the resolved IP address. This will shortly be followed by a HTTP request from the browser for http://127.0.0.1:9999/torlogo.jpg. This request should be served with the appropriate image. If the browser's DNS request for [uniquesessionid] is not routed through Tor the browser may display the 'alt' text specified in the html element. The HTML served by Tor should also contain text accompanying the image to advise users what it means if they do not see an image. It should also provide a link to click that provides information on how to remedy the problem. This behaviour also applies to the images described in 2. and 3. below, so should be assumed there as well. 2. Proxy Configuration Test Image This is a HTML element embedded in the page served by Tor at http://127.0.0.1:9999: <IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see this text, your browser is not configured to work with Tor." width="200" height="200" align="middle" border="2"> If the HTTP request for the resource [uniquesessionid].jpg is received by Tor it will serve the appropriate image in response. It should serve this image itself, without attempting to retrieve anything from the Internet. If Tor can identify the name of the proxy application requesting the resource then it could store and serve an image identifying the proxy to the user. 3. Tor Connectivity Test Image This is a HTML element embedded in the page served by Tor at http://127.0.0.1:9999: <IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you can see this text, your Tor installation cannot connect to the Internet." width="200" height="200" align="middle" border="2"> The referenced image should actually exist on the Tor project website. If Tor receives the request for the above resource it should remove the random base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt to retrieve the real image. Even on a fully operational Tor client this test may not always succeed. The user should be advised that one or more attempts to retrieve this image may be necessary to confirm a genuine problem. Open Issues: The final connectivity test relies on an externally maintained resource, if this resource becomes unavailable the connectivity test will always fail. Either the text accompanying the test should advise of this possibility or Tor clients should be advised of the location of the test resource in the main network directory listings. Any number of misconfigurations may make the web service unreachable, it is the responsibility of the user's controller to recognize these and assist the user in eliminating them. Tor can mitigate against the specific misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by serving such requests through the SOCKS port as well as the configured web service report. Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping' them. It already inspects for raw IP addresses (to warn of DNS leaks) but maybe the behaviour proposed here is qualitatively different. Maybe this is an unwelcome precedent that can be used to beat the project over the head in future. Or maybe it's not such a bad thing, Tor is merely attempting to make normally invalid resource requests valid for a given purpose.
Filename: 133-unreachable-ors.txt Title: Incorporate Unreachable ORs into the Tor Network Author: Robert Hogan Created: 2008-03-08 Status: Reserve Overview: Propose a scheme for harnessing the bandwidth of ORs who cannot currently participate in the Tor network because they can only make outbound TCP connections. Motivation: Restrictive local and remote firewalls are preventing many willing candidates from becoming ORs on the Tor network.These ORs have a casual interest in joining the network but their operator is not sufficiently motivated or adept to complete the necessary router or firewall configuration. The Tor network is losing out on their bandwidth. At the moment we don't even know how many such 'candidate' ORs there are. Objective: 1. Establish how many ORs are unable to qualify for publication because they cannot establish that their ORPort is reachable. 2. Devise a method for making such ORs available to clients for circuit building without prejudicing their anonymity. Proposal: ORs whose ORPort reachability testing fails a specified number of consecutive times should: 1. Enlist themselves with the authorities setting a 'Fallback' flag. This flag indicates that the OR is up and running but cannot connect to itself. 2. Open an orconn with all ORs whose fingerprint begins with the same byte as their own. The management of this orconn will be transferred entirely to the OR at the other end. 2. The fallback OR should update it's router status to contain the 'Running' flag if it has managed to open an orconn with 3/4 of the ORs with an FP beginning with the same byte as its own. Tor ORs who are contacted by fallback ORs requesting an orconn should: 1. Accept the orconn until they have reached a defined limit of orconn connections with fallback ORs. 2. Should only accept such orconn requests from listed fallback ORs who have an FP beginning with the same byte as its own. Tor clients can include fallback ORs in the network by doing the following: 1. When building a circuit, observe the fingerprint of each node they wish to connect to. 2. When randomly selecting a node from the set of all eligible nodes, add all published, running fallback nodes to the set where the first byte of the fingerprint matches the previous node in the circuit. Anonymity Implications: At least some, and possibly all, nodes on the network will have a set of nodes that only they and a few others can build circuits on. 1. This means that fallback ORs might be unsuitable for use as middlemen nodes, because if the exit node is the attacker it knows that the number of nodes that could be the entry guard in the circuit is reduced to roughly 1/256th of the network, or worse 1/256th of all nodes listed as Guards. For the same reason, fallback nodes would appear to be unsuitable for two-hop circuits. 2. This is not a problem if fallback ORs are always exit nodes. If the fallback OR is an attacker it will not be able to reduce the set of possible nodes for the entry guard any further than a normal, published OR. Possible Attacks/Open Issues: 1. Gaming Node Selection Does running a fallback OR customized for a specific set of published ORs improve an attacker's chances of seeing traffic from that set of published ORs? Would such a strategy be any more effective than running published ORs with other 'attractive' properties? 2. DOS Attack An attacker could prevent all other legitimate fallback ORs with a given byte-1 in their FP from functioning by running 20 or 30 fallback ORs and monopolizing all available fallback slots on the published ORs. This same attacker would then be in a position to monopolize all the traffic of the fallback ORs on that byte-1 network segment. I'm not sure what this would allow such an attacker to do. 4. Circuit-Sniffing An observer watching exit traffic from a fallback server will know that the previous node in the circuit is one of a very small, identifiable subset of the total ORs in the network. To establish the full path of the circuit they would only have to watch the exit traffic from the fallback OR and all the traffic from the 20 or 30 ORs it is likely to be connected to. This means it is substantially easier to establish all members of a circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e. 1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560 or so ORs on the network). The same mechanism that allows the client to expect a specific fallback OR to be available from a specific published OR allows an attacker to prepare his ground. Mitigant: In terms of the resources and access required to monitor 2000 to 3000 nodes, the effort of the adversary is not significantly diminished when he is only interested in 20 or 30. It is hard to see how an adversary who can obtain access to a randomly selected portion of the Tor network would face any new or qualitatively different obstacles in attempting to access much of the rest of it. Implementation Issues: The number of ORs this proposal would add to the Tor network is not known. This is because there is no mechanism at present for recording unsuccessful attempts to become an OR. If the proposal is considered promising it may be worthwhile to issue an alpha series release where candidate ORs post a primitive fallback descriptor to the authority directories. This fallback descriptor would not contain any other flag that would make it eligible for selection by clients. It would act solely as a means of sizing the number of Tor instances that try and fail to become ORs. The upper limit on the number of orconns from fallback ORs a normal, published OR should be willing to accept is an open question. Is one hundred, mostly idle, such orconns too onerous?
Filename: 134-robust-voting.txt Title: More robust consensus voting with diverse authority sets Author: Peter Palfrader Created: 2008-04-01 Status: Rejected History: 2009 May 27: Added note on rejecting this proposal -- Nick Overview: A means to arrive at a valid directory consensus even when voters disagree on who is an authority. Motivation: Right now there are about five authoritative directory servers in the Tor network, tho this number is expected to rise to about 15 eventually. Adding a new authority requires synchronized action from all operators of directory authorities so that at any time during the update at least half of all authorities are running and agree on who is an authority. The latter requirement is there so that the authorities can arrive at a common consensus: Each authority builds the consensus based on the votes from all authorities it recognizes, and so a different set of recognized authorities will lead to a different consensus document. Objective: The modified voting procedure outlined in this proposal obsoletes the requirement for most authorities to exactly agree on the list of authorities. Proposal: The vote document each authority generates contains a list of authorities recognized by the generating authority. This will be a list of authority identity fingerprints. Authorities will accept votes from and serve/mirror votes also for authorities they do not recognize. (Votes contain the signing, authority key, and the certificate linking them so they can be verified even without knowing the authority beforehand.) Before building the consensus we will check which votes to use for building: 1) We build a directed graph of which authority/vote recognizes whom. 2) (Parts of the graph that aren't reachable, directly or indirectly, from any authorities we recognize can be discarded immediately.) 3) We find the largest fully connected subgraph. (Should there be more than one subgraph of the same size there needs to be some arbitrary ordering so we always pick the same. E.g. pick the one who has the smaller (XOR of all votes' digests) or something.) 4) If we are part of that subgraph, great. This is the list of votes we build our consensus with. 5) If we are not part of that subgraph, remove all the nodes that are part of it and go to 3. Using this procedure authorities that are updated to recognize a new authority will continue voting with the old group until a sufficient number has been updated to arrive at a consensus with the recently added authority. In fact, the old set of authorities will probably be voting among themselves until all but one has been updated to recognize the new authority. Then which set of votes is used for consensus building depends on which of the two equally large sets gets ordered before the other in step (3) above. It is necessary to continue with the process in (5) even if we are not in the largest subgraph. Otherwise one rogue authority could create a number of extra votes (by new authorities) so that everybody stops at 5 and no consensus is built, even tho it would be trusted by all clients. Anonymity Implications: The author does not believe this proposal to have anonymity implications. Possible Attacks/Open Issues/Some thinking required: Q: Can a number (less or exactly half) of the authorities cause an honest authority to vote for "their" consensus rather than the one that would result were all authorities taken into account? Q: Can a set of votes from external authorities, i.e of whom we trust either none or at least not all, cause us to change the set of consensus makers we pick? A: Yes, if other authorities decide they rather build a consensus with them then they'll be thrown out in step 3. But that's ok since those other authorities will never vote with us anyway. If we trust none of them then we throw them out even sooner, so no harm done. Q: Can this ever force us to build a consensus with authorities we do not recognize? A: No, we can never build a fully connected set with them in step 3. ------------------------------ I'm rejecting this proposal as insecure. Suppose that we have a clique of size N, and M hostile members in the clique. If these hostile members stop declaring trust for up to M-1 good members of the clique, the clique with the hostile members will in it will be larger than the one without them. The M hostile members will constitute a majority of this new clique when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our requirement that an adversary must compromise a majority of authorities in order to control the consensus. -- Nick
Filename: 135-private-tor-networks.txt Title: Simplify Configuration of Private Tor Networks Author: Karsten Loesing Created: 29-Apr-2008 Status: Closed Target: 0.2.1.x Implemented-In: 0.2.1.2-alpha Change history: 29-Apr-2008 Initial proposal for or-dev 19-May-2008 Included changes based on comments by Nick to or-dev and added a section for test cases. 18-Jun-2008 Changed testing-network-only configuration option names. Overview: Configuring a private Tor network has become a time-consuming and error-prone task with the introduction of the v3 directory protocol. In addition to that, operators of private Tor networks need to set an increasing number of non-trivial configuration options, and it is hard to keep FAQ entries describing this task up-to-date. In this proposal we (1) suggest to (optionally) accelerate timing of the v3 directory voting process and (2) introduce an umbrella config option specifically aimed at creating private Tor networks. Design: 1. Accelerate Timing of v3 Directory Voting Process Tor has reasonable defaults for setting up a large, Internet-scale network with comparably high latencies and possibly wrong server clocks. However, those defaults are bad when it comes to quickly setting up a private Tor network for testing, either on a single node or LAN (things might be different when creating a test network on PlanetLab or something). Some time constraints should be made configurable for private networks. The general idea is to accelerate everything that has to do with propagation of directory information, but nothing else, so that a private network is available as soon as possible. (As a possible safeguard, changing these configuration values could be made dependent on the umbrella configuration option introduced in 2.) 1.1. Initial Voting Schedule When a v3 directory does not know any consensus, it assumes an initial, hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and DistDelay of 5 minutes. This is important for multiple, simultaneously restarted directory authorities to meet at a common time and create an initial consensus. Unfortunately, this means that it may take up to half an hour (or even more) for a private Tor network to bootstrap. We propose to make these three time constants configurable (note that V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an effect on the _initial_ voting schedule, but only on the schedule that a directory authority votes for). This can be achieved by introducing three new configuration options: TestingV3AuthInitialVotingInterval, TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay. As first safeguards, Tor should only accept configuration values for TestingV3AuthInitialVotingInterval that divide evenly into the default value of 30 minutes. The effect is that even if people misconfigured their directory authorities, they would meet at the default values at the latest. The second safeguard is to allow configuration only when the umbrella configuration option TestingTorNetwork is set. 1.2. Immediately Provide Reachability Information (Running flag) The default behavior of a directory authority is to provide the Running flag only after the authority is available for at least 30 minutes. The rationale is that before that time, an authority simply cannot deliver useful information about other running nodes. But for private Tor networks this may be different. This is currently implemented in the code as: /** If we've been around for less than this amount of time, our * reachability information is not accurate. */ #define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60) There should be another configuration option TestingAuthDirTimeToLearnReachability with a default value of 30 minutes that can be changed when running testing Tor networks, e.g. to 0 minutes. The configuration value would simply replace the quoted constant. Again, changing this option could be safeguarded by requiring the umbrella configuration option TestingTorNetwork to be set. 1.3. Reduce Estimated Descriptor Propagation Time Tor currently assumes that it takes up to 10 minutes until router descriptors are propagated from the authorities to directory caches. This is not very useful for private Tor networks, and we want to be able to reduce this time, so that clients can download router descriptors in a timely manner. /** Clients don't download any descriptor this recent, since it will * probably not have propagated to enough caches. */ #define ESTIMATED_PROPAGATION_TIME (10*60) We suggest to introduce a new config option TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes, but that can be set to any lower non-negative value, e.g. 0 minutes. The same safeguards as in 1.2 could be used here, too. 2. Umbrella Option for Setting Up Private Tor Networks Setting up a private Tor network requires a number of specific settings that are not required or useful when running Tor in the public Tor network. Instead of writing down these options in a FAQ entry, there should be a single configuration option, e.g. TestingTorNetwork, that changes all required settings at once. Newer Tor versions would keep the set of configuration options up-to-date. It should still remain possible to manually overwrite the settings that the umbrella configuration option affects. The following configuration options are set by TestingTorNetwork: - ServerDNSAllowBrokenResolvConf 1 Ignore the situation that private relays are not aware of any name servers. - DirAllowPrivateAddresses 1 Allow router descriptors containing private IP addresses. - EnforceDistinctSubnets 0 Permit building circuits with relays in the same subnet. - AssumeReachable 1 Omit self-testing for reachability. - AuthDirMaxServersPerAddr 0 - AuthDirMaxServersPerAuthAddr 0 Permit an unlimited number of nodes on the same IP address. - ClientDNSRejectInternalAddresses 0 Believe in DNS responses resolving to private IP addresses. - ExitPolicyRejectPrivate 0 Allow exiting to private IP addresses. (This one is a matter of taste---it might be dangerous to make this a default in a private network, although people setting up private Tor networks should know what they are doing.) - V3AuthVotingInterval 5 minutes - V3AuthVoteDelay 20 seconds - V3AuthDistDelay 20 seconds Accelerate voting schedule after first consensus has been reached. - TestingV3AuthInitialVotingInterval 5 minutes - TestingV3AuthInitialVoteDelay 20 seconds - TestingV3AuthInitialDistDelay 20 seconds Accelerate initial voting schedule until first consensus is reached. - TestingAuthDirTimeToLearnReachability 0 minutes Consider routers as Running from the start of running an authority. - TestingEstimatedDescriptorPropagationTime 0 minutes Clients try downloading router descriptors from directory caches, even when they are not 10 minutes old. In addition to changing the defaults for these configuration options, TestingTorNetwork can only be set when a user has manually configured DirServer lines. Test: The implementation of this proposal must pass the following tests: 1. Set TestingTorNetwork and see if dependent configuration options are correctly changed. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability 250-TestingTorNetwork=1 250 TestingAuthDirTimeToLearnReachability=0 QUIT 2. Set TestingTorNetwork and a dependent configuration value to see if the provided value is used for the dependent option. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ TestingAuthDirTimeToLearnReachability 5 telnet 127.0.0.1 9051 AUTHENTICATE GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability 250-TestingTorNetwork=1 250 TestingAuthDirTimeToLearnReachability=5 QUIT 3. Start with TestingTorNetwork set and change a dependent configuration option later on. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE SETCONF TestingAuthDirTimeToLearnReachability=5 GETCONF TestingAuthDirTimeToLearnReachability 250 TestingAuthDirTimeToLearnReachability=5 QUIT 4. Start with TestingTorNetwork set and a dependent configuration value, and reset that dependent configuration value. The result should be the testing-network specific default value. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ TestingAuthDirTimeToLearnReachability 5 telnet 127.0.0.1 9051 AUTHENTICATE GETCONF TestingAuthDirTimeToLearnReachability 250 TestingAuthDirTimeToLearnReachability=5 RESETCONF TestingAuthDirTimeToLearnReachability GETCONF TestingAuthDirTimeToLearnReachability 250 TestingAuthDirTimeToLearnReachability=0 QUIT 5. Leave TestingTorNetwork unset and check if dependent configuration options are left unchanged. tor DataDirectory . ControlPort 9051 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability 250-TestingTorNetwork=0 250 TestingAuthDirTimeToLearnReachability=1800 QUIT 6. Leave TestingTorNetwork unset, but set dependent configuration option which should fail. tor DataDirectory . ControlPort 9051 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ TestingAuthDirTimeToLearnReachability 0 [warn] Failed to parse/validate config: TestingAuthDirTimeToLearnReachability may only be changed in testing Tor networks! 7. Start with TestingTorNetwork unset and change dependent configuration option later on which should fail. tor DataDirectory . ControlPort 9051 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE SETCONF TestingAuthDirTimeToLearnReachability=0 513 Unacceptable option value: TestingAuthDirTimeToLearnReachability may only be changed in testing Tor networks! 8. Start with TestingTorNetwork unset and set it later on which should fail. tor DataDirectory . ControlPort 9051 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE SETCONF TestingTorNetwork=1 553 Transition not allowed: While Tor is running, changing TestingTorNetwork is not allowed. 9. Start with TestingTorNetwork set and unset it later on which should fail. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" telnet 127.0.0.1 9051 AUTHENTICATE RESETCONF TestingTorNetwork 513 Unacceptable option value: TestingV3AuthInitialVotingInterval may only be changed in testing Tor networks! 10. Set TestingTorNetwork, but do not provide an alternate DirServer which should fail. tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 [warn] Failed to parse/validate config: TestingTorNetwork may only be configured in combination with a non-default set of DirServers.
Filename: 136-legacy-keys.txt Title: Mass authority migration with legacy keys Author: Nick Mathewson Created: 13-May-2008 Status: Closed Implemented-In: 0.2.0.x Overview: This document describes a mechanism to change the keys of more than half of the directory servers at once without breaking old clients and caches immediately. Motivation: If a single authority's identity key is believed to be compromised, the solution is obvious: remove that authority from the list, generate a new certificate, and treat the new cert as belonging to a new authority. This approach works fine so long as less than 1/2 of the authority identity keys are bad. Unfortunately, the mass-compromise case is possible if there is a sufficiently bad bug in Tor or in any OS used by a majority of v3 authorities. Let's be prepared for it! We could simply stop using the old keys and start using new ones, and tell all clients running insecure versions to upgrade. Unfortunately, this breaks our cacheing system pretty badly, since caches won't cache a consensus that they don't believe in. It would be nice to have everybody become secure the moment they upgrade to a version listing the new authority keys, _without_ breaking upgraded clients until the caches upgrade. So, let's come up with a way to provide a time window where the consensuses are signed with the new keys and with the old. Design: We allow directory authorities to list a single "legacy key" fingerprint in their votes. Each authority may add a single legacy key. The format for this line is: legacy-dir-key FINGERPRINT We describe a new consensus method for generating directory consensuses. This method is consensus method "3". When the authorities decide to use method "3" (as described in 3.4.1 of dir-spec.txt), for every included vote with a legacy-dir-key line, the consensus includes an extra dir-source line. The fingerprint in this extra line is as in the legacy-dir-key line. The ports and addresses are in the dir-source line. The nickname is as in the dir-source line, with the string "-legacy" appended. [We need to include this new dir-source line because the code won't accept or preserve signatures from authorities not listed as contributing to the consensus.] Authorities using legacy dir keys include two signatures on their consensuses: one generated with a signing key signed with their real signing key, and another generated with a signing key signed with another signing key attested to by their identity key. These signing keys MUST be different. Authorities MUST serve both certificates if asked. Process: In the event of a mass key failure, we'll follow the following (ugly) procedure: - All affected authorities generate new certificates and identity keys, and circulate their new dirserver lines. They copy their old certificates and old broken keys, but put them in new "legacy key files". - At the earliest time that can be arranged, the authorities replace their signing keys, identity keys, and certificates with the new uncompromised versions, and update to the new list of dirserer lines. - They add an "V3DirAdvertiseLegacyKey 1" option to their torrc. - Now, new consensuses will be generated using the new keys, but the results will also be signed with the old keys. - Clients and caches are told they need to upgrade, and given a time window to do so. - At the end of the time window, authorities remove the V3DirAdvertiseLegacyKey option. Notes: It might be good to get caches to cache consensuses that they do not believe in. I'm not sure the best way of how to do this. It's a superficially neat idea to have new signing keys and have them signed by the new and by the old authority identity keys. This breaks some code, though, and doesn't actually gain us anything, since we'd still need to include each signature twice. It's also a superficially neat idea, if identity keys and signing keys are compromised, to at least replace all the signing keys. I don't think this achieves us anything either, though.
Filename: 137-bootstrap-phases.txt Title: Keep controllers informed as Tor bootstraps Author: Roger Dingledine Created: 07-Jun-2008 Status: Closed Implemented-In: 0.2.1.x 1. Overview. Tor has many steps to bootstrapping directory information and initial circuits, but from the controller's perspective we just have a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with slow connections or with connectivity problems can wait a long time staring at the yellow onion, wondering if it will ever change color. This proposal describes a new client status event so Tor can give more details to the controller. Section 2 describes the changes to the controller protocol; Section 3 describes Tor's internal bootstrapping phases when everything is going correctly; Section 4 describes when Tor detects a problem and issues a bootstrap warning; Section 5 covers suggestions for how controllers should display the results. 2. Controller event syntax. The generic status event is: "650" SP StatusType SP StatusSeverity SP StatusAction [SP StatusArguments] CRLF So in this case we send 650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \ PROGRESS=num TAG=Keyword SUMMARY=String \ [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword] The arguments MAY appear in any order. Controllers MUST accept unrecognized arguments. "Progress" gives a number between 0 and 100 for how far through the bootstrapping process we are. "Summary" is a string that can be displayed to the user to describe the *next* task that Tor will tackle, i.e., the task it is working on after sending the status event. "Tag" is an optional string that controllers can use to recognize bootstrap phases from Section 3, if they want to do something smarter than just blindly displaying the summary string. The severity describes whether this is a normal bootstrap phase (severity notice) or an indication of a bootstrapping problem (severity warn). If severity warn, it should also include a "warning" argument string with any hints Tor has to offer about why it's having troubles bootstrapping, a "reason" string that lists one of the reasons allowed in the ORConn event, a "count" number that tells how many bootstrap problems there have been so far at this phase, and a "recommendation" keyword to indicate how the controller ought to react. 3. The bootstrap phases. This section describes the various phases currently reported by Tor. Controllers should not assume that the percentages and tags listed here will continue to match up, or even that the tags will stay in the same order. Some phases might also be skipped (not reported) if the associated bootstrap step is already complete, or if the phase no longer is necessary. Only "starting" and "done" are guaranteed to exist in all future versions. Current Tor versions enter these phases in order, monotonically; future Tors MAY revisit earlier stages. Phase 0: tag=starting summary="starting" Tor starts out in this phase. Phase 5: tag=conn_dir summary="Connecting to directory mirror" Tor sends this event as soon as Tor has chosen a directory mirror --- one of the authorities if bootstrapping for the first time or after a long downtime, or one of the relays listed in its cached directory information otherwise. Tor will stay at this phase until it has successfully established a TCP connection with some directory mirror. Problems in this phase generally happen because Tor doesn't have a network connection, or because the local firewall is dropping SYN packets. Phase 10 tag=handshake_dir summary="Finishing handshake with directory mirror" This event occurs when Tor establishes a TCP connection with a relay used as a directory mirror (or its https proxy if it's using one). Tor remains in this phase until the TLS handshake with the relay is finished. Problems in this phase generally happen because Tor's firewall is doing more sophisticated MITM attacks on it, or doing packet-level keyword recognition of Tor's handshake. Phase 15: tag=onehop_create summary="Establishing one-hop circuit for dir info" Once TLS is finished with a relay, Tor will send a CREATE_FAST cell to establish a one-hop circuit for retrieving directory information. It will remain in this phase until it receives the CREATED_FAST cell back, indicating that the circuit is ready. Phase 20: tag=requesting_status summary="Asking for networkstatus consensus" Once we've finished our one-hop circuit, we will start a new stream for fetching the networkstatus consensus. We'll stay in this phase until we get the 'connected' relay cell back, indicating that we've established a directory connection. Phase 25: tag=loading_status summary="Loading networkstatus consensus" Once we've established a directory connection, we will start fetching the networkstatus consensus document. This could take a while; this phase is a good opportunity for using the "progress" keyword to indicate partial progress. This phase could stall if the directory mirror we picked doesn't have a copy of the networkstatus consensus so we have to ask another, or it does give us a copy but we don't find it valid. Phase 40: tag=loading_keys summary="Loading authority key certs" Sometimes when we've finished loading the networkstatus consensus, we find that we don't have all the authority key certificates for the keys that signed the consensus. At that point we put the consensus we fetched on hold and fetch the keys so we can verify the signatures. Phase 45 tag=requesting_descriptors summary="Asking for relay descriptors" Once we have a valid networkstatus consensus and we've checked all its signatures, we start asking for relay descriptors. We stay in this phase until we have received a 'connected' relay cell in response to a request for descriptors. Phase 50: tag=loading_descriptors summary="Loading relay descriptors" We will ask for relay descriptors from several different locations, so this step will probably make up the bulk of the bootstrapping, especially for users with slow connections. We stay in this phase until we have descriptors for at least 1/4 of the usable relays listed in the networkstatus consensus. This phase is also a good opportunity to use the "progress" keyword to indicate partial steps. Phase 80: tag=conn_or summary="Connecting to entry guard" Once we have a valid consensus and enough relay descriptors, we choose some entry guards and start trying to build some circuits. This step is similar to the "conn_dir" phase above; the only difference is the context. If a Tor starts with enough recent cached directory information, its first bootstrap status event will be for the conn_or phase. Phase 85: tag=handshake_or summary="Finishing handshake with entry guard" This phase is similar to the "handshake_dir" phase, but it gets reached if we finish a TCP connection to a Tor relay and we have already reached the "conn_or" phase. We'll stay in this phase until we complete a TLS handshake with a Tor relay. Phase 90: tag=circuit_create "Establishing circuits" Once we've finished our TLS handshake with an entry guard, we will set about trying to make some 3-hop circuits in case we need them soon. Phase 100: tag=done summary="Done" A full 3-hop circuit has been established. Tor is ready to handle application connections now. 4. Bootstrap problem events. When an OR Conn fails, we send a "bootstrap problem" status event, which is like the standard bootstrap status event except with severity warn. We include the same progress, tag, and summary values as we would for a normal bootstrap event, but we also include "warning", "reason", "count", and "recommendation" key/value combos. The "reason" values are long-term-stable controller-facing tags to identify particular issues in a bootstrapping step. The warning strings, on the other hand, are human-readable. Controllers SHOULD NOT rely on the format of any warning string. Currently the possible values for "recommendation" are either "ignore" or "warn" -- if ignore, the controller can accumulate the string in a pile of problems to show the user if the user asks; if warn, the controller should alert the user that Tor is pretty sure there's a bootstrapping problem. Currently Tor uses recommendation=ignore for the first nine bootstrap problem reports for a given phase, and then uses recommendation=warn for subsequent problems at that phase. Hopefully this is a good balance between tolerating occasional errors and reporting serious problems quickly. 5. Suggested controller behavior. Controllers should start out with a yellow onion or the equivalent ("starting"), and then watch for either a bootstrap status event (meaning the Tor they're using is sufficiently new to produce them, and they should load up the progress bar or whatever they plan to use to indicate progress) or a circuit_established status event (meaning bootstrapping is finished). In addition to a progress bar in the display, controllers should also have some way to indicate progress even when no controller window is open. For example, folks using Tor Browser Bundle in hostile Internet cafes don't want a big splashy screen up. One way to let the user keep informed of progress in a more subtle way is to change the task tray icon and/or tooltip string as more bootstrap events come in. Controllers should also have some mechanism to alert their user when bootstrapping problems are reported. Perhaps we should gather a set of help texts and the controller can send the user to the right anchor in a "bootstrapping problems" page in the controller's help subsystem? 6. Getting up to speed when the controller connects. There's a new "GETINFO /status/bootstrap-phase" option, which returns the most recent bootstrap phase status event sent. Specifically, it returns a string starting with either "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should use this getinfo when they connect or attach to Tor to learn its current state.
Filename: 138-remove-down-routers-from-consensus.txt Title: Remove routers that are not Running from consensus documents Author: Peter Palfrader Created: 11-Jun-2008 Status: Closed Implemented-In: 0.2.1.2-alpha 1. Overview. Tor directory authorities hourly vote and agree on a consensus document which lists all the routers on the network together with some of their basic properties, like if a router is an exit node, whether it is stable or whether it is a version 2 directory mirror. One of the properties given with each router is the 'Running' flag. Clients do not use routers that are not listed as running. This proposal suggests that routers without the Running flag are not listed at all. 2. Current status At a typical bootstrap a client downloads a 140KB consensus, about 10KB of certificates to verify that consensus, and about 1.6MB of server descriptors, about 1/4 of which it requires before it will start building circuits. Another proposal deals with how to get that huge 1.6MB fraction to effectively zero (by downloading only individual descriptors, on demand). Should that get successfully implemented that will leave the 140KB compressed consensus as a large fraction of what a client needs to get in order to work. About one third of the routers listed in a consensus are not running and will therefore never be used by clients who use this consensus. Not listing those routers will save about 30% to 40% in size. 3. Proposed change Authority directory servers produce vote documents that include all the servers they know about, running or not, like they currently do. In addition these vote documents also state that the authority supports a new consensus forming method (method number 4). If more than two thirds of votes that an authority has received claim they support method 4 then this new method will be used: The consensus document is formed like before but a new last step removes all routers from the listing that are not marked as Running.
Filename: 139-conditional-consensus-download.txt Title: Download consensus documents only when it will be trusted Author: Peter Palfrader Created: 2008-04-13 Status: Closed Implemented-In: 0.2.1.x Overview: Servers only provide consensus documents to clients when it is known that the client will trust it. Motivation: When clients[1] want a new network status consensus they request it from a Tor server using the URL path /tor/status-vote/current/consensus. Then after downloading the client checks if this consensus can be trusted. Whether the client trusts the consensus depends on the authorities that the client trusts and how many of those authorities signed the consensus document. If the client cannot trust the consensus document it is disregarded and a new download is tried at a later time. Several hundred kilobytes of server bandwidth were wasted by this single client's request. With hundreds of thousands of clients this will have undesirable consequences when the list of authorities has changed so much that a large number of established clients no longer can trust any consensus document formed. Objective: The objective of this proposal is to make clients not download consensuses they will not trust. Proposal: The list of authorities that are trusted by a client are encoded in the URL they send to the directory server when requesting a consensus document. The directory server then only sends back the consensus when more than half of the authorities listed in the request have signed the consensus. If it is known that the consensus will not be trusted a 404 error code is sent back to the client. This proposal does not require directory caches to keep more than one consensus document. This proposal also does not require authorities to verify the signature on the consensus document of authorities they do not recognize. The new URL scheme to download a consensus is /tor/status-vote/current/consensus/<F> where F is a list of fingerprints, sorted in ascending order, and concatenated using a + sign. Fingerprints are uppercase hexadecimal encodings of the authority identity key's digest. Servers should also accept requests that use lower case or mixed case hexadecimal encodings. A .z URL for compressed versions of the consensus will be provided similarly to existing resources and is the URL that usually should be used by clients. Migration: The old location of the consensus should continue to work indefinitely. Not only is it used by old clients, but it is a useful resource for automated tools that do not particularly care which authorities have signed the consensus. Authorities that are known to the client a priori by being shipped with the Tor code are assumed to handle this format. When downloading a consensus document from caches that do not support this new format they fall back to the old download location. Caches support the new format starting with Tor version 0.2.1.1-alpha. Anonymity Implications: By supplying the list of authorities a client trusts to the directory server we leak information (like likely version of Tor client) to the directory server. In the current system we also leak that we are very old - by re-downloading the consensus over and over again, but only when we are so old that we no longer can trust the consensus. Footnotes: 1. For the purpose of this proposal a client can be any Tor instance that downloads a consensus document. This includes relays, directory caches as well as end users.
Filename: 140-consensus-diffs.txt Title: Provide diffs between consensuses Author: Peter Palfrader Created: 13-Jun-2008 Implemented-In: 0.3.1.1-alpha Status: Closed Ticket: https://bugs.torproject.org/13339 0. History 22-May-2009: Restricted the ed format even more strictly for ease of implementation. -nickm 25-May-2014: Adapted to the new dir-spec version 3 and made the diff urls backwards-compatible. -mvdan 1-Mar-2017: Update to new stats, note newer proposals, note flavors, diffs, add parameters, restore diff-only URLs, say what "Digest" means. -nickm 3-May-2017: Add a notion of "digest-as-signed" vs "full digest", since otherwise the fact that there are multiple encodings of the same valid consensus signatures would make clients identify which encodings they had been given as they asked for diffs. 4-May-2017: Remove support for truncated digest prefixes. 1. Overview. Tor clients and servers need a list of which relays are on the network. This list, the consensus, is created by authorities hourly and clients fetch a copy of it, with some delay, hourly. This proposal suggests that clients download diffs of consensuses once they have a consensus instead of hourly downloading a full consensus. This does not only apply to ordinary directory consensuses, but to the newer microdescriptor consensuses added in the third version of the directory specification. 2. Numbers After implementing proposal 138, which removed nodes that are not running from the list, a consensus document was about 92 kilobytes in size after compression... back in 2008 when this proposal was first written. But now in March 2017, that figure is more like 625 kilobytes. The diff between two consecutive consensuses, in ed format, is on average 37 kilobytes compressed. So by making this change, we could save something like 94% of our consensus download bandwidth. 3. Proposal 3.0. Preliminaries. Unless otherwise specified, all digests in this document are SHA3-256 digests, encoded in base64. This document also uses "hash" as synonymous with "digest". A "full digest" of a consensus document covers the entire document, from the "network-status-version" through the newline after the final "-----END SIGNATURE-----". A "digest as signed" of a consensus document covers the same part that the signatures cover: the "network-status-version" through the space immediately after the "directory-signature" keyword on the first "directory-signature" line. 3.1 Clients If a client has a consensus that is recent enough it SHOULD try to download a diff to get the latest consensus rather than fetching a full one. [XXX: what is recent enough? time delta in hours / size of compressed diff 1: 38177 2: 66955 3: 93502 4: 118959 5: 143450 6: 167136 12: 291354 18: 404008 24: 416663 30: 431240 36: 443858 42: 454849 48: 464677 54: 476716 60: 487755 66: 497502 72: 506421 Data suggests that for the first few hours' diffs are very useful, saving at least 50% for the first 12 hours. After that, returns seem to be more marginal. But note the savings from proposals like 274-276, which make diffs smaller over a much longer timeframe. ] 3.2 Servers Directory authorities and servers need to keep a number of old consensus documents so they can build diffs. (See section 5 below ). They should offer a diff to the most recent consensus at the following request: HTTP/1.0 GET /tor/status-vote/current/consensus{-Flavor}/<FPRLIST>.z X-Or-Diff-From-Consensus: HASH1 HASH2... where the hashes are the digests-as-signed of the consensuses the client currently has, and FPRLIST is a list of (abbreviated) fingerprints of authorities the client trusts. Servers will only return a consensus if more than half of the requested authorities have signed the document. Otherwise, a 404 error will be sent back. The advantage of using the same URL that is currently used for consensuses is that the client doesn't need to know whether a server supports consensus diffs. If it doesn't, it will simply ignore the extra header and return the full consensus. If a server cannot offer a diff from one of the consensuses identified by one of the hashes but has a current consensus it MUST return the full consensus. [XXX: what should we do when the client already has the latest consensus? I can think of the following options: - send back 3xx not modified - send back 200 ok and an empty diff - send back 404 nothing newer here. I currently lean towards the empty diff.] Additionally, specific diff for a given consensus digest-as-signed should be available a URL of the form: /tor/status-vote/current/consensus{-Flavor}/diff/<HASH>/<FPRLIST>.z This differs from the previous request type in that it should never return a whole consensus: if a diff is not available, it should return 404. 4. Diff Format Diffs start with the token "network-status-diff-version" followed by a space and the version number, currently "1". If a document does not start with network-status-diff it is assumed to be a full consensus download and would therefore currently start with "network-status-version 3". Following the network-status-diff line is another header line, starting with the token "hash" followed by the digest-as-signed of the consensus that this diff applies to, and the full digest that the resulting consensus should have. Following the network-status-diff header lines is a diff, or patch, in limited ed format. We choose this format because it is easy to create and process with standard tools (patch, diff -e, ed). This will help us in developing and testing this proposal and it should make future debugging easier. [ If at one point in the future we decide that the space benefits from a custom diff format outweighs these benefits we can always introduce a new diff format and offer it at for instance ../diff2/... ] We support the following ed commands, each on a line by itself: - "<n1>d" Delete line n1 - "<n1>,<n2>d" Delete lines n1 through n2, inclusive - "<n1>,$d" Delete line n1 through the end of the file, inclusive. - "<n1>c" Replace line n1 with the following block - "<n1>,<n2>c" Replace lines n1 through n2, inclusive, with the following block. - "<n1>a" Append the following block after line n1. - "a" Append the following block after the current line. Note that line numbers always apply to the file after all previous commands have already been applied. Note also that line numbers are 1-indexed. The commands MUST apply to the file from back to front, such that lines are only ever referred to by their position in the original file. If there are any directory signatures on the original document, the first command MUST be a "<n1>,$d" form to remove all of the directory signatures. Using this format ensures that the client will successfully apply the diff even if they have an unusual encoding for the signatures. The "current line" is either the first line of the file, if this is the first command, the last line of a block we added in an append or change command, or the line immediate following a set of lines we just deleted (or the last line of the file if there are no lines after that). The replace and append command take blocks. These blocks are simply appended to the diff after the line with the command. A line with just a period (".") ends the block (and is not part of the lines to add). Note that it is impossible to insert a line with just a single dot. 4.1. Concatenating multiple diffs Directory caches may, at their discretion, return the concatenation of multiple diffs using the format above. Such diffs are to be applied from first to last. This allows the caches to cache a smaller number of compressed diffs, at the expense of some loss in bandwidth efficiency. 5. Networkstatus parameters The following parameters govern how relays and clients use this protocol. min-consensuses-age-to-cache-for-diff (min 0, max 744, default 6) max-consensuses-age-to-cache-for-diff (min 0, max 8192, default 72) These two parameters determine how much consensus history (in hours) relays should try to cache in order to serve diffs. try-diff-for-consensus-newer-than (min 0, max 8192, default 72) This parameter determines how old a consensus can be (in hours) before a client should no longer try to find a diff for it.
Filename: 141-jit-sd-downloads.txt Title: Download server descriptors on demand Author: Peter Palfrader Created: 15-Jun-2008 Status: Obsolete 1. Overview Downloading all server descriptors is the most expensive part of bootstrapping a Tor client. These server descriptors currently amount to about 1.5 Megabytes of data, and this size will grow linearly with network size. Fetching all these server descriptors takes a long while for people behind slow network connections. It is also a considerable load on our network of directory mirrors. This document describes proposed changes to the Tor network and directory protocol so that clients will no longer need to download all server descriptors. These changes consist of moving load balancing information into network status documents, implementing a means to download server descriptors on demand in an anonymity-preserving way, and dealing with exit node selection. 2. What is in a server descriptor When a Tor client starts the first thing it will try to get is a current network status document: a consensus signed by a majority of directory authorities. This document is currently about 100 Kilobytes in size, tho it will grow linearly with network size. This document lists all servers currently running on the network. The Tor client will then try to get a server descriptor for each of the running servers. All server descriptors currently amount to about 1.5 Megabytes of downloads. A Tor client learns several things about a server from its descriptor. Some of these it already learned from the network status document published by the authorities, but the server descriptor contains it again in a single statement signed by the server itself, not just by the directory authorities. Tor clients use the information from server descriptors for different purposes, which are considered in the following sections. #three ways: One, to determine if a server will be able to handle #this client's request; two, to actually communicate or use the server; #three, for load balancing decisions. # #These three points are considered in the following subsections. 2.1 Load balancing The Tor load balancing mechanism is quite complex in its details, but it has a simple goal: The more traffic a server can handle the more traffic it should get. That means the more traffic a server can handle the more likely a client will use it. For this purpose each server descriptor has bandwidth information which tries to convey a server's capacity to clients. Currently we weigh servers differently for different purposes. There is a weight for when we use a server as a guard node (our entry to the Tor network), there is one weight we assign servers for exit duties, and a third for when we need intermediate (middle) nodes. 2.2 Exit information When a Tor wants to exit to some resource on the internet it will build a circuit to an exit node that allows access to that resource's IP address and TCP Port. When building that circuit the client can make sure that the circuit ends at a server that will be able to fulfill the request because the client already learned of all the servers' exit policies from their descriptors. 2.3 Capability information Server descriptors contain information about the specific version of the Tor protocol they understand [proposal 105]. Furthermore the server descriptor also contains the exact version of the Tor software that the server is running and some decisions are made based on the server version number (for instance a Tor client will only make conditional consensus requests [proposal 139] when talking to Tor servers version 0.2.1.1-alpha or later). 2.4 Contact/key information A server descriptor lists a server's IP address and TCP ports on which it accepts onion and directory connections. Furthermore it contains the onion key (a short lived RSA key to which clients encrypt CREATE cells). 2.5 Identity information A Tor client learns the digest of a server's key from the network status document. Once it has a server descriptor this descriptor contains the full RSA identity key of the server. Clients verify that 1) the digest of the identity key matches the expected digest it got from the consensus, and 2) that the signature on the descriptor from that key is valid. 3. No longer require clients to have copies of all SDs 3.1 Load balancing info in consensus documents One of the reasons why clients download all server descriptors is for doing load proper load balancing as described in 2.1. In order for clients to not require all server descriptors this information will have to move into the network status document. Consensus documents will have a new line per router similar to the "r", "s", and "v" lines that already exist. This line will convey weight information to clients. "w Bandwidth=193" The bandwidth number is the lesser of observed bandwidth and bandwidth rate limit from the server descriptor that the "r" line referenced by digest (1st and 3rd field of the bandwidth line in the descriptor). It is given in kilobytes per second so the byte value in the descriptor has to be divided by 1024 (and is then truncated, i.e. rounded down). Authorities will cap the bandwidth number at some arbitrary value, currently 10MB/sec. If a router claims a larger bandwidth an authority's vote will still only show Bandwidth=10240. The consensus value for bandwidth is the median of all bandwidth numbers given in votes. In case of an even number of votes we use the lower median. (Using this procedure allows us to change the cap value more easily.) Clients should believe the bandwidth as presented in the consensus, not capping it again. 3.2 Fetching descriptors on demand As described in 2.4 a descriptor lists IP address, OR- and Dir-Port, and the onion key for a server. A client already knows the IP address and the ports from the consensus documents, but without the onion key it will not be able to send CREATE/EXTEND cells for that server. Since the client needs the onion key it needs the descriptor. If a client only downloaded a few descriptors in an observable manner then that would leak which nodes it was going to use. This proposal suggests the following: 1) when connecting to a guard node for which the client does not yet have a cached descriptor it requests the descriptor it expects by hash. (The consensus document that the client holds has a hash for the descriptor of this server. We want exactly that descriptor, not a different one.) It does that by sending a RELAY_REQUEST_SD cell. A client MAY cache the descriptor of the guard node so that it does not need to request it every single time it contacts the guard. 2) when a client wants to extend a circuit that currently ends in server B to a new next server C, the client will send a RELAY_REQUEST_SD cell to server B. This cell contains in its payload the hash of a server descriptor the client would like to obtain (C's server descriptor). The server sends back the descriptor and the client can now form a valid EXTEND/CREATE cell encrypted to C's onion key. Clients MUST NOT cache such descriptors. If they did they might leak that they already extended to that server at least once before. Replies to RELAY_REQUEST_SD requests need to be padded to some constant upper limit in order to conceal a client's destination from anybody who might be counting cells/bytes. RELAY_REQUEST_SD cells contain the following information: - hash of the server descriptor requested - hash of the identity digest of the server for which we want the SD - IP address and OR-port or the server for which we want the SD - padding factor - the number of cells we want the answer padded to. [XXX this just occured to me and it might be smart. or it might be stupid. clients would learn the padding factor they want to use from the consensus document. This allows us to grow the replies later on should SDs become larger.] [XXX: figure out a decent padding size] 3.3 Protocol versions Server descriptors contain optional information of supported link-level and circuit-level protocols in the form of "opt protocols Link 1 2 Circuit 1". These are not currently needed and will probably eventually move into the "v" (version) line in the consensus. This proposal does not deal with them. Similarly a server descriptor contains the version number of a Tor node. This information is already present in the consensus and is thus available to all clients immediately. 3.4 Exit selection Currently finding an appropriate exit node for a user's request is easy for a client because it has complete knowledge of all the exit policies of all servers on the network. The consensus document will once again be extended to contain the information required by clients. This information will be a summary of each node's exit policy. The exit policy summary will only contain the list of ports to which a node exits to most destination IP addresses. A summary should claim a router exits to a specific TCP port if, ignoring private IP addresses, the exit policy indicates that the router would exit to this port to most IP address. either two /8 netblocks, or one /8 and a couple of /12s or any other combination). The exact algorith used is this: Going through all exit policy items - ignore any accept that is not for all IP addresses ("*"), - ignore rejects for these netblocks (exactly, no subnetting): 0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8, and 172.16.0.0/12m - for each reject count the number of IP addresses rejected against the affected ports, - once we hit an accept for all IP addresses ("*") add the ports in that policy item to the list of accepted ports, if they don't have more than 2^25 IP addresses (that's two /8 networks) counted against them (i.e. if the router exits to a port to everywhere but at most two /8 networks). An exit policy summary will be included in votes and consensus as a new line attached to each exit node. The line will have the format "p" <space> "accept"|"reject" <portlist> where portlist is a comma seperated list of single port numbers or portranges (e.g. "22,80-88,1024-6000,6667"). Whether the summary shows the list of accepted ports or the list of rejected ports depends on which list is shorter (has a shorter string representation). In case of ties we choose the list of accepted ports. As an exception to this rule an allow-all policy is represented as "accept 1-65535" instead of "reject " and a reject-all policy is similarly given as "reject 1-65535". Summary items are compressed, that is instead of "80-88,89-100" there only is a single item of "80-100", similarly instead of "20,21" a summary will say "20-21". Port lists are sorted in ascending order. The maximum allowed length of a policy summary (including the "accept " or "reject ") is 1000 characters. If a summary exceeds that length we use an accept-style summary and list as much of the port list as is possible within these 1000 bytes. 3.4.1 Consensus selection When building a consensus, authorities have to agree on a digest of the server descriptor to list in the router line for each router. This is documented in dir-spec section 3.4. All authorities that listed that agreed upon descriptor digest in their vote should also list the same exit policy summary - or list none at all if the authority has not been upgraded to list that information in their vote. If we have votes with matching server descriptor digest of which at least one of them has an exit policy then we differ between two cases: a) all authorities agree (or abstained) on the policy summary, and we use the exit policy summary that they all listed in their vote, b) something went wrong (or some authority is playing foul) and we have different policy summaries. In that case we pick the one that is most commonly listed in votes with the matching descriptor. We break ties in favour of the lexigraphically larger vote. If none one of the votes with a matching server descriptor digest has an exit policy summary we use the most commonly listed one in all votes, breaking ties like in case b above. 3.4.2 Client behaviour When choosing an exit node for a specific request a Tor client will choose from the list of nodes that exit to the requested port as given by the consensus document. If a client has additional knowledge (like cached full descriptors) that indicates the so chosen exit node will reject the request then it MAY use that knowledge (or not include such nodes in the selection to begin with). However, clients MUST NOT use nodes that do not list the port as accepted in the summary (but for which they know that the node would exit to that address from other sources, like a cached descriptor). An exception to this is exit enclave behaviour: A client MAY use the node at a specific IP address to exit to any port on the same address even if that node is not listed as exiting to the port in the summary. 4. Migration 4.1 Consensus document changes. The consensus will need to include - bandwidth information (see 3.1) - exit policy summaries (3.4) A new consensus method (number TBD) will be chosen for this. 5. Future possibilities This proposal still requires that all servers have the descriptors of every other node in the network in order to answer RELAY_REQUEST_SD cells. These cells are sent when a circuit is extended from ending at node B to a new node C. In that case B would have to answer a RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest). In order to answer that request B obviously needs a copy of C's server descriptor. The RELAY_REQUEST_SD cell already has all the info that B needs to contact C so it can ask about the descriptor before passing it back to the client.
Filename: 142-combine-intro-and-rend-points.txt Title: Combine Introduction and Rendezvous Points Author: Karsten Loesing, Christian Wilms Created: 27-Jun-2008 Status: Dead Change history: 27-Jun-2008 Initial proposal for or-dev 04-Jul-2008 Give first security property the new name "Responsibility" and change new cell formats according to rendezvous protocol version 3 draft. 19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of circuits between multiple clients is not supported by Tor. Overview: Establishing a connection to a hidden service currently involves two Tor relays, introduction and rendezvous point, and 10 more relays distributed over four circuits to connect to them. The introduction point is established in the mid-term by a hidden service to transfer introduction requests from client to the hidden service. The rendezvous point is set up by the client for a single hidden service request and actually transfers end-to-end encrypted application data between client and hidden service. There are some reasons for separating the two roles of introduction and rendezvous point: (1) Responsibility: A relay shall not be made responsible that it relays data for a certain hidden service; in the original design as described in [1] an introduction point relays no application data, and a rendezvous points neither knows the hidden service nor can it decrypt the data. (2) Scalability: The hidden service shall not have to maintain a number of open circuits proportional to the expected number of client requests. (3) Attack resistance: The effect of an attack on the only visible parts of a hidden service, its introduction points, shall be as small as possible. However, elimination of a separate rendezvous connection as proposed by Øverlier and Syverson [2] is the most promising approach to improve the delay in connection establishment. From all substeps of connection establishment extending a circuit by only a single hop is responsible for a major part of delay. Reducing on-demand circuit extensions from two to one results in a decrease of mean connection establishment times from 39 to 29 seconds [3]. Particularly, eliminating the delay on hidden-service side allows the client to better observe progress of connection establishment, thus allowing it to use smaller timeouts. Proposal 114 introduced new introduction keys for introduction points and provides for user authorization data in hidden service descriptors; it will be shown in this proposal that introduction keys in combination with new introduction cookies provide for the first security property responsibility. Further, eliminating the need for a separate introduction connection benefits the overall network load by decreasing the number of circuit extensions. After all, having only one connection between client and hidden service reduces the overall protocol complexity. Design: 1. Hidden Service Configuration Hidden services should be able to choose whether they would like to use this protocol. This might be opt-in for 0.2.1.x and opt-out for later major releases. 2. Contact Point Establishment When preparing a hidden service, a Tor client selects a set of relays to act as contact points instead of introduction points. The contact point combines both roles of introduction and rendezvous point as proposed in [2]. The only requirement for a relay to be picked as contact point is its capability of performing this role. This can be determined from the Tor version number that needs to be equal or higher than the first version that implements this proposal. The easiest way to implement establishment of contact points is to introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes version 2 ESTABLISH_INTRO cells as requests to establish a contact point rather than an introduction point. V Format byte: set to 255 [1 octet] V Version byte: set to 2 [1 octet] KLEN Key length [2 octets] PK Public introduction key [KLEN octets] HS Hash of session info [20 octets] SIG Signature of above information [variable] The hidden service does not create a fixed number of contact points, like 3 in the current protocol. It uses a minimum of 3 contact points, but increases this number depending on the history of client requests within the last hour. The hidden service also increases this number depending on the frequency of failing contact points in order to defend against attacks on its contact points. When client authorization as described in proposal 121 is used, a hidden service can also use the number of authorized clients as first estimate for the required number of contact points. 3. Hidden Service Descriptor Creation A hidden service needs to issue a fresh introduction cookie for each established introduction point. By requiring clients to use this cookie in a later connection establishment, an introduction point cannot access the hidden service that it works for. Together with the fresh introduction key that was introduced in proposal 114, this reduces responsibility of a contact point for a specific hidden service. The v2 hidden service descriptor format contains an "intro-authentication" field that may contain introduction-point specific keys. The hidden service creates a random string, comparable to the rendezvous cookie, and includes it in the descriptor as introduction cookie for auth-type "1". By convention, clients recognize existence of auth-type 1 as possibility to connect to a hidden service via a contact point rather than an introduction point. Older clients that do not understand this new protocol simply ignore that cookie. 4. Connection Establishment When establishing a connection to a hidden service a client learns about the capability of using the new protocol from the hidden service descriptor. It may choose whether to use this new protocol or not, whereas older clients cannot understand the new capability and can only use the current protocol. Client using version 0.2.1.x should be able to opt-in for using the new protocol, which should change to opt-out for later major releases. When using the new capability the client creates a v2 INTRODUCE1 cell that extends an unversioned INTRODUCE1 cell by adding the content of an ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point, because unversioned and versioned INTRODUCE1 cells are indistinguishable: Cleartext V Version byte: set to 2 [1 octet] PK_ID Identifier for Bob's PK [20 octets] RC Rendezvous cookie [20 octets] Encrypted to introduction key: VER Version byte: set to 3. [1 octet] AUTHT The auth type that is supported [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] RC Rendezvous cookie [20 octets] g^x Diffie-Hellman data, part 1 [128 octets] The cleartext part contains the rendezvous cookie that the contact point remembers just as a rendezvous point would do. The encrypted part contains the introduction cookie as auth data for the auth type 1. The rendezvous cookie is contained as before, but there is no further rendezvous point information, as there is no separate rendezvous point. 5. Rendezvous Establishment The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a request to be used in the new protocol. It remembers the contained rendezvous cookie, replies to the client with an INTRODUCE_ACK cell (omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service. 6. Introduction at Hidden Service The hidden services recognizes an INTRODUCE2 cell containing an introduction cookie as authorization data. In this case, it does not extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell directly back to its contact point as usual. 7. Rendezvous at Contact Point The contact point processes a RENDEZVOUS1 cell just as a rendezvous point does. The only difference is that the hidden-service-side circuit is not exclusive for the client connection, but shared among multiple client connections. [Tor does not allow sharing of a single circuit among multiple client connections easily. We need to think about a smart and efficient way to implement this. Comment by Nick. -KL] Security Implications: (1) Responsibility One of the original reasons for the separation of introduction and rendezvous points is that a relay shall not be made responsible that it relays data for a certain hidden service. In the current design an introduction point relays no application data and a rendezvous points neither knows the hidden service nor can it decrypt the data. This property is also fulfilled in this new design. A contact point only learns a fresh introduction key instead of the hidden service key, so that it cannot recognize a hidden service. Further, the introduction cookie, which is unknown to the contact point, prevents it from accessing the hidden service itself. The only way for a contact point to access a hidden service is to look up whether it is contained in the descriptors of known hidden services. A contact point cannot directly be made responsible for which hidden service it is working. In addition to that, it cannot learn the data that it transfers, because all communication between client and hidden service are end-to-end encrypted. (2) Scalability Another goal of the existing hidden service protocol is that a hidden service does not have to maintain a number of open circuits proportional to the expected number of client requests. The rationale behind this is better scalability. The new protocol eliminates the need for a hidden service to extend circuits on demand, which has a positive effect on circuits establishment times and overall network load. The solution presented here to establish a number of contact points proportional to the history of connection requests reduces the number of circuits to a minimum number that fits the hidden service's needs. (3) Attack resistance The third goal of separating introduction and rendezvous points is to limit the effect of an attack on the only visible parts of a hidden service which are the contact points in this protocol. In theory, the new protocol is more vulnerable to this attack. An attacker who can take down a contact point does not only eliminate an access point to the hidden service, but also breaks current client connections to the hidden service using that contact point. Øverlier and Syverson proposed the concept of valet nodes as additional safeguard for introduction/contact points [4]. Unfortunately, this increases hidden service protocol complexity conceptually and from an implementation point of view. Therefore, it is not included in this proposal. However, in practice attacking a contact point (or introduction point) is not as rewarding as it might appear. The cost for a hidden service to set up a new contact point and publish a new hidden service descriptor is minimal compared to the efforts necessary for an attacker to take a Tor relay down. As a countermeasure to further frustrate this attack, the hidden service raises the number of contact points as a function of previous contact point failures. Further, the probability of breaking client connections due to attacking a contact point is minimal. It can be assumed that the probability of one of the other five involved relays in a hidden service connection failing or being shut down is higher than that of a successful attack on a contact point. (4) Resistance against Locating Attacks Clients are no longer able to force a hidden service to create or extend circuits. This further reduces an attacker's capabilities of locating a hidden server as described by Øverlier and Syverson [5]. Compatibility: The presented protocol does not raise compatibility issues with current Tor versions. New relay versions support both, the existing and the proposed protocol as introduction/rendezvous/contact points. A contact point acts as introduction point simultaneously. Hidden services and clients can opt-in to use the new protocol which might change to opt-out some time in the future. References: [1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The Second-Generation Onion Router. In the Proceedings of the 13th USENIX Security Symposium, August 2004. [2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity of Tor Circuit Establishment and Hidden Services. In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies (PET 2007), Ottawa, Canada, June 2007. [3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at Better Performance, diploma thesis, June 2008, University of Bamberg. [4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden Servers with a Personal Touch. In the Proceedings of the Sixth Workshop on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006. [5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.
Filename: 143-distributed-storage-improvements.txt Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors Author: Karsten Loesing Created: 28-Jun-2008 Status: Superseded Change history: 28-Jun-2008 Initial proposal for or-dev Overview: An evaluation of the distributed storage for Tor hidden service descriptors and subsequent discussions have brought up a few improvements to proposal 114. All improvements are backwards compatible to the implementation of proposal 114. Design: 1. Report Bad Directory Nodes Bad hidden service directory nodes could deny existence of previously stored descriptors. A bad directory node that does this with all stored descriptors causes harm to the distributed storage in general, but replication will cope with this problem in most cases. However, an adversary that attempts to make a specific hidden service unavailable by running relays that become responsible for all of a service's descriptors poses a more serious threat. The distributed storage needs to defend against this attack by detecting and removing bad directory nodes. As a countermeasure hidden services try to download their descriptors every hour at random times from the hidden service directories that are responsible for storing it. If a directory node replies with 404 (Not found), the hidden service reports the supposedly bad directory node to a random selection of half of the directory authorities (with version numbers equal to or higher than the first version that implements this proposal). The hidden service posts a complaint message using HTTP 'POST' to a URL "/tor/rendezvous/complain" with the following message format: "hidden-service-directory-complaint" identifier NL [At start, exactly once] The identifier of the hidden service directory node to be investigated. "rendezvous-service-descriptor" descriptor NL [At end, Excatly once] The hidden service descriptor that the supposedly bad directory node does not serve. The directory authority checks if the descriptor is valid and the hidden service directory responsible for storing it. It waits for a random time of up to 30 minutes before posting the descriptor to the hidden service directory. If the publication is acknowledged, the directory authority waits another random time of up to 30 minutes before attempting to request the descriptor that it has posted. If the directory node replies with 404 (Not found), it will be blacklisted for being a hidden service directory node for the next 48 hours. A blacklisted hidden service directory is assigned the new flag BadHSDir instead of the HSDir flag in the vote that a directory authority creates. In a consensus a relay is only assigned a HSDir flag if the majority of votes contains a HSDir flag and no more than one third of votes contains a BadHSDir flag. As a result, clients do not have to learn about the BadHSDir flag. A blacklisted directory node will simply not be assigned the HSDir flag in the consensus. In order to prevent an attacker from setting up new nodes as replacement for blacklisted directory nodes, all directory nodes in the same /24 subnet are blacklisted, too. Furthermore, if two or more directory nodes are blacklisted in the same /16 subnet concurrently, all other directory nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at most 48 hours. 2. Publish Fewer Replicas The evaluation has shown that the probability of a directory node to serve a previously stored descriptor is 85.7% (more precisely, this is the 0.001-quantile of the empirical distribution with the rationale that it holds for 99.9% of all empirical cases). If descriptors are replicated to x directory nodes, the probability of at least one of the replicas to be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an overall availability of 99.9%, x = 3.55 replicas need to be stored. From this follows that 4 replicas are sufficient, rather than the currently stored 6 replicas. Further, the current design stores 2 sets of descriptors on 3 directory nodes with consecutive identities. Originally, this was meant to facilitate replication between directory nodes, which has not been and will not be implemented (the selection criterion of 24 hours uptime does not make it necessary). As a result, storing descriptors on directory nodes with consecutive identities is not required. In fact it should be avoided to enable an attacker to create "black holes" in the identifier ring. Hidden services should store their descriptors on 4 non-consecutive directory nodes, and clients should request descriptors from these directory nodes only. For compatibility reasons, hidden services also store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x clients will be able to retrieve 4 out of 6 descriptors, but will fail for the remaining 2 descriptors, which is sufficient for reliability. As soon as 0.2.0.x is deprecated, hidden services can stop publishing the additional 2 replicas. 3. Change Default Value of Being Hidden Service Directory The requirements for becoming a hidden service directory node are an open directory port and an uptime of at least 24 hours. The evaluation has shown that there are 300 hidden service directory candidates in the mean, but only 6 of them are configured to act as hidden service directories. This is bad, because those 6 nodes need to serve a large share of all hidden service descriptors. Optimally, there should be hundreds of hidden service directories. Having a large number of 0.2.1.x directory nodes also has a positive effect on 0.2.0.x hidden services and clients. Therefore, the new default of HidServDirectoryV2 should be 1, so that a Tor relay that has an open directory port automatically accepts and serves v2 hidden service descriptors. A relay operator can still opt-out running a hidden service directory by changing HidServDirectoryV2 to 0. The additional bandwidth requirements for running a hidden service directory node in addition to being a directory cache are negligible. 4. Make Descriptors Persistent on Directory Nodes Hidden service directories that are restarted by their operators or after a failure will not be selected as hidden service directories within the next 24 hours. However, some clients might still think that these nodes are responsible for certain descriptors, because they work on the basis of network consensuses that are up to three hours old. The directory nodes should be able to serve the previously received descriptors to these clients. Therefore, directory nodes make all received descriptors persistent and load previously received descriptors on startup. 5. Store and Serve Descriptors Regardless of Responsibility Currently, directory nodes only accept descriptors for which they think they are responsible. This may lead to problems when a directory node uses an older or newer network consensus than hidden service or client or when a directory node has been restarted recently. In fact, there are no security issues in storing or serving descriptors for which a directory node thinks it is not responsible. To the contrary, doing so may improve reliability in border cases. As a result, a directory node does not pay attention to responsibilty when receiving a publication or fetch request, but stores or serves the requested descriptor. Likewise, the directory node does not remove descriptors when it thinks it is not responsible for them any more. 6. Avoid Periodic Descriptor Re-Publication In the current implementation a hidden service re-publishes its descriptor either when its content changes or an hour elapses. However, the evaluation has shown that failures of hidden service directory nodes, i.e. of nodes that have not failed within the last 24 hours, are very rare. Together with making descriptors persistent on directory nodes, there is no necessity to re-publish descriptors hourly. The only two events leading to descriptor re-publication should be a change of the descriptor content and a new directory node becoming responsible for the descriptor. Hidden services should therefore consider re-publication every time they learn about a new network consensus instead of hourly. 7. Discard Expired Descriptors The current implementation lets directory nodes keep a descriptor for two days before discarding it. However, with the v2 design, descriptors are only valid for at most one day. Directory nodes should determine the validity of stored descriptors and discard them one hour after they have expired (to compensate wrong clocks on clients). 8. Shorten Client-Side Descriptor Fetch History When clients try to download a hidden service descriptor, they memorize fetch requests to directory nodes for up to 15 minutes. This allows them to request all replicas of a descriptor to avoid bad or failing directory nodes, but without querying the same directory node twice. The downside is that a client that has requested a descriptor without success, will not be able to find a hidden service that has been started during the following 15 minutes after the client's last request. This can be improved by shortening the fetch history to only 5 minutes. This time should be sufficient to complete requests for all replicas of a descriptor, but without ending in an infinite request loop. Compatibility: All proposed improvements are compatible to the currently implemented design as described in proposal 114.
Filename: 144-enforce-distinct-providers.txt Title: Increase the diversity of circuits by detecting nodes belonging the same provider Author: Mfr Created: 2008-06-15 Status: Obsolete Overview: Increase network security by reducing the capacity of the relay or ISPs monitoring personally or requisition, a large part of traffic Tor trying to break circuits privacy. A way to increase the diversity of circuits without killing the network performance. Motivation: Since 2004, Roger an Nick publication about diversity [1], very fast relays Tor running are focused among an half dozen of providers, controlling traffic of some dozens of routers [2]. In the same way the generalization of VMs clonables paid by hour, allowing starting in few minutes and for a small cost, a set of very high-speed relay whose in a few hours can attract a big traffic that can be analyzed, increasing the vulnerability of the network. Whether ISPs or domU providers, these usually have several groups of IP Class B. Also the restriction in place EnforceDistinctSubnets automatically excluding IP subnet class B is only partially effective. By contrast a restriction at the class A will be too restrictive. Therefore it seems necessary to consider another approach. Proposal: Add a provider control based on AS number added by the router on is descriptor, controlled by Directories Authorities, and used like the declarative family field for circuit creating. Design: Step 1 : Add to the router descriptor a provider information get request [4] by the router itself. "provider" name NL 'names' is the AS number of the router formated like this: 'ASxxxxxx' where AS is fixed and xxxxxx is the AS number, left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number is missing the network A class number is used like that: 'ANxxx' where AN is fixed and xxx is the first 3 digits of the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set if it's a local network IP. If two ORs list one another in their "provider" entries, then OPs should treat them as a single OR for the purpose of path selection. For example, if node A's descriptor contains "provider B", and node B's descriptor contains "provider A", then node A and node B should never be used on the same circuit. Add the regarding config option in torrc EnforceDistinctProviders set to 1 by default. Permit building circuits with relays in the same provider if set to 0. Regarding to proposal 135 if TestingTorNetwork is set need to be EnforceDistinctProviders is unset. Control by Authorities Directories of the AS numbers The Directories Authority control the AS numbers of the new node descriptor uploaded. If an old version is operated by the node this test is bypassed. If AS number get by request is different from the description, router is flagged as non-Valid by the testing Authority for the voting process. Step 2 When a ' significant number of nodes' of valid routers are generating descriptor with provider information. Add missing provider information get by DNS request functionality for the circuit user: During circuit building, computing, OP apply first family check and EnforceDistinctSubnets directives for performance, then if provider info is needed and missing in router descriptor try to get AS provider info by DNS request [4]. This information could be DNS cached. AN ( class A number) is never generated during this process to prevent DNS block problems. If DNS request fails ignore and continue building circuit. Step 3 When the 'whole majority' of valid Tor clients are providing DNS request. Older versions are deprecated and mark as no-Valid. EnforceDistinctProviders replace EnforceDistinctSubnets functionnality. EnforceDistinctSubnets is removed. Functionalities deployed in step 2 are removed. Security implications: This providermeasure will increase the number of providers addresses that an attacker must use in order to carry out traffic analysis. Compatibility: The presented protocol does not raise compatibility issues with current Tor versions. The compatibility is preserved by implementing this functionality in 3 steps, giving time to network users to upgrade clients and routers. Performance and scalability notes: Provider change for all routers could reduce a little performance if the circuit to long. During step 2 Get missing provider information could increase building path time and should have a time out. Possible Attacks/Open Issues/Some thinking required: These proposal seems be compatible with proposal 135 Simplify Configuration of Private Tor Networks. This proposal does not resolve multiples AS owners and top providers traffic monitoring attacks [5]. Unresolved AS number are treated as a Class A network. Perhaps should be marked as invalid. But there's only fives items on last check see [2]. Need to define what's a 'significant number of nodes' and 'whole majority' ;-) References: [1] Location Diversity in Anonymity Networks by Nick Feamster and Roger Dingledine. In the Proceedings of the Workshop on Privacy in the Electronic Society (WPES 2004), Washington, DC, USA, October 2004 http://freehaven.net/anonbib/#feamster:wpes2004 [2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt [3] see Goodell Tor Exit Page http://cassandra.eecs.harvard.edu/cgi-bin/exit.py [4] see the great IP to ASN DNS Tool http://www.team-cymru.org/Services/ip-to-asn.html [5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by Steven J. Murdoch and Piotr Zielinski. In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies (PET 2007), Ottawa, Canada, June 2007. http://freehaven.net/anonbib/#murdoch-pet2007 [5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690
Filename: 145-newguard-flag.txt Title: Separate "suitable as a guard" from "suitable as a new guard" Author: Nick Mathewson Created: 1-Jul-2008 Status: Superseded [This could be obsoleted by proposal 141, which could replace NewGuard with a Guard weight.] [This _is_ superseded by 236, which adds guard weights for real.] Overview Right now, Tor has one flag that clients use both to tell which nodes should be kept as guards, and which nodes should be picked when choosing new guards. This proposal separates this flag into two. Motivation Balancing clients amoung guards is not done well by our current algorithm. When a new guard appears, it is chosen by clients looking for a new guard with the same probability as all existing guards... but new guards are likelier to be under capacity, whereas old guards are likelier to be under more use. Implementation We add a new flag, NewGuard. Clients will change so that when they are choosing new guards, they only consider nodes with the NewGuard flag set. For now, authorities will always set NewGuard if they are setting the Guard flag. Later, it will be easy to migrate authorities to set NewGuard for underused guards. Alternatives We might instead have authorities list weights with which nodes should be picked as guards.
Filename: 146-long-term-stability.txt Title: Add new flag to reflect long-term stability Author: Nick Mathewson Created: 19-Jun-2008 Status: Superseded Superseded-by: 206 Status: The applications of this design are achieved by proposal 206 instead. Instead of having the authorities track long-term stability for nodes that might be useful as directories in a fallback consensus, we eliminated the idea of a fallback consensus, and just have a DirSource configuration option. (Nov 2013) Overview This document proposes a new flag to indicate that a router has existed at the same address for a long time, describes how to implement it, and explains what it's good for. Motivation Tor has had three notions of "stability" for servers. Older directory protocols based a server's stability on its (self-reported) uptime: a server that had been running for a day was more stable than a server that had been running for five minutes, regardless of their past history. Current directory protocols track weighted mean time between failure (WMTBF) and weighted fractional uptime (WFU). WFU is computed as the fraction of time for which the server is running, with measurements weighted to exponentially decay such that old days count less. WMTBF is computed as the average length of intervals for which the server runs between downtime, with old intervals weighted to count less. WMTBF is useful in answering the question: "If a server is running now, how long is it likely to stay running?" This makes it a good choice for picking servers for streams that need to be long-lived. WFU is useful in answering the question: "If I try connecting to this server at an arbitrary time, is it likely to be running?" This makes it an important factor for picking guard nodes, since we want guard nodes to be usually-up. There are other questions that clients want to answer, however, for which the current flags aren't very useful. The one that this proposal addresses is, "If I found this server in an old consensus, is it likely to still be running at the same address?" This one is useful when we're trying to find directory mirrors in a fallback-consensus file. This property is equivalent to, "If I find this server in a current consensus, how long is it likely to exist on the network?" This one is useful if we're trying to pick introduction points or something and care more about churn rate than about whether every IP will be up all the time. Implementation: I propose we add a new flag, called "Longterm." Authorities should set this flag for routers if their Longevity is in the upper quartile of all routers. A router's Longevity is computed as the total amount of days in the last year or so[*] for which the router has been Running at least once at its current IP:orport pair. Clients should use directory servers from a fallback-consensus only if they have the Longterm flag set. Authority ops should be able to mark particular routers as not Longterm, regardless of history. (For instance, it makes sense to remove the Longterm flag from a router whose op says that it will need to shutdown in a month.) [*] This is deliberately vague, to permit efficient implementations. Compatibility and migration issues: The voting protocol already acts gracefully when new flags are added, so no change to the voting protocol is needed. Tor won't have collected this data, however. It might be desirable to bootstrap it from historical consensuses. Alternatively, we can just let the algorithm run for a month or two. Issues and future possibilities: Longterm is a really awkward name.
Filename: 147-prevoting-opinions.txt Title: Eliminate the need for v2 directories in generating v3 directories Author: Nick Mathewson Created: 2-Jul-2008 Status: Rejected Target: 0.2.4.x Overview We propose a new v3 vote document type to replace the role of v2 networkstatus information in generating v3 consensuses. Motivation When authorities vote on which descriptors are to be listed in the next consensus, it helps if they all know about the same descriptors as one another. But a hostile, confused, or out-of-date server may upload a descriptor to only some authorities. In the current v3 directory design, the authorities don't have a good way to tell one another about the new descriptor until they exchange votes... but by the time this happens, they are already committed to their votes, and they can't add anybody they learn about from other authorities until the next voting cycle. That's no good! The current Tor implementation avoids this problem by having authorities also look at v2 networkstatus documents, but we'd like in the long term to eliminate these, once 0.1.2.x is obsolete. Design: We add a new value for vote-status in v3 consensus documents in addition to "consensus" and "vote": "opinion". Authorities generate and sign an opinion document as if they were generating a vote, except that they generate opinions earlier than they generate votes. [This proposal doesn't say what lines must be contained in opinion documents. It seems that an authority that parses an opinion document is only interested in a) relay fingerprint, b) descriptor publication time, and c) descriptor digest; unless there's more information that helps authorities decide whether "they might accept" a descriptor. If not, opinion documents only need to contain a small subset of headers and all the "r" lines that would be contained in a later vote. -KL] [This seems okay. It would however mean that we can't use the same parsing logic as we use for regular votes. -NM] [Authorities should use the same "valid-after", "fresh-until", and "valid-until" lines in opinion documents as they are going to use in their next vote. -KL] [Maybe these lines should just get ignored on opinions. Or omitted. -NM] Authorities don't need to generate more than one opinion document per voting interval, but may. They should send it to the other authorities they know about, at http://<hostname>/tor/post/opinion , before the authorities begin voting, so that enough time remains for the authorities to fetch new descriptors. Additionally, authories make their opinions available at http://<hostname>/tor/status-vote/next/opinion.z and download opinions from authorities they haven't heard from in a while. Authorities SHOULD send their opinion document to all other authorities OpinionSeconds seconds before voting and request missing opinion documents OpinionSeconds/2 seconds before voting. OpinionSeconds SHOULD be defined as part of "voting-delay" lines and otherwise default to the same number of seconds as VoteSeconds. Authorities MAY generate opinions on demand. Upon receiving an opinion document, authorities scan it for any descriptors that: - They might accept. - Are for routers they don't know about, or are published more recently than any descriptor they have for that router. Authorities then begin downloading such descriptors from authorities that claim to have them. Authorities also download corresponding extra-info descriptors for any router descriptor they learned from parsing an opinion document. Authorities MAY cache opinion documents, but don't need to. Reasons for rejection: 1. Authorities learn about new relays from each others' vote documents. See git commits 2e692bd8 and eaf5487d, which went into 0.2.2.12-alpha: o Major bugfixes: - Many relays have been falling out of the consensus lately because not enough authorities know about their descriptor for them to get a majority of votes. When we deprecated the v2 directory protocol, we got rid of the only way that v3 authorities can hear from each other about other descriptors. Now authorities examine every v3 vote for new descriptors, and fetch them from that authority. Bugfix on 0.2.1.23. 2. Authorities don't serve version 2 statuses anymore. Since January 2013, there was only a single version 3 directory authority left that served version 2 statuses: dizum. moria1 and tor26 have been rejecting version 2 requests for a long time, and it was mostly an oversight that dizum still served them. As of January 2014, dizum does not serve version 2 statuses anymore. The other six authorities have never generated version 2 statuses for others to be used as pre-voting opinions. 3. Vote documents indicate that pre-voting opinions wouldn't help much. From January 1 to 7, 2014, only 0.4 relays on average were not included in a consensus because they were listed in less than 5 votes. These 0.4 relays could probably have been included with pre-voting opinions. (Here's how to find out: extract the votes-2014-01.tar.bz2 tarball, run `grep -R "^r " 0[1-7] | cut -c 4-22,112- | cut -d" " -f1,3 | sort | uniq -c | sort | grep " [1-4] " | wc -l`, result is 63, divide by 7*24 published consensuses, obtain 0.375 as end result.)
Filename: 148-uniform-client-end-reason.txt Title: Stream end reasons from the client side should be uniform Author: Roger Dingledine Created: 2-Jul-2008 Status: Closed Implemented-In: 0.2.1.9-alpha Overview When a stream closes before it's finished, the end relay cell that's sent includes an "end stream reason" to tell the other end why it closed. It's useful for the exit relay to send a reason to the client, so the client can choose a different circuit, inform the user, etc. But there's no reason to include it from the client to the exit relay, and in some cases it can even harm anonymity. We should pick a single reason for the client-to-exit-relay direction and always just send that. Motivation Back when I first deployed the Tor network, it was useful to have the Tor relays learn why a stream closed, so I could debug both ends of the stream at once. Now that streams have worked for many years, there's no need to continue telling the exit relay whether the client gave up on a stream because of "timeout" or "misc" or what. Then in Tor 0.2.0.28-rc, I fixed this bug: - Fix a bug where, when we were choosing the 'end stream reason' to put in our relay end cell that we send to the exit relay, Tor clients on Windows were sometimes sending the wrong 'reason'. The anonymity problem is that exit relays may be able to guess whether the client is running Windows, thus helping partition the anonymity set. Down the road we should stop sending reasons to exit relays, or otherwise prevent future versions of this bug. It turned out that non-Windows clients were choosing their reason correctly, whereas Windows clients were potentially looking at errno wrong and so always choosing 'misc'. I fixed that particular bug, but I think we should prevent future versions of the bug too. (We already fixed it so *circuit* end reasons don't get sent from the client to the exit relay. But we appear to be have skipped over stream end reasons thus far.) Design: One option would be to no longer include any 'reason' field in end relay cells. But that would introduce a partitioning attack ("users running the old version" vs "users running the new version"). Instead I suggest that clients all switch to sending the "misc" reason, like most of the Windows clients currently do and like the non-Windows clients already do sometimes.
Filename: 149-using-netinfo-data.txt Title: Using data from NETINFO cells Author: Nick Mathewson Created: 2-Jul-2008 Status: Superseded Target: 0.2.1.x [Partially done: we do the anti-MITM part. Not entirely done: we don't do the time part.] Overview Current Tor versions send signed IP and timestamp information in NETINFO cells, but don't use them to their fullest. This proposal describes how they should start using this info in 0.2.1.x. Motivation Our directory system relies on clients and routers having reasonably accurate clocks to detect replayed directory info, and to set accurate timestamps on directory info they publish themselves. NETINFO cells contain timestamps. Also, the directory system relies on routers having a reasonable idea of their own IP addresses, so they can publish correct descriptors. This is also in NETINFO cells. Learning the time and IP address We need to think about attackers here. Just because a router tells us that we have a given IP or a given clock skew doesn't mean that it's true. We believe this information only if we've heard it from a majority of the routers we've connected to recently, including at least 3 routers. Routers only believe this information if the majority includes at least one authority. Avoiding MITM attacks Current Tors use the IP addresses published in the other router's NETINFO cells to see whether the connection is "canonical". Right now, we prefer to extend circuits over "canonical" connections. In 0.2.1.x, we should refuse to extend circuits over non-canonical connections without first trying to build a canonical one.
Filename: 150-exclude-exit-nodes.txt Title: Exclude Exit Nodes from a circuit Author: Mfr Created: 2008-06-15 Status: Closed Implemented-In: 0.2.1.3-alpha Overview Right now, Tor users can manually exclude a node from all positions in their circuits created using the directive ExcludeNodes. This proposal makes this exclusion less restrictive, allowing users to exclude a node only from the exit part of a circuit. Motivation This feature would Help the integration into vidalia (tor exit branch) or other tools, of features to exclude a country for exit without reducing circuits possibilities, and privacy. This feature could help people from a country were many sites are blocked to exclude this country for browsing, giving them a more stable navigation. It could also add the possibility for the user to exclude a currently used exit node. Implementation ExcludeExitNodes is similar to ExcludeNodes except it's only the exit node which is excluded for circuit build. Tor doesn't warn if node from this list is not an exit node. Security implications: Open also possibilities for a future user bad exit reporting Risks: Use of this option can make users partitionable under certain attack assumptions. However, ExitNodes already creates this possibility, so there isn't much increased risk in ExcludeExitNodes. We should still encourage people who exclude an exit node because of bad behavior to report it instead of just adding it to their ExcludeExit list. It would be unfortunate if we didn't find out about broken exits because of this option. This issue can probably be addressed sufficiently with documentation.
Filename: 151-path-selection-improvements.txt Title: Improving Tor Path Selection Author: Fallon Chen, Mike Perry Created: 5-Jul-2008 Status: Closed In-Spec: path-spec.txt Implemented-In: 0.2.2.2-alpha Overview The performance of paths selected can be improved by adjusting the CircuitBuildTimeout and avoiding failing guard nodes. This proposal describes a method of tracking buildtime statistics at the client, and using those statistics to adjust the CircuitBuildTimeout. Motivation Tor's performance can be improved by excluding those circuits that have long buildtimes (and by extension, high latency). For those Tor users who require better performance and have lower requirements for anonymity, this would be a very useful option to have. Implementation Gathering Build Times Circuit build times are stored in the circular array 'circuit_build_times' consisting of uint32_t elements as milliseconds. The total size of this array is based on the number of circuits it takes to converge on a good fit of the long term distribution of the circuit builds for a fixed link. We do not want this value to be too large, because it will make it difficult for clients to adapt to moving between different links. From our observations, the minimum value for a reasonable fit appears to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep a good fit over the long term, we store 5000 most recent circuits in the array (NCIRCUITS_TO_OBSERVE). The Tor client will build test circuits at a rate of one per minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have a CircuitBuildTimeout estimated within 8 hours after install, upgrade, or network change (see below). Long Term Storage The long-term storage representation is implemented by storing a histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when writing out the statistics to disk. The format this takes in the state file is 'CircuitBuildTime <bin-ms> <count>', with the total specified as 'TotalBuildTimes <total>' Example: TotalBuildTimes 100 CircuitBuildTimeBin 25 50 CircuitBuildTimeBin 75 25 CircuitBuildTimeBin 125 13 ... Reading the histogram in will entail inserting <count> values into the circuit_build_times array each with the value of <bin-ms> milliseconds. In order to evenly distribute the values in the circular array, the Fisher-Yates shuffle will be performed after reading values from the bins. Learning the CircuitBuildTimeout Based on studies of build times, we found that the distribution of circuit buildtimes appears to be a Frechet distribution. However, estimators and quantile functions of the Frechet distribution are difficult to work with and slow to converge. So instead, since we are only interested in the accuracy of the tail, we approximate the tail of the distribution with a Pareto curve starting at the mode of the circuit build time sample set. We will calculate the parameters for a Pareto distribution fitting the data using the estimators at http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation. The timeout itself is calculated by using the Quartile function (the inverted CDF) to give us the value on the CDF such that BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is below the timeout value. Thus, we expect that the Tor client will accept the fastest 80% of the total number of paths on the network. Detecting Changing Network Conditions We attempt to detect both network connectivity loss and drastic changes in the timeout characteristics. We assume that we've had network connectivity loss if 3 circuits timeout and we've received no cells or TLS handshakes since those circuits began. We then set the timeout to 60 seconds and stop counting timeouts. If 3 more circuits timeout and the network still has not been live within this new 60 second timeout window, we then discard the previous timeouts during this period from our history. To detect changing network conditions, we keep a history of the timeout or non-timeout status of the past RECENT_CIRCUITS (20) that successfully completed at least one hop. If more than 75% of these circuits timeout, we discard all buildtimes history, reset the timeout to 60, and then begin recomputing the timeout. Testing After circuit build times, storage, and learning are implemented, the resulting histogram should be checked for consistency by verifying it persists across successive Tor invocations where no circuits are built. In addition, we can also use the existing buildtime scripts to record build times, and verify that the histogram the python produces matches that which is output to the state file in Tor, and verify that the Pareto parameters and cutoff points also match. We will also verify that there are no unexpected large deviations from node selection, such as nodes from distant geographical locations being completely excluded. Dealing with Timeouts Timeouts should be counted as the expectation of the region of of the Pareto distribution beyond the cutoff. This is done by generating a random sample for each timeout at points on the curve beyond the current timeout cutoff. Future Work At some point, it may be desirable to change the cutoff from a single hard cutoff that destroys the circuit to a soft cutoff and a hard cutoff, where the soft cutoff merely triggers the building of a new circuit, and the hard cutoff triggers destruction of the circuit. It may also be beneficial to learn separate timeouts for each guard node, as they will have slightly different distributions. This will take longer to generate initial values though. Issues Impact on anonymity Since this follows a Pareto distribution, large reductions on the timeout can be achieved without cutting off a great number of the total paths. This will eliminate a great deal of the performance variation of Tor usage.
Filename: 152-single-hop-circuits.txt Title: Optionally allow exit from single-hop circuits Author: Geoff Goodell Created: 13-Jul-2008 Status: Closed Implemented-In: 0.2.1.6-alpha Overview Provide a special configuration option that adds a line to descriptors indicating that a router can be used as an exit for one-hop circuits, and allow clients to attach streams to one-hop circuits provided that the descriptor for the router in the circuit includes this configuration option. Motivation At some point, code was added to restrict the attachment of streams to one-hop circuits. The idea seems to be that we can use the cost of forking and maintaining a patch as a lever to prevent people from writing controllers that jeopardize the operational security of routers and the anonymity properties of the Tor network by creating and using one-hop circuits rather than the standard three-hop circuits. It may be, for example, that some users do not actually seek true anonymity but simply reachability through network perspectives afforded by the Tor network, and since anonymity is stronger in numbers, forcing users to contribute to anonymity and decrease the risk to server operators by using full-length paths may be reasonable. As presently implemented, the sweeping restriction of one-hop circuits for all routers limits the usefulness of Tor as a general-purpose technology for building circuits. In particular, we should allow for controllers, such as Blossom, that create and use single-hop circuits involving routers that are not part of the Tor network. Design Introduce a configuration option for Tor servers that, when set, indicates that a router is willing to provide exit from one-hop circuits. Routers with this policy will not require that a circuit has at least two hops when it is used as an exit. In addition, routers for which this configuration option has been set will have a line in their descriptors, "opt exit-from-single-hop-circuits". Clients will keep track of which routers have this option and allow streams to be attached to single-hop circuits that include such routers. Security Considerations This approach seems to eliminate the worry about operational router security, since server operators will not set the configuraiton option unless they are willing to take on such risk. To reduce the impact on anonymity of the network resulting from including such "risky" routers in regular Tor path selection, clients may systematically exclude routers with "opt exit-from-single-hop-circuits" when choosing random paths through the Tor network.
Filename: 153-automatic-software-update-protocol.txt Title: Automatic software update protocol Author: Jacob Appelbaum Created: 14-July-2008 Status: Superseded [Superseded by thandy-spec.txt] Automatic Software Update Protocol Proposal 0.0 Introduction The Tor project and its users require a robust method to update shipped software bundles. The software bundles often includes Vidalia, Privoxy, Polipo, Torbutton and of course Tor itself. It is not inconcievable that an update could include all of the Tor Browser Bundle. It seems reasonable to make this a standalone program that can be called in shell scripts, cronjobs or by various Tor controllers. 0.1 Minimal Tasks To Implement Automatic Updating At the most minimal, an update must be able to do the following: 0 - Detect the curent Tor version, note the working status of Tor. 1 - Detect the latest Tor version. 2 - Fetch the latest version in the form of a platform specific package(s). 3 - Verify the itegrity of the downloaded package(s). 4 - Install the verified package(s). 5 - Test that the new package(s) works properly. 0.2 Specific Enumeration Of Minimal Tasks To implement requirement 0, we need to detect the current Tor version of both the updater and the current running Tor. The update program itself should be versioned internally. This requirement should also test connecting through Tor itself and note if such connections are possible. To implement requirement 1, we need to learn the concensus from the directory authorities or fail back to a known good URL with cryptographically signed content. To implement requirement 2, we need to download Tor - hopefully over Tor. To implement requirement 3, we need to verify the package signature. To implement requirement 4, we need to use a platform specific method of installation. The Tor controller performing the update perform these platform specific methods. To implement requirement 5, we need to be able to extend circuits and reach the internet through Tor. 0.x Implementation Goals The update system will be cross platform and rely on as little external code as possible. If the update system uses it, it must be updated by the update system itself. It will consist only of free software and will not rely on any non-free components until the actual installation phase. If a package manager is in use, it will be platform specific and thus only invoked by the update system implementing the update protocol. The update system itself will attempt to perform update related network activity over Tor. Possibly it will attempt to use a hidden service first. It will attempt to use novel and not so novel caching when possible, it will always verify cryptographic signatures before any remotely fetched code is executed. In the event of an unusable Tor system, it will be able to attempt to fetch updates without Tor. This should be user configurable, some users will be unwilling to update without the protection of using Tor - others will simply be unable because of blocking of the main Tor website. The update system will track current version numbers of Tor and supporting software. The update system will also track known working versions to assist with automatic The update system itself will be a standalone library. It will be strongly versioned internally to match the Tor bundle it was shiped with. The update system will keep track of the given platform, cpu architecture, lsb_release, package management functionality and any other platform specific metadata. We have referenced two popular automatic update systems, though neither fit our needs, both are useful as an idea of what others are doing in the same area. The first is sparkle[0] but it is sadly only available for Cocoa environments and is written in Objective C. This doesn't meet our requirements because it is directly tied into the private Apple framework. The second is the Mozilla Automatic Update System[1]. It is possibly useful as an idea of how other free software projects automatically update. It is however not useful in its currently documented form. [0] http://sparkle.andymatuschak.org/documentation/ [1] http://wiki.mozilla.org/AUS:Manual 0.x Previous methods of Tor and related software update Previously, Tor users updated their Tor related software by hand. There has been no fully automatic method for any user to update. In addition, there hasn't been any specific way to find out the most current stable version of Tor or related software as voted on by the directory authority concensus. 0.x Changes to the directory specification We will want to supplement client-versions and server-versions in the concensus voting with another version identifier known as 'auto-update-versions'. This will keep track of the current concensus of specific versions that are best per platform and per architecture. It should be noted that while the Mac OS X universal binary may be the best for x86 processers with Tiger, it may not be the best for PPC users on Panther. This goes for all of the package updates. We want to prevent updates that cause Tor to break even if the updating program can recover gracefully. x.x Assumptions About Operating System Package Management It is assumed that users will use their package manager unless they are on Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows users will have integration with the normal "add/remove program" functionality that said users would expect. x.x Package Update System Failure Modes The package update will try to ensure that a user always has a working Tor at the very least. It will keep state to remember versions of Tor that were able to bootstrap properly and reach the rest of the Tor network. It will also keep note of which versions broke. It will select the best Tor that works for the user. It will also allow for anonymized bug reporting on the packages available and tested by the auto-update system. x.x Package Signature Verification The update system will be aware of replay attacks against the update signature system itself. It will not allow package update signatures that are radically out of date. It will be a multi-key system to prevent any single party from forging an update. The key will be updated regularly. This is like authority key (see proposal 103) usage. x.x Package Caching The update system will iterate over different update methods. Whichever method is picked will have caching functionality. Each Tor server itself should be able to serve cached update files. This will be an option that friendly server administrators can turn on should they wish to support caching. In addition, it is possible to cache the full contents of a package in an authoratative DNS zone. Users can then query the DNS zone for their package. If we wish to further distribute the update load, we can also offer packages with encrypted bittorrent. Clients who wish to share the updates but do not wish to be a server can help distribute Tor updates. This can be tied together with the DNS caching[2][3] if needed. [2] http://www.netrogenic.com/dnstorrent/ [3] http://www.doxpara.com/ozymandns_src_0.1.tgz x.x Helping Our Users Spread Tor There should be a way for a user to participate in the packaging caching as described in section x.x. This option should be presented by the Tor controller. x.x Simple HTTP Proxy To The Tor Project Website It has been suggested that we should provide a simple proxy that allows a user to visit the main Tor website to download packages. This was part of a previous proposal and has not been closely examined. x.x Package Installation Platform specific methods for proper package installation will be left to the controller that is calling for an update. Each platform is different, the installation options and user interface will be specific to the controller in question. x.x Other Things Other things should be added to this proposal. What are they?
Filename: 154-automatic-updates.txt Title: Automatic Software Update Protocol Author: Matt Edman Created: 30-July-2008 Status: Superseded Target: 0.2.1.x Superseded by thandy-spec.txt Scope This proposal specifies the method by which an automatic update client can determine the most recent recommended Tor installation package for the user's platform, download the package, and then verify that the package was downloaded successfully. While this proposal focuses on only the Tor software, the protocol defined is sufficiently extensible such that other components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be managed and updated by the automatic update client as well. The initial target platform for the automatic update framework is Windows, given that's the platform used by a majority of our users and that it lacks a sane package management system that many Linux distributions already have. Our second target platform will be Mac OS X, and so the protocol will be designed with this near-future direction in mind. Other client-side aspects of the automatic update process, such as user interaction, the interface presented, and actual package installation procedure, are outside the scope of this proposal. Motivation Tor releases new versions frequently, often with important security, anonymity, and stability fixes. Thus, it is important for users to be able to promptly recognize when new versions are available and to easily download, authenticate, and install updated Tor and Tor-related software packages. Tor's control protocol [2] provides a method by which controllers can identify when the user's Tor software is obsolete or otherwise no longer recommended. Currently, however, no mechanism exists for clients to automatically download and install updated Tor and Tor-related software for the user. Design Overview The core of the automatic update framework is a well-defined file called a "recommended-packages" file. The recommended-packages file is accessible via HTTP[S] at one or more well-defined URLs. An example recommended-packages URL may be: https://updates.torproject.org/recommended-packages The recommended-packages document is formatted according to Section 1.2 below and specifies the most recent recommended installation package versions for Tor or Tor-related software, as well as URLs at which the packages and their signatures can be downloaded. An automatic update client process runs on the Tor user's computer and periodically retrieves the recommended-packages file according to the method described in Section 2.0. As described further in Section 1.2, the recommended-packages file is signed and can be verified by the automatic update client with one or more public keys included in the client software. Since it is signed, the recommended-packages file can be mirrored by multiple hosts (e.g., Tor directory authorities), whose URLs are included in the automatic update client's configuration. After retrieving and verifying the recommended-packages file, the automatic update client compares the versions of the recommended software packages listed in the file with those currently installed on the end-user's computer. If one or more of the installed packages is determined to be out of date, an updated package and its signature will be downloaded from one of the package URLs listed in the recommended-packages file as described in Section 2.2. The automatic update system uses a multilevel signing key scheme for package signatures. There are a small number of entities we call "packaging authorities" that each have their own signing key. A packaging authority is responsible for signing and publishing the recommended-packages file. Additionally, each individual packager responsible for producing an installation package for one or more platforms has their own signing key. Every packager's signing key must be signed by at least one of the packaging authority keys. Specification 1. recommended-packages Specification In this section we formally specify the format of the published recommended-packages file. 1.1. Document Meta-format The recommended-packages document follows the lightweight extensible information format defined in Tor's directory protocol specification [1]. In the interest of self-containment, we have reproduced the relevant portions of that format's specification in this Section. (Credits to Nick Mathewson for much of the original format definition language.) The highest level object is a Document, which consists of one or more Items. Every Item begins with a KeywordLine, followed by zero or more Objects. A KeywordLine begins with a Keyword, optionally followed by whitespace and more non-newline characters, and ends with a newline. A Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. An Object is a block of encoded data in pseudo-Open-PGP-style armor. (cf. RFC 2440) More formally: Document ::= (Item | NL)+ Item ::= KeywordLine Object* KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL Keyword ::= KeywordChar+ KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' ArgumentChar ::= any printing ASCII character except NL. WS ::= (SP | TAB)+ Object ::= BeginLine Base-64-encoded-data EndLine BeginLine ::= "-----BEGIN " Keyword "-----" NL EndLine ::= "-----END " Keyword "-----" NL The BeginLine and EndLine of an Object must use the same keyword. In our Document description below, we also tag Items with a multiplicity in brackets. Possible tags are: "At start, exactly once": These items MUST occur in every instance of the document type, and MUST appear exactly once, and MUST be the first item in their documents. "Exactly once": These items MUST occur exactly one time in every instance of the document type. "Once or more": These items MUST occur at least once in any instance of the document type, and MAY occur more than once. "At end, exactly once": These items MUST occur in every instance of the document type, and MUST appear exactly once, and MUST be the last item in their documents. 1.2. recommended-packages Document Format When interpreting a recommended-packages Document, software MUST ignore any KeywordLine that starts with a keyword it doesn't recognize; future implementations MUST NOT require current automatic update clients to understand any KeywordLine not currently described. In lines that take multiple arguments, extra arguments SHOULD be accepted and ignored. The currently defined Items contained in a recommended-packages document are: "recommended-packages-format" SP number NL [Exactly once] This Item specifies the version of the recommended-packages format that is contained in the subsequent document. The version defined in this proposal is version "1". Subsequent iterations of this protocol MUST increment this value if they introduce incompatible changes to the document format and MAY increment this value if they only introduce additional Keywords. "published" SP YYYY-MM-DD SP HH:MM:SS NL [Exactly once] The time, in GMT, when this recommended-packages document was generated. Automatic update clients SHOULD ignore Documents over 60 days old. "tor-stable-win32-version" SP TorVersion NL [Exactly once] This keyword specifies the latest recommended release of Tor's "stable" branch for the Windows platform that has an installation package available. Note that this version does not necessarily correspond to the most recently tagged stable Tor version, since that version may not yet have an installer package available, or may have known issues on Windows. The TorVersion field is formatted according to Section 2 of Tor's version specification [3]. "tor-stable-win32-package" SP Url NL [Once or more] This Item specifies the location from which the most recent recommended Windows installation package for Tor's stable branch can be downloaded. When this Item appears multiple times within the Document, automatic update clients SHOULD select randomly from the available package mirrors. "tor-dev-win32-version" SP TorVersion NL [Exactly once] This Item specifies the latest recommended release of Tor's "development" branch for the Windows platform that has an installation package available. The same caveats from the description of "tor-stable-win32-version" also apply to this keyword. The TorVersion field is formatted according to Section 2 of Tor's version specification [3]. "tor-dev-win32-package" SP Url NL [Once or more] This Item specifies the location from which the most recent recommended Windows installation package and its signature for Tor's development branch can be downloaded. When this Keyword appears multiple times within the Document, automatic update clients SHOULD select randomly from the available package mirrors. "signature" NL SIGNATURE NL [At end, exactly once] The "SIGNATURE" Object contains a PGP signature (using a packaging authority signing key) of the entire document, taken from the beginning of the "recommended-packages-format" keyword, through the newline after the "signature" Keyword. 2. Automatic Update Client Behavior The client-side component of the automatic update framework is an application that runs on the end-user's machine. It is responsible for fetching and verifying a recommended-packages document, as well as downloading, verifying, and subsequently installing any necessary updated software packages. 2.1. Download and verify a recommended-packages document The first step in the automatic update process is for the client to download a copy of the recommended-packages file. The automatic update client contains a (hardcoded and/or user-configurable) list of URLs from which it will attempt to retrieve a recommended-packages file. Connections to each of the recommended-packages URLs SHOULD be attempted in the following order: 1) HTTPS over Tor 2) HTTP over Tor 3) Direct HTTPS 4) Direct HTTP If the client fails to retrieve a recommended-packages document via any of the above connection methods from any of the configured URLs, the client SHOULD retry its download attempts following an exponential back-off algorithm. After the first failed attempt, the client SHOULD delay one hour before attempting again, up to a maximum of 24 hours delay between retry attempts. After successfully downloading a recommended-packages file, the automatic update client will verify the signature using one of the public keys distributed with the client software. If more than one recommended-packages file is downloaded and verified, the file with the most recent "published" date that is verified will be retained and the rest discarded. 2.2. Download and verify the updated packages The automatic update client next compares the latest recommended package version from the recommended-packages document with the currently installed Tor version. If the user currently has installed a Tor version from Tor's "development" branch, then the version specified in "tor-dev-*-version" Item is used for comparison. Similarly, if the user currently has installed a Tor version from Tor's "stable" branch, then the version specified in the "tor-stable-*version" Item is used for comparison. Version comparisons are done according to Tor's version specification [3]. If the automatic update client determines an installation package newer than the user's currently installed version is available, it will attempt to download a package appropriate for the user's platform and Tor branch from a URL specified by a "tor-[branch]-[platform]-package" Item. If more than one mirror for the selected package is available, a mirror will be chosen at random from all those available. The automatic update client must also download a ".asc" signature file for the retrieved package. The URL for the package signature is the same as that for the package itself, except with the extension ".asc" appended to the package URL. Connections to download the updated package and its signature SHOULD be attempted in the same order described in Section 2.1. After completing the steps described in Sections 2.1 and 2.2, the automatic update client will have downloaded and verified a copy of the latest Tor installation package. It can then take whatever subsequent platform-specific steps are necessary to install the downloaded software updates. 2.3. Periodic checking for updates The automatic update client SHOULD maintain a local state file in which it records (at a minimum) the timestamp at which it last retrieved a recommended-packages file and the timestamp at which the client last successfully downloaded and installed a software update. Automatic update clients SHOULD check for an updated recommended-packages document at most once per day but at least once every 30 days. 3. Future Extensions There are several possible areas for future extensions of this framework. The extensions below are merely suggestions and should be the subject of their own proposal before being implemented. 3.1. Additional Software Updates There are several software packages often included in Tor bundles besides Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and download locations of updated installation packages for these bundle components can be easily added to the recommended-packages document specification above. 3.2. Including ChangeLog Information It may be useful for automatic update clients to be able to display for users a summary of the changes made in the latest Tor or Tor-related software release, before the user chooses to install the update. In the future, we can add keywords to the specification in Section 1.2 that specify the location of a ChangeLog file for the latest recommended package versions. It may also be desirable to allow localized ChangeLog information, so that the automatic update client can fetch release notes in the end-user's preferred language. 3.3. Weighted Package Mirror Selection We defined in Section 1.2 a method by which automatic update clients can select from multiple available package mirrors. We may want to add a Weight argument to the "*-package" Items that allows the recommended-packages file to suggest to clients the probability with which a package mirror should be chosen. This will allow clients to more appropriately distribute package downloads across available mirrors proportional to their approximate bandwidth. Implementation Implementation of this proposal will consist of two separate components. The first component is a small "au-publish" tool that takes as input a configuration file specifying the information described in Section 1.2 and a private key. The tool is run by a "packaging authority" (someone responsible for publishing updated installation packages), who will be prompted to enter the passphrase for the private key used to sign the recommended-packages document. The output of the tool is a document formatted according to Section 1.2, with a signature appended at the end. The resulting document can then be published to any of the update mirrors. The second component is an "au-client" tool that is run on the end-user's machine. It periodically checks for updated installation packages according to Section 2 and fetches the packages if necessary. The public keys used to sign the recommended-packages file and any of the published packages are included in the "au-client" tool. References [1] Tor directory protocol (version 3), https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt [2] Tor control protocol (version 2), https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt [3] Tor version specification, https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt
Filename: 155-four-hidden-service-improvements.txt Title: Four Improvements of Hidden Service Performance Author: Karsten Loesing, Christian Wilms Created: 25-Sep-2008 Status: Closed Implemented-In: 0.2.1.x Change history: 25-Sep-2008 Initial proposal for or-dev Overview: A performance analysis of hidden services [1] has brought up a few possible design changes to reduce advertisement time of a hidden service in the network as well as connection establishment time. Some of these design changes have side-effects on anonymity or overall network load which had to be weighed up against individual performance gains. A discussion of seven possible design changes [2] has led to a selection of four changes [3] that are proposed to be implemented here. Design: 1. Shorter Circuit Extension Timeout When establishing a connection to a hidden service a client cannibalizes an existing circuit and extends it by one hop to one of the service's introduction points. In most cases this can be accomplished within a few seconds. Therefore, the current timeout of 60 seconds for extending a circuit is far too high. Assuming that the timeout would be reduced to a lower value, for example 30 seconds, a second (or third) attempt to cannibalize and extend would be started earlier. With the current timeout of 60 seconds, 93.42% of all circuits can be established, whereas this fraction would have been only 0.87% smaller at 92.55% with a timeout of 30 seconds. For a timeout of 30 seconds the performance gain would be approximately 2 seconds in the mean as opposed to the current timeout of 60 seconds. At the same time a smaller timeout leads to discarding an increasing number of circuits that might have been completed within the current timeout of 60 seconds. Measurements with simulated low-bandwidth connectivity have shown that there is no significant effect of client connectivity on circuit extension times. The reason for this might be that extension messages are small and thereby independent of the client bandwidth. Further, the connection between client and entry node only constitutes a single hop of a circuit, so that its influence on the whole circuit is limited. The exact value of the new timeout does not necessarily have to be 30 seconds, but might also depend on the results of circuit build timeout measurements as described in proposal 151. 2. Parallel Connections to Introduction Points An additional approach to accelerate extension of introduction circuits is to extend a second circuit in parallel to a different introduction point. Such parallel extension attempts should be started after a short delay of, e.g., 15 seconds in order to prevent unnecessary circuit extensions and thereby save network resources. Whichever circuit extension succeeds first is used for introduction, while the other attempt is aborted. An evaluation has been performed for the more resource-intensive approach of starting two parallel circuits immediately instead of waiting for a short delay. The result was a reduction of connection establishment times from 27.4 seconds in the original protocol to 22.5 seconds. While the effect of the proposed approach of delayed parallelization on mean connection establishment times is expected to be smaller, variability of connection attempt times can be reduced significantly. 3. Increase Count of Internal Circuits Hidden services need to create or cannibalize and extend a circuit to a rendezvous point for every client request. Really popular hidden services require more than two internal circuits in the pool to answer multiple client requests at the same time. This scenario was not yet analyzed, but will probably exhibit worse performance than measured in the previous analysis. The number of preemptively built internal circuits should be a function of connection requests in the past to adapt to changing needs. Furthermore, an increased number of internal circuits on client side would allow clients to establish connections to more than one hidden service at a time. Under the assumption that a popular hidden service cannot make use of cannibalization for connecting to rendezvous points, the circuit creation time needs to be added to the current results. In the mean, the connection establishment time to a popular hidden service would increase by 4.7 seconds. 4. Build More Introduction Circuits When establishing introduction points, a hidden service should launch 5 instead of 3 introduction circuits at the same time and use only the first 3 that could be established. The remaining two circuits could still be used for other purposes afterwards. The effect has been simulated using previously measured data, too. Therefore, circuit establishment times were derived from log files and written to an array. Afterwards, a simulation with 10,000 runs was performed picking 5 (4, 6) random values and using the 3 lowest values in contrast to picking only 3 values at random. The result is that the mean time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of the 3-out-of-5 approach is 4.4 seconds. The effect on network load is minimal, because the hidden service can reuse the slower internal circuits for other purposes, e.g., rendezvous circuits. The only change is that a hidden service starts establishing more circuits at once instead of subsequently doing so. References: [1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf [2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf [3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf
Filename: 156-tracking-blocked-ports.txt Title: Tracking blocked ports on the client side Author: Robert Hogan Created: 14-Oct-2008 Status: Superseded [Superseded by 156, which recognizes the security issues here.] Motivation: Tor clients that are behind extremely restrictive firewalls can end up waiting a while for their first successful OR connection to a node on the network. Worse, the more restrictive their firewall the more susceptible they are to an attacker guessing their entry nodes. Tor routers that are behind extremely restrictive firewalls can only offer a limited, 'partitioned' service to other routers and clients on the network. Exit nodes behind extremely restrictive firewalls may advertise ports that they are actually not able to connect to, wasting network resources in circuit constructions that are doomed to fail at the last hop on first use. Proposal: When a client attempts to connect to an entry guard it should avoid further attempts on ports that fail once until it has connected to at least one entry guard successfully. (Maybe it should wait for more than one failure to reduce the skew on the first node selection.) Thereafter it should select entry guards regardless of port and warn the user if it observes that connections to a given port have failed every multiple of 5 times without success or since the last success. Tor should warn the operators of exit, middleman and entry nodes if it observes that connections to a given port have failed a multiple of 5 times without success or since the last success. If attempts on a port fail 20 or more times without or since success, Tor should add the port to a 'blocked-ports' entry in its descriptor's extra-info. Some thought needs to be given to what the authorities might do with this information. Related TODO item: "- Automatically determine what ports are reachable and start using those, if circuits aren't working and it's a pattern we recognize ("port 443 worked once and port 9001 keeps not working")." I've had a go at implementing all of this in the attached. Addendum: Just a note on the patch, storing the digest of each router that uses the port is a bit of a memory hog, and its only real purpose is to provide a count of routers using that port when warning the user. That could be achieved when warning the user by iterating through the routerlist instead. Index: src/or/connection_or.c =================================================================== --- src/or/connection_or.c (revision 17104) +++ src/or/connection_or.c (working copy) @@ -502,6 +502,9 @@ connection_or_connect_failed(or_connection_t *conn, int reason, const char *msg) { + if ((reason == END_OR_CONN_REASON_NO_ROUTE) || + (reason == END_OR_CONN_REASON_REFUSED)) + or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port); control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason); if (!authdir_mode_tests_reachability(get_options())) control_event_bootstrap_problem(msg, reason); @@ -580,6 +583,7 @@ /* already marked for close */ return NULL; } + return conn; } @@ -909,6 +913,7 @@ control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0); if (started_here) { + or_port_hist_success(TO_CONN(conn)->port); rep_hist_note_connect_succeeded(conn->identity_digest, now); if (entry_guard_register_connect_status(conn->identity_digest, 1, now) < 0) { Index: src/or/rephist.c =================================================================== --- src/or/rephist.c (revision 17104) +++ src/or/rephist.c (working copy) @@ -18,6 +18,7 @@ static void bw_arrays_init(void); static void predicted_ports_init(void); static void hs_usage_init(void); +static void or_port_hist_init(void); /** Total number of bytes currently allocated in fields used by rephist.c. */ uint64_t rephist_total_alloc=0; @@ -89,6 +90,25 @@ digestmap_t *link_history_map; } or_history_t; +/** or_port_hist_t contains our router/client's knowledge of + all OR ports offered on the network, and how many servers with each port we + have succeeded or failed to connect to. */ +typedef struct { + /** The port this entry is tracking. */ + uint16_t or_port; + /** Have we ever connected to this port on another OR?. */ + unsigned int success:1; + /** The ORs using this port. */ + digestmap_t *ids; + /** The ORs using this port we have failed to connect to. */ + digestmap_t *failure_ids; + /** Are we excluding ORs with this port during entry selection?*/ + unsigned int excluded; +} or_port_hist_t; + +static unsigned int still_searching = 0; +static smartlist_t *or_port_hists; + /** When did we last multiply all routers' weighted_run_length and * total_run_weights by STABILITY_ALPHA? */ static time_t stability_last_downrated = 0; @@ -164,6 +184,16 @@ tor_free(hist); } +/** Helper: free storage held by a single OR port history entry. */ +static void +or_port_hist_free(or_port_hist_t *p) +{ + tor_assert(p); + digestmap_free(p->ids,NULL); + digestmap_free(p->failure_ids,NULL); + tor_free(p); +} + /** Update an or_history_t object <b>hist</b> so that its uptime/downtime * count is up-to-date as of <b>when</b>. */ @@ -1639,7 +1669,7 @@ tmp_time = smartlist_get(predicted_ports_times, i); if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) { tmp_port = smartlist_get(predicted_ports_list, i); - log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port); + log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port); smartlist_del(predicted_ports_list, i); smartlist_del(predicted_ports_times, i); rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t); @@ -1821,6 +1851,12 @@ tor_free(last_stability_doc); built_last_stability_doc_at = 0; predicted_ports_free(); + if (or_port_hists) { + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p, + or_port_hist_free(p)); + smartlist_free(or_port_hists); + or_port_hists = NULL; + } } /****************** hidden service usage statistics ******************/ @@ -2356,3 +2392,225 @@ tor_free(fname); } +/** Create a new entry in the port tracking cache for the or_port in + * <b>ri</b>. */ +void +or_port_hist_new(const routerinfo_t *ri) +{ + or_port_hist_t *result; + const char *id=ri->cache_info.identity_digest; + + if (!or_port_hists) + or_port_hist_init(); + + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + /* Cope with routers that change their advertised OR port or are + dropped from the networkstatus. We don't discard the failures of + dropped routers because they are still valid when counting + consecutive failures on a port.*/ + if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) { + digestmap_remove(tp->ids, id); + } + if (tp->or_port == ri->or_port) { + if (!(digestmap_get(tp->ids, id))) + digestmap_set(tp->ids, id, (void*)1); + return; + } + }); + + result = tor_malloc_zero(sizeof(or_port_hist_t)); + result->or_port=ri->or_port; + result->success=0; + result->ids=digestmap_new(); + digestmap_set(result->ids, id, (void*)1); + result->failure_ids=digestmap_new(); + result->excluded=0; + smartlist_add(or_port_hists, result); +} + +/** Create the port tracking cache. */ +/*XXX: need to call this when we rebuild/update our network status */ +static void +or_port_hist_init(void) +{ + routerlist_t *rl = router_get_routerlist(); + + if (!or_port_hists) + or_port_hists=smartlist_create(); + + if (rl && rl->routers) { + SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri, + { + or_port_hist_new(ri); + }); + } +} + +#define NOT_BLOCKED 0 +#define FAILURES_OBSERVED 1 +#define POSSIBLY_BLOCKED 5 +#define PROBABLY_BLOCKED 10 +/** Return the list of blocked ports for our router's extra-info.*/ +char * +or_port_hist_get_blocked_ports(void) +{ + char blocked_ports[2048]; + char *bp; + + tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports"); + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED) + tor_snprintf(blocked_ports+strlen(blocked_ports), + sizeof(blocked_ports)," %u,",tp->or_port); + }); + if (strlen(blocked_ports) == 13) + return NULL; + bp=tor_strdup(blocked_ports); + bp[strlen(bp)-1]='\n'; + bp[strlen(bp)]='\0'; + return bp; +} + +/** Revert to client-only mode if we have seen to many failures on a port or + * range of ports.*/ +static void +or_port_hist_report_block(unsigned int min_severity) +{ + or_options_t *options=get_options(); + char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048]; + char port[1024]; + + memset(failures_observed,0,sizeof(failures_observed)); + memset(possibly_blocked,0,sizeof(possibly_blocked)); + memset(probably_blocked,0,sizeof(probably_blocked)); + + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + unsigned int failures = digestmap_size(tp->failure_ids); + if (failures >= min_severity) { + tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the" + " network)",tp->or_port,failures, + (!tp->success)?"and no successes": "since last success", + digestmap_size(tp->ids)); + if (failures >= PROBABLY_BLOCKED) { + strlcat(probably_blocked, port, sizeof(probably_blocked)); + } else if (failures >= POSSIBLY_BLOCKED) + strlcat(possibly_blocked, port, sizeof(possibly_blocked)); + else if (failures >= FAILURES_OBSERVED) + strlcat(failures_observed, port, sizeof(failures_observed)); + } + }); + + log_warn(LD_HIST,"%s%s%s%s%s%s%s%s", + server_mode(options) && + ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))? + "You should consider disabling your Tor server.":"", + (min_severity==FAILURES_OBSERVED)? + "Tor appears to be blocked from connecting to a range of ports " + "with the result that it cannot connect to one tenth of the Tor " + "network. ":"", + strlen(failures_observed)? + "Tor has observed failures on the following ports: ":"", + failures_observed, + strlen(possibly_blocked)? + "Tor is possibly blocked on the following ports: ":"", + possibly_blocked, + strlen(probably_blocked)? + "Tor is almost certainly blocked on the following ports: ":"", + probably_blocked); + +} + +/** Record the success of our connection to <b>digest</b>'s + * OR port. */ +void +or_port_hist_success(uint16_t or_port) +{ + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + if (tp->or_port != or_port) + continue; + /*Reset our failure stats so we can notice if this port ever gets + blocked again.*/ + tp->success=1; + if (digestmap_size(tp->failure_ids)) { + digestmap_free(tp->failure_ids,NULL); + tp->failure_ids=digestmap_new(); + } + if (still_searching) { + still_searching=0; + SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;); + } + return; + }); +} +/** Record the failure of our connection to <b>digest</b>'s + * OR port. Warn, exclude the port from future entry guard selection, or + * add port to blocked-ports in our server's extra-info as appropriate. */ +void +or_port_hist_failure(const char *digest, uint16_t or_port) +{ + int total_failures=0, ports_excluded=0, report_block=0; + int total_routers=smartlist_len(router_get_routerlist()->routers); + + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + ports_excluded += tp->excluded; + total_failures+=digestmap_size(tp->failure_ids); + if (tp->or_port != or_port) + continue; + /* We're only interested in unique failures */ + if (digestmap_get(tp->failure_ids, digest)) + return; + + total_failures++; + digestmap_set(tp->failure_ids, digest, (void*)1); + if (still_searching && !tp->success) { + tp->excluded=1; + ports_excluded++; + } + if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) && + !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED)) + report_block=POSSIBLY_BLOCKED; + }); + + if (total_failures >= (int)(total_routers/10)) + or_port_hist_report_block(FAILURES_OBSERVED); + else if (report_block) + or_port_hist_report_block(report_block); + + if (ports_excluded >= smartlist_len(or_port_hists)) { + log_warn(LD_HIST,"During entry node selection Tor tried every port " + "offered on the network on at least one server " + "and didn't manage a single " + "successful connection. This suggests you are behind an " + "extremely restrictive firewall. Tor will keep trying to find " + "a reachable entry node."); + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;); + } +} + +/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */ +void +or_port_hist_exclude(routerset_t *rt) +{ + SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, + { + char portpolicy[9]; + if (tp->excluded) { + tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port); + log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily " + "from entry guard selection.", tp->or_port); + routerset_parse(rt, portpolicy, "Ports"); + } + }); +} + +/** Allow the exclusion of ports during our search for an entry node. */ +void +or_port_hist_search_again(void) +{ + still_searching=1; +} Index: src/or/or.h =================================================================== --- src/or/or.h (revision 17104) +++ src/or/or.h (working copy) @@ -3864,6 +3864,13 @@ int any_predicted_circuits(time_t now); int rep_hist_circbuilding_dormant(time_t now); +void or_port_hist_failure(const char *digest, uint16_t or_port); +void or_port_hist_success(uint16_t or_port); +void or_port_hist_new(const routerinfo_t *ri); +void or_port_hist_exclude(routerset_t *rt); +void or_port_hist_search_again(void); +char *or_port_hist_get_blocked_ports(void); + /** Possible public/private key operations in Tor: used to keep track of where * we're spending our time. */ typedef enum { Index: src/or/routerparse.c =================================================================== --- src/or/routerparse.c (revision 17104) +++ src/or/routerparse.c (working copy) @@ -1401,6 +1401,8 @@ goto err; } + or_port_hist_new(router); + if (!router->platform) { router->platform = tor_strdup("<unknown>"); } Index: src/or/router.c =================================================================== --- src/or/router.c (revision 17104) +++ src/or/router.c (working copy) @@ -1818,6 +1818,7 @@ char published[ISO_TIME_LEN+1]; char digest[DIGEST_LEN]; char *bandwidth_usage; + char *blocked_ports; int result; size_t len; @@ -1825,7 +1826,6 @@ extrainfo->cache_info.identity_digest, DIGEST_LEN); format_iso_time(published, extrainfo->cache_info.published_on); bandwidth_usage = rep_hist_get_bandwidth_lines(1); - result = tor_snprintf(s, maxlen, "extra-info %s %s\n" "published %s\n%s", @@ -1835,6 +1835,16 @@ if (result<0) return -1; + blocked_ports = or_port_hist_get_blocked_ports(); + if (blocked_ports) { + result = tor_snprintf(s+strlen(s), maxlen-strlen(s), + "%s", + blocked_ports); + tor_free(blocked_ports); + if (result<0) + return -1; + } + if (should_record_bridge_info(options)) { static time_t last_purged_at = 0; char *geoip_summary; Index: src/or/circuitbuild.c =================================================================== --- src/or/circuitbuild.c (revision 17104) +++ src/or/circuitbuild.c (working copy) @@ -62,6 +62,7 @@ static void entry_guards_changed(void); static time_t start_of_month(time_t when); +static int num_live_entry_guards(void); /** Iterate over values of circ_id, starting from conn-\>next_circ_id, * and with the high bit specified by conn-\>circ_id_type, until we get @@ -1627,12 +1628,14 @@ smartlist_t *excluded; or_options_t *options = get_options(); router_crn_flags_t flags = 0; + routerset_t *_ExcludeNodes; if (state && options->UseEntryGuards && (purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) { return choose_random_entry(state); } + _ExcludeNodes = routerset_new(); excluded = smartlist_create(); if (state && (r = build_state_get_exit_router(state))) { @@ -1670,12 +1673,18 @@ if (options->_AllowInvalid & ALLOW_INVALID_ENTRY) flags |= CRN_ALLOW_INVALID; + if (options->ExcludeNodes) + routerset_union(_ExcludeNodes,options->ExcludeNodes); + + or_port_hist_exclude(_ExcludeNodes); + choice = router_choose_random_node( NULL, excluded, - options->ExcludeNodes, + _ExcludeNodes, flags); smartlist_free(excluded); + routerset_free(_ExcludeNodes); return choice; } @@ -2727,6 +2736,7 @@ entry_guards_update_state(or_state_t *state) { config_line_t **next, *line; + unsigned int have_reachable_entry=0; if (! entry_guards_dirty) return; @@ -2740,6 +2750,7 @@ char dbuf[HEX_DIGEST_LEN+1]; if (!e->made_contact) continue; /* don't write this one to disk */ + have_reachable_entry=1; *next = line = tor_malloc_zero(sizeof(config_line_t)); line->key = tor_strdup("EntryGuard"); line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2); @@ -2785,6 +2796,11 @@ if (!get_options()->AvoidDiskWrites) or_state_mark_dirty(get_or_state(), 0); entry_guards_dirty = 0; + + /* XXX: Is this the place to decide that we no longer have any reachable + guards? */ + if (!have_reachable_entry) + or_port_hist_search_again(); } /** If <b>question</b> is the string "entry-guards", then dump
Filename: 157-specific-cert-download.txt Title: Make certificate downloads specific Author: Nick Mathewson Created: 2-Dec-2008 Status: Closed Target: 0.2.4.x History: 2008 Dec 2, 22:34 Changed name of cross certification field to match the other authority certificate fields. Status: As of 0.2.1.9-alpha: Cross-certification is implemented for new certificates, but not yet required. Directories support the tor/keys/fp-sk urls. Overview: Tor's directory specification gives two ways to download a certificate: by its identity fingerprint, or by the digest of its signing key. Both are error-prone. We propose a new download mechanism to make sure that clients get the certificates they want. Motivation: When a client wants a certificate to verify a consensus, it has two choices currently: - Download by identity key fingerprint. In this case, the client risks getting a certificate for the same authority, but with a different signing key than the one used to sign the consensus. - Download by signing key fingerprint. In this case, the client risks getting a forged certificate that contains the right signing key signed with the wrong identity key. (Since caches are willing to cache certs from authorities they do not themselves recognize, the attacker wouldn't need to compromise an authority's key to do this.) Current solution: Clients fetch by identity keys, and re-fetch with backoff if they don't get certs with the signing key they want. Proposed solution: Phase 1: Add a URL type for clients to download certs by identity _and_ signing key fingerprint. Unless both fields match, the client doesn't accept the certificate(s). Clients begin using this method when their randomly chosen directory cache supports it. Phase 1A: Simultaneously, add a cross-certification element to certificates. Phase 2: Once many directory caches support phase 1, clients should prefer to fetch certificates using that protocol when available. Phase 2A: Once all authorities are generating cross-certified certificates as in phase 1A, require cross-certification. Specification additions: The key certificate whose identity key fingerprint is <F> and whose signing key fingerprint is <S> should be available at: http://<hostname>/tor/keys/fp-sk/<F>-<S>.z As usual, clients may request multiple certificates using: http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z Clients SHOULD use this format whenever they know both key fingerprints for a desired certificate. Certificates SHOULD contain the following field (at most once): "dir-key-crosscert" NL CrossSignature NL where CrossSignature is a signature, made using the certificate's signing key, of the digest of the PKCS1-padded hash of the certificate's identity key. For backward compatibility with broken versions of the parser, we wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and -----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow the "ID " portion to be omitted, however. When encountering a certificate with a dir-key-crosscert entry, implementations MUST verify that the signature is a correct signature of the hash of the identity key using the signing key. (In a future version of this specification, dir-key-crosscert entries will be required.) Why cross-certify too? Cross-certification protects clients who haven't updated yet, by reducing the number of caches that are willing to hold and serve bogus certificates. References: This is related to part 2 of bug 854.
Filename: 158-microdescriptors.txt Title: Clients download consensus + microdescriptors Author: Roger Dingledine Created: 17-Jan-2009 Status: Closed Implemented-In: 0.2.3.1-alpha 0. History 15 May 2009: Substantially revised based on discussions on or-dev from late January. Removed the notion of voting on how to choose microdescriptors; made it just a function of the consensus method. (This lets us avoid the possibility of "desynchronization.") Added suggestion to use a new consensus flavor. Specified use of SHA256 for new hashes. -nickm 15 June 2009: Cleaned up based on comments from Roger. -nickm 1. Overview This proposal replaces section 3.2 of proposal 141, which was called "Fetching descriptors on demand". Rather than modifying the circuit-building protocol to fetch a server descriptor inline at each circuit extend, we instead put all of the information that clients need either into the consensus itself, or into a new set of data about each relay called a microdescriptor. Descriptor elements that are small and frequently changing should go in the consensus itself, and descriptor elements that are small and relatively static should go in the microdescriptor. If we ever end up with descriptor elements that aren't small yet clients need to know them, we'll need to resume considering some design like the one in proposal 141. Note also that any descriptor element which clients need to use to decide which servers to fetch info about, or which servers to fetch info from, needs to stay in the consensus. 2. Motivation See http://archives.seul.org/or/dev/Nov-2008/msg00000.html and http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially http://archives.seul.org/or/dev/Nov-2008/msg00007.html for a discussion of the options and why this is currently the best approach. 3. Design There are three pieces to the proposal. First, authorities will list in their votes (and thus in the consensus) the expected hash of microdescriptor for each relay. Second, authorities will serve microdescriptors, directory mirrors will cache and serve them. Third, clients will ask for them and cache them. 3.1. Consensus changes If the authorities choose a consensus method of a given version or later, a microdescriptor format is implicit in that version. A microdescriptor should in every case be a pure function of the router descriptor and the consensus method. In votes, we need to include the hash of each expected microdescriptor in the routerstatus section. I suggest a new "m" line for each stanza, with the base64 of the SHA256 hash of the router's microdescriptor. For every consensus method that an authority supports, it includes a separate "m" line in each router section of its vote, containing: "m" SP methods 1*(SP AlgorithmName "=" digest) NL where methods is a comma-separated list of the consensus methods that the authority believes will produce "digest". (As with base64 encoding of SHA1 hashes in consensuses, let's omit the trailing =s) The consensus microdescriptor-elements and "m" lines are then computed as described in Section 3.1.2 below. (This means we need a new consensus-method that knows how to compute the microdescriptor-elements and add "m" lines.) The microdescriptor consensus uses the directory-signature format from proposal 162, with the "sha256" algorithm. 3.1.1. Descriptor elements to include for now In the first version, the microdescriptor should contain the onion-key element, and the family element from the router descriptor, and the exit policy summary as currently specified in dir-spec.txt. 3.1.2. Computing consensus for microdescriptor-elements and "m" lines When we are generating a consensus, we use whichever m line unambiguously corresponds to the descriptor digest that will be included in the consensus. (If different votes have different microdescriptor digests for a single <descriptor-digest, consensus-method> pair, then at least one of the authorities is broken. If this happens, the consensus should contain whichever microdescriptor digest is most common. If there is no winner, we break ties in the favor of the lexically earliest. Either way, we should log a warning: there is definitely a bug.) The "m" lines in a consensus contain only the digest, not a list of consensus methods. 3.1.3. A new flavor of consensus Rather than inserting "m" lines in the current consensus format, they should be included in a new consensus flavor (see proposal 162). This flavor can safely omit descriptor digests. When we implement this voting method, we can remove the exit policy summary from the current "ns" flavor of consensus, since no current clients use them, and they take up about 5% of the compressed consensus. This new consensus flavor should be signed with the sha256 signature format as documented in proposal 162. 3.2. Directory mirrors fetch, cache, and serve microdescriptors Directory mirrors should fetch, catch, and serve each microdescriptor from the authorities. (They need to continue to serve normal relay descriptors too, to handle old clients.) The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be available at: http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z (We use base64 for size and for consistency with the consensus format. We use -s instead of +s to separate these items, since the + character is used in base64 encoding.) All the microdescriptors from the current consensus should also be available at: http://<hostname>/tor/micro/all.z so a client that's bootstrapping doesn't need to send a 70KB URL just to name every microdescriptor it's looking for. Microdescriptors have no header or footer. The hash of the microdescriptor is simply the hash of the concatenated elements. Directory mirrors should check to make sure that the microdescriptors they're about to serve match the right hashes (either the hashes from the fetch URL or the hashes from the consensus, respectively). We will probably want to consider some sort of smart data structure to be able to quickly convert microdescriptor hashes into the appropriate microdescriptor. Clients will want this anyway when they load their microdescriptor cache and want to match it up with the consensus to see what's missing. 3.3. Clients fetch them and cache them When a client gets a new consensus, it looks to see if there are any microdescriptors it needs to learn. If it needs to learn more than some threshold of the microdescriptors (half?), it requests 'all', else it requests only the missing ones. Clients MAY try to determine whether the upload bandwidth for listing the microdescriptors they want is more or less than the download bandwidth for the microdescriptors they do not want. Clients maintain a cache of microdescriptors along with metadata like when it was last referenced by a consensus, and which identity key it corresponds to. They keep a microdescriptor until it hasn't been mentioned in any consensus for a week. Future clients might cache them for longer or shorter times. 3.3.1. Information leaks from clients If a client asks you for a set of microdescs, then you know she didn't have them cached before. How much does that leak? What about when we're all using our entry guards as directory guards, and we've seen that user make a bunch of circuits already? Fetching "all" when you need at least half is a good first order fix, but might not be all there is to it. Another future option would be to fetch some of the microdescriptors anonymously (via a Tor circuit). Another crazy option (Roger's phrasing) is to do decoy fetches as well. 4. Transition and deployment Phase one, the directory authorities should start voting on microdescriptors, and putting them in the consensus. Phase two, directory mirrors should learn how to serve them, and learn how to read the consensus to find out what they should be serving. Phase three, clients should start fetching and caching them instead of normal descriptors.
Filename: 159-exit-scanning.txt Title: Exit Scanning Author: Mike Perry Created: 13-Feb-2009 Status: Informational Overview: This proposal describes the implementation and integration of an automated exit node scanner for scanning the Tor network for malicious, misconfigured, firewalled or filtered nodes. Motivation: Tor exit nodes can be run by anyone with an Internet connection. Often, these users aren't fully aware of limitations of their networking setup. Content filters, antivirus software, advertisements injected by their service providers, malicious upstream providers, and the resource limitations of their computer or networking equipment have all been observed on the current Tor network. It is also possible that some nodes exist purely for malicious purposes. In the past, there have been intermittent instances of nodes spoofing SSH keys, as well as nodes being used for purposes of plaintext surveillance. While it is not realistic to expect to catch extremely targeted or completely passive malicious adversaries, the goal is to prevent malicious adversaries from deploying dragnet attacks against large segments of the Tor userbase. Scanning methodology: The first scans to be implemented are HTTP, HTML, Javascript, and SSL scans. The HTTP scan scrapes Google for common filetype urls such as exe, msi, doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and compares the SHA1 hashes of the resulting content. The SSL scan downloads certificates for all IPs a domain will locally resolve to and compares these certificates to those seen over Tor. The scanner notes if a domain had rotated certificates locally in the results for each scan. The HTML scan checks HTML, Javascript, and plugin content for modifications. Because of the dynamic nature of most of the web, the scanner has a number of mechanisms built in to filter out false positives that are used when a change is noticed between Tor and Non-Tor. All tests also share a URL-based false positive filter that automatically removes results retroactively if the number of failures exceeds a certain percentage of nodes tested with the URL. Deployment Stages: To avoid instances where bugs cause us to mark exit nodes as BadExit improperly, it is proposed that we begin use of the scanner in stages. 1. Manual Review: In the first stage, basic scans will be run by a small number of people while we stabilize the scanner. The scanner has the ability to resume crashed scans, and to rescan nodes that fail various tests. 2. Human Review: In the second stage, results will be automatically mailed to an email list of interested parties for review. We will also begin classifying failure types into three to four different severity levels, based on both the reliability of the test and the nature of the failure. 3. Automatic BadExit Marking: In the final stage, the scanner will begin marking exits depending on the failure severity level in one of three different ways: by node idhex, by node IP, or by node IP mask. A potential fourth, less severe category of results may still be delivered via email only for review. BadExit markings will be delivered in batches upon completion of whole-network scans, so that the final false positive filter has an opportunity to filter out URLs that exhibit dynamic content beyond what we can filter. Specification of Exit Marking: Technically, BadExit could be marked via SETCONF AuthDirBadExit over the control port, but this would allow full access to the directory authority configuration and operation. The approved-routers file could also be used, but currently it only supports fingerprints, and it also contains other data unrelated to exit scanning that would be difficult to coordinate. Instead, we propose that a new badexit-routers file that has three keywords: BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt] BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt] BadExitNet lines would follow the codepaths used by AuthDirBadExit to set authdir_badexit_policy, and BadExitFP would follow the codepaths from approved-router's !badexit lines. The scanner would have exclusive ability to write, append, rewrite, and modify this file. Prior to building a new consensus vote, a participating Tor authority would read in a fresh copy. Security Implications: Aside from evading the scanner's detection, there are two additional high-level security considerations: 1. Ensure nodes cannot be marked BadExit by an adversary at will It is possible individual website owners will be able to target certain Tor nodes, but once they begin to attempt to fail more than the URL filter percentage of the exits, their sites will be automatically discarded. Failing specific nodes is possible, but scanned results are fully reproducible, and BadExits should be rare enough that humans are never fully removed from the loop. State (cookies, cache, etc) does not otherwise persist in the scanner between exit nodes to enable one exit node to bias the results of a later one. 2. Ensure that scanner compromise does not yield authority compromise Having a separate file that is under the exclusive control of the scanner allows us to heavily isolate the scanner from the Tor authority, potentially even running them on separate machines.
Filename: 160-bandwidth-offset.txt Title: Authorities vote for bandwidth offsets in consensus Author: Roger Dingledine Created: 4-May-2009 Status: Closed Target: 0.2.1.x 1. Motivation As part of proposal 141, we moved the bandwidth value for each relay into the consensus. Now clients can know how they should load balance even before they've fetched the corresponding relay descriptors. Putting the bandwidth in the consensus also lets the directory authorities choose more accurate numbers to advertise, if we come up with a better algorithm for deciding weightings. Our original plan was to teach directory authorities how to measure bandwidth themselves; then every authority would vote for the bandwidth it prefers, and we'd take the median of votes as usual. The problem comes when we have 7 authorities, and only a few of them have smarter bandwidth allocation algorithms. So long as the majority of them are voting for the number in the relay descriptor, the minority that have better numbers will be ignored. 2. Options One fix would be to demand that every authority also run the new bandwidth measurement algorithms: in that case, part of the responsibility of being an authority operator is that you need to run this code too. But in practice we can't really require all current authority operators to do that; and if we want to expand the set of authority operators even further, it will become even more impractical. Also, bandwidth testing adds load to the network, so we don't really want to require that the number of concurrent bandwidth tests match the number of authorities we have. The better fix is to allow certain authorities to specify that they are voting on bandwidth measurements: more accurate bandwidth values that have actually been evaluated. In this way, authorities can vote on the median measured value if sufficient measured votes exist for a router, and otherwise fall back to the median value taken from the published router descriptors. 3. Security implications If only some authorities choose to vote on an offset, then a majority of those voting authorities can arbitrarily change the bandwidth weighting for the relay. At the extreme, if there's only one offset-voting authority, then that authority can dictate which relays clients will find attractive. This problem isn't entirely new: we already have the worry wrt the subset of authorities that vote for BadExit. To make it not so bad, we should deploy at least three offset-voting authorities. Also, authorities that know how to vote for offsets should vote for an offset of zero for new nodes, rather than choosing not to vote on any offset in those cases. 4. Design First, we need a new consensus method to support this new calculation. Now v3 votes can have an additional value on the "w" line: "w Bandwidth=X Measured=" INT. Once we're using the new consensus method, the new way to compute the Bandwidth weight is by checking if there are at least 3 "Measured" votes. If so, the median of these is taken. Otherwise, the median of the "Bandwidth=" values are taken, as described in Proposal 141. Then the actual consensus looks just the same as it did before, so clients never have to know that this additional calculation is happening. 5. Implementation The Measured values will be read from a file provided by the scanners described in proposal 161. Files with a timestamp older than 3 days will be ignored. The file will be read in from dirserv_generate_networkstatus_vote_obj() in a location specified by a new config option "V3MeasuredBandwidths". A helper function will be called to populate new 'measured' and 'has_measured' fields of the routerstatus_t 'routerstatuses' list with values read from this file. An additional for_vote flag will be passed to routerstatus_format_entry() from format_networkstatus_vote(), which will indicate that the "Measured=" string should be appended to the "w Bandwith=" line with the measured value in the struct. routerstatus_parse_entry_from_string() will be modified to parse the "Measured=" lines into routerstatus_t struct fields. Finally, networkstatus_compute_consensus() will set rs_out.bandwidth to the median of the measured values if there are more than 3, otherwise it will use the bandwidth value median as normal.
Title: Computing Bandwidth Adjustments Filename: 161-computing-bandwidth-adjustments.txt Author: Mike Perry Created: 12-May-2009 Target: 0.2.1.x Status: Closed 1. Motivation There is high variance in the performance of the Tor network. Despite our efforts to balance load evenly across the Tor nodes, some nodes are significantly slower and more overloaded than others. Proposal 160 describes how we can augment the directory authorities to vote on measured bandwidths for routers. This proposal describes what goes into the measuring process. 2. Measurement Selection The general idea is to determine a load factor representing the ratio of the capacity of measured nodes to the rest of the network. This load factor could be computed from three potentially relevant statistics: circuit failure rates, circuit extend times, or stream capacity. Circuit failure rates and circuit extend times appear to be non-linearly proportional to node load. We've observed that the same nodes when scanned at US nighttime hours (when load is presumably lower) exhibit almost no circuit failure, and significantly faster extend times than when scanned during the day. Stream capacity, however, is much more uniform, even during US nighttime hours. Moreover, it is a more intuitive representation of node capacity, and also less dependent upon distance and latency if amortized over large stream fetches. 3. Average Stream Bandwidth Calculation The average stream bandwidths are obtained by dividing the network into slices of 50 nodes each, grouped according to advertised node bandwidth. Two hop circuits are built using nodes from the same slice, and a large file is downloaded via these circuits. The file sizes are set based on node percentile rank as follows: 0-10: 2M 10-20: 1M 20-30: 512k 30-50: 256k 50-100: 128k These sizes are based on measurements performed during test scans. This process is repeated until each node has been chosen to participate in at least 5 circuits. 4. Ratio Calculation The ratios are calculated by dividing each measured value by the network-wide average. 5. Ratio Filtering After the base ratios are calculated, a second pass is performed to remove any streams with nodes of ratios less than X=0.5 from the results of other nodes. In addition, all outlying streams with capacity of one standard deviation below a node's average are also removed. The final ratio result will be greater of the unfiltered ratio and the filtered ratio. 6. Pseudocode for Ratio Calculation Algorithm Here is the complete pseudocode for the ratio algorithm: Slices = {S | S is 50 nodes of similar consensus capacity} for S in Slices: while exists node N in S with circ_chosen(N) < 7: fetch_slice_file(build_2hop_circuit(N, (exit in S))) for N in S: BW_measured(N) = MEAN(b | b is bandwidth of a stream through N) Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N) Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S) for N in S: Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)} BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N)) Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices) Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices) for N in all Slices: Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices) Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices) ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N)) 7. Security implications The ratio filtering will deal with cases of sabotage by dropping both very slow outliers in stream average calculations, as well as dropping streams that used very slow nodes from the calculation of other nodes. This scheme will not address nodes that try to game the system by providing better service to scanners. The scanners can be detected at the entry by IP address, and at the exit by the destination fetch IP. Measures can be taken to obfuscate and separate the scanners' source IP address from the directory authority IP address. For instance, scans can happen offsite and the results can be rsynced into the authorities. The destination server IP can also change. Neither of these methods are foolproof, but such nodes can already lie about their bandwidth to attract more traffic, so this solution does not set us back any in that regard. 8. Parallelization Because each slice takes as long as 6 hours to complete, we will want to parallelize as much as possible. This will be done by concurrently running multiple scanners from each authority to deal with different segments of the network. Each scanner piece will continually loop over a portion of the network, outputting files of the form: node_id=<idhex> SP strm_bw=<BW_measured(N)> SP filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL The most recent file from each scanner will be periodically gathered by another script that uses them to produce network-wide averages and calculate ratios as per the algorithm in section 6. Because nodes may shift in capacity, they may appear in more than one slice and/or appear more than once in the file set. The most recently measured line will be chosen in this case. 9. Integration with Proposal 160 The final results will be produced for the voting mechanism described in Proposal 160 by multiplying the derived ratio by the average published consensus bandwidth during the course of the scan, and taking the weighted average with the previous consensus bandwidth: Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1)) The Alpha parameter is a smoothing parameter intended to prevent rapid oscillation between loaded and unloaded conditions. It is currently fixed at 0.333. The Round() step consists of rounding to the 3 most significant figures in base10, and then rounding that result to the nearest 1000, with a minimum value of 1000. This will produce a new bandwidth value that will be output into a file consisting of lines of the form: node_id=<idhex> SP bw=<Bw_new> NL The first line of the file will contain a timestamp in UNIX time() seconds. This will be used by the authority to decide if the measured values are too old to use. This file can be either copied or rsynced into a directory readable by the directory authority.
Filename: 162-consensus-flavors.txt Title: Publish the consensus in multiple flavors Author: Nick Mathewson Created: 14-May-2009 Implemented-In: 0.2.3.1-alpha Status: Closed [Implementation notes: the 'consensus index' feature never got implemented.] Overview: This proposal describes a way to publish each consensus in multiple simultaneous formats, or "flavors". This will reduce the amount of time needed to deploy new consensus-like documents, and reduce the size of consensus documents in the long term. Motivation: In the future, we will almost surely want different fields and data in the network-status document. Examples include: - Publishing hashes of microdescriptors instead of hashes of full descriptors (Proposal 158). - Including different digests of descriptors, instead of the perhaps-soon-to-be-totally-broken SHA1. Note that in both cases, from the client's point of view, this information _replaces_ older information. If we're using a SHA256 hash, we don't need to see the SHA1. If clients only want microdescriptors, they don't (necessarily) need to see hashes of other things. Our past approach to cases like this has been to shovel all of the data into the consensus document. But this is rather poor for bandwidth. Adding a single SHA256 hash to a consensus for each router increases the compressed consensus size by 47%. In comparison, replacing a single SHA1 hash with a SHA256 hash for each listed router increases the consensus size by only 18%. Design in brief: Let the voting process remain as it is, until a consensus is generated. With future versions of the voting algorithm, instead of just a single consensus being generated, multiple consensus "flavors" are produced. Consensuses (all of them) include a list of which flavors are being generated. Caches fetch and serve all flavors of consensus that are listed, regardless of whether they can parse or validate them, and serve them to clients. Thus, once this design is in place, we won't need to deploy more cache changes in order to get new flavors of consensus to be cached. Clients download only the consensus flavor they want. A note on hashes: Everything in this document is specified to use SHA256, and to be upgradeable to use better hashes in the future. Spec modifications: 1. URLs and changes to the current consensus format. Every consensus flavor has a name consisting of a sequence of one or more alphanumeric characters and dashes. For compatibility current descriptor flavor is called "ns". The supported consensus flavors are defined as part of the authorities' consensus method. For each supported flavor, every authority calculates another consensus document of as-yet-unspecified format, and exchanges detached signatures for these documents as in the current consensus design. In addition to the consensus currently served at /tor/status-vote/(current|next)/consensus.z and /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z , authorities serve another consensus of each flavor "F" from the locations /tor/status-vote/(current|next)/consensus-F.z. and /tor/status-vote/(current|next)/consensus-F/<FP1>+....z. When caches serve these documents, they do so from the same locations. 2. Document format: generic consensus. The format of a flavored consensus is as-yet-unspecified, except that the first line is: "network-status-version" SP version SP flavor NL where version is 3 or higher, and the flavor is a string consisting of alphanumeric characters and dashes, matching the corresponding flavor listed in the unflavored consensus. 3. Document format: detached signatures. We amend the detached signature format to include more than one consensus-digest line, and more than one set of signatures. After the consensus-digest line, we allow more lines of the form: "additional-digest" SP flavor SP algname SP digest NL Before the directory-signature lines, we allow more entries of the form: "additional-signature" SP flavor SP algname SP identity SP signing-key-digest NL signature. [We do not use "consensus-digest" or "directory-signature" for flavored consensuses, since this could confuse older Tors.] The consensus-signatures URL should contain the signatures for _all_ flavors of consensus. 4. The consensus index: Authorities additionally generate and serve a consensus-index document. Its format is: Header ValidAfter ValidUntil Documents Signatures Header = "consensus-index" SP version NL ValidAfter = as in a consensus ValidUntil = as in a consensus Documents = Document* Document = "document" SP flavor SP SignedLength 1*(SP AlgorithmName "=" Digest) NL Signatures = Signature* Signature = "directory-signature" SP algname SP identity SP signing-key-digest NL signature There must be one Document line for each generated consensus flavor. Each Document line describes the length of the signed portion of a consensus (the signatures themselves are not included), along with one or more digests of that signed portion. Digests are given in hex. The algorithm "sha256" MUST be included; others are allowed. The algname part of a signature describes what algorithm was used to hash the identity and signing keys, and to compute the signature. The algorithm "sha256" MUST be recognized; signatures with unrecognized algorithms MUST be ignored. (See below). The consensus index is made available at /tor/status-vote/(current|next)/consensus-index.z. Caches should fetch this document so they can check the correctness of the different consensus documents they fetch. They do not need to check anything about an unrecognized consensus document beyond its digest and length. 4.1. The "sha256" signature format. The 'SHA256' signature format for directory objects is defined as the RSA signature of the OAEP+-padded SHA256 digest of the item to be signed. When checking signatures, the signature MUST be treated as valid if the signature material begins with SHA256(document); this allows us to add other data later. Considerations: - We should not create a new flavor of consensus when adding a field instead wouldn't be too onerous. - We should not proliferate flavors lightly: clients will be distinguishable based on which flavor they download. Migration: - Stage one: authorities begin generating and serving consensus-index files. - Stage two: Caches begin downloading consensus-index files, validating them, and using them to decide what flavors of consensus documents to cache. They download all listed documents, and compare them to the digests given in the consensus. - Stage three: Once we want to make a significant change to the consensus format, we deploy another flavor of consensus at the authorities. This will immediately start getting cached by the caches, and clients can start fetching the new flavor without waiting a version or two for enough caches to begin supporting it. Acknowledgements: Aspects of this design and its applications to hash migration were heavily influenced by IRC conversations with Marian.
Filename: 163-detecting-clients.txt Title: Detecting whether a connection comes from a client Author: Nick Mathewson Created: 22-May-2009 Target: 0.2.2 Status: Superseded [Note: Actually, this is partially done, partially superseded -nickm, 9 May 2011] Overview: Some aspects of Tor's design require relays to distinguish connections from clients from connections that come from relays. The existing means for doing this is easy to spoof. We propose a better approach. Motivation: There are at least two reasons for which Tor servers want to tell which connections come from clients and which come from other servers: 1) Some exits, proposal 152 notwithstanding, want to disallow their use as single-hop proxies. 2) Some performance-related proposals involve prioritizing traffic from relays, or limiting traffic per client (but not per relay). Right now, we detect client vs server status based on how the client opens circuits. (Check out the code that implements the AllowSingleHopExits option if you want all the details.) This method is depressingly easy to fake, though. This document proposes better means. Goals: To make grabbing relay privileges at least as difficult as just running a relay. In the analysis below, "using server privileges" means taking any action that only servers are supposed to do, like delivering a BEGIN cell to an exit node that doesn't allow single hop exits, or claiming server-like amounts of bandwidth. Passive detection: A connection is definitely a client connection if it takes one of the TLS methods during setup that does not establish an identity key. A circuit is definitely a client circuit if it is initiated with a CREATE_FAST cell, though the node could be a client or a server. A node that's listed in a recent consensus is probably a server. A node to which we have successfully extended circuits from multiple origins is probably a server. Active detection: If a node doesn't try to use server privileges at all, we never need to care whether it's a server. When a node or circuit tries to use server privileges, if it is "definitely a client" as per above, we can refuse it immediately. If it's "probably a server" as per above, we can accept it. Otherwise, we have either a client, or a server that is neither listed in any consensus or used by any other clients -- in other words, a new or private server. For these servers, we should attempt to build one or more test circuits through them. If enough of the circuits succeed, the node is a real relay. If not, it is probably a client. While we are waiting for the test circuits to succeed, we should allow a short grace period in which server privileges are permitted. When a test is done, we should remember its outcome for a while, so we don't need to do it again. Why it's hard to do good testing: Doing a test circuit starting with an unlisted router requires only that we have an open connection for it. Doing a test circuit starting elsewhere _through_ an unlisted router--though more reliable-- would require that we have a known address, port, identity key, and onion key for the router. Only the address and identity key are easily available via the current Tor protocol in all cases. We could fix this part by requiring that all servers support BEGIN_DIR and support downloading at least a current descriptor for themselves. Open questions: What are the thresholds for the needed numbers of circuits for us to decide that a node is a relay? [Suggested answer: two circuits from two distinct hosts.] How do we pick grace periods? How long do we remember the outcome of a test? [Suggested answer: 10 minute grace period; 48 hour memory of test outcomes.] If we can build circuits starting at a suspect node, but we don't have enough information to try extending circuits elsewhere through the node, should we conclude that the node is "server-like" or not? [Suggested answer: for now, just try making circuits through the node. Extend this to extending circuits as needed.]
Filename: 164-reporting-server-status.txt Title: Reporting the status of server votes Author: Nick Mathewson Created: 22-May-2009 Status: Obsolete Notes: This doesn't work with the current things authorities do, though we could revise it to work if we ever want to do this. Overview: When a given node isn't listed in the directory, it isn't always easy to tell why. This proposal suggest a quick-and-dirty way for authorities to export not only how they voted, but why, and a way to collate the information. Motivation: Right now, if you want to know the reason why your server was listed a certain way in the Tor directory, the following steps are recommended: - Look through your log for reports of what the authority said when you tried to upload. - Look at the consensus; see if you're listed. - Wait a while, see if things get better. - Download the votes from all the authorities, and see how they voted. Try to figure out why. - If you think they'll listen to you, ask some authority operators to look you up in their mtbf files and logs to see why they voted as they did. This is far too hard. Solution: We should add a new vote-like information-only document that authorities serve on request. Call it a "vote info". It is generated at the same time as a vote, but used only for determining why a server voted as it did. It is served from /tor/status-vote-info/current/authority[.z] It differs from a vote in that: * Its vote-status field is 'vote-info'. * It includes routers that the authority would not include in its vote. For these, it includes an "omitted" line with an English message explaining why they were omitted. * For each router, it includes a line describing its WFU and MTBF. The format is: "stability <mtbf> up-since='date'" "uptime <wfu> down-since='date'" * It describes the WFU and MTBF thresholds it requires to vote for a given router in various roles in the header. The format is: "flag-requirement <flag-name> <field> <op> <value>" e.g. "flag-requirement Guard uptime > 80" * It includes info on routers all of whose descriptors that were uploaded but rejected over the past few hours. The "r" lines for these are the same as for regular routers. The other lines are omitted for these routers, and are replaced with a single "rejected" line, explaining (in English) why the router was rejected. A status site (like Torweather or Torstatus or another tool) can poll these files when they are generated, collate the data, and make it available to server operators. Risks: This document makes no provisions for caching these "vote info" documents. If many people wind up fetching them aggressively from the authorities, that would be bad.
Filename: 165-simple-robust-voting.txt Title: Easy migration for voting authority sets Author: Nick Mathewson Created: 2009-05-28 Status: Rejected Status: rejected as too complex. Overview: This proposal describes an easy-to-implement, easy-to-verify way to change the set of authorities without creating a "flag day" situation. Motivation: From proposal 134 ("More robust consensus voting with diverse authority sets") by Peter Palfrader: Right now there are about five authoritative directory servers in the Tor network, tho this number is expected to rise to about 15 eventually. Adding a new authority requires synchronized action from all operators of directory authorities so that at any time during the update at least half of all authorities are running and agree on who is an authority. The latter requirement is there so that the authorities can arrive at a common consensus: Each authority builds the consensus based on the votes from all authorities it recognizes, and so a different set of recognized authorities will lead to a different consensus document. In response to this problem, proposal 134 suggested that every candidate authority list in its vote whom it believes to be an authority. These A-says-B-is-an-authority relationships form a directed graph. Each authority then iteratively finds the largest clique in the graph and remove it, until they find one containing them. They vote with this clique. Proposal 134 had some problems: - It had a security problem in that M hostile authorities in a clique could effectively kick out M-1 honest authorities. This could enable a minority of the original authorities to take over. - It was too complex in its implications to analyze well: it took us over a year to realize that it was insecure. - It tried to solve a bigger problem: general fragmentation of authority trust. Really, all we wanted to have was the ability to add and remove authorities without forcing a flag day. Proposed protocol design: A "Voting Set" is a set of authorities. Each authority has a list of the voting sets it considers acceptable. These sets are chosen manually by the authority operators. They must always contain the authority itself. Each authority lists all of these voting sets in its votes. Authorities exchange votes with every other authority in any of their voting sets. When it is time to calculate a consensus, an authority picks votes from whichever voting set it lists that is listed by the most members of that set. In other words, given two sets S1 and S2 that an authority lists, that authority will prefer to vote with S1 over S2 whenever the number of other authorities in S1 that themselves list S1 is higher than the number of other authorities in S2 that themselves list S2. For example, suppose authority A recognizes two sets, "A B C D" and "A E F G H". Suppose that the first set is recognized by all of A, B, C, and D, whereas the second set is recognized only by A, E, and F. Because the first set is recognize by more of the authorities in it than the other one, A will vote with the first set. Ties are broken in favor of some arbitrary function of the identity keys of the authorities in the set. How to migrate authority sets: In steady state, each authority operator should list only the current actual voting set as accepted. When we want to add an authority, each authority operator configures his or her server to list two voting sets: one containing all the old authorities, and one containing the old authorities and the new authority too. Once all authorities are listing the new set of authorities, they will start voting with that set because of its size. What if one or two authority operators are slow to list the new set? Then the other operators can stop listing the old set once there are enough authorities listing the new set to make its voting successful. (Note that these authorities not listing the new set will still have their votes counted, since they themselves will be members of the new set. They will only fail to sign the consensus generated by the other authorities who are using the new set.) When we want to remove an authority, the operators list two voting sets: one containing all the authorities, and one omitting the authority we want to remove. Once enough authorities list the new set as acceptable, we start having authority operators stop listing the old set. Once there are more listing the new set than the old set, the new set will win. Data format changes: Add a new 'voting-set' line to the vote document format. Allow it to occur any number of times. Its format is: voting-set SP 'fingerprint' SP 'fingerprint' ... NL where each fingerprint is the hex fingerprint of an identity key of an authority. Sort fingerprints in ascending order. When the consensus method is at least 'X' (decide this when we implement the proposal), add this line to the consensus format as well, before the first dir-source line. [This information is not redundant with the dir-source sections in the consensus: If an authority is recognized but didn't vote, that authority will appear in the voting-set line but not in the dir-source sections.] We don't need to list other information about authorities in our vote. Migration issues: We should keep track somewhere which Tor client versions recognized which authorities. Acknowledgments: The design came out of an IRC conversation with Peter Palfrader. He had the basic idea first.
Filename: 166-statistics-extra-info-docs.txt Title: Including Network Statistics in Extra-Info Documents Author: Karsten Loesing Created: 21-Jul-2009 Target: 0.2.2 Status: Closed Change history: 21-Jul-2009 Initial proposal for or-dev Overview: The Tor network has grown to almost two thousand relays and millions of casual users over the past few years. With growth has come increasing performance problems and attempts by some countries to block access to the Tor network. In order to address these problems, we need to learn more about the Tor network. This proposal suggests to measure additional statistics and include them in extra-info documents to help us understand the Tor network better. Introduction: As of May 2009, relays, bridges, and directories gather the following data for statistical purposes: - Relays and bridges count the number of bytes that they have pushed in 15-minute intervals over the past 24 hours. Relays and bridges include these data in extra-info documents that they send to the directory authorities whenever they publish their server descriptor. - Bridges further include a rough number of clients per country that they have seen in the past 48 hours in their extra-info documents. - Directories can be configured to count the number of clients they see per country in the past 24 hours and to write them to a local file. Since then we extended the network statistics in Tor. These statistics include: - Directories now gather more precise statistics about connecting clients. Fixes include measuring in intervals of exactly 24 hours, counting unsuccessful requests, measuring download times, etc. The directories append their statistics to a local file every 24 hours. - Entry guards count the number of clients per country per day like bridges do and write them to a local file every 24 hours. - Relays measure statistics of the number of cells in their circuit queues and how much time these cells spend waiting there. Relays write these statistics to a local file every 24 hours. - Exit nodes count the number of read and written bytes on exit connections per port as well as the number of opened exit streams per port in 24-hour intervals. Exit nodes write their statistics to a local file. The following four sections contain descriptions for adding these statistics to the relays' extra-info documents. Directory request statistics: The first type of statistics aims at measuring directory requests sent by clients to a directory mirror or directory authority. More precisely, these statistics aim at requests for v2 and v3 network statuses only. These directory requests are sent non-anonymously, either via HTTP-like requests to a directory's Dir port or tunneled over a 1-hop circuit. Measuring directory request statistics is useful for several reasons: First, the number of locally seen directory requests can be used to estimate the total number of clients in the Tor network. Second, the country-wise classification of requests using a GeoIP database can help counting the relative and absolute number of users per country. Third, the download times can give hints on the available bandwidth capacity at clients. Directory requests do not give any hints on the contents that clients send or receive over the Tor network. Every client requests network statuses from the directories, so that there are no anonymity-related concerns to gather these statistics. It might be, though, that clients wish to hide the fact that they are connecting to the Tor network. Therefore, IP addresses are resolved to country codes in memory, events are accumulated over 24 hours, and numbers are rounded up to multiples of 4 or 8. "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). A "dirreq-stats-end" line, as well as any other "dirreq-*" line, is only added when the relay has opened its Dir port and after 24 hours of measuring directory requests. "dirreq-v2-ips" CC=N,CC=N,... NL [At most once.] "dirreq-v3-ips" CC=N,CC=N,... NL [At most once.] List of mappings from two-letter country codes to the number of unique IP addresses that have connected from that country to request a v2/v3 network status, rounded up to the nearest multiple of 8. Only those IP addresses are counted that the directory can answer with a 200 OK status code. "dirreq-v2-reqs" CC=N,CC=N,... NL [At most once.] "dirreq-v3-reqs" CC=N,CC=N,... NL [At most once.] List of mappings from two-letter country codes to the number of requests for v2/v3 network statuses from that country, rounded up to the nearest multiple of 8. Only those requests are counted that the directory can answer with a 200 OK status code. "dirreq-v2-share" num% NL [At most once.] "dirreq-v3-share" num% NL [At most once.] The share of v2/v3 network status requests that the directory expects to receive from clients based on its advertised bandwidth compared to the overall network bandwidth capacity. Shares are formatted in percent with two decimal places. Shares are calculated as means over the whole 24-hour interval. "dirreq-v2-resp" status=num,... NL [At most once.] "dirreq-v3-resp" status=nul,... NL [At most once.] List of mappings from response statuses to the number of requests for v2/v3 network statuses that were answered with that response status, rounded up to the nearest multiple of 4. Only response statuses with at least 1 response are reported. New response statuses can be added at any time. The current list of response statuses is as follows: "ok": a network status request is answered; this number corresponds to the sum of all requests as reported in "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before rounding up. "not-enough-sigs: a version 3 network status is not signed by a sufficient number of requested authorities. "unavailable": a requested network status object is unavailable. "not-found": a requested network status is not found. "not-modified": a network status has not been modified since the If-Modified-Since time that is included in the request. "busy": the directory is busy. "dirreq-v2-direct-dl" key=val,... NL [At most once.] "dirreq-v3-direct-dl" key=val,... NL [At most once.] "dirreq-v2-tunneled-dl" key=val,... NL [At most once.] "dirreq-v3-tunneled-dl" key=val,... NL [At most once.] List of statistics about possible failures in the download process of v2/v3 network statuses. Requests are either "direct" HTTP-encoded requests over the relay's directory port, or "tunneled" requests using a BEGIN_DIR cell over the relay's OR port. The list of possible statistics can change, and statistics can be left out from reporting. The current list of statistics is as follows: Successful downloads and failures: "complete": a client has finished the download successfully. "timeout": a download did not finish within 10 minutes after starting to send the response. "running": a download is still running at the end of the measurement period for less than 10 minutes after starting to send the response. Download times: "min", "max": smallest and largest measured bandwidth in B/s. "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured bandwidth in B/s. For a given decile i, i/10 of all downloads had a smaller bandwidth than di, and (10-i)/10 of all downloads had a larger bandwidth than di. "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One fourth of all downloads had a smaller bandwidth than q1, one fourth of all downloads had a larger bandwidth than q3, and the remaining half of all downloads had a bandwidth between q1 and q3. "md": median of measured bandwidth in B/s. Half of the downloads had a smaller bandwidth than md, the other half had a larger bandwidth than md. Entry guard statistics: Entry guard statistics include the number of clients per country and per day that are connecting directly to an entry guard. Entry guard statistics are important to learn more about the distribution of clients to countries. In the future, this knowledge can be useful to detect if there are or start to be any restrictions for clients connecting from specific countries. The information which client connects to a given entry guard is very sensitive. This information must not be combined with the information what contents are leaving the network at the exit nodes. Therefore, entry guard statistics need to be aggregated to prevent them from becoming useful for de-anonymization. Aggregation includes resolving IP addresses to country codes, counting events over 24-hour intervals, and rounding up numbers to the next multiple of 8. "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). An "entry-stats-end" line, as well as any other "entry-*" line, is first added after the relay has been running for at least 24 hours. "entry-ips" CC=N,CC=N,... NL [At most once.] List of mappings from two-letter country codes to the number of unique IP addresses that have connected from that country to the relay and which are no known other relays, rounded up to the nearest multiple of 8. Cell statistics: The third type of statistics have to do with the time that cells spend in circuit queues. In order to gather these statistics, the relay memorizes when it puts a given cell in a circuit queue and when this cell is flushed. The relay further notes the life time of the circuit. These data are sufficient to determine the mean number of cells in a queue over time and the mean time that cells spend in a queue. Cell statistics are necessary to learn more about possible reasons for the poor network performance of the Tor network, especially high latencies. The same statistics are also useful to determine the effects of design changes by comparing today's data with future data. There are basically no privacy concerns from measuring cell statistics, regardless of a node being an entry, middle, or exit node. "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). A "cell-stats-end" line, as well as any other "cell-*" line, is first added after the relay has been running for at least 24 hours. "cell-processed-cells" num,...,num NL [At most once.] Mean number of processed cells per circuit, subdivided into deciles of circuits by the number of cells they have processed in descending order from loudest to quietest circuits. "cell-queued-cells" num,...,num NL [At most once.] Mean number of cells contained in queues by circuit decile. These means are calculated by 1) determining the mean number of cells in a single circuit between its creation and its termination and 2) calculating the mean for all circuits in a given decile as determined in "cell-processed-cells". Numbers have a precision of two decimal places. "cell-time-in-queue" num,...,num NL [At most once.] Mean time cells spend in circuit queues in milliseconds. Times are calculated by 1) determining the mean time cells spend in the queue of a single circuit and 2) calculating the mean for all circuits in a given decile as determined in "cell-processed-cells". "cell-circuits-per-decile" num NL [At most once.] Mean number of circuits that are included in any of the deciles, rounded up to the next integer. Exit statistics: The last type of statistics affects exit nodes counting the number of bytes written and read and the number of streams opened per port and per 24 hours. Exit port statistics can be measured from looking at headers of BEGIN and DATA cells. A BEGIN cell contains the exit port that is required for the exit node to open a new exit stream. Subsequent DATA cells coming from the client or being sent back to the client contain a length field stating how many bytes of application data are contained in the cell. Exit port statistics are important to measure in order to identify possible load-balancing problems with respect to exit policies. Exit nodes that permit more ports than others are very likely overloaded with traffic for those ports plus traffic for other ports. Improving load balancing in the Tor network improves the overall utilization of bandwidth capacity. Exit traffic is one of the most sensitive parts of network data in the Tor network. Even though these statistics do not require looking at traffic contents, statistics are aggregated so that they are not useful for de-anonymizing users. Only those ports are reported that have seen at least 0.1% of exiting or incoming bytes, numbers of bytes are rounded up to full kibibytes (KiB), and stream numbers are rounded up to the next multiple of 4. "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). An "exit-stats-end" line, as well as any other "exit-*" line, is first added after the relay has been running for at least 24 hours and only if the relay permits exiting (where exiting to a single port and IP address is sufficient). "exit-kibibytes-written" port=N,port=N,... NL [At most once.] "exit-kibibytes-read" port=N,port=N,... NL [At most once.] List of mappings from ports to the number of kibibytes that the relay has written to or read from exit connections to that port, rounded up to the next full kibibyte. "exit-streams-opened" port=N,port=N,... NL [At most once.] List of mappings from ports to the number of opened exit streams to that port, rounded up to the nearest multiple of 4. Implementation notes: Right now, relays that are configured accordingly write similar statistics to those described in this proposal to disk every 24 hours. With this proposal being implemented, relays include the contents of these files in extra-info documents. The following steps are necessary to implement this proposal: 1. The current format of [dirreq|entry|buffer|exit]-stats files needs to be adapted to the description in this proposal. This step basically means renaming keywords. 2. The timing of writing the four *-stats files should be unified, so that they are written exactly 24 hours after starting the relay. Right now, the measurement intervals for dirreq, entry, and exit stats starts with the first observed request, and files are written when observing the first request that occurs more than 24 hours after the beginning of the measurement interval. With this proposal, the measurement intervals should all start at the same time, and files should be written exactly 24 hours later. 3. It is advantageous to cache statistics in local files in the data directory until they are included in extra-info documents. The reason is that the 24-hour measurement interval can be very different from the 18-hour publication interval of extra-info documents. When a relay crashes after finishing a measurement interval, but before publishing the next extra-info document, statistics would get lost. Therefore, statistics are written to disk when finishing a measurement interval and read from disk when generating an extra-info document. Only the statistics that were appended to the *-stats files within the past 24 hours are included in extra-info documents. Further, the contents of the *-stats files need to be checked in the process of generating extra-info documents. 4. With the statistics patches being tested, the ./configure options should be removed and the statistics code be compiled by default. It is still required for relay operators to add configuration options (DirReqStatistics, ExitPortStatistics, etc.) to enable gathering statistics. However, in the near future, statistics shall be enabled gathered by all relays by default, where requiring a ./configure option would be a barrier for many relay operators.
Filename: 167-params-in-consensus.txt Title: Vote on network parameters in consensus Author: Roger Dingledine Created: 18-Aug-2009 Status: Closed Implemented-In: 0.2.2 0. History 1. Overview Several of our new performance plans involve guessing how to tune clients and relays, yet we won't be able to learn whether we guessed the right tuning parameters until many people have upgraded. Instead, we should have directory authorities vote on the parameters, and teach Tors to read the currently recommended values out of the consensus. 2. Design V3 votes should include a new "params" line after the known-flags line. It contains key=value pairs, where value is an integer. Consensus documents that are generated with a sufficiently new consensus method (7?) then include a params line that includes every key listed in any vote, and the median value for that key (in case of ties, we use the median closer to zero). 2.1. Planned keys. The first planned parameter is "circwindow=101", which is the initial circuit packaging window that clients and relays should use. Putting it in the consensus will let us perform experiments with different values once enough Tors have upgraded -- see proposal 168. Later parameters might include a weighting for how much to favor quiet circuits over loud circuits in our round-robin algorithm; a weighting for how much to prioritize relays over clients if we use an incentive scheme like the gold-star design; and what fraction of circuits we should throw out from proposal 151. 2.2. What about non-integers? I'm not sure how we would do median on non-integer values. Further, I don't have any non-integer values in mind yet. So I say we cross that bridge when we get to it.
Filename: 168-reduce-circwindow.txt Title: Reduce default circuit window Author: Roger Dingledine Created: 12-Aug-2009 Status: Rejected 0. History 1. Overview We should reduce the starting circuit "package window" from 1000 to 101. The lower package window will mean that clients will only be able to receive 101 cells (~50KB) on a circuit before they need to send a 'sendme' acknowledgement cell to request 100 more. Starting with a lower package window on exit relays should save on buffer sizes (and thus memory requirements for the exit relay), and should save on queue sizes (and thus latency for users). Lowering the package window will induce an extra round-trip for every additional 50298 bytes of the circuit. This extra step is clearly a slow-down for large streams, but ultimately we hope that a) clients fetching smaller streams will see better response, and b) slowing down the large streams in this way will produce lower e2e latencies, so the round-trips won't be so bad. 2. Motivation Karsten's torperf graphs show that the median download time for a 50KB file over Tor in mid 2009 is 7.7 seconds, whereas the median download time for 1MB and 5MB are around 50s and 150s respectively. The 7.7 second figure is way too high, whereas the 50s and 150s figures are surprisingly low. The median round-trip latency appears to be around 2s, with 25% of the data points taking more than 5s. That's a lot of variance. We designed Tor originally with the goal of maximizing throughput. We figured that would also optimize other network properties like round-trip latency. Looks like we were wrong. 3. Design Wherever we initialize the circuit package window, initialize it to 101 rather than 1000. Reducing it should be safe even when interacting with old Tors: the old Tors will receive the 101 cells and send back a sendme ack cell. They'll still have much higher deliver windows, but the rest of their deliver window will go unused. You can find the patch at arma/circwindow. It seems to work. 3.1. Why not 100? Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme ack cell after 101 cells rather than the intended 100 cells. Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But hopefully we'll have moved to some datagram protocol long before 0.2.1.19 becomes obsolete. 3.2. What about stream packaging windows? Right now the stream packaging windows start at 500. The goal was to set the stream window to half the circuit window, to provide a crude load balancing between streams on the same circuit. Once we lower the circuit packaging window, the stream packaging window basically becomes redundant. We could leave it in -- it isn't hurting much in either case. Or we could take it out -- people building other Tor clients would thank us for that step. Alas, people building other Tor clients are going to have to be compatible with current Tor clients, so in practice there's no point taking out the stream packaging windows. 3.3. What about variable circuit windows? Once upon a time we imagined adapting the circuit package window to the network conditions. That is, we would start the window small, and raise it based on the latency and throughput we see. In theory that crude imitation of TCP's windowing system would allow us to adapt to fill the network better. In practice, I think we want to stick with the small window and never raise it. The low cap reduces the total throughput you can get from Tor for a given circuit. But that's a feature, not a bug. 4. Evaluation How do we know this change is actually smart? It seems intuitive that it's helpful, and some smart systems people have agreed that it's a good idea (or said another way, they were shocked at how big the default package window was before). To get a more concrete sense of the benefit, though, Karsten has been running torperf side-by-side on exit relays with the old package window vs the new one. The results are mixed currently -- it is slightly faster for fetching 40KB files, and slightly slower for fetching 50KB files. I think it's going to be tough to get a clear conclusion that this is a good design just by comparing one exit relay running the patch. The trouble is that the other hops in the circuits are still getting bogged down by other clients introducing too much traffic into the network. Ultimately, we'll want to put the circwindow parameter into the consensus so we can test a broader range of values once enough relays have upgraded. 5. Transition and deployment We should put the circwindow in the consensus (see proposal 167), with an initial value of 101. Then as more exit relays upgrade, clients should seamlessly get the better behavior. Note that upgrading the exit relay will only affect the "download" package window. An old client that's uploading lots of bytes will continue to use the old package window at the client side, and we can't throttle that window at the exit side without breaking protocol. The real question then is what we should backport to 0.2.1. Assuming this could be a big performance win, we can't afford to wait until 0.2.2.x comes out before starting to see the changes here. So we have two options as I see them: a) once clients in 0.2.2.x know how to read the value out of the consensus, and it's been tested for a bit, backport that part to 0.2.1.x. b) if it's too complex to backport, just pick a number, like 101, and backport that number. Clearly choice (a) is the better one if the consensus parsing part isn't very complex. Let's shoot for that, and fall back to (b) if the patch turns out to be so big that we reconsider.
Filename: 169-eliminating-renegotiation.txt Title: Eliminate TLS renegotiation for the Tor connection handshake Author: Nick Mathewson Created: 27-Jan-2010 Status: Superseded Target: 0.2.2 Superseded-By: 176 1. Overview I propose a backward-compatible change to the Tor connection establishment protocol to avoid the use of TLS renegotiation. Rather than doing a TLS renegotiation to exchange certificates and authenticate the original handshake, this proposal takes an approach similar to Steven Murdoch's proposal 124, and uses Tor cells to finish authenticating the parties' identities once the initial TLS handshake is finished. Terminological note: I use "client" below to mean the Tor instance (a client or a relay) that initiates a TLS connection, and "server" to mean the Tor instance (a relay) that accepts it. 2. Motivation and history In the original Tor TLS connection handshake protocol ("V1", or "two-cert"), parties that wanted to authenticate provided a two-cert chain of X.509 certificates during the handshake setup phase. Every party that wanted to authenticate sent these certificates. In the current Tor TLS connection handshake protocol ("V2", or "renegotiating"), the parties begin with a single certificate sent from the server (responder) to the client (initiator), and then renegotiate to a two-certs-from-each-authenticating-party. We made this change to make Tor's handshake look like a browser speaking SSL to a webserver. (See proposal 130, and tor-spec.txt.) To tell whether to use the V1 or V2 handshake, servers look at the list of ciphers sent by the client. (This is ugly, but there's not much else in the ClientHello that they can look at.) If the list contains any cipher not used by the V1 protocol, the server sends back a single cert and expects a renegotiation. If the client gets back a single cert, then it withholds its own certificates until the TLS renegotiation phase. In other words, initiator behavior now looks like this: - Begin TLS negotiation with V2 cipher list; wait for certificate(s). - If we get a certificate chain: - Then we are using the V1 handshake. Send our own certificate chain as part of this initial TLS handshake if we want to authenticate; otherwise, send no certificates. When the handshake completes, check certificates. We are now mutually authenticated. Otherwise, if we get just a single certificate: - Then we are using the V2 handshake. Do not send any certificates during this handshake. - When the handshake is done, immediately start a TLS renegotiation. During the renegotiation, expect a certificate chain from the server; send a certificate chain of our own if we want to authenticate ourselves. - After the renegotiation, check the certificates. Then send (and expect) a VERSIONS cell from the other side to establish the link protocol version. And V2 responder behavior now looks like this: - When we get a TLS ClientHello request, look at the cipher list. - If the cipher list contains only the V1 ciphersuites: - Then we're doing a V1 handshake. Send a certificate chain. Expect a possible client certificate chain in response. Otherwise, if we get other ciphersuites: - We're using the V2 handshake. Send back a single certificate and let the handshake complete. - Do not accept any data until the client has renegotiated. - When the client is renegotiating, send a certificate chain, and expect (possibly multiple) certificates in reply. - Check the certificates when the renegotiation is done. Then exchange VERSIONS cells. Late in 2009, researchers found a flaw in most applications' use of TLS renegotiation: Although TLS renegotiation does not reauthenticate any information exchanged before the renegotiation takes place, many applications were treating it as though it did, and assuming that data sent _before_ the renegotiation was authenticated with the credentials negotiated _during_ the renegotiation. This problem was exacerbated by the fact that most TLS libraries don't actually give you an obvious good way to tell where the renegotiation occurred relative to the datastream. Tor wasn't directly affected by this vulnerability, but its aftermath hurts us in a few ways: 1) OpenSSL has disabled renegotiation by default, and created a "yes we know what we're doing" option we need to set to turn it back on. (Two options, actually: one for openssl 0.9.8l and one for 0.9.8m and later.) 2) Some vendors have removed all renegotiation support from their versions of OpenSSL entirely, forcing us to tell users to either replace their versions of OpenSSL or to link Tor against a hand-built one. 3) Because of 1 and 2, I'd expect TLS renegotiation to become rarer and rarer in the wild, making our own use stand out more. 3. Design 3.1. The view in the large Taking a cue from Steven Murdoch's proposal 124, I propose that we move the work currently done by the TLS renegotiation step (that is, authenticating the parties to one another) and do it with Tor cells instead of with TLS. Using _yet another_ variant response from the responder (server), we allow the client to learn that it doesn't need to rehandshake and can instead use a cell-based authentication system. Once the TLS handshake is done, the client and server exchange VERSIONS cells to determine link protocol version (including handshake version). If they're using the handshake version specified here, the client and server arrive at link protocol version 3 (or higher), and use cells to exchange further authentication information. 3.2. New TLS handshake variant We already used the list of ciphers from the clienthello to indicate whether the client can speak the V2 ("renegotiating") handshake or later, so we can't encode more information there. We can, however, change the DN in the certificate passed by the server back to the client. Currently, all V2 certificates are generated with CN values ending with ".net". I propose that we have the ".net" commonName ending reserved to indicate the V2 protocol, and use commonName values ending with ".com" to indicate the V3 ("minimal") handshake described herein. Now, once the initial TLS handshake is done, the client can look at the server's certificate(s). If there is a certificate chain, the handshake is V1. If there is a single certificate whose subject commonName ends in ".net", the handshake is V2 and the client should try to renegotiate as it would currently. Otherwise, the client should assume that the handshake is V3+. [Servers should _only_ send ".com" addesses, to allow room for more signaling in the future.] 3.3. Authenticating inside Tor Once the TLS handshake is finished, if the client renegotiates, then the server should go on as it does currently. If the client implements this proposal, however, and the server has shown it can understand the V3+ handshake protocol, the client immediately sends a VERSIONS cell to the server and waits to receive a VERSIONS cell in return. We negotiate the Tor link protocol version _before_ we proceed with the negotiation, in case we need to change the authentication protocol in the future. Once either party has seen the VERSIONS cell from the other, it knows which version they will pick (that is, the highest version shared by both parties' VERSIONS cells). All Tor instances using the handshake protocol described in 3.2 MUST support at least link protocol version 3 as described here. On learning the link protocol, the server then sends the client a CERT cell and a NETINFO cell. If the client wants to authenticate to the server, it sends a CERT cell, an AUTHENTICATE cell, and a NETINFO cell; or it may simply send a NETINFO cell if it does not want to authenticate. The CERT cell describes the keys that a Tor instance is claiming to have. It is a variable-length cell. Its payload format is: N: Number of certs in cell [1 octet] N times: CLEN [2 octets] Certificate [CLEN octets] Any extra octets at the end of a CERT cell MUST be ignored. Each certificate has the form: CertType [1 octet] CertPurpose [1 octet] PublicKeyLen [2 octets] PublicKey [PublicKeyLen octets] NotBefore [4 octets] NotAfter [4 octets] SignerID [HASH256_LEN octets] SignatureLen [2 octets] Signature [SignatureLen octets] where CertType is 1 (meaning "RSA/SHA256") CertPurpose is 1 (meaning "link certificate") PublicKey is the DER encoding of the ASN.1 representation of the RSA key of the subject of this certificate NotBefore is a time in HOURS since January 1, 1970, 00:00 UTC before which this certificate should not be considered valid. NotAfter is a time in HOURS since January 1, 1970, 00:00 UTC after which this certificate should not be considered valid. SignerID is the SHA-256 digest of the public key signing this certificate and Signature is the signature of all the other fields in this certificate, using SHA256 as described in proposal 158. While authenticating, a server need send only a self-signed certificate for its identity key. (Its TLS certificate already contains its link key signed by its identity key.) A client that wants to authenticate MUST send two certificates: one containing a public link key signed by its identity key, and one self-signed cert for its identity. Tor instances MUST ignore any certificate with an unrecognized CertType or CertPurpose, and MUST ignore extra bytes in the cert. The AUTHENTICATE cell proves to the server that the client with whom it completed the initial TLS handshake is the one possessing the link public key in its certificate. It is a variable-length cell. Its contents are: SignatureType [2 octets] SignatureLen [2 octets] Signature [SignatureLen octets] where SignatureType is 1 (meaning "RSA-SHA256") and Signature is an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master secret key as its key, of the following elements: - The SignatureType field (0x00 0x01) - The NUL terminated ASCII string: "Tor certificate verification" - client_random, as sent in the Client Hello - server_random, as sent in the Server Hello Once the above handshake is complete, the client knows (from the initial TLS handshake) that it has a secure connection to an entity that controls a given link public key, and knows (from the CERT cell) that the link public key is a valid public key for a given Tor identity. If the client authenticates, the server learns from the CERT cell that a given Tor identity has a given current public link key. From the AUTHENTICATE cell, it knows that an entity with that link key knows the master secret for the TLS connection, and hence must be the party with whom it's talking, if TLS works. 3.4. Security checks If the TLS handshake indicates a V2 or V3+ connection, the server MUST reject any connection from the client that does not begin with either a renegotiation attempt or a VERSIONS cell containing at least link protocol version "3". If the TLS handshake indicates a V3+ connection, the client MUST reject any connection where the server sends anything before the client has sent a VERSIONS cell, and any connection where the VERSIONS cell does not contain at least link protocol version "3". If link protocol version 3 is chosen: Clients and servers MUST check that all digests and signatures on the certificates in CERT cells they are given are as described above. After the VERSIONS cell, clients and servers MUST close the connection if anything besides a CERT or AUTH cell is sent before the CERT or AUTHENTICATE cells anywhere after the first NETINFO cell must be rejected. ... [write more here. What else?] ... 3.5. Summary We now revisit the protocol outlines from section 2 to incorporate our changes. New or modified steps are marked with a *. The new initiator behavior now looks like this: - Begin TLS negotiation with V2 cipher list; wait for certificate(s). - If we get a certificate chain: - Then we are using the V1 handshake. Send our own certificate chain as part of this initial TLS handshake if we want to authenticate; otherwise, send no certificates. When the handshake completes, check certificates. We are now mutually authenticated. Otherwise, if we get just a single certificate: - Then we are using the V2 or the V3+ handshake. Do not send any certificates during this handshake. * When the handshake is done, look at the server's certificate's subject commonName. * If it ends with ".net", we're doing a V2 handshake: - Immediately start a TLS renegotiation. During the renegotiation, expect a certificate chain from the server; send a certificate chain of our own if we want to authenticate ourselves. - After the renegotiation, check the certificates. Then send (and expect) a VERSIONS cell from the other side to establish the link protocol version. * If it ends with anything else, assume a V3 or later handshake: * Send a VERSIONS cell, and wait for a VERSIONS cell from the server. * If we are authenticating, send CERT and AUTHENTICATE cells. * Send a NETINFO cell. Wait for a CERT and a NETINFO cell from the server. * If the CERT cell contains a valid self-identity cert, and the identity key in the cert can be used to check the signature on the x.509 certificate we got during the TLS handshake, then we know we connected to the server with that identity. If any of these checks fail, or the identity key was not what we expected, then we close the connection. * Once the NETINFO cell arrives, continue as before. And V3+ responder behavior now looks like this: - When we get a TLS ClientHello request, look at the cipher list. - If the cipher list contains only the V1 ciphersuites: - Then we're doing a V1 handshake. Send a certificate chain. Expect a possible client certificate chain in response. Otherwise, if we get other ciphersuites: - We're using the V2 handshake. Send back a single certificate whose subject commonName ends with ".com", and let the handshake complete. * If the client does anything besides renegotiate or send a VERSIONS cell, drop the connection. - If the client renegotiates immediately, it's a V2 connection: - When the client is renegotiating, send a certificate chain, and expect (possibly multiple certificates in reply). - Check the certificates when the renegotiation is done. Then exchange VERSIONS cells. * Otherwise we got a VERSIONS cell and it's a V3 handshake. * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE cell, and a NETINFO cell. * Wait for the client to send cells in reply. If the client sends a CERT and an AUTHENTICATE and a NETINFO, use them to authenticate the client. If the client sends a NETINFO, it is unauthenticated. If it sends anything else before its NETINFO, it's rejected. 4. Numbers to assign We need a version number for this link protocol. I've been calling it "3". We need to reserve command numbers for CERT and AUTH cells. I suggest that in link protocol 3 and higher, we reserve command numbers 128..240 for variable-length cells. (241-256 we can hold for future extensions.) 5. Efficiency This protocol adds a round-trip step when the client sends a VERSIONS cell to the server and waits for the {VERSIONS, CERT, NETINFO} response in turn. (The server then waits for the client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply, but it would have already been waiting for the client's NETINFO, so that's not an additional wait.) This is actually fewer round-trip steps than required before for TLS renegotiation, so that's a win. 6. Open questions: - Should we use X.509 certificates instead of the certificate-ish things we describe here? They are more standard, but more ugly. - May we cache which certificates we've already verified? It might leak in timing whether we've connected with a given server before, and how recently. - Is there a better secret than the master secret to use in the AUTHENTICATE cell? Say, a portable one? Can we get at it for other libraries besides OpenSSL? - Does using the client_random and server_random data in the AUTHENTICATE message actually help us? How hard is it to pull them out of the OpenSSL data structure? - Can we give some way for clients to signal "I want to use the V3 protocol if possible, but I can't renegotiate, so don't give me the V2"? Clients currently have a fair idea of server versions, so they could potentially do the V3+ handshake with servers that support it, and fall back to V1 otherwise. - What should servers that don't have TLS renegotiation do? For now, I think they should just get it. Eventually we can deprecate the V2 handshake as we did with the V1 handshake.
Title: Configuration options regarding circuit building Filename: 170-user-path-config.txt Author: Sebastian Hahn Created: 01-March-2010 Status: Superseded Overview: This document outlines how Tor handles the user configuration options to influence the circuit building process. Motivation: Tor's treatment of the configuration *Nodes options was surprising to many users, and quite a few conspiracy theories have crept up. We should update our specification and code to better describe and communicate what is going during circuit building, and how we're honoring configuration. So far, we've been tracking a bugreport about this behaviour ( https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 ) and Nick replied in a thread on or-talk ( http://archives.seul.org/or/talk/Feb-2010/msg00117.html ). This proposal tries to document our intention for those configuration options. Design: Five configuration options are available to users to influence Tor's circuit building. EntryNodes and ExitNodes define a list of nodes that are for the Entry/Exit position in all circuits. ExcludeNodes is a list of nodes that are used for no circuit, and ExcludeExitNodes is a list of nodes that aren't used as the last hop. StrictNodes defines Tor's behaviour in case of a conflict, for example when a node that is excluded is the only available introduction point. Setting StrictNodes to 1 breaks Tor's functionality in that case, and it will refuse to build such a circuit. Neither Nick's email nor bug 1090 have clear suggestions how we should behave in each case, so I tried to come up with something that made sense to me. Security implications: Deviating from normal circuit building can break one's anonymity, so the documentation of the above option should contain a warning to make users aware of the pitfalls. Specification: It is proposed that the "User configuration" part of path-spec (section 2.2.2) be replaced with this: Users can alter the default behavior for path selection with configuration options. In case of conflicts (excluding and requiring the same node) the "StrictNodes" option is used to determine behaviour. If a nodes is both excluded and required via a configuration option, the exclusion takes preference. - If "ExitNodes" is provided, then every request requires an exit node on the ExitNodes list. If a request is supported by no nodes on that list, and "StrictNodes" is false, then Tor treats that request as if ExitNodes were not provided. - "EntryNodes" behaves analogously. - If "ExcludeNodes" is provided, then no circuit uses any of the nodes listed. If a circuit requires an excluded node to be used, and "StrictNodes" is false, then Tor uses the node in that position while not using any other of the excluded nodes. - If "ExcludeExitNodes" is provided, then Tor will not use the nodes listed for the exit position in a circuit. If a circuit requires an excluded node to be used in the exit position and "StrictNodes" is false, then Tor builds that circuit as if ExcludeExitNodes were not provided. - If a user tries to connect to or resolve a hostname of the form <target>.<servername>.exit and the "AllowDotExit" configuration option is set to 1, the request is rewritten to a request for <target>, and the request is only supported by the exit whose nickname or fingerprint is <servername>. If "AllowDotExit" is set to 0 (default), any request for <anything>.exit is denied. - When any of the *Nodes settings are changed, all circuits are expired immediately, to prevent a situation where a previously built circuit is used even though some of its nodes are now excluded. Compatibility: The old Strict*Nodes options are deprecated, and the StrictNodes option is new. Tor users may need to update their configuration file.
Filename: 171-separate-streams.txt Title: Separate streams across circuits by connection metadata Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson Created: 21-Oct-2008 Modified: 7-Dec-2010 Status: Closed Implemented-In: 0.2.3.3-alpha Summary: We propose a new set of options to isolate unrelated streams from one another, putting them on separate circuits so that semantically unrelated traffic is not inadvertently made linkable. Motivation: Currently, Tor attaches regular streams (that is, ones not carrying rendezvous or directory traffic) to circuits based only on whether Tor circuit's current exit node supports the destination, and whether the circuit has been dirty (that is, in use) for too long. This means that traffic that would otherwise be unrelated sometimes gets sent over the same circuit, allowing the exit node to link such streams with certainty, and allowing other parties to link such streams probabilistically. Older versions of onion routing tried to address this problem by sending every stream over a separate circuit; performance issues made this unfeasible. Moreover, in the presence of a localized adversary, separating streams by circuits increases the odds that, for any given linked set of streams, at least one will go over a compromised circuit. Therefore we ought to look for ways to allow streams that ought to be linked to travel over a single circuit, while keeping streams that ought not be linked isolated to separate circuits. Discussion: Let's call a series of inherently-linked streams (like a set of streams downloading objects from the same webpage, or a browsing session where the user requests several related webpages) a "Session". "Sessions" are a necessarily a fuzzy concept. While users typically consider some activities as wholly unrelated to each other ("My IM session has nothing to do with my web browsing!"), the boundaries between activities are sometimes hard to determine. If I'm reading lolcats in one browser tab and reading about treatments for an embarrassing disease in another, those are probably separate sessions. If I search for a forum, log in, read it for a while, and post a few messages on unrelated topics, that's probably all the same session. So with the proviso that no automated process can identify sessions 100% accurately, let's see which options we have available. Generally, all the streams on a session come from a single application. Unfortunately, isolating streams by application automatically isn't feasible, given the lack of any nice cross-platform way to tell which local process originated a given connection. (Yes, lsof works. But a quick review of the lsof code should be sufficient to scare you away from thinking there is a portable option, much less a portable O(1) option.) So instead, we'll have to use some other aspect of a Tor request as a proxy for the application. Generally, traffic from separate applications is not in the same session. With some applications (IRC, for example), each stream is a session. Some applications (most notably web browsing) can't be meaningfully split into sessions without inspecting the traffic itself and maintaining a lot of state. How well do ports correspond to sessions? Early versions of this proposal focused on using destination ports as a proxy for application, since a connection to port 22 for SSH is probably not in the same session as one to port 80. This only works with some applications better than others, though: while SSH users typically know when they're on port 22 and when they aren't, a web browser can be coaxed (though img urls or any number of releated tricks) into connecting to any port at all. Moreover, when Tor gets a DNS lookup request, it doesn't know in advance which port the resulting address will be used to connect to. So in summary, each kind of traffic wants to follow different rules, and assuming the existence of a web browser and a hostile web page or exit node, we can't tell one kind of traffic from another by simply looking at the destination:port of the traffic. Fortunately, we're not doomed. Design: When a stream arrives at Tor, we have the following data to examine: 1) The destination address 2) The destination port (unless this a DNS lookup) 3) The protocol used by the application to send the stream to Tor: SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy" mechanism the kernel gives us. 4) The port used by the application to send the stream to Tor -- that is, the SOCKSListenAddress or TransListenAddress that the application used, if we have more than one. 5) The SOCKS username and password, if any. 6) The source address and port for the application. We propose to use 3, 4, and 5 as a backchannel for applications to tell Tor about different sessions. Rather than running only one SOCKSPort, a Tor user who would prefer better session isolation should run multiple SOCKSPorts/TransPorts, and configure different applications to use separate ports. Applications that support SOCKS authentication can further be separated on a single port by their choice of username/password. Streams sent to separate ports or using different authentication information should never be sent over the same circuit. We allow each port to have its own settings for isolation based on destination port, destination address, or both. Handling DNS can be a challenge. We can get hostnames by one of three means: A) A SOCKS4a request, or a SOCKS5 request with a hostname. This case is handled trivially using the rules above. B) A RESOLVE request on a SOCKSPort. This case is handled using the rules above, except that port isolation can't work to isolate RESOLVE requests into a proper session, since we don't know which port will eventually be used when we connect to the returned address. C) A request on a DNSPort. We have no way of knowing which address/port will be used to connect to the requested address. When B or C is required but problematic, we could favor the use of AutomapHostsOnResolve. Interface: We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax: ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?) OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort" Addr = An IPv4 address / an IPv6 address surrounded by brackets. If optional, we default to 127.0.0.1 Port = An integer from 1 through 65535 inclusive Options = Option Options = Options SP Option Option = IsolateOption / GroupOption GroupOption = "SessionGroup=" UINT IsolateOption = OptNo ("IsolateDestPort" / "IsolateDestAddr" / "IsolateSOCKSUser"/ "IsolateClientProtocol" / "IsolateClientAddr") OptPlural OptNo = "No" ? OptPlural = "s" ? SP = " " UINT = An unsigned integer All options are case-insensitive. The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively turn them off. The IsolateDestPort and IsolateDestAddr and IsolateClientProtocol options are off by default. NoIsolateDestPort and NoIsolateDestAddr and NoIsolateClientProtocol have no effect. Given a set of ClientPortLines, streams must NOT be placed on the same circuit if ANY of the following hold: * They were sent to two different client ports, unless the two client ports both specify a "SessionGroup" option with the same integer value. * At least one was sent to a client port with the IsolateDestPort active, and they have different destination ports. * At least one was sent to a client port with IsolateDestAddr active, and they have different destination addresses. * At least one was sent to a client port with IsolateClientProtocol active, and they use different protocols (where SOCKS4, SOCKS4a, SOCKS5, TransPort, NatdPort, and DNS are the protocols in question) * At least one was sent to a client port with IsolateSOCKSUser active, and they have different SOCKS username/password values configurations. (For the purposes of this option, the username/password pair of ""/"" is distinct from SOCKS without authentication, and both are distinct from any non-SOCKS client's non-authentication.) * At least one was sent to a client port with IsolateClientAddr active, and they came from different client addresses. (For the purpose of this option, any local interface counts as the same address. So if the host is configured with addresses 10.0.0.1, 192.0.32.10, and 127.0.0.1, then traffic from those addresses can leave on the same circuit, but traffic to from 10.0.0.2 (for example) could not share a circuit with any of them.) These rules apply regardless of whether the streams are active at the same time. In other words, if the rules say that streams A and B must not be on the same circuit, and stream A is attached to circuit X, then stream B must never be attached to stream X, even if stream A is closed first. Alternative Interface: We're cramming a lot onto one line in the design above. Perhaps instead it would be a better idea to have grouped lines of the form: StreamGroup 1 SOCKSPort 9050 TransPort 9051 IsolateDestPort 1 IsolateClientProtocol 0 EndStreamGroup StreamGroup 2 SOCKSPort 9052 DNSPort 9053 IsolateDestAddr 1 EndStreamGroup This would be equivalent to: SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol SOCKSPort 9052 SessionGroup=2 IsolateDestAddr DNSPort 9053 SessionGroup=2 IsolateDestAddr But it would let us extend range of allowed options later without having client port lines group without bound. For example, we might give different circuit building parameters to different session groups. Example of use: Suppose that we want to use a web browser, an IRC client, and a SSH client all at the same time. Let's assume that we want web traffic to be isolated from all other traffic, even if the browser makes connections to ports usually used for IRC or SSH. Let's also assume that IRC and SSH are both used for relatively long-lived connections, and we want to keep all IRC/SSH sessions separate from one another. In this case, we could say: SOCKSPort 9050 SOCKSPort 9051 IsolateDestAddr IsolateDestPort We would then configure our browser to use 9050 and our IRC/SSH clients to use 9051. Advanced example of use, #2: Suppose that we have a bunch of applications, and we launch them all using torsocks, and we want to keep each applications isolated from one another. We just create a shell script, "torlaunch": #!/bin/bash export TORSOCKS_USERNAME="$1" exec torsocks $@ And we configure our SOCKSPort with IsolateSOCKSUser. Or if we're on Linux and we want to isolate by application invocation, we would change the TORSOCKS_USERNAME line to: export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`" Advanced example of use, #2: Now suppose that we want to achieve the benefits of the first example of use, but we are stuck using transparent proxies. Let's suppose this is Linux. TransPort 9090 TransPort 9091 IsolateDestAddr IsolateDestPort DNSPort 5353 AutomapHostsOnResolve 1 Here we use the iptables --cmd-owner filter to distinguish which command is originating the packets, directing traffic from our irc client and our SSH client to port 9091, and directing other traffic to 9090. Using AutomapHostsOnResolve will confuse ssh in its default configuration; we'll need to find a way around that. Security Risks: Disabling IsolateClientAddr is a pretty bad idea. Setting up a set of applications to use this system effectively is a big problem. It's likely that lots of people who try to do this will mess it up. We should try to see which setups are sensible, and see if we can provide good feedback to explain which streams are isolated how. Performance Risks: This proposal will result in clients building many more circuits than they do today. To avoid accidentally hammering the network, we should have in-process limits on the maximum circuit creation rate and the total maximum client circuits. Specification: The Tor client circuit selection process is not entirely specified. Any client circuit specification must take these changes into account. Implementation notes: The more obvious ways to implement the "find a good circuit to attach to" part of this proposal involve doing an O(n_circuits) operation every time we have a stream to attach. We already do such an operation, so it's not as if we need to hunt for fancy ways to make it O(1). What will be harder is implementing the "launch circuits as needed" part of the proposal. Still, it should come down to "a simple matter of programming." The SOCKS4 spec has the client provide authentication info when it connects; accepting such info is no problem. But the SOCKS5 spec has the client send a list of known auth methods, then has the server send back the authentication method it chooses. We'll need to update the SOCKS5 implementation so it can accept user/password authentication if it's offered. If we use the second syntax for describing these options, we'll want to add a new "section-based" entry type for the configuration parser. Not a huge deal; we already have kludged up something similar for hidden service configurations. Opening circuits for predicted ports has the potential to get a little more complicated; we can probably get away with the existing algorithm, though, to see where its weak points are and look for better ones. Perhaps we can get our next-gen HTTP proxy to communicate browser tab or session into to tor via authentication, or have torbutton do it directly. More design is needed here, though. Alternative designs: The implementation of this option may want to consider cases where the same exit node is shared by two or more circuits and IsolateStreamsByPort is in force. Since one possible use of the option is to reduce the opportunity of Exit Nodes to attack traffic from the same source on multiple ports, the implementation may need to ensure that circuits reserved for the exclusive use of given ports do not share the same exit node. On the other hand, if our goal is only that streams should be unlinkable, deliberately shunting them to different exit nodes is unnecessary and slightly counterproductive. Earlier versions of this design included a mechanism to isolate _particular_ destination ports and addresses, so that traffic sent to, say, port 22 would never share a port with any traffic *not* sent to port 22. You can achieve this here by having all applications that send traffic to one of these ports use a separate SOCKSPort, and then setting IsolateDestPorts on that SOCKSPort. Future work: Nikita Borisov suggests that different session profiles -- so long as there aren't too many of them -- could well get different guard node allocations in order to prevent guard profiling. This can be done orthogonally to the rest of this proposal. Lingering questions: I suspect there are issues remaining with DNS and TransPort users, and that my "just use AutomapHostsOnResolve" suggestion may be insufficient.
Filename: 172-circ-getinfo-option.txt Title: GETINFO controller option for circuit information Author: Damian Johnson Created: 03-June-2010 Status: Reserve Overview: This details an additional GETINFO option that would provide information concerning a relay's current circuits. Motivation: The original proposal was for connection related information, but Jake make the excellent point that any information retrieved from the control port is... 1. completely ineffectual for auditing purposes since either (a) these results can be fetched from netstat already or (b) the information would only be provided via tor and can't be validated. 2. The more useful uses for connection information can be achieved with much less (and safer) information. Hence the proposal is now for circuit based rather than connection based information. This would strip the most controversial and sensitive data entirely (ip addresses, ports, and connection based bandwidth breakdowns) while still being useful for the following purposes: - Basic Relay Usage Questions How is the bandwidth I'm contributing broken down? Is it being evenly distributed or is someone hogging most of it? Do these circuits belong to the hidden service I'm running or something else? Now that I'm using exit policy X am I desirable as an exit, or are most people just using me as a relay? - Debugging Say a relay has a restrictive firewall policy for outbound connections, with the ORPort whitelisted but doesn't realize that tor needs random high ports. Tor would report success ("your orport is reachable - excellent") yet the relay would be nonfunctional. This proposed information would reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good indicator of what's wrong. - Visualization A nice benefit of visualizing tor's behavior is that it becomes a helpful tool in puzzling out how tor works. For instance, tor spawns numerous client connections at startup (even if unused as a client). As a newcomer to tor these asymmetric (outbound only) connections mystified me for quite a while until until Roger explained their use to me. The proposed TYPE_FLAGS would let controllers clearly label them as being client related, making their purpose a bit clearer. At the moment connection data can only be retrieved via commands like netstat, ss, and lsof. However, providing an alternative via the control port provides several advantages: - scrubbing for private data Raw connection data has no notion of what's sensitive and what is not. The relay's flags and cached consensus can be used to take educated guesses concerning which connections could possibly belong to client or exit traffic, but this is both difficult and inaccurate. Anything provided via the control port can scrubbed to make sure we aren't providing anything we think relay operators should not see. - additional information All connection querying commands strictly provide the ip address and port of connections, and nothing else. However, for the uses listed above the far more interesting attributes are the circuit's type, bandwidth usage and uptime. - improved performance Querying connection data is an expensive activity, especially for busy relays or low end processors (such as mobile devices). Tor already internally knows its circuits, allowing for vastly quicker lookups. - cross platform capability The connection querying utilities mentioned above not only aren't available under Windows, but differ widely among different *nix platforms. FreeBSD in particular takes a very unique approach, dropping important options from netstat and assigning ss to a spreadsheet application instead. A controller interface, however, would provide a uniform means of retrieving this information. Security Implications: This is an open question. This proposal lacks the most controversial pieces of information (ip addresses and ports) and insight into potential threats this would pose would be very welcomed! Specification: The following addition would be made to the control-spec's GETINFO section: "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay circuit, formatted as: CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag> READ=<bytes> WRITE=<bytes> none of the parameters contain whitespace, and additional results must be ignored to allow for future expansion. Parameters are defined as follows: CIRC_ID - Unique numeric identifier for the circuit this belongs to. CREATED - Unix timestamp (as seconds since the Epoch) for when the circuit was created. UPDATED - Unix timestamp for when this information was last updated. TYPE - Single character flags indicating attributes in the circuit: (E)ntry : has a connection that doesn't belong to a known Tor server, indicating that this is either the first hop or bridged E(X)it : has been used for at least one exit stream (R)elay : has been extended Rende(Z)vous : is being used for a rendezvous point (I)ntroduction : is being used for a hidden service introduction (N)one of the above: none of the above have happened yet. READ - Total bytes transmitted toward the exit over the circuit. WRITE - Total bytes transmitted toward the client over the circuit. "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by newlines. The following would be included for circ info update events. 4.1.X. Relay circuit status changed The syntax is: "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP Read SP Write] CRLF Notice = "NEW" / ; first information being provided for this circuit "UPDATE" / ; update for a previously reported circuit "CLOSED" ; notice that the circuit no longer exists Notice indicating that queryable information on a relay related circuit has changed. If the Notice parameter is either "NEW" or "UPDATE" then this provides the same fields that would be given by calling "GETINFO rcirc/id/" with the CircID.
Filename: 173-getinfo-option-expansion.txt Title: GETINFO Option Expansion Author: Damian Johnson Created: 02-June-2010 Status: Obsolete Overview: Over the course of developing arm there's been numerous hacks and workarounds to glean pieces of basic, desirable information about the tor process. As per Roger's request I've compiled a list of these pain points to try and improve the control protocol interface. Motivation: The purpose of this proposal is to expose additional process and relay related information that is currently unavailable in a convenient, dependable, and/or platform independent way. Examples are: - The relay's total contributed bandwidth. This is a highly requested piece of information and, based on the following patch from pipe, looks trivial to include. http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html - The process ID of the tor process. There is a high degree of guess work in obtaining this. Arm for instance uses pidof, netstat, and ps yet still fails on some platforms, and Orbot recently got a ticket about its own attempt to fetch it with ps: https://trac.torproject.org/projects/tor/ticket/1388 This just includes the pieces of missing information I've noticed (suggestions or questions of their usefulness are welcome!). Security Implications: None that I'm aware of. From a security standpoint this seems decently innocuous. Specification: The following addition would be made to the control-spec's GETINFO section: "relay/bw-limit" -- Effective relayed bandwidth limit. "relay/burst-limit" -- Effective relayed burst limit. "relay/read-total" -- Total bytes relayed (download). "relay/write-total" -- Total bytes relayed (upload). "relay/flags" -- Space separated listing of flags currently held by the relay as reported by the currently cached consensus. "process/user" -- Username under which the tor process is running, or an empty string if none exists. [what do we mean 'if none exists'?] [Implemented in 0.2.3.1-alpha.] "process/pid" -- Process id belonging to the main tor process, -1 if none exists for the platform. [Implemented in 0.2.3.1-alpha.] "process/uptime" -- Total uptime of the tor process (in seconds). "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD signal, in seconds). [should clarify exactly which events cause an uptime reset] "process/descriptors-used" -- Count of file descriptors used. "process/descriptor-limit" -- File descriptor limit (getrlimit results). "ns/authority" -- Router status info (v2 directory style) for all recognized directory authorities, joined by newlines. "state/names" -- A space-separated list of all the keys supported by this version of Tor's state. "state/val/<key>" -- Provides the current state value belonging to the given key. If undefined, this provides the key's default value. "status/ports-seen" -- A summary of which ports we've seen connections' circuits connect to recently, formatted the same as the EXITS_SEEN status event described in Section 4.1.XX. This GETINFO option is currently available only for exit relays. 4.1.XX. Per-port exit stats The syntax is: "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF We just generated a new summary of which ports we've seen exiting circuits connecting to recently. The controller could display this for the user, e.g. in their "relay" configuration window, to give them a sense of how they're being used (popularity of the various ports they exit to). Currently only exit relays will receive this event. TimeStarted is a quoted string indicating when the reported summary counts from (in GMT). The PortSummary keyword has as its argument a comma-separated, possibly empty set of "port=count" pairs. For example (without linebreak), 650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43" PortSummary=80=16,443=8
Filename: 174-optimistic-data-server.txt Title: Optimistic Data for Tor: Server Side Author: Ian Goldberg Created: 2-Aug-2010 Status: Closed Implemented-In: 0.2.3.1-alpha Overview: When a SOCKS client opens a TCP connection through Tor (for an HTTP request, for example), the query latency is about 1.5x higher than it needs to be. Simply, the problem is that the sequence of data flows is this: 1. The SOCKS client opens a TCP connection to the OP 2. The SOCKS client sends a SOCKS CONNECT command 3. The OP sends a BEGIN cell to the Exit 4. The Exit opens a TCP connection to the Server 5. The Exit returns a CONNECTED cell to the OP 6. The OP returns a SOCKS CONNECTED notification to the SOCKS client 7. The SOCKS client sends some data (the GET request, for example) 8. The OP sends a DATA cell to the Exit 9. The Exit sends the GET to the server 10. The Server returns the HTTP result to the Exit 11. The Exit sends the DATA cells to the OP 12. The OP returns the HTTP result to the SOCKS client Note that the Exit node knows that the connection to the Server was successful at the end of step 4, but is unable to send the HTTP query to the server until step 9. This proposal (as well as its upcoming sibling concerning the client side) aims to reduce the latency by allowing: 1. SOCKS clients to optimistically send data before they are notified that the SOCKS connection has completed successfully 2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT state 3. Exit nodes to accept and queue DATA cells while in the EXIT_CONN_STATE_CONNECTING state This particular proposal deals with #3. In this way, the flow would be as follows: 1. The SOCKS client opens a TCP connection to the OP 2. The SOCKS client sends a SOCKS CONNECT command, followed immediately by data (such as the GET request) 3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA cells 4. The Exit opens a TCP connection to the Server 5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET request to the Server 6. The OP returns a SOCKS CONNECTED notification to the SOCKS client, and the Server returns the HTTP result to the Exit 7. The Exit sends the DATA cells to the OP 8. The OP returns the HTTP result to the SOCKS client Motivation: This change will save one OP<->Exit round trip (down to one from two). There are still two SOCKS Client<->OP round trips (negligible time) and two Exit<->Server round trips. Depending on the ratio of the Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will decrease the latency by 25 to 50 percent. Experiments validate these predictions. [Goldberg, PETS 2010 rump session; see https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] Design: The current code actually correctly handles queued data at the Exit; if there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data will be immediately sent when the connection succeeds. If the connection fails, the data will be correctly ignored and freed. The problem with the current server code is that the server currently drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state. Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state, bad things happen because streams in that state don't yet have conn->write_event set, and so some existing sanity checks (any stream with queued data is at least potentially writable) are no longer sound. The solution is to simply not drop received DATA cells while in the EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this state, so that the OP cannot send more than one window's worth of data to be queued at the Exit. Finally, patch the sanity checks so that streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data can pass. If no clients ever send such optimistic data, the new code will never be executed, and the behaviour of Tor will not change. When clients begin to send optimistic data, the performance of those clients' streams will improve. After discussion with nickm, it seems best to just have the server version number be the indicator of whether a particular Exit supports optimistic data. (If a client sends optimistic data to an Exit which does not support it, the data will be dropped, and the client's request will fail to complete.) What do version numbers for hypothetical future protocol-compatible implementations look like, though? Security implications: Servers (for sure the Exit, and possibly others, by watching the pattern of packets) will be able to tell that a particular client is using optimistic data. This will be discussed more in the sibling proposal. On the Exit side, servers will be queueing a little bit extra data, but no more than one window. Clients today can cause Exits to queue that much data anyway, simply by establishing a Tor connection to a slow machine, and sending one window of data. Specification: tor-spec section 6.2 currently says: The OP waits for a RELAY_CONNECTED cell before sending any data. Once a connection has been established, the OP and exit node package stream data in RELAY_DATA cells, and upon receiving such cells, echo their contents to the corresponding TCP stream. RELAY_DATA cells sent to unrecognized streams are dropped. It is not clear exactly what an "unrecognized" stream is, but this last sentence would be changed to say that RELAY_DATA cells received on a stream that has processed a RELAY_BEGIN cell and has not yet issued a RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed immediately after a RELAY_CONNECTED cell is issued for the stream, or freed after a RELAY_END cell is issued for the stream. The earlier part of this section will be addressed in the sibling proposal. Compatibility: There are compatibility issues, as mentioned above. OPs MUST NOT send optimistic data to Exit nodes whose version numbers predate (something). OPs MAY send optimistic data to Exit nodes whose version numbers match or follow that value. (But see the question about independent server reimplementations, above.) Implementation: Here is a simple patch. It seems to work with both regular streams and hidden services, but there may be other corner cases I'm not aware of. (Do streams used for directory fetches, hidden services, etc. take a different code path?) diff --git a/src/or/connection.c b/src/or/connection.c index 7b1493b..f80cd6e 100644 --- a/src/or/connection.c +++ b/src/or/connection.c @@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len, return; } - connection_start_writing(conn); + /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING + * state, we don't want to try to write it right away, since + * conn->write_event won't be set yet. Otherwise, write data from + * this conn as the socket is available. */ + if (conn->state != EXIT_CONN_STATE_RESOLVING) { + connection_start_writing(conn); + } if (zlib) { conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen; } else { @@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now) tor_assert(conn->s < 0); if (conn->outbuf_flushlen > 0) { - tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw || + /* With optimistic data, we may have queued data in + * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing. + * */ + tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING || + connection_is_writing(conn) || conn->write_blocked_on_bw || (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ)); } diff --git a/src/or/relay.c b/src/or/relay.c index fab2d88..e45ff70 100644 --- a/src/or/relay.c +++ b/src/or/relay.c @@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, relay_header_t rh; unsigned domain = layer_hint?LD_APP:LD_EXIT; int reason; + int optimistic_data = 0; /* Set to 1 if we receive data on a stream + that's in the EXIT_CONN_STATE_RESOLVING + or EXIT_CONN_STATE_CONNECTING states.*/ tor_assert(cell); tor_assert(circ); @@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, /* either conn is NULL, in which case we've got a control cell, or else * conn points to the recognized stream. */ - if (conn && !connection_state_is_open(TO_CONN(conn))) - return connection_edge_process_relay_cell_not_open( - &rh, cell, circ, conn, layer_hint); + if (conn && !connection_state_is_open(TO_CONN(conn))) { + if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING || + conn->_base.state == EXIT_CONN_STATE_RESOLVING) && + rh.command == RELAY_COMMAND_DATA) { + /* We're going to allow DATA cells to be delivered to an exit + * node in state EXIT_CONN_STATE_CONNECTING or + * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */ + log_warn(domain, "Optimistic data received."); + optimistic_data = 1; + } else { + return connection_edge_process_relay_cell_not_open( + &rh, cell, circ, conn, layer_hint); + } + } switch (rh.command) { case RELAY_COMMAND_DROP: @@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, log_debug(domain,"circ deliver_window now %d.", layer_hint ? layer_hint->deliver_window : circ->deliver_window); - circuit_consider_sending_sendme(circ, layer_hint); + if (!optimistic_data) { + circuit_consider_sending_sendme(circ, layer_hint); + } if (!conn) { log_info(domain,"data cell dropped, unknown stream (streamid %d).", @@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, stats_n_data_bytes_received += rh.length; connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE, rh.length, TO_CONN(conn)); - connection_edge_consider_sending_sendme(conn); + if (!optimistic_data) { + connection_edge_consider_sending_sendme(conn); + } return 0; case RELAY_COMMAND_END: reason = rh.length > 0 ? Performance and scalability notes: There may be more RAM used at Exit nodes, as mentioned above, but it is transient.
Filename: 175-automatic-node-promotion.txt Title: Automatically promoting Tor clients to nodes Author: Steven Murdoch Created: 12-Mar-2010 Status: Rejected 1. Overview This proposal describes how Tor clients could determine when they have sufficient bandwidth capacity and are sufficiently reliable to become either bridges or Tor relays. When they meet this criteria, they will automatically promote themselves, based on user preferences. The proposal also defines the new controller messages and options which will control this process. Note that for the moment, only transitions between client and bridge are being considered. Transitions to public relay will be considered at a future date, but will use the same infrastructure for measuring capacity and reliability. 2. Motivation and history Tor has a growing user-base and one of the major impediments to the quality of service offered is the lack of network capacity. This is particularly the case for bridges, because these are gradually being blocked, and thus no longer of use to people within some countries. By automatically promoting Tor clients to bridges, and perhaps also to full public relays, this proposal aims to solve these problems. Only Tor clients which are sufficiently useful should be promoted, and the process of determining usefulness should be performed without reporting the existence of the client to the central authorities. The criteria used for determining usefulness will be in terms of bandwidth capacity and uptime, but parameters should be specified in the directory consensus. State stored at the client should be in no more detail than necessary, to prevent sensitive information being recorded. 3. Design 3.x Opt-in state model Tor can be in one of five node-promotion states: - off (O): Currently a client, and will stay as such - auto (A): Currently a client, but will consider promotion - bridge (B): Currently a bridge, and will stay as such - auto-bridge (AB): Currently a bridge, but will consider promotion - relay (R): Currently a public relay, and will stay as such The state can be fully controlled from the configuration file or controller, but the normal state transitions are as follows: Any state -> off: User has opted out of node promotion Off -> any state: Only permitted with user consent Auto -> auto-bridge: Tor has detected that it is sufficiently reliable to be a *bridge* Auto -> bridge: Tor has detected that it is sufficiently reliable to be a *relay*, but the user has chosen to remain a *bridge* Auto -> relay: Tor has detected that it is sufficiently reliable to be *relay*, and will skip being a *bridge* Auto-bridge -> relay: Tor has detected that it is sufficiently reliable to be a *relay* Note that this model does not support automatic demotion. If this is desirable, there should be some memory as to whether the previous state was relay, bridge, or auto-bridge. Otherwise the user may be prompted to become a relay, although he has opted to only be a bridge. 3.x User interaction policy There are a variety of options in how to involve the user into the decision as to whether and when to perform node promotion. The choice also may be different when Tor is running from Vidalia (and thus can readily prompt the user for information), and standalone (where Tor can only log messages, which may or may not be read). The option requiring minimal user interaction is to automatically promote nodes according to reliability, and allow the user to opt out, by changing settings in the configuration file or Vidalia user interface. Alternatively, if a user interface is available, Tor could prompt the user when it detects that a transition is available, and allow the user to choose which of the available options to select. If Vidalia is not available, it still may be possible to solicit an email address on install, and contact the operator to ask whether a transition to bridge or relay is permitted. Finally, Tor could by default not make any transition, and the user would need to opt in by stating the maximum level (bridge or relay) to which the node may automatically promote itself. 3.x Performance monitoring model To prevent a large number of clients activating as relays, but being too unreliable to be useful, clients should measure their performance. If this performance meets a parameterized acceptance criteria, a client should consider promotion. To measure reliability, this proposal adopts a simple user model: - A user decides to use Tor at times which follow a Poisson distribution - At each time, the user will be happy if the bridge chosen has adequate bandwidth and is reachable - If the chosen bridge is down or slow too many times, the user will consider Tor to be bad If we additionally assume that the recent history of relay performance matches the current performance, we can measure reliability by simulating this simple user. The following parameters are distributed to clients in the directory consensus: - min_bandwidth: Minimum self-measured bandwidth for a node to be considered useful, in bytes per second - check_period: How long, in seconds, to wait between checking reachability and bandwidth (on average) - num_samples: Number of recent samples to keep - num_useful: Minimum number of recent samples where the node was reachable and had at least min_bandwidth capacity, for a client to consider promoting to a bridge A different set of parameters may be used for considering when to promote a bridge to a full relay, but this will be the subject of a future revision of the proposal. 3.x Performance monitoring algorithm The simulation described above can be implemented as follows: Every 60 seconds: 1. Tor generates a random floating point number x in the interval [0, 1). 2. If x > (1 / (check_period / 60)) GOTO end; otherwise: 3. Tor sets the value last_check to the current_time (in seconds) 4. Tor measures reachability 5. If the client is reachable, Tor measures its bandwidth 6. If the client is reachable and the bandwidth is >= min_bandwidth, the test has succeeded, otherwise it has failed. 7. Tor adds the test result to the end of a ring-buffer containing the last num_samples results: measurement_results 8. Tor saves last_check and measurements_results to disk 9. If the length of measurements_results == num_samples and the number of successes >= num_useful, Tor should consider promotion to a bridge end. When Tor starts, it must fill in the samples for which it was not running. This can only happen once the consensus has downloaded, because the value of check_period is needed. 1. Tor generates a random number y from the Poisson distribution [1] with lambda = (current_time - last_check) * (1 / check_period) 2. Tor sets the value last_check to the current_time (in seconds) 3. Add y test failures to the ring buffer measurements_results 4. Tor saves last_check and measurements_results to disk In this way, a Tor client will measure its bandwidth and reachability every check_period seconds, on average. Provided check_period is sufficiently greater than a minute (say, at least an hour), the times of check will follow a Poisson distribution. [2] While this does require that Tor does record the state of a client over time, this does not leak much information. Only a binary reachable/non-reachable is stored, and the timing of samples becomes increasingly fuzzy as the data becomes less recent. On IP address changes, Tor should clear the ring-buffer, because from the perspective of users with the old IP address, this node might as well be a new one with no history. This policy may change once we start allowing the bridge authority to hand out new IP addresses given the fingerprint. [Perhaps another consensus param? Also, this means we save previous IP address in our state file, yes? -RD] 3.x Bandwidth measurement Tor needs to measure its bandwidth to test the usefulness as a bridge. A non-intrusive way to do this would be to passively measure the peak data transfer rate since the last reachability test. Once this exceeds min_bandwidth, Tor can set a flag that this node currently has sufficient bandwidth to pass the bandwidth component of the upcoming performance measurement. For the first version we may simply skip the bandwidth test, because the existing reachability test sends 500 kB over several circuits, and checks whether the node can transfer at least 50 kB/s. This is probably good enough for a bridge, so this test might be sufficient to record a success in the ring buffer. 3.x New options 3.x New controller message 4. Migration plan We should start by setting a high bandwidth and uptime requirement in the consensus, so as to avoid overloading the bridge authority with too many bridges. Once we are confident our systems can scale, the criteria can be gradually shifted down to gain more bridges. 5. Related proposals 6. Open questions: - What user interaction policy should we take? - When (if ever) should we turn a relay into an exit relay? - What should the rate limits be for auto-promoted bridges/relays? Should we prompt the user for this? - Perhaps the bridge authority should tell potential bridges whether to enable themselves, by taking into account whether their IP address is blocked - How do we explain the possible risks of running a bridge/relay * Use of bandwidth/congestion * Publication of IP address * Blocking from IRC (even for non-exit relays) - What feedback should we give to bridge relays, to encourage them e.g. number of recent users (what about reserve bridges)? - Can clients back-off from doing these tests (yes, we should do this) [1] For algorithms to generate random numbers from the Poisson distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables [2] "The sample size n should be equal to or larger than 20 and the probability of a single success, p, should be smaller than or equal to .05. If n >= 100, the approximation is excellent if np is also <= 10." http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods) % vim: spell ai et:
Filename: 176-revising-handshake.txt Title: Proposed version-3 link handshake for Tor Author: Nick Mathewson Created: 31-Jan-2011 Status: Closed Target: 0.2.3 Supersedes: 169 1. Overview I propose a (mostly) backward-compatible change to the Tor connection establishment protocol to avoid the use of TLS renegotiation, to avoid certain protocol fingerprinting attacks, and to make it easier to write Tor clients and servers. Rather than doing a TLS renegotiation to exchange certificates and authenticate the original handshake, this proposal takes an approach similar to Steven Murdoch's proposal 124 and my old proposal 169, and uses Tor cells to finish authenticating the parties' identities once the initial TLS handshake is finished. I discuss some alternative design choices and why I didn't make them in section 7; please have a quick look there before telling me that something is pointless or makes no sense. Terminological note: I use "client" or "initiator" below to mean the Tor instance (a client or a bridge or a relay) that initiates a TLS connection, and "server" or "responder" to mean the Tor instance (a bridge or a relay) that accepts it. 2. History and Motivation The _goals_ of the Tor link handshake have remained basically uniform since our earliest versions. They are: * Provide data confidentiality, data integrity * Provide forward secrecy * Allow responder authentication or bidirectional authentication. * Try to look like some popular too-important-to-block-at-whim encryption protocol, to avoid fingerprinting and censorship. * Try to be implementable -- on the client side at least! -- by as many TLS implementations as possible. When we added the v2 handshake, we added another goal: * Remain compatible with older versions of the handshake protocol. In the original Tor TLS connection handshake protocol ("V1", or "two-cert"), parties that wanted to authenticate provided a two-cert chain of X.509 certificates during the handshake setup phase. Every party that wanted to authenticate sent these certificates. The security properties of this protocol are just fine; the problem was that our behavior of sending two-certificate chains made Tor easy to identify. In the current Tor TLS connection handshake protocol ("V2", or "renegotiating"), the parties begin with a single certificate sent from the server (responder) to the client (initiator), and then renegotiate to a two-certs-from-each-authenticating party. We made this change to make Tor's handshake look like a browser speaking SSL to a webserver. (See proposal 130, and tor-spec.txt.) So from an observer's point of view, two parties performing the V2 handshake begin by making a regular TLS handshake with a single certificate, then renegotiate immediately. To tell whether to use the V1 or V2 handshake, the servers look at the list of ciphers sent by the client. (This is ugly, but there's not much else in the ClientHello that they can look at.) If the list contains any cipher not used by the V1 protocol, the server sends back a single cert and expects a renegotiation. If the client gets back a single cert, then it withholds its own certificates until the TLS renegotiation phase. In other words, V2-supporting initiator behavior currently looks like this: - Begin TLS negotiation with V2 cipher list; wait for certificate(s). - If we get a certificate chain: - Then we are using the V1 handshake. Send our own certificate chain as part of this initial TLS handshake if we want to authenticate; otherwise, send no certificates. When the handshake completes, check certificates. We are now mutually authenticated. Otherwise, if we get just a single certificate: - Then we are using the V2 handshake. Do not send any certificates during this handshake. - When the handshake is done, immediately start a TLS renegotiation. During the renegotiation, expect a certificate chain from the server; send a certificate chain of our own if we want to authenticate ourselves. - After the renegotiation, check the certificates. Then send (and expect) a VERSIONS cell from the other side to establish the link protocol version. And V2-supporting responder behavior now looks like this: - When we get a TLS ClientHello request, look at the cipher list. - If the cipher list contains only the V1 ciphersuites: - Then we're doing a V1 handshake. Send a certificate chain. Expect a possible client certificate chain in response. Otherwise, if we get other ciphersuites: - We're using the V2 handshake. Send back a single certificate and let the handshake complete. - Do not accept any data until the client has renegotiated. - When the client is renegotiating, send a certificate chain, and expect (possibly multiple) certificates in reply. - Check the certificates when the renegotiation is done. Then exchange VERSIONS cells. Late in 2009, researchers found a flaw in most applications' use of TLS renegotiation: Although TLS renegotiation does not reauthenticate any information exchanged before the renegotiation takes place, many applications were treating it as though it did, and assuming that data sent _before_ the renegotiation was authenticated with the credentials negotiated _during_ the renegotiation. This problem was exacerbated by the fact that most TLS libraries don't actually give you an obvious good way to tell where the renegotiation occurred relative to the datastream. Tor wasn't directly affected by this vulnerability, but the aftermath hurts us in a few ways: 1) OpenSSL has disabled renegotiation by default, and created a "yes we know what we're doing" option we need to set to turn it back on. (Two options, actually: one for openssl 0.9.8l and one for 0.9.8m and later.) 2) Some vendors have removed all renegotiation support from their versions of OpenSSL entirely, forcing us to tell users to either replace their versions of OpenSSL or to link Tor against a hand-built one. 3) Because of 1 and 2, I'd expect TLS renegotiation to become rarer and rarer in the wild, making our own use stand out more. Furthermore, there are other issues related to TLS and fingerprinting that we want to fix in any revised handshake: 1) We should make it easier to use self-signed certs, or maybe even existing HTTPS certificates, for the server side handshake, since most non-Tor SSL handshakes use either self-signed certificates or CA-signed certificates. 2) We should allow other changes in our use of TLS and in our certificates so as to resist fingerprinting based on how our certificates look. (See proposal 179.) 3. Design 3.1. The view in the large Taking a cue from Steven Murdoch's proposal 124 and my old proposal 169, I propose that we move the work currently done by the TLS renegotiation step (that is, authenticating the parties to one another) and do it with Tor cells instead of with TLS alone. This section outlines the protocol; we go into more detail below. To tell the client that it can use the new cell-based authentication system, the server sends a "V3 certificate" during the initial TLS handshake. (More on what makes a certificate "v3" below.) If the client recognizes the format of the certificate and decides to pursue the V3 handshake, then instead of renegotiating immediately on completion of the initial TLS handshake, the client instead sends a VERSIONS cell (and the negotiation begins). So the flowchart on the server side is: Wait for a ClientHello. If the client sends a ClientHello that indicates V1: - Send a certificate chain. - When the TLS handshake is done, if the client sent us a certificate chain, then check it. If the client sends a ClientHello that indicates V2 or V3: - Send a self-signed certificate or a CA-signed certificate - When the TLS handshake is done, wait for renegotiation or data. - If renegotiation occurs, the client is V2: send a certificate chain and maybe receive one. Check the certificate chain as in V1. - If the client sends data without renegotiating, it is starting the V3 handshake. Proceed with the V3 handshake as below. And the client-side flowchart is: - Send a ClientHello with a set of ciphers that indicates V2/V3. - After the handshake is done: - If the server sent us a certificate chain, check it: we are using the V1 handshake. - If the server sent us a single "V2 certificate", we are using the v2 handshake: the client begins to renegotiate and proceeds as before. - Finally, if the server sent us a "v3 certificate", we are doing the V3 handshake below. And the cell-based part of the V3 handshake, in summary, is: C<->S: TLS handshake where S sends a "v3 certificate" In TLS: C->S: VERSIONS cell S->C: VERSIONS cell, CERT cell, AUTH_CHALLENGE cell, NETINFO cell C->S: Optionally: CERT cell, AUTHENTICATE cell C->S: NETINFO cell A "CERTS" cell contains a set of certificates; an "AUTHENTICATE" cell authenticates the client to the server. More on these later. 3.2. Distinguishing V2 and V3 certificates In the protocol outline above, we require that the client can distinguish between v2 certificates (that is, those sent by current servers) and v3 certificates. We further require that existing clients will accept v3 certificates as they currently accept v2 certificates. Fortunately, current certificates have a few characteristics that make them fairly well-mannered as it is. We say that a certificate indicates a V2-only server if ALL of the following hold: * The certificate is not self-signed. * There is no DN field set in the certificate's issuer or subject other than "commonName". * The commonNames of the issuer and subject both end with ".net" * The public modulus is at most 1024 bits long. Otherwise, the client should assume that the server supports the V3 handshake. To the best of my knowledge, current clients will behave properly on receiving non-v2 certs during the initial TLS handshake so long as they eventually get the correct V2 cert chain during the renegotiation. The v3 requirements are easy to meet: any certificate designed to resist fingerprinting will likely be self-signed, or if it's signed by a CA, then the issuer will surely have more DN fields set. Certificates that aren't trying to resist fingerprinting can trivially become v3 by using a CN that doesn't end with .net, or using a key longer than 1024 bits. 3.3. Authenticating via Tor cells: server authentication Once the TLS handshake is finished, if the client renegotiates, then the server should go on as it does currently. If the client implements this proposal, however, and the server has shown it can understand the V3+ handshake protocol, the client immediately sends a VERSIONS cell to the server and waits to receive a VERSIONS cell in return. We negotiate the Tor link protocol version _before_ we proceed with the negotiation, in case we need to change the authentication protocol in the future. Once either party has seen the VERSIONS cell from the other, it knows which version they will pick (that is, the highest version shared by both parties' VERSIONS cells). All Tor instances using the handshake protocol described in 3.2 MUST support at least link protocol version 3 as described here. If a version lower than 3 is negotiated with the V3 handshake in place, a Tor instance MUST close the connection. On learning the link protocol, the server then sends the client a CERT cell and a NETINFO cell. If the client wants to authenticate to the server, it sends a CERT cell, an AUTHENTICATE cell, and a NETINFO cell; or it may simply send a NETINFO cell if it does not want to authenticate. The CERT cell describes the keys that a Tor instance is claiming to have. It is a variable-length cell. Its payload format is: N: Number of certs in cell [1 octet] N times: CertType [1 octet] CLEN [2 octets] Certificate [CLEN octets] Any extra octets at the end of a CERT cell MUST be ignored. CertType values are: 1: Link key certificate from RSA1024 identity 2: RSA1024 Identity certificate 3: RSA1024 AUTHENTICATE cell link certificate The certificate format is X509. To authenticate the server, the client MUST check the following: * The CERTS cell contains exactly one CertType 1 "Link" certificate. * The CERTS cell contains exactly one CertType 2 "ID" certificate. * Both certificates have validAfter and validUntil dates that are not expired. * The certified key in the Link certificate matches the link key that was used to negotiate the TLS connection. * The certified key in the ID certificate is a 1024-bit RSA key. * The certified key in the ID certificate was used to sign both certificates. * The link certificate is correctly signed with the key in the ID certificate * The ID certificate is correctly self-signed. If all of these conditions hold, then the client knows that it is connected to the server whose identity key is certified in the ID certificate. If any condition does not hold, the client closes the connection. If the client wanted to connect to a server with a different identity key, the client closes the connection. An AUTH_CHALLENGE cell is a variable-length cell with the following fields: Challenge [32 octets] N_Methods [2 octets] Methods [2 * N_Methods octets] It is sent from the server to the client. Clients MUST ignore unexpected bytes at the end of the cell. Servers MUST generate every challenge using a strong RNG or PRNG. The Challenge field is a randomly generated string that the client must sign (a hash of) as part of authenticating. The methods are the authentication methods that the server will accept. Only one authentication method is defined right now; see 3.4 below. 3.4. Authenticating via Tor cells: Client authentication A client does not need to authenticate to the server. If it does not wish to, it responds to the server's valid CERT cell by sending a NETINFO cell: once it has gotten a valid NETINFO cell, the client should consider the connection open, and the server should consider the connection as opened by an unauthenticated client. If a client wants to authenticate, it responds to the AUTH_CHALLENGE cell with a CERT cell and an AUTHENTICATE cell. The CERT cell is as a server would send, except that instead of sending a CertType 1 cert for an arbitrary link certificate, the client sends a CertType 3 cert for an RSA AUTHENTICATE key. (This difference is because we allow any link key type on a TLS link, but the protocol described here will only work for 1024-bit RSA keys. A later protocol version should extend the protocol here to work with non-1024-bit, non-RSA keys.) AuthType [2 octets] AuthLen [2 octets] Authentication [AuthLen octets] Servers MUST ignore extra bytes at the end of an AUTHENTICATE cell. If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the Authentication contains the following: TYPE: The characters "AUTH0001" [8 octets] CID: A SHA256 hash of the client's RSA1024 identity key [32 octets] SID: A SHA256 hash of the server's RSA1024 identity key [32 octets] SLOG: A SHA256 hash of all bytes sent from the server to the client as part of the negotiation up to and including the AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERT cell, the AUTH_CHALLENGE cell, and any padding cells. [32 octets] CLOG: A SHA256 hash of all bytes sent from the client to the server as part of the negotiation so far; that is, the VERSIONS cell and the CERT cell and any padding cells. [32 octets] SCERT: A SHA256 hash of the server's TLS link certificate. [32 octets] TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the secret key, of the following: - client_random, as sent in the TLS Client Hello - server_random, as sent in the TLS Server Hello - the NUL terminated ASCII string: "Tor V3 handshake TLS cross-certification" [32 octets] TIME: The time of day in seconds since the POSIX epoch. [8 octets] RAND: A 16 byte value, randomly chosen by the client [16 octets] SIG: A signature of a SHA256 hash of all the previous fields using the client's "Authenticate" key as presented. (As always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt section 0.3.) [variable length] To check the AUTHENTICATE cell, a server checks that all fields containing from TYPE through TLSSECRETS contain their unique correct values as described above, and then verifies the signature. signature. The server MUST ignore any extra bytes in the signed data after the SHA256 hash. 3.5. Responding to extra cells, and other security checks. If the handshake is a V3 TLS handshake, both parties MUST reject any negotiated link version less than 3. Both parties MUST check this and close the connection if it is violated. If the handshake is not a V3 TLS handshake, both parties MUST still advertise all link protocols they support in their versions cell. Both parties MUST close the link if it turns out they both would have supported version 3 or higher, but they somehow wound up using a v2 or v1 handshake. (More on this in section 6.4.) Either party may send a VPADDING cell at any time during the handshake, except as the first cell. (See proposal 184.) A server SHOULD NOT send any sequence of cells when starting a v3 negotiation other than "VERSIONS, CERT, AUTH_CHALLENGE, NETINFO". A client SHOULD drop a CERT, AUTH_CHALLENGE, or NETINFO cell that appears at any other time or out of sequence. A client should not begin a v3 negotiation with any sequence other than "VERSIONS, NETINFO" or "VERSIONS, CERT, AUTHENTICATE, NETINFO". A server SHOULD drop a CERT, AUTH_CHALLENGE, or NETINFO cell that appears at any other time or out of sequence. 4. Numbers to assign We need a version number for this link protocol. I've been calling it "3". We need to reserve command numbers for CERT, AUTH_CHALLENGE, and AUTHENTICATE. I suggest that in link protocol 3 and higher, we reserve a separate range of commands for variable-length cells. See proposal 184 for more there. 5. Efficiency This protocol adds a round-trip step when the client sends a VERSIONS cell to the server, and waits for the {VERSIONS, CERT, NETINFO} response in turn. (The server then waits for the client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply, but it would have already been waiting for the client's NETINFO, so that's not an additional wait.) This is actually fewer round-trip steps than required before for TLS renegotiation, so that's a win over v2. 6. Security argument These aren't crypto proofs, since I don't write those. They are meant to be reasonably convincing. 6.1. The server is authenticated TLS guarantees that if the TLS handshake completes successfully, the client knows that it is speaking to somebody who knows the private key corresponding to the public link key that was used in the TLS handshake. Because this public link key is signed by the server's identity key in the CERT cell, the client knows that somebody who holds the server's private identity key says that the server's public link key corresponds to the server's public identity key. Therefore, if the crypto works, and if TLS works, and if the keys aren't compromised, then the client is talking to somebody who holds the server's private identity key. 6.2. The client is authenticated Once the server has checked the client's certificates, the server knows that somebody who knows the client's private identity key says that he is the one holding the private key corresponding to the client's presented link-authentication public key. Once the server has checked the signature in the AUTHENTICATE cell, the server knows that somebody holding the client's link-authentication private key signed the data in question. By the standard certification argument above, the server knows that somebody holding the client's private identity key signed the data in question. So the server's remaining question is: am I really talking to somebody holding the client's identity key, or am I getting a replayed or MITM'd AUTHENTICATE cell that was previously sent by the client? Because the client includes a TLSSECRET component, and the server is able to verify it, then the answer is easy: the server knows for certain that it is talking to the party with whom it did the TLS handshake, since if somebody else generated a correct TLSSECRET, they would have to know the master secret of the TLS connection, which would require them to have broken TLS. Even if the protocol didn't contain the TLSSECRET component, the server could the client's authentication, but it's a little trickier. The server knows that it is not getting a replayed AUTHENTICATE cell, since the cell authenticates (among other stuff) the server's AUTH_CHALLENGE cell, which it has never used before. The server knows that it is not getting a MITM'd AUTHENTICATE cell, since the cell includes a hash of the server's link certificate, which nobody else should have been able to use in a successful TLS negotiation. 6.3. MITM attacks won't work any better than they do against TLS TLS guarantees that a man-in-the-middle attacker can't read the content of a successfully negotiated encrypted connection, nor alter the content in any way other than truncating it, unless he compromises the session keys or one of the key-exchange secret keys used to establish that connection. Let's make sure we do at least that well. Suppose that a client Alice connects to an MITM attacker Mallory, thinking that she is connecting to some server Bob. Let's assume that the TLS handshake between Alice and Mallory finishes successfully and the v3 protocol is chosen. [If the v1 or v2 protocol is chosen, those already resist MITM. If the TLS handshake doesn't complete, then Alice isn't connected to anybody.] During the v3 handshake, Mallory can't convince Alice that she is talking to Bob, since she should not be able to produce a CERT cell containing a certificate chain signed by Bob's identity key and used to authenticate the link key that Mallory used during TLS. (If Mallory used her own link key for the TLS handshake, it won't match anything Bob signed unless Bob is compromised. Mallory can't use any key that Bob _did_ produce a certificate for, since she doesn't know the private key.) Even if Alice fails to check the certificates from Bob, Mallory still can't convince Bob that she is really Alice. Assuming that Alice's keys aren't compromised, Mallory can't send a CERT cell with a cert chain from Alice's identity key to a key that Mallory controls, so if Mallory wants to impersonate Alice's identity key, she can only do so by sending an AUTHENTICATE cell really generated by Alice. Because Bob will check that the random bytes in the AUTH_CHALLENGE cell will influence the SLOG hash, Mallory needs to send Bob's challenge to Alice, and can't use any other AUTHENTICATE cell that Alice generated before. But because the AUTHENTICATE cell Alice will generate will include in the SCERT field a hash of the link certificate used by Mallory, Bob will reject it as not being valid to connect to him. 6.4. Protocol downgrade attacks won't work. Assuming that Alice checks the certificates from Bob, she knows that Bob really sent her the VERSION cell that she received. Because the AUTHENTICATE cell from Alice includes signed hashes of the VERSIONS cells from Alice and Bob, Bob knows that Alice got the VERSIONS cell he sent and sent the VERSIONS cell that he received. But what about attempts to downgrade the protocol earlier in the handshake? Here TLS comes to the rescue: because the TLS Finished handshake message includes an authenticated digest of everything previously said during the handshake, an attacker can't replace the client's ciphersuite list (to trigger a downgrade to the v1 protocol) or the server's certificate [chain] (to trigger a downgrade to the v1 or v2 protocol). 7. Design considerations I previously considered adding our own certificate format in order to avoid the pain associated with X509, but decided instead to simply use X509 since a correct Tor implementation will already need to have X509 code to handle the other handshake versions and to use TLS. The trickiest part of the design here is deciding what to stick in the AUTHENTICATE cell. Some of it is strictly necessary, and some of it is left there for security margin in case my other security arguments fail. Because of the CID and SID elements you can't use an AUTHENTICATE cell for anything other than authenticating a client ID to a server with an appropriate server ID. The SLOG and CLOG elements are there mostly to authenticate the VERSIONS cells and resist downgrade attacks once there are two versions of this. The presence of the AUTH_CHALLENGE field in the stuff authenticated in SLOG prevents replays and ensures that the AUTHENTICATE cell was really generated by somebody who is reading what the server is sending over the TLS connection. The SCERT element is meant to prevent MITM attacks. When the TLSSECRET field is used, it should prevent the use of the AUTHENTICATE cell for anything other than the TLS connection the client had in mind. A signature of the TLSSECRET element on its own should also be sufficient to prevent the attacks we care about. The redundancy here should come in handy if I've made a mistake somewhere else in my analysis. If the client checks the server's certificates and matches them to the TLS connection link key before proceding with the handshake, then signing the contents of the AUTH_CHALLENGE cell would be sufficient to authenticate the client. But implementers of allegedly compatible Tor clients have in the past skipped certificate verification steps, and I didn't want a client's failure to verify certificates to mean that a server couldn't trust that he was really talking to the client. To prevent this, I added the TLS link certificate to the authenticated data: even if the Tor client code doesn't check any certificates, the TLS library code will still check that the certificate used in the handshake contains a link key that matches the one used in the handshake. 8. Open questions: - May we cache which certificates we've already verified? It might leak in timing whether we've connected with a given server before, and how recently. - With which TLS libraries is it feasible to yoink client_random, server_random, and the master secret? If the answer is "All free C TLS libraries", great. If the answer is "OpenSSL only", not so great. - Should we do anything to check the timestamp in the AUTHENTICATE cell? - Can we give some way for clients to signal "I want to use the V3 protocol if possible, but I can't renegotiate, so don't give me the V2"? Clients currently have a fair idea of server versions, so they could potentially do the V3 handshake with servers that support it, and fall back to V1 otherwise. - What should servers that don't have TLS renegotiation do? For now, I think they should just stick with V1. Eventually we can deprecate the V2 handshake as we did with the V1 handshake. When that happens, servers can be V3-only.
Filename: 177-flag-abstention.txt Title: Abstaining from votes on individual flags Author: Nick Mathewson Created: 14 Feb 2011 Status: Reserve Target: 0.2.4.x Overview: We should have a way for authorities to vote on flags in particular instances, without having to vote on that flag for all servers. Motivation: Suppose that the status of some router becomes controversial, and an authority wants to vote for or against the BadExit status of that router. Suppose also that the authority is not currently voting on the BadExit flag. If the authority wants to say that the router is or is not "BadExit", it cannot currently do so without voting yea or nay on the BadExit status of all other routers. Suppose that an authority wants to vote "Valid" or "Invalid" on a large number of routers, but does not have an opinion on some of them. Currently, it cannot do so: if it votes for the Valid flag anywhere, it votes for it everywhere. Design: We add a new line "extra-flags" in directory votes, to appear after "known-flags". It lists zero or more flags that an authority has occasional opinions on, but for which the authority will usually abstain. No flag may appear in both extra-flags and known-flags. In the router-status section for each directory vote, we allow an optional "s2" line to appear after the "s" line. It contains zero or more flag votes. A flag vote is of the form of one of "+", "-", or "/" followed by the name of a flag. "+" denotes a yea vote, and "-" denotes a nay vote, and "/" notes an abstention. Authorities may omit most abstentions, except as noted below. No flag may appear in an s2 line unless it appears in the known-flags or extra-flags line.We retain the rule that no flag may appear in an s line unless it appears in the known-flags line. When using an appropriate consensus method to vote, we use these new rules to determine flags: A flag is listed in the consensus if it is in the known-flags section of at least one voter, and in the known-flags or extra-flags section of at least three voters (or half the authorities, whichever set is smaller). A single authority's vote for a given flag on a given router is interpreted as follows: - If the authority votes +Flag or -Flag or /Flag in the s2 line for that router, the vote is "yea" or "nay" or "abstain" respectively. - Otherwise, if the flag is listed on the "s" line for the router, then the vote is "yea". - Otherwise, if the flag is listed in the known-flags line, then the vote is "nay". - Otherwise, the vote is "abstain". A router is assigned a flag in the consensus iff the total "yeas" outnumber the total "nays". As an exception, this proposal does not affect the behavior of the "Named" and "Unnamed" flags; these are still treated as before. (An authority can already abstain from a single naming decision by not voting Named on any router with a given name.) Examples: Suppose that it becomes important to know which Tor servers are operated by burrowing marsupials. Some authority operators diligently research this question; others want to vote about individual routers on an ad hoc basis when they learn about a particular router's being e.g. located underground in New South Wales. If an authority usually has no opinions on the RunByWombats flag, it should list it in the "extra-flags" of its votes. If it occasionally wants to vote that a router is (or is not) run by wombats, it should list "s2 +RunByWombats" or "s2 -RunByWombats" for the routers in question. Otherwise it can omit the flag from its s and s2 lines entirely. If an authority usually has an opinion on the RunByWombats flag, but wants to abstain in some cases, it should list "RunByWombats" in the "known-flags" part of its votes, and include "RunByWombats" in the s line for every router that it believes is run by wombats. When it wants to vote that a router is not run by wombats, it should list the RunByWombats flag in neither the s nor the s2 line. When it wants to abstain, it should list "s2 /RunByWombats". In both cases, when the new consensus method is used, a router will get listed as "RunByWombats" if there are more authorities that say it is run by wombats than there are authorities saying it is not run by wombats. (As now, "no" votes win ties.)
Filename: 178-param-voting.txt Title: Require majority of authorities to vote for consensus parameters Author: Sebastian Hahn Created: 16-Feb-2011 Status: Closed Implemented-In: 0.2.3.9-alpha Overview: The consensus that the directory authorities create may contain one or more parameters (32-bit signed integers) that influence the behavior of Tor nodes (see proposal 167, "Vote on network parameters in consensus" for more details). Currently (as of consensus method 11), a consensus will end up containing a parameter if at least one directory authority votes for that paramater. The value of the parameter will be the low-median of all the votes for this parameter. This proposal aims at changing this voting process to be more secure against tampering by a small fraction of directory authorities. Motivation: To prevent a small fraction of the directory authorities from influencing the value of a parameter unduly, a big enough fraction of all directory authorities authorities has to vote for that parameter. This is not currently happening, and it is in fact not uncommon for a single authority to govern the value of a consensus parameter. Design: When the consensus is generated, the directory authorities ensure that a param is only included in the list of params if at least three of the authorities (or a simple majority, whichever is the smaller number) votes for that param. The value chosen is the low-median of all the votes. We don't mandate that the authorities have to vote on exactly the same value for it to be included because some consensus parameters could be the result of active measurements that individual authorities make. Security implications: This change is aimed at improving the security of Tor nodes against attacks carried out by a small fraction of directory authorities. It is possible that a consensus parameter that would be helpful to the network is not included because not enough directory authorities voted for it, but since clients are required to have sane defaults in case the parameter is absent this does not carry a security risk. This proposal makes a security vs coordination effort tradeoff. When considering only the security of the design, it would be better to require a simple majority of directory authorities to agree on voting on a parameter, but it would involve requiring more directory authority operators to coordinate their actions to set the parameter successfully. Specification: dir-spec section 3.4 currently says: Entries are given on the "params" line for every keyword on which any authority voted. The values given are the low-median of all votes on that keyword. It is proposed that the above is changed to: Entries are given on the "params" line for every keyword on which a majority of authorities (total authorities, not just those participating in this vote) voted on, or if at least three authorities voted for that parameter. The values given are the low-median of all votes on that keyword. Consensus methods 11 and before, entries are given on the "params" line for every keyword on which any authority voted, the value given being the low-median of all votes on that keyword. The following should be added to the bottom of section 3.4.: * If consensus method 12 or later is used, only consensus parameters that more than half of the total number of authorities voted for are included in the consensus. The following line should be added to the bottom of section 3.4.1.: "12" -- Params are only included if enough auths voted for them Compatibility: A sufficient number of directory authorities must upgrade to the new consensus method used to calculate the params in the way this proposal calls for, otherwise the old mechanism is used. Nodes that do not act as directory authorities do not need to be upgraded and should experience no change in behaviour. Implementation: An example implementation of this feature can be found in https://gitweb.torproject.org/sebastian/tor.git, branch safer_params.
Filename: 179-TLS-cert-and-parameter-normalization.txt Title: TLS certificate and parameter normalization Author: Jacob Appelbaum, Gladys Shufflebottom Created: 16-Feb-2011 Status: Closed Target: 0.2.3.x Draft spec for TLS certificate and handshake normalization Overview STATUS NOTE: This document is implemented in part in 0.2.3.x, deferred in part, and rejected in part. See indented bracketed comments in individual sections below for more information. -NM Scope This is a document that proposes improvements to problems with Tor's current TLS (Transport Layer Security) certificates and handshake that will reduce the distinguishability of Tor traffic from other encrypted traffic that uses TLS. It also addresses some of the possible fingerprinting attacks possible against the current Tor TLS protocol setup process. Motivation and history Censorship is an arms race and this is a step forward in the defense of Tor. This proposal outlines ideas to make it more difficult to fingerprint and block Tor traffic. Goals This proposal intends to normalize or remove easy-to-predict or static values in the Tor TLS certificates and with the Tor TLS setup process. These values can be used as criteria for the automated classification of encrypted traffic as Tor traffic. Network observers should not be able to trivially detect Tor merely by receiving or observing the certificate used or advertised by a Tor relay. I also propose the creation of a hard-to-detect covert channel through which a server can signal that it supports the third version ("V3") of the Tor handshake protocol. Non-Goals This document is not intended to solve all of the possible active or passive Tor fingerprinting problems. This document focuses on removing distinctive and predictable features of TLS protocol negotiation; we do not attempt to make guarantees about resisting other kinds of fingerprinting of Tor traffic, such as fingerprinting techniques related to timing or volume of transmitted data. Implementation details Certificate Issues The CN or commonName ASN1 field Tor generates certificates with a predictable commonName field; the field is within a given range of values that is specific to Tor. Additionally, the generated host names have other undesirable properties. The host names typically do not resolve in the DNS because the domain names referred to are generated at random. Although they are syntatically valid, they usually refer to domains that have never been registered by any domain name registrar. An example of the current commonName field: CN=www.s4ku5skci.net An example of OpenSSL’s asn1parse over a typical Tor certificate: 0:d=0 hl=4 l= 438 cons: SEQUENCE 4:d=1 hl=4 l= 287 cons: SEQUENCE 8:d=2 hl=2 l= 3 cons: cont [ 0 ] 10:d=3 hl=2 l= 1 prim: INTEGER :02 13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A 19:d=2 hl=2 l= 13 cons: SEQUENCE 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 32:d=3 hl=2 l= 0 prim: NULL 34:d=2 hl=2 l= 35 cons: SEQUENCE 36:d=3 hl=2 l= 33 cons: SET 38:d=4 hl=2 l= 31 cons: SEQUENCE 40:d=5 hl=2 l= 3 prim: OBJECT :commonName 45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net 71:d=2 hl=2 l= 30 cons: SEQUENCE 73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z 88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z 103:d=2 hl=2 l= 28 cons: SEQUENCE 105:d=3 hl=2 l= 26 cons: SET 107:d=4 hl=2 l= 24 cons: SEQUENCE 109:d=5 hl=2 l= 3 prim: OBJECT :commonName 114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net 133:d=2 hl=3 l= 159 cons: SEQUENCE 136:d=3 hl=2 l= 13 cons: SEQUENCE 138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption 149:d=4 hl=2 l= 0 prim: NULL 151:d=3 hl=3 l= 141 prim: BIT STRING 295:d=1 hl=2 l= 13 cons: SEQUENCE 297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 308:d=2 hl=2 l= 0 prim: NULL 310:d=1 hl=3 l= 129 prim: BIT STRING I propose that we match OpenSSL's default self-signed certificates. I hypothesise that they are the most common self-signed certificates. If this turns out not to be the case, then we should use whatever the most common turns out to be. Certificate serial numbers Currently our generated certificate serial number is set to the number of seconds since the epoch at the time of the certificate's creation. I propose that we should ensure that our serial numbers are unrelated to the epoch, since the generation methods are potentially recognizable as Tor-related. Instead, I propose that we use a randomly generated number that is subsequently hashed with SHA-512 and then truncate the data to eight bytes[1]. Random sixteen byte values appear to be the high bound for serial number as issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length. Others common byte lengths appear to be between one and four bytes. The default OpenSSL certificates are eight bytes and we should use this length with our self-signed certificates. This randomly generated serial number field may now serve as a covert channel that signals to the client that the OR will not support TLS renegotiation; this means that the client can expect to perform a V3 TLS handshake setup. Otherwise, if the serial number is a reasonable time since the epoch, we should assume the OR is using an earlier protocol version and hence that it expects renegotiation. We also have a need to signal properties with our certificates for a possible v3 handshake in the future. Therefore I propose that we match OpenSSL default self-signed certificates (a 64-bit random number), but reserve the two least- significant bits for signaling. For the moment, these two bits will be zero. This means that an attacker may be able to identify Tor certificates from default OpenSSL certificates with a 75% probability. As a security note, care must be taken to ensure that supporting this covert channel will not lead to an attacker having a method to downgrade client behavior. This shouldn't be a risk because the TLS Finished message hashes over all the bytes of the handshake, including the certificates. [Randomized serial numbers are implemented in 0.2.3.9-alpha. We probably shouldn't do certificate tagging by a covert channel in serial numbers, since doing so would mean we could never have an externally signed cert. -NM] Certificate fingerprinting issues expressed as base64 encoding It appears that all deployed Tor certificates have the following strings in common: MIIB CCA gAwIBAgIETU ANBgkqhkiG9w0BAQUFADA YDVQQDEx 3d3cu As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID) properties (sha1WithRSAEncryption, commonName, etc) of how we generate our certificates. As an illustrated example of the common bytes of all certificates used within the Tor network within a single one hour window, I have replaced the actual value with a wild card ('.') character here: -----BEGIN CERTIFICATE----- MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3 d3cuariable length and padding -----END CERTIFICATE----- This fine ascii art only illustrates the bytes that absolutely match in all cases. In many cases, it's likely that there is a high probability for a given byte to be only a small subset of choices. Using the above strings, the EFF's certificate observatory may trivially discover all known relays, known bridges and unknown bridges in a single SQL query. I propose that we ensure that we test our certificates to ensure that they do not have these kinds of statistical similarities without ensuring overlap with a very large cross section of the internet's certificates. Certificate dating and validity issues TLS certificates found in the wild are generally found to be long-lived; they are frequently old and often even expired. The current Tor certificate validity time is a very small time window starting at generation time and ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME (2*60*60). I propose that the certificate validity time length is extended to a period of twelve Earth months, possibly with a small random skew to be determined by the implementer. Tor should randomly set the start date in the past or some currently unspecified window of time before the current date. This would more closely track the typical distribution of non-Tor TLS certificate expiration times. The certificate values, such as expiration, should not be used for anything relating to security; for example, if the OR presents an expired TLS certificate, this does not imply that the client should terminate the connection (as would be appropriate for an ordinary TLS implementation). Rather, I propose we use a TOFU style expiration policy - the certificate should never be trusted for more than a two hour window from first sighting. This policy should have two major impacts. The first is that an adversary will have to perform a differential analysis of all certificates for a given IP address rather than a single check. The second is that the server expiration time is enforced by the client and confirmed by keys rotating in the consensus. The expiration time should not be a fixed time that is simple to calculate by any Deep Packet Inspection device or it will become a new Tor TLS setup fingerprint. [Deferred and needs revision; see proposal XXX. -NM] Proposed certificate form The following output from openssl asn1parse results from the proposed certificate generation algorithm. It matches the results of generating a default self-signed certificate: 0:d=0 hl=4 l= 513 cons: SEQUENCE 4:d=1 hl=4 l= 362 cons: SEQUENCE 8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478 19:d=2 hl=2 l= 13 cons: SEQUENCE 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 32:d=3 hl=2 l= 0 prim: NULL 34:d=2 hl=2 l= 69 cons: SEQUENCE 36:d=3 hl=2 l= 11 cons: SET 38:d=4 hl=2 l= 9 cons: SEQUENCE 40:d=5 hl=2 l= 3 prim: OBJECT :countryName 45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU 49:d=3 hl=2 l= 19 cons: SET 51:d=4 hl=2 l= 17 cons: SEQUENCE 53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName 58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State 70:d=3 hl=2 l= 33 cons: SET 72:d=4 hl=2 l= 31 cons: SEQUENCE 74:d=5 hl=2 l= 3 prim: OBJECT :organizationName 79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd 105:d=2 hl=2 l= 30 cons: SEQUENCE 107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z 122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z 137:d=2 hl=2 l= 69 cons: SEQUENCE 139:d=3 hl=2 l= 11 cons: SET 141:d=4 hl=2 l= 9 cons: SEQUENCE 143:d=5 hl=2 l= 3 prim: OBJECT :countryName 148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU 152:d=3 hl=2 l= 19 cons: SET 154:d=4 hl=2 l= 17 cons: SEQUENCE 156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName 161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State 173:d=3 hl=2 l= 33 cons: SET 175:d=4 hl=2 l= 31 cons: SEQUENCE 177:d=5 hl=2 l= 3 prim: OBJECT :organizationName 182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd 208:d=2 hl=3 l= 159 cons: SEQUENCE 211:d=3 hl=2 l= 13 cons: SEQUENCE 213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption 224:d=4 hl=2 l= 0 prim: NULL 226:d=3 hl=3 l= 141 prim: BIT STRING 370:d=1 hl=2 l= 13 cons: SEQUENCE 372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 383:d=2 hl=2 l= 0 prim: NULL 385:d=1 hl=3 l= 129 prim: BIT STRING [Rejected pending more evidence; this pattern is trivially detectable, and there is just not enough reason at the moment to think that this particular certificate pattern is common enough for sites that matter that the censors wouldn't be willing to block it. -NM] Custom Certificates It should be possible for a Tor relay operator to use a specifically supplied certificate and secret key. This will allow a relay or bridge operator to use a certificate signed by any member of any geographically relevant certificate authority racket; it will also allow for any other user-supplied certificate. This may be desirable in some kinds of filtered networks or when attempting to avoid attracting suspicion by blending in with the TLS web server certificate crowd. [Deferred; see proposal XXX] Problematic Diffie–Hellman parameters We currently send a static Diffie–Hellman parameter, prime p (or “prime p outlaw”) as specified in RFC2409 as part of the TLS Server Hello response. The use of this prime in TLS negotiations may, as a result, be filtered and effectively banned by certain networks. We do not have to use this particular prime in all cases. While amusing to have the power to make specific prime numbers into a new class of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p outlaw is not required. The use of this prime in TLS negotiations may, as a result, be filtered and effectively banned by certain networks. We do not have to use this particular prime in all cases. I propose that the function to initialize and generate DH parameters be split into two functions. First, init_dh_param() should be used only for OR-to-OR DH setup and communication. Second, it is proposed that we create a new function init_tls_dh_param() that will have a two-stage development process. The first stage init_tls_dh_param() will use the same prime that Apache2.x [4] sends (or “dh1024_apache_p”), and this change should be made immediately. This is a known good and safe prime number (p-1 / 2 is also prime) that is currently not known to be blocked. The second stage init_tls_dh_param() should randomly generate a new prime on a regular basis; this is designed to make the prime difficult to outlaw or filter. Call this a shape-shifting or "Rakshasa" prime. This should be added to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution time and probably does not need to be stored on disk. Rakshasa primes only need to be generated by Tor relays as Tor clients will never send them. Such a prime should absolutely not be shared between different Tor relays nor should it ever be static after the 0.2.3.x release. As a security precaution, care must be taken to ensure that we do not generate weak primes or known filtered primes. Both weak and filtered primes will undermine the TLS connection security properties. OpenSSH solves this issue dynamically in RFC 4419 [5] and may provide a solution that works reasonably well for Tor. More research in this area including the applicability of Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably added to Tor. [Randomized DH groups are implemented in 0.2.3.9-alpha. -NM] Practical key size Currently we use a 1024 bit long RSA modulus. I propose that we increase the RSA key size to 2048 as an additional channel to signal support for the V3 handshake setup. 2048 appears to be the most common key size[0] above 1024. Additionally, the increase in modulus size provides a reasonable security boost with regard to key security properties. The implementer should increase the 1024 bit RSA modulus to 2048 bits. [Deferred and needs performance analysis. See proposal XXX. Additionally, DH group strength seems far more crucial. Still, this is out-of-scope for a "normalization" question. -NM] Possible future filtering nightmares At some point it may cost effective or politically feasible for a network filter to simply block all signed or self-signed certificates without a known valid CA trust chain. This will break many applications on the internet and hopefully, our option for custom certificates will ensure that this step is simply avoided by the censors. The Rakshasa prime approach may cause censors to specifically allow only certain known and accepted DH parameters. Appendix: Other issues What other obvious TLS certificate issues exist? What other static values are present in the Tor TLS setup process? [0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html [1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html [2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html [3] To be fair this is hardly a new class of numbers. History is rife with similar examples of inane authoritarian attempts at mathematical secrecy. Probably the most dramatic example is the story of the pupil Hipassus of Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the fact that Root2 cannot be expressed as a fraction of whole numbers (now called an irrational number) and was assassinated for revealing this secret. Further reading on the subject may be found on the Wikipedia: http://en.wikipedia.org/wiki/Hippasus [4] httpd-2.2.17/modules/ss/ssl_engine_dh.c [5] http://tools.ietf.org/html/rfc4419 [6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html
Filename: 180-pluggable-transport.txt Title: Pluggable transports for circumvention Author: Jacob Appelbaum, Nick Mathewson Created: 15-Oct-2010 Status: Closed Implemented-In: 0.2.3.x Overview This proposal describes a way to decouple protocol-level obfuscation from the core Tor protocol in order to better resist client-bridge censorship. Our approach is to specify a means to add pluggable transport implementations to Tor clients and bridges so that they can negotiate a superencipherment for the Tor protocol. Scope This is a document about transport plugins; it does not cover discovery improvements, or bridgedb improvements. While these requirements might be solved by a program that also functions as a transport plugin, this proposal only covers the requirements and operation of transport plugins. Motivation Frequently, people want to try a novel circumvention method to help users connect to Tor bridges. Some of these methods are already pretty easy to deploy: if the user knows an unblocked VPN or open SOCKS proxy, they can just use that with the Tor client today. Less easy to deploy are methods that require participation by both the client and the bridge. In order of increasing sophistication, we might want to support: 1. A protocol obfuscation tool that transforms the output of a TLS connection into something that looks like HTTP as it leaves the client, and back to TLS as it arrives at the bridge. 2. An additional authentication step that a client would need to perform for a given bridge before being allowed to connect. 3. An information passing system that uses a side-channel in some existing protocol to convey traffic between a client and a bridge without the two of them ever communicating directly. 4. A set of clients to tunnel client->bridge traffic over an existing large p2p network, such that the bridge is known by an identifier in that network rather than by an IP address. We could in theory support these almost fine with Tor as it stands today: every Tor client can take a SOCKS proxy to use for its outgoing traffic, so a suitable client proxy could handle the client's traffic and connections on its behalf, while a corresponding program on the bridge side could handle the bridge's side of the protocol transformation. Nevertheless, there are some reasons to add support for transportation plugins to Tor itself: 1. It would be good for bridges to have a standard way to advertise which transports they support, so that clients can have multiple local transport proxies, and automatically use the right one for the right bridge. 2. There are some changes to our architecture that we'll need for a system like this to work. For testing purposes, if a bridge blocks off its regular ORPort and instead has an obfuscated ORPort, the bridge authority has no way to test it. Also, unless the bridge has some way to tell that the bridge-side proxy at 127.0.0.1 is not the origin of all the connections it is relaying, it might decide that there are too many connections from 127.0.0.1, and start paring them down to avoid a DoS. 3. Censorship and anticensorship techniques often evolve faster than the typical Tor release cycle. As such, it's a good idea to provide ways to test out new anticensorship mechanisms on a more rapid basis. 4. Transport obfuscation is a relatively distinct problem from the other privacy problems that Tor tries to solve, and it requires a fairly distinct skill-set from hacking the rest of Tor. By decoupling transport obfuscation from the Tor core, we hope to encourage people working on transport obfuscation who would otherwise not be interested in hacking Tor. 5. Finally, we hope that defining a generic transport obfuscation plugin mechanism will be useful to other anticensorship projects. Non-Goals We're not going to talk about automatic verification of plugin correctness and safety via sandboxing, proof-carrying code, or whatever. We need to do more with discovery and distribution, but that's not what this proposal is about. We're pretty convinced that the problems are sufficiently orthogonal that we should be fine so long as we don't preclude a single program from implementing both transport and discovery extensions. This proposal is not about what transport plugins are the best ones for people to write. We do, however, make some general recommendations for plugin authors in an appendix. We've considered issues involved with completely replacing Tor's TLS with another encryption layer, rather than layering it inside the obfuscation layer. We describe how to do this in an appendix to the current proposal, though we are not currently sure whether it's a good idea to implement. We deliberately reject any design that would involve linking the transport plugins into Tor's process space. Design overview To write a new transport protocol, an implementer must provide two pieces: a "Client Proxy" to run at the initiator side, and a "Server Proxy" to run at the server side. These two pieces may or may not be implemented by the same program. Each client may run any number of Client Proxies. Each one acts like a SOCKS proxy that accepts connections on localhost. Each one runs on a different port, and implements one or more transport methods. If the protocol has any parameters, they are passed from Tor inside the regular username/password parts of the SOCKS protocol. Bridges (and maybe relays) may run any number of Server Proxies: these programs provide an interface like stunnel: they get connections from the network (typically by listening for connections on the network) and relay them to the Bridge's real ORPort. To configure one of these programs, it should be sufficient simply to list it in your torrc. The program tells Tor which transports it provides. The Tor consensus should carry a new approved version number that is specific for pluggable transport; this will allow Tor to know when a particular transport is known to be unsafe, safe, or non-functional. Bridges (and maybe relays) report in their descriptors which transport protocols they support. This information can be copied into bridge lines. Bridges using a transport protocol may have multiple bridge lines. Any methods that are wildly successful, we can bake into Tor. Specifications: Client behavior We extend the bridge line format to allow you to say which method to use to connect to a bridge. The new format is: Bridge method address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v] To connect to such a bridge, the Tor program needs to know which SOCKS proxy will support the transport called "method". It then connects to this proxy, and asks it to connect to address:port. If [id-fingerprint] is provided, Tor should expect the public identity key on the TLS connection to match the digest provided in [id-fingerprint]. If any [k=v] items are provided, they are configuration parameters for the proxy: Tor should separate them with semicolons and put them in the user and password fields of the request, splitting them across the fields as necessary. If a key or value value must contain a semicolon or a backslash, it is escaped with a backslash. Method names must be C identifiers. For reference, the old bridge format was Bridge address[:port] [id-fingerprint] where port defaults to 443 and the id-fingerprint is optional. The new format can be distinguished from the old one by checking if the first argument has any non-C-identifier characters. (Looking for a period should be a simple way.) Also, while the id-fingerprint could optionally include whitespace in the old format, whitespace in the id-fingerprint is not permitted in the new format. Example: if the bridge line is "bridge trebuchet www.example.com:3333 keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m" AND if the Tor client knows that the 'trebuchet' method is supported, the client should connect to the proxy that provides the 'trebuchet' method, ask it to connect to www.example.com, and provide the string "rocks=20;height=5.6m" as the username, the password, or split across the username and password. There are two ways to tell Tor clients about protocol proxies: external proxies and managed proxies. An external proxy is configured with ClientTransportPlugin <method> socks4 <address:port> [auth=X] or ClientTransportPlugin <method> socks5 <address:port> [username=X] [password=Y] as in "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999". This example tells Tor that another program is already running to handle 'trubuchet' connections, and Tor doesn't need to worry about it. A managed proxy is configured with ClientTransportPlugin <methods> exec <path> [options] as in "ClientTransportPlugin trebuchet exec /usr/libexec/trebuchet --managed". This example tells Tor to launch an external program to provide a socks proxy for 'trebuchet' connections. The Tor client only launches one instance of each external program with a given set of options, even if the same executable and options are listed for more than one method. In managed proxies, <methods> can be a comma-separated list of pluggable transport method names, as in: "ClientTransportPlugin pawn,bishop,rook exec /bin/ptproxy --managed". If instead of a transport method, the torrc lists "*" for a managed proxy, Tor uses that proxy for all transport methods that the plugin supports. So "ClientTransportPlugin * exec /usr/libexec/tor/foobar" tells Tor that Tor should use the foobar plugin for every method that the proxy supports. See the "Managed proxy interface" section below for details on how Tor learns which methods a plugin supports. If two plugins support the same method, Tor should use whichever one is listed first. The same program can implement a managed or an external proxy: it just needs to take an argument saying which one to be. Server behavior Server proxies are configured similarly to client proxies. When launching a proxy, the server must tell it what ORPort it has configured, and what address (if any) it can listen on. The server must tell the proxy which (if any) methods it should provide if it can; the proxy needs to tell the server which methods it is actually providing, and on what ports. When a client connects to the proxy, the proxy may need a way to tell the server some identifier for the client address. It does this in-band. As before, the server lists proxies in its torrc. These can be external proxies that run on their own, or managed proxies that Tor launches. An external server proxy is configured as ServerTransportPlugin <method> proxy <address:port> <param=val> ... as in "ServerTransportPlugin trebuchet proxy 127.0.0.1:999 rocks=heavy". The param=val pairs and the address are used to make the bridge configuration information that we'll tell users. A managed proxy is configured as ServerTransportPlugin <methods> exec </path/to/binary> [options] or ServerTransportPlugin * exec </path/to/binary> [options] When possible, Tor should launch only one binary of each binary/option pair configured. So if the torrc contains ClientTransportPlugin foo exec /usr/bin/megaproxy --foo ClientTransportPlugin bar exec /usr/bin/megaproxy --bar ServerTransportPlugin * exec /usr/bin/megaproxy --foo then Tor will launch the megaproxy binary twice: once with the option --foo and once with the option --bar. Managed proxy interface When the Tor client or relay launches a managed proxy, it communicates via environment variables. At a minimum, it sets (in addition to the normal environment variables inherited from Tor): {Client and server} "TOR_PT_STATE_LOCATION" -- A filesystem directory path where the proxy should store state if it wants to. This directory is not required to exist, but the proxy SHOULD be able to create it if it doesn't. The proxy MUST NOT store state elsewhere. Example: TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/ "TOR_PT_MANAGED_TRANSPORT_VER" -- To tell the proxy which versions of this configuration protocol Tor supports. Future versions will give a comma-separated list. Clients MUST accept comma-separated lists containing any version that they recognize, and MUST work correctly even if some of the versions they don't recognize are non-numeric. Valid version characters are non-space, non-comma printing ASCII characters. Example: TOR_PT_MANAGED_TRANSPORT_VER=1,1a,2,4B {Client only} "TOR_PT_CLIENT_TRANSPORTS" -- A comma-separated list of which methods this client should enable, or * if all methods should be enabled. The proxy SHOULD ignore methods that it doesn't recognize. Example: TOR_PT_CLIENT_TRANSPORTS=trebuchet,battering_ram,ballista {Server only} "TOR_PT_EXTENDED_SERVER_PORT" -- An <address>:<port> where tor should be listening for connections speaking the extended ORPort protocol (See the "The extended ORPort protocol" section below). If tor does not support the extended ORPort protocol, it MUST use the empty string as the value of this environment variable. Example: TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:4200 "TOR_PT_ORPORT" -- Our regular ORPort in a form suitable for local connections, i.e. connections from the proxy to the ORPort. Example: TOR_PT_ORPORT=127.0.0.1:9001 "TOR_PT_SERVER_BINDADDR" -- A comma seperated list of <key>-<value> pairs, where <key> is a transport name and <value> is the adress:port on which it should listen for client proxy connections. The keys holding transport names must appear on the same order as they appear on TOR_PT_SERVER_TRANSPORTS. This might be the advertised address, or might be a local address that Tor will forward ports to. It MUST be an address that will work with bind(). Example: TOR_PT_SERVER_BINDADDR=trebuchet-127.0.0.1:1984,ballista-127.0.0.1:4891 "TOR_PT_SERVER_TRANSPORTS" -- A comma-separated list of server methods that the proxy should support, or * if all methods should be enabled. The proxy SHOULD ignore methods that it doesn't recognize. Example: TOR_PT_SERVER_TRANSPORTS=trebuchet,ballista The transport proxy replies by writing NL-terminated lines to stdout. The line metaformat is <Line> ::= <Keyword> <OptArgs> <NL> <Keyword> ::= <KeywordChar> | <Keyword> <KeywordChar> <KeyWordChar> ::= <any US-ASCII alphanumeric, dash, and underscore> <OptArgs> ::= <Args>* <Args> ::= <SP> <ArgChar> | <Args> <ArgChar> <ArgChar> ::= <any US-ASCII character but NUL or NL> <SP> ::= <US-ASCII whitespace symbol (32)> <NL> ::= <US-ASCII newline (line feed) character (10)> Tor MUST ignore lines with keywords that it doesn't recognize. First, if there's an error parsing the environment variables, the proxy should write: ENV-ERROR <errormessage> and exit. If the environment variables were correctly formatted, the proxy should write: VERSION <configuration protocol version> to say that it supports this configuration protocol version (example "VERSION 1"). It must either pick a version that Tor told it about in TOR_PT_MANAGED_TRANSPORT_VER, or pick no version at all, say: VERSION-ERROR no-version and exit. The proxy should then open its ports. If running as a client proxy, it should not use fixed ports; instead it should autoselect ports to avoid conflicts. A client proxy should by default only listen on localhost for connections. A server proxy SHOULD try to listen at a consistent port, though it SHOULD pick a different one if the port it last used is now allocated. A client or server proxy then should tell which methods it has made available and how. It does this by printing zero or more CMETHOD and SMETHOD lines to its stdout. These lines look like: CMETHOD <methodname> socks4/socks5 <address:port> [ARGS=arglist] \ [OPT-ARGS=arglist] as in CMETHOD trebuchet socks5 127.0.0.1:19999 ARGS=rocks,height \ OPT-ARGS=tensile-strength The ARGS field lists mandatory parameters that must appear in every bridge line for this method. The OPT-ARGS field lists optional parameters. If no ARGS or OPT-ARGS field is provided, Tor should not check the parameters in bridge lines for this method. The proxy should print a single "CMETHODS DONE" line after it is finished telling Tor about the client methods it provides. If it tries to supply a client method but can't for some reason, it should say: CMETHOD-ERROR <methodname> <errormessage> A proxy should also tell Tor about the server methods it is providing by printing zero or more SMETHOD lines. These lines look like: SMETHOD <methodname> <address:port> [options] If there's an error setting up a configured server method, the proxy should say: SMETHOD-ERROR <methodname> <errormessage> as in SMETHOD-ERROR trebuchet could not setup 'trebuchet' method The 'address:port' part of an SMETHOD line is the address to put in the bridge line. The Options part is a list of space-separated K:V flags that Tor should know about. Recognized options are: - FORWARD:1 If this option is set (for example, because address:port is not a publicly accessible address), then Tor needs to forward some other address:port to address:port via upnp-helper. Tor would then advertise that other address:port in the bridge line instead. - ARGS:K=V,K=V,K=V If this option is set, the K=V arguments are added to Tor's extrainfo document. - DECLARE:K=V,... If this option is set, the K=V options should be added as extension entries to the router descriptor, so clients and other relays can make use of it. See ideas/xxx-triangleboy-transport.txt for an example situation where the plugin would want to declare parameters to other Tors. - USE-EXTENDED-PORT:1 If this option is set, the server plugin is planning to connect to Tor's extended server port. SMETHOD and CMETHOD lines may be interspersed, to allow the proxies to report methods as they become available, even when some methods may require probing your network, connecting to some kind of peers, etc before they are set up. After the final SMETHOD line, the proxy says "SMETHODS DONE". The proxy SHOULD NOT tell Tor about a server or client method unless it is actually open and ready to use. Tor clients SHOULD NOT use any method from a client proxy or advertise any method from a server proxy UNLESS it is listed as a possible method for that proxy in torrc, and it is listed by the proxy as a method it supports. Proxies should respond to a single INT signal by closing their listener ports and not accepting any new connections, but keeping all connections open, then terminating when connections are all closed. Proxies should respond to a second INT signal by shutting down cleanly. The managed proxy configuration protocol version defined in this section is "1". So, for example, if tor supports this configuration protocol it should set the environment variable: TOR_PT_MANAGED_TRANSPORT_VER=1 The Extended ORPort protocol The Extended ORPort protocol is described in proposal 196. Advertising bridge methods Bridges put the 'method' lines in their extra-info documents. transport SP <transportname> SP <address:port> [SP arglist] NL The address:port are as returned from an SMETHOD line (unless they are replaced by the FORWARD: directive). The arglist is a K=V,... list as returned in the ARGS: part of the SMETHOD line's Options component. If the SMETHOD line includes a DECLARE: part, the router descriptor gets a new line: transport-info SP <transportname> [SP arglist] NL Bridge authority behavior We need to specify a way to test different transport methods that bridges claim to support. We should test as many as possible. We should NOT require that we have a way to test every possible transport method before we allow its use: the point of this design is to remove bottlenecks in transport deployment. Bridgedb behavior Bridgedb can, given a set of router descriptors and their corresponding extrainfo documents, generate a set of bridge lines for each bridge. Bridgedb may want to avoid handing out methods that seem to get bridges blocked quickly. Implementation plan First, we should implement per-bridge proxies via the "external proxy" method described in "Specifications: Client behavior". Also, we'll want to build the extended-server-port mechanism. This will let bridges run transport proxies such that they can generate bridge lines to give to clients for testing, so long as the user configures and launches their proxies on their own. Once that's done, we can see if we need any managed proxies, or if the whole idea there is silly. If we do, the next most important part seems to be getting the client-side automation part written. And once that's done, we can evaluate how much of the server side is easy for people to do and how much is hard. The "obfsproxy" obfuscating proxy is a likely candidate for an initial transport (trac entry #2760), as is Steven Murdoch's http thing (trac entry #2759) or something similar. Notes on plugins to write We should ship a couple of null plugin implementations in one or two popular, portable languages so that people get an idea of how to write the stuff. 1. We should have one that's just a proof of concept that does nothing but transfer bytes back and forth. 2. We should implement DNS or HTTP using other software (as Geoff Goodell did years ago with DNS) as an example of wrapping existing code into our plugin model. 3. The obfuscated-ssh superencipherment is pretty trivial and pretty useful. It makes the protocol stringwise unfingerprintable. 4. If we do a raw-traffic proxy, openssh tunnels would be the logical choice. Appendix: recommendations for transports Be free/open-source software. Also, if you think your code might someday do so well at circumvention that it should be implemented inside Tor, it should use the same license as Tor. Tor already uses OpenSSL, Libevent, and zlib. Before you go and decide to use crypto++ in your transport plugin, ask yourself whether OpenSSL wouldn't be a nicer choice. Be portable: most Tor users are on Windows, and most Tor developers are not, so designing your code for just one of these platforms will make it either get a small userbase, or poor auditing. Think secure: if your code is in a C-like language, and it's hard to read it and become convinced it's safe, then it's probably not safe. Think small: we want to minimize the bytes that a Windows user needs to download for a transport client. Avoid security-through-obscurity if possible. Specify. Resist trivial fingerprinting: There should be no good string or regex to search for to distinguish your protocol from protocols permitted by censors. Imitate a real profile: There are many ways to implement most protocols -- and in many cases, most possible variants of a given protocol won't actually exist in the wild.
Filename: 181-optimistic-data-client.txt Title: Optimistic Data for Tor: Client Side Author: Ian Goldberg Created: 2-Jun-2011 Status: Closed Implemented-In: 0.2.3.3-alpha Overview: This proposal (as well as its already-implemented sibling concerning the server side) aims to reduce the latency of HTTP requests in particular by allowing: 1. SOCKS clients to optimistically send data before they are notified that the SOCKS connection has completed successfully 2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT state 3. Exit nodes to accept and queue DATA cells while in the EXIT_CONN_STATE_CONNECTING state This particular proposal deals with #1 and #2. For more details (in general and for #3), see the sibling proposal 174 (Optimistic Data for Tor: Server Side), which has been implemented in 0.2.3.1-alpha. Motivation: This change will save one OP<->Exit round trip (down to one from two). There are still two SOCKS Client<->OP round trips (negligible time) and two Exit<->Server round trips. Depending on the ratio of the Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will decrease the latency by 25 to 50 percent. Experiments validate these predictions. [Goldberg, PETS 2010 rump session; see https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] Design: Currently, data arriving on the SOCKS connection to the OP on a stream in AP_CONN_STATE_CONNECT_WAIT is queued, and transmitted when the state transitions to AP_CONN_STATE_OPEN. Instead, when data arrives on the SOCKS connection to the OP on a stream in AP_CONN_STATE_CONNECT_WAIT (connection_edge_process_inbuf): - Check to see whether optimistic data is allowed at all (see below). - Check to see whether the exit node for this stream supports optimistic data (according to tor-spec.txt section 6.2, this means that the exit node's version number is at least 0.2.3.1-alpha). If you don't know the exit node's version number (because it's not in your hashtable of fingerprints, for example), assume it does *not* support optimistic data. - If both are true, transmit the data on the stream. Also, when a stream transitions *to* AP_CONN_STATE_CONNECT_WAIT (connection_ap_handshake_send_begin), do the above checks, and immediately send any already-queued data if they pass. SOCKS clients (e.g. polipo) will also need to be patched to take advantage of optimistic data. The simplest solution would seem to be to just start sending data immediately after sending the SOCKS CONNECT command, without waiting for the SOCKS server reply. When the SOCKS client starts reading data back from the SOCKS server, it will first receive the SOCKS server reply, which may indicate success or failure. If success, it just continues reading the stream as normal. If failure, it does whatever it used to do when a SOCKS connection failed. Security implications: ORs (for sure the Exit, and possibly others, by watching the pattern of packets), as well as possibly end servers, will be able to tell that a particular client is using optimistic data. This of course has the potential to fingerprint clients, dividing the anonymity set. The usual kind of solution is suggested: - There is a boolean consensus parameter UseOptimisticData. - There is a 3-state (-1, 0, 1) configuration parameter UseOptimisticData (or give it a distinct name if you like) defaulting to -1. - If the configuration parameter is -1, the OP obeys the consensus value; otherwise, it obeys the configuration parameter. It may be wise to set the consensus parameter to 1 at the same time as similar other client protocol changes are made (for example, a new circuit construction protocol) in order to not further subdivide the anonymity set. Specification: The current tor-spec has already been updated by proposal 174 to handle optimistic data. It says, in part: If the exit node does not support optimistic data (i.e. its version number is before 0.2.3.1-alpha), then the OP MUST wait for a RELAY_CONNECTED cell before sending any data. If the exit node supports optimistic data (i.e. its version number is 0.2.3.1-alpha or later), then the OP MAY send RELAY_DATA cells immediately after sending the RELAY_BEGIN cell (and before receiving either a RELAY_CONNECTED or RELAY_END cell). Should the "MAY" be more specific, referring to the consensus parameters? Or does the existence of the configuration parameter override mean it's really "MAY", regardless? Compatibility: There are compatibility issues, as mentioned above. OPs MUST NOT send optimistic data to Exit nodes whose version numbers predate 0.2.3.1-alpha. OPs MAY send optimistic data to Exit nodes whose version numbers match or follow that value. Implementation: My git diff is 42 lines long (+17 lines, -1 line), changing only the two functions mentioned above (connection_edge_process_inbuf and connection_ap_handshake_send_begin). This diff does not, however, handle the configuration options, or check the version number of the exit node. I have patched a command-line SOCKS client (webfetch) to use optimistic data. I have not attempted to patch polipo, but I have looked at it a bit, and it seems pretty straightforward. (Of course, if and when polipo is deprecated, whatever else speaks SOCKS to the OP should take advantage of optimistic data.) Performance and scalability notes: OPs may queue a little more data, if the SOCKS client pushes it faster than the OP can write it out. But that's also true today after the SOCKS CONNECT returns success, right?
Filename: 182-creditbucket.txt Title: Credit Bucket Author: Florian Tschorsch and Björn Scheuermann Created: 22 Jun 2011 Status: Obsolete Note: Obsolete because we no longer have a once-per-second bucket refill. Overview: The following proposal targets the reduction of queuing times in onion routers. In particular, we focus on the token bucket algorithm in Tor and point out that current usage unnecessarily locks cells for long time spans. We propose a non-intrusive change in Tor's design which overcomes the deficiencies. Motivation and Background: Cell statistics from the Tor network [1] reveal that cells reside in individual onion routers' cell queues for up to several seconds. These queuing times increase the end-to-end delay very significantly and are apparently the largest contributor to overall cell latency in Tor. In Tor there exist multiple token buckets on different logical levels. They all work independently. They are used to limit the up- and downstream of an onion router. All token buckets are refilled every second with a constant amount of tokens that depends on the configured bandwidth limits. For example, the so-called RelayedTokenBucket limits relay traffic only. All read data of incoming connections are bound to a dedicated read token bucket. An analogous mechanism exists for written data leaving the onion router. We were able to identify the specific usage and implementation of the token bucket algorithm as one cause for very high (and unnecessary) queuing times in an onion router. We observe that the token buckets in Tor are (surprisingly at a first glance) allowed to take on negative fill levels. This is justified by the TLS connections between onion routers where whole TLS records need to be processed. The token bucket on the incoming side (i.e., the one which determines at which rate it is allowed to read from incoming TCP connections) in particular often runs into non-negligible negative fill levels. As a consequence of this behavior, sometimes slightly more data is read than it would be admissible upon strict interpretation of the token bucket concept. However, the token bucket for limiting the outgoing rate does not take on negative fill levels equally often. Consequently, it regularly happens that somewhat more data are read on the incoming side than the outgoing token bucket allows to be written during the same cycle, even if their configured data rates are the same. The respective cells will thus not be allowed to leave the onion router immediately. They will thus necessarily be queued for at least as long as it takes until the token bucket on the outgoing side is refilled again. The refill interval currently is, as mentioned before, one second -- so, these cells are delayed for a very substantial time. In summary, one could say that the two buckets, on the incoming and outgoing side, work like a double door system and frequently lock cells for a full token bucket refill interval length. General Design: In order to overcome the described problem, we propose the following changes related to the token bucket algorithm. We observe that the token bucket on the outgoing connections with its current design is contra productive in the sense of queuing times. We therefore propose modifications to the token bucket algorithm that will eliminate the "double door effect" discussed above. Let us start from Tor's current approach: Thus, we have a regular token bucket on the reading side with a certain rate and a certain burst size. Let x denote the current amount of tokens in the bucket. On the outgoing side we need something appropriate that monitors and constrains the outgoing rate, but at the same time avoids holding back cells (cf. double door effects) whenever possible. Here we propose something that adopts the role of a token bucket, but realizes this functionality in a slightly different way. We call it a "credit bucket". Like a token bucket, the credit bucket also has a current fill level, denoted by y. However, the credit bucket is refilled in a different way. To understand how it works, let us look at the possible operations: As said, x is the fill level of a regular token bucket on the incoming side and thus gets incremented periodically according to the configured rate. No changes here. If x<=0, we are obviously not allowed to read. If x>0, we are allowed to read up to x bytes of incoming data. If k bytes are read (k<=x), then we update x and y as follows: x = x - k (1) y = y + k (2) (1) is the standard token bucket operation on the incoming side. Whenever data is admitted in, though, an additional operation is performed: (2) allocates the same number of bytes on the outgoing side, which will later on allow the same number of bytes to leave the onion router without any delays. If y + x > -M, we are allowed to write up to y + x + M bytes on the outgoing side, where M is a positive constant. M specifies a burst size for the outgoing side. M should be higher than the number of tokens that get refilled during a refill interval, we would suggest to have M in the order of a few seconds "worth" of data. Now if k bytes are written on the outgoing side, we proceed as follows: If k <= y then y = y - k In this case we use "saved" credits, previously allocated on the incoming side when incoming data has been processed. If k > y then y = 0 and x = x - (k-y) We generated additional traffic in the onion router, so that more data is to be sent than has been read (the credit is not sufficient). We therefore "steal" tokens from the token buffer on the incoming side to compensate for the additionally generated data. This will result in correspondingly less data being read on the incoming side subsequently. As a result of such an operation, the token bucket fill level x on the incoming side may become negative (but it can never fall below -M). If y + x <= -M then outgoing data will be held back. This may lead to double-door effects, but only in extreme cases where the outgoing traffic largely exceeds the incoming traffic, so that the outgoing bursts size M is exceeded. Aside from short-term bursts of configurable size (as with every token bucket), this procedure guarantees that the configured rate may never be exceeded (on the application layer, that is; as with the current implementation, an attacker may easily cause the onion router to arbitrarily exceed the limits on the lower layers). Over time, we never send more data than the configured rate: every sent byte needs a corresponding token on the incoming side; this token must either have been consumed by an incoming byte before (it then became a "credit"), or it is "stolen" from the incoming bucket to compensate for data generated within the onion router. Specific Design Changes: In the following we briefly point out the specific changes that need to be done in Tor's source code. By doing so one can see how non intrusive our modifications are. First we need to address the bucket increment and decrement operations. According to the described logic above, this should be done in the methods connection_bucket_refill and connection_buckets_decrement respectively. In particular allocating, saving and "stealing" of tokens need to be considered here. Second the rate limiting, i.e. the amount we are allowed to write (connection_bucket_write_limit) needs to be adapted in lines of the credit bucket logic. Meaning in order to avoid the here identified unnecessary queuing of cells, we need to consider the new burst parameter M. Here we also need to take non rate limited connections such as from the localhost into account. The rate limiting on the reading side remains the same. At last we need to find good values/ ratios for the parameter M such that the trade off between avoiding "double door effects" and maintaining strict rate limits work as expected. As future work and after insights about the performance gain of the here described proposal we need to find a way to implement this both using bufferevent rate limiting with libevent 2.3.x and Tor's rate limiting code. Conclusion: This proposal can be implemented with moderate effort and requires changes only at the points where currently the token bucket operations are performed. We feel that this is not the be-all and end-all solution, because it again introduces a feedback loop between the incoming and the outgoing side. We therefore still hope that we will be able to come to a both simpler and more effective design in the future. However, we believe that what we proposed here is a good compromise between avoiding double-door effects to the furthest possible extent, strictly enforcing an application-layer data rate, and keeping the extent of changes to the code small. Feedback is highly appreciated. References: [1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009. [2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
Filename: 183-refillintervals.txt Title: Refill Intervals Author: Florian Tschorsch and Björn Scheuermann Created: 03-Dec-2010 Status: Closed Implemented-In: 0.2.3.5-alpha Overview: In order to avoid additional queuing and bursty traffic, the refill interval of the token bucket algorithm should be shortened. Thus we propose a configurable parameter that sets the refill interval accordingly. Motivation and Background: In Tor there exist multiple token buckets on different logical levels. They all work independently. They are used to limit the up- and downstream of an onion router. All token buckets are refilled every second with a constant amount of tokens that depends on the configured bandwidth limits. The very coarse-grained refill interval of one second has detrimental effects. First, consider an onion router with multiple TLS connections over which cells arrive. If there is high activity (i.e., many incoming cells in total), then the coarse refill interval will cause unfairness. Assume (just for simplicity) that C doesn't share its TLS connection with any other circuit. Moreover, assume that C hasn't transmitted any data for some time (e.g., due a typical bursty HTTP traffic pattern). Consequently, there are no cells from this circuit in the incoming socket buffers. When the buckets are refilled, the incoming token bucket will immediately spend all its tokens on other incoming connections. Now assume that cells from C arrive soon after. For fairness' sake, these cells should be serviced timely -- circuit C hasn't received any bandwidth for a significant time before. However, it will take a very long time (one refill interval) before the current implementation will fetch these cells from the incoming TLS connection, because the token bucket will remain empty for a long time. Just because the cells happened to arrive at the "wrong" point in time, they must wait. Such situations may occur even though the configured admissible incoming data rate is not exceeded by incoming cells: the long refill intervals often lead to an operational state where all the cells that were admissible during a given one-second period are queued until the end of this second, before the onion router even just starts processing them. This results in unnecessary, long queuing delays in the incoming socket buffers. These delays are not visible in the Tor circuit queue delay statistics [1]. Finally, the coarse-grained refill intervals result in a very bursty outgoing traffic pattern at the onion routers (one large chunk of data once per second, instead of smooth transmission progress). This is undesirable, since such a traffic pattern can interfere with TCP's control mechanisms and can be the source of suboptimal TCP performance on the TLS links between onion routers. Specific Changes: The token buckets should be refilled more often, with a correspondingly smaller amount of tokens. For instance, the buckets might be refilled every 10 milliseconds with one-hundredth of the amount of data admissible per second. This will help to overcome the problem of unfairness when reading from the incoming socket buffers. At the same time it smoothes the traffic leaving the onion routers. We are aware that this latter change has apparently been discussed before [2]; we are not sure why this change has not been implemented yet. In particular we need to change the current implementation in Tor which triggers refilling always after exactly one second. Instead the refill event should fire more frequently. The smaller time intervals between each refill action need to be taken into account for the number of tokens that are added to the bucket. With libevent 2.x and bufferevents enabled, smaller refill intervals are already considered but hard coded. This should be changed to a configurable parameter, too. Conclusion: This proposal can be implemented with moderate effort and requires changes only at the points where the token bucket operations are currently performed. This change will also be a good starting point for further enhancements to improve queuing times in Tor. I.e. it will pave the ground for other means that tackle this problem. Feedback is highly appreciated. References: [1] Karsten Loesing. Analysis of Circuit Queues in Tor. August 25, 2009. [2] https://trac.torproject.org/projects/tor/wiki/sponsors/SponsorD/June2011
Filename: 184-v3-link-protocol.txt Title: Miscellaneous changes for a v3 Tor link protocol Author: Nick Mathewson Created: 19-Sep-2011 Status: Closed Target: 0.2.3.x Overview: When proposals 176 and 179 are implemented, Tor will have a new link protocol. I propose two simple improvements for the v3 link protocol: a more partitioned set of which types indicate variable-length cells, and a better way to handle link padding if and when we come up with a decent scheme for it. Motivation: We're getting a new link protocol in 0.2.3.x, thanks (again) to TLS fingerprinting concerns. When we do, it'd be nice to take care of some small issues that require a link protocol version increment. First, our system for introducing new variable-length cell types has required a protocol increment for each one. Unlike fixed-length (512 byte) cells, we can't add new variable-length cells in the existing link protocols and just let older clients ignore them, because unless the recipient knows which cells are variable-length, it will treat them as 512-byte cells and discard too much of the stream or too little. In the past, it's been useful to be able to introduce new cell types without having to increment the link protocol version. Second, once we have our new TLS handshake in place, we will want a good way to address the remaining fingerprinting opportunities. Some of those will likely involve traffic volume. We can't fix that easily with our existing PADDING cell type, since PADDING cells are fixed-length, and wouldn't be so easy to use to break up our TLS record sizes. Design: Indicating variable-length cells. Beginning with the v3 link protocol, we specify that all cell types in the range 128..255 indicate variable-length cells. Cell types in the range 0..127 are still used for 512-byte cells, except that the VERSIONS cell type (7) also indicates a variable-length cell (for backward compatibility). As before, all Tor instances must ignore cells with types that they don't recognize. Design: Variable-length padding. We add a new variable-length cell type, "VPADDING", to be used for padding. All Tor instances may send a VPADDING cell at any point that a VERSIONS cell is not required; a VPADDING cell's body may be any length; the body of a VPADDING cell MAY have any content. Upon receiving a VPADDING cell, the recipient should drop it, as with a PADDING cell. (This does not give a way to send fewer than 5 bytes of padding. We could add this in the future, in a new link protocol.) Implementations SHOULD fill the content of all padding cells randomly. A note on padding: We do not specify any situation in which a node ought to generate a VPADDING cell; that's left for future work. Implementors should be aware that many schemes have been proposed for link padding that do not in fact work as well as one would expect. We recommend that no mainstream implementation should produce padding in an attempt to resist traffic analysis, without real research showing that it helps. Interaction with proposal 176: Proposal 176 says that during the v3 handshake, no cells other than VERSIONS, AUTHENTICATE, AUTH_CHALLENGE, CERT, and NETINFO are allowed, and those are only allowed in their standard order. If this proposal is accepted, then VPADDING cells should also be allowed in the handshake at any point after the VERSIONS cell. They should be included when computing the "SLOG" and "CLOG" handshake-digest fields of the AUTHENTICATE cell. Notes on future-proofing: It may be in the future we need a new cell format that is neither the original 512-byte format nor the variable-length format. If we do, we can just increment the link protocol version number again. Right now we have 10 cell types; with this proposal and proposal 176, we will have 14. It's unlikely that we'll run out any time soon, but if we start to approach the number 64 with fixed-length cell types or 196 with var-length cell types, we should consider tweaking the link protocol to have a variable-length cell type encoding.
Filename: 185-dir-without-dirport.txt Title: Directory caches without DirPort Author: Nick Mathewson Created: 20-Sep-2011 Status: Superseded Superseded-by: 237 Overview: Exposing a directory port is no longer necessary for running as a directory cache. This proposal suggests that we eliminate that requirement, and describes how. Motivation: Now that we tunnel directory connections by default, it is no longer necessary to have a DirPort to be a directory cache. In fact, bridges act as directory caches but do not actually have a DirPort exposed. It would be nice and tidy to expand that property to the rest of the network. Configuration: Add a new torrc option, "DirCache". Its values can be "0", "1", and "auto". If it is 0, we never act as a directory cache, even if DirPort is set. If it is 1, then we act as a directory cache according to same rules as those used for nodes that set a DirPort. If it is "auto", then Tor decides whether to act as a directory cache based on some future intelligent algorithm. "Auto" should be the new default. Advertising cache status: Nodes that are running as a directory cache should set the entry "dir-cache 1" in their router descriptors. If they do not have a DirPort set, or do not have a working DirPort, they should give their directory port as 0 in their router lines. (Nodes that have a working directory port advertise it as usual, and also include a "dir-cache" line. Nodes that do not serve directory information should set their directory port to 0, and not include any dir-cache line. Implementations should accept and ignore dir-cache lines with values other than "dir-cache 1".) Consensus: Authorities should assign a "DirCache" flag to all nodes running as a directory cache. This does not require a new version of the consensus algorithm.
Filename: 186-multiple-orports.txt Title: Multiple addresses for one OR or bridge Author: Nick Mathewson Created: 19-Sep-2011 Supersedes: 118 Status: Closed Target: 0.2.4.x+ Status: This proposal is partially implemented to the extent needed to allow nodes to have one IPv4 and one IPv6 address. Overview: This document is a proposal for servers to advertise multiple address/port combinations for their ORPort. It supersedes proposal 118. Motivation: Sometimes servers want to support multiple ports for incoming connections, either in order to support multiple address families (ie, to add IPv6 support), to better use multiple interfaces, or to support a variety of FascistFirewallPorts settings. This is easy to set up now, but there's no way to advertise it to clients. Configuring additional addresses and ports: In consonance with our changes to the (Socks|Trans|NATD|DNS)Port options made in 0.2.3.x for proposal 171, I make a corresponding change to allow multiple ORPort options and deprecate ORListenAddress. The new syntax will be: "ORPort" PortDescription Option* Option = "NoAdvertise" | "NoListen" | "AllAddrs" | "IPV4Only" | "IPV6Only" PortDescription = PORTLIST | ADDRESS ":" PORTLIST | Hostname ":" PORTLIST (PORTLIST and ADDRESS are defined below.) The 'NoAdvertise' option performs the function of the old ORListenAddress option. If it is set, we bind a port, but don't put it in our descriptor. The 'NoListen' option tells Tor to advertise an address, but not bind to it. The operator needs to use some other mechanism to ensure that ports are redirected to ports that _are_ listened on. The 'AllAddrs' option tells Tor that if no address is given in the PortDescription part, we should bind/advertise every one of our publicly visible unicast addresses; and that if a hostname address is given in the PortDescription, we should bind/advertise every publicly visible unicast address that the hostname resolves to. (Q: Should this be on by default?) The 'IPv4Only' and 'IPv6Only' options tell Tor to interpret such situations as applying only to IPv4 addresses or to IPv6 addresses. As with the client *Port options, only the old format or the new format are allowed: either a single numeric ORPort and zero or more ORListenAddress options, or a set of one or more ORPorts in the new extended format. In current operating systems (unless we get into crazy nonportable tricks) we need to use one socket for every address:port that Tor binds on. As a sanity check, we can limit the number of such sockets we use to, say, something between 8 and 64. If you want to bind lots of address:port combinations, you'll want to do it at the firewall/routing level. Example: We want to bind on 0.0.0.0:9001 ORPort 9001 Example: Our firewall is redirecting ports 80, 443, and 7000 on all hosts in 18.244.2.0 onto our port 2929. ORPort 2929 noadvertise ORPort 18.244.2.0:80,443,7000 nolisten Example: We have a dynamic DNS provider that maps tornode.example.com to our current external IPv4 and IPv6 addresses. Our firewall forwards port 443 on those addresses to our port 1337. ORPort 1337 noadvertise alladdrs ORPort tornode.example.com:443 nobind alladdrs Self-testing: Right now, Tor nodes need to check every port that they advertise before they declare themselves reachable. If a Tor has a lot of advertised ports, that could be prohibitive. Instead, it should try a sample of ports for each address. It should not advertise any given ORPort line until it has tried extending to or connecting to a sample of the address/port combinations. It will now be possible for a Tor node to find that some addresses work and others do not. In this case, the node should only advertise ORPort lines that have been checked. (As a consequence, the node should not advertise any address unless at least one ORPort without nolisten has been specified.) {Until support is added for extend cells to IPv6 addresses, it will only be possible to test IPv6 addresses by connecting directly. We might want to just skip self-testing those until we have IPv6 extend support.} New descriptor syntax: We add a new line in the router descriptor, "or-address". This line can occur zero, one, or multiple times. Its format is: or-address SP ADDRESS ":" PORTLIST NL ADDRESS = IPV6ADDR | IPV4ADDR IPV6ADDR = an ipv6 address, surrounded by square brackets. IPV4ADDR = an ipv4 address, represented as a dotted quad. PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST PORTSPEC = PORT PORT = a number between 1 and 65535 inclusive. [This is the regular format for specifying sets of addresses and ports in Tor.] A descriptor should not include an or-address line that does nothing but duplicate the address:port pair from its "router" line. A node must not list more than 8 or-address lines. A PORTLIST must have no more than 16 PORTSPEC entries, and its entries must be disjoint. (Q: Any reason to allow more than 2? Multiple interfaces, I guess.) New authority behavior: The same rationale applies as for self-testing. An authority needs to test the main address:port from the router line, and every or-address line. For or-address lines that contain multiple ports, it needs to test all of them if they are few, or a sample if they are not. An authority shouldn't list a node as Running unless every or-address line it advertises looks like it will work. Consensus directories and microdescriptors: We introduce a new line type for microdescriptors and consensuses, "a". Each "a" line has the same format as an or-address line. The "a" lines (if any) appear immediately after the "r" line for a router in the consensus, and immediately after the "onion-key" entry in a microdescriptor. Clients that use microdescriptors should consider a node's addresses to be the address:port listed in the "r" line of a consensus, plus all "a" lines for that node in the consensus, plus all "a" lines for that node in its microdescriptor. Clients that use full descriptors should consider a node's addresses to be everything listed in its descriptor. We will have to define a new voting algorithm version; when using this version or later, votes should include a single "a" line for every relay that has an IPv6 address, to include the first IPv6 line in its descriptor. (If there are no IPv6 or-address lines, then they shouldn't include any "a" lines.) The remaining or-address lines will turn into "a" lines in the microdescriptor. As with other data in the vote derived from the descriptor, the consensus will include whichever set of "a" lines are given by the most authorities who voted for the descriptor digest that will be used for the router. Directory authorities with more addresses: We need a way for a client to configure a TrustedDirServer as having multiple OR addresses, specifically so that we can give at least one default authority an IPv6 address for bootstrapping purposes. (Q: Do any of the current authorities have stable IPv6 addresses?) We will want to allow the address in a "dir-source" line in a vote to contain an IPv6 address, and/or allow voters to list themselves with more addresses in votes/consensuses. But right now, nothing actually uses the addresses listed for voters in dir-source lines for anything besides log messages. Client behavior: I propose that initially we shouldn't change client behavior too much here. (Q: Is there any advantage to having a client choose a random address? If so we can do it later. If not, why list any more than one IPv4 and one IPv6 address?) Tor clients not running with bridges, and running with IPv4 support, should still use the address and ORPort as advertised in the "router" or "r" line of the appropriate directory object. Tor clients not running with bridges, and running without IPv4 support, should use the first listed IPv6 address for a node, using the lowest-numbered listed port for that address. They should only connect to nodes with an IPv6 address. Clients should accept Bridge lines with IPv6 addresses, and address:port sets, in addition to the lines they currently accept. Clients, for now, should only use the address:port from the router line when making EXTEND cells; see below. Nodes without IPv4 addresses: Currently Tor requires every node or bridge to have an IPv4 address. We will want to maintain this property for the foreseeable future, but we should define how a node without an IPv4 address would advertise itself. Right now, there's no way to do that: if anything but an IPv4 address appears in a router line of a routerdesc, or the "r" line of a consensus, then it won't parse. If something that looks like an IPv4 address appears there, clients will (I believe) try to connect to it. We can make this work, though: let's allow nodes to list themselves with a magic IPv4 address (say, 127.1.1.1) if they have or-address entries containing only IPv6 address. We could give these nodes a new flag other than Running to indicate that they're up, and not give them the Running flag. That way, old clients would never try to use them, but new clients could know to treat the new flag as indicating that the node is running, and know not to connect to a node listed with address 127.1.1.1. Interaction with EXTEND and NETINFO: Currently, EXTEND cells only support IPv4 addresses, so we should use only those. There is a proposal draft to support more address types. A server's NETINFO cells must list all configured addresses for a server. Why not extend DirPort this way too? Because clients are all using BEGINDIR these days. That is, clients tunnel their directory requests inside OR connections, and don't generally connect to DirPorts at all. Why not have address and port ranges? Earlier drafts of this proposal suggested that servers should provide ranges of addresses, specified with bitmasks. That's a neat idea for circumvention, but if we did that, you wouldn't want to advertise publicly that you have an entire address range. Port ranges are out because I don't think they would actually get used much, and they add a fair bit of complexity. Coding impact: In addition to the obvious changes, we need to audit everything that looks up or compares OR connections and nodes by address:port under the assumptions that each node has only a single address or ORPort. TODO: * Make it so that authorities can vote on which addresses are working somehow. * Specify some way to say "I only want to connect to v4/v6 addresses". * Come up with a better alternative to running6 for the longterm?
Filename: 187-allow-client-auth.txt Title: Reserve a cell type to allow client authorization Author: Nick Mathewson Created: 16-Oct-2011 Status: Closed Target: 0.2.3.x Overview: Proposals 176 and 184 introduce a new "v3" handshake, coupled with a new version 3 link protocol. This is a good time to introduce other stuff we might need. One thing we might want is a scanning resistance feature for bridges. This proposal suggests a change we should make right away to enable us to deploy such a feature in future versions of Tor. Motivation: If an adversary has a suspected bridge address/port combination, the easiest way for them to confirm or disconfirm their suspicion is to connect to the address and see whether they can do a Tor handshake. The easiest way to fix this problem seems to be to give out bridge addresses along with some secret that clients should know, but which an adversary shouldn't be able to learn easily. The client should prove to the bridge that it's authorized to know about the bridge, before the bridge acts like a bridge. If the client doesn't show knowledge of the proper secret, the bridge should act like an HTTPS server or a bittorrent tracker or something. This proposal *does not* specify a way for clients to authorize themselves at bridges; rather, it specifies changes that we should make now in order to allow this kind of authorization in the future. Design: Currently, now that proposal 176 is implemented, if a server provides a certificate that indicates a v3 handshake, and the client understands how to do a V3 handshake, we specify that the client's first cell must be a VERSIONS cell. Instead, we make the following specification changes: We reserve a new variable-length cell type, "AUTHORIZE". We specify that any number of PADDING or VPADDING or AUTHORIZE cells may be sent by the client before it sends a VERSIONS cell. Servers that do not require client authorization MUST ignore such cells, except to include them when calculating the HMAC that will appear in the CLOG part of a client's AUTHENTICATE cell. We still specify that clients SHOULD send VERSIONS as their first cell; only in some future version of Tor will an AUTHORIZE cell be sent first. Discussion: This change allows future versions of the Tor client to know that some bridges need authorization, and to send them authentication before sending them anything recognizably Tor-like. The authorization cell needs to be received before the server can send any Tor cells, so we can't just patch it in after the VERSIONS cell exchange: the server's VERSIONS cell is unsendable until after the AUTHORIZE has been accepted. Note that to avoid scanning attacks, it's not sufficient to wait for a single cell, and then either handle it as authorization or reject the connection. Instead, we need to decide what kind of server we're impersonating, and respond once the client has provided *either* an authorization cell, *or* a recognizably valid or invalid command in the impersonated protocol. Alternative design: Just use pluggable transports Pluggable transports can do this too, but in general, we want to avoid designing the Tor protocol so that any particular desirable feature can only be done with a pluggable transport. That is, any feature that *every* bridge should want, should be doable in Tor proper. Also, as of 16 Oct 2011, pluggable transports aren't in general use. Past experience IMO suggests that we shouldn't offload architectural responsibilities to our chickens until they've hatched. Alternative design: Out-of-TLS authorization There are features (like port-knocking) designed to allow a client to show that it's authorized to use a bridge before the TLS handshake even happens. These are appropriate for bunches of applications, but they're trickier with an adversary who is MITMing the client. Alternative design: Just use padding. Arguably, we could only add the "VPADDING" cell type to the list of those allowed before VERSIONS cells, and say that any client authorization we specify later on will be sent as a VPADDING cell. But that design is kludgy: padding should be padding, not semantically significant. Besides, cell types are still fairly plentiful. Counterargument: specify it later We could, later on, say that if a client learns that a bridge needs authorization, it should send an AUTHORIZE cell. So long as a client never sends an AUTHORIZE to anything other than a bridge that needs authorization, it'll never violate the spec. But all things considered, it seems easier (just a few lines of spec and code) to let bridges eat unexpected authorization now than it does to have stuff fail later when clients think that a bridge needs authorization but it doesn't. Counterargument: it's too late! We've already got the prop176 branch merged and running on a few servers. But as of this writing, it isn't in any Tor version. Even if it *is* out in an alpha before we can get this proposal accepted and implemented, that's not a big disaster. In the worst case, where future clients don't know whom to send authorization to so they need to send it to _all_ v3 servers, they will at worst break their connections only to a couple of alpha versions which one hopes by then will be long-deprecated already.
Filename: 188-bridge-guards.txt Title: Bridge Guards and other anti-enumeration defenses Author: Nick Mathewson, Isis Lovecruft Created: 14 Oct 2011 Modified: 10 Sep 2015 Status: Reserve [NOTE: This proposal is marked as "reserve" because the enumeration technique it addresses does not currently seem to be in use. See ticket tor#7144 for more information. (2020 July 31)] 1. Overview Bridges are useful against censors only so long as the adversary cannot easily enumerate their addresses. I propose a design to make it harder for an adversary who controls or observes only a few nodes to enumerate a large number of bridges. Briefly: bridges should choose guard nodes, and use the Tor protocol's "Loose source routing" feature to re-route all extend requests from clients through an additional layer of guard nodes chosen by the bridge. This way, only a bridge's guard nodes can tell that it is a bridge, and the attacker needs to run many more nodes in order to enumerate a large number of bridges. I also discuss other ways to avoid enumeration, recommending some. These ideas are due to a discussion at the 2011 Tor Developers' Meeting in Waterloo, Ontario. Practically none of the ideas here are mine; I'm just writing up what I remember. 2. History and Motivation Under the current bridge design, an attacker who runs a node can identify bridges by seeing which "clients" make a large number of connections to it, or which "clients" make connections to it in the same way clients do. This has been a known attack since early versions {XXXX check} of the design document; let's try to fix it. 2.1. Related idea: Guard nodes The idea of guard nodes isn't new: since 0.1.1, Tor has used guard nodes (first designed as "Helper" nodes by Wright et al in {XXXX}) to make it harder for an adversary who controls a smaller number of nodes to eavesdrop on clients. The rationale was: an adversary who controls or observes only one entry and one exit will have a low probability of correlating any single circuit, but over time, if clients choose a random entry and exit for each circuit, such an adversary will eventually see some circuits from each client with a probability of 1, thereby building a statistical profile of the client's activities. Therefore, let each client choose its entry node only from among a small number of client-selected "guard" nodes: the client is still correlated with the same probability as before, but now the client has a nonzero chance of remaining unprofiled. 2.2. Related idea: Loose source routing Since the earliest versions of Onion Routing, the protocol has provided "loose source routing". In strict source routing, the source of a message chooses every hop on the message's path. But in loose source routing, the message traverses the selected nodes, but may also traverse other nodes as well. In other words, the client selects nodes N_a, N_b, and N_c, but the message may in fact traverse any sequence of nodes N_1...N_j, so long as N_1=N_a, N_x=N_b, and N_y=N_c, for 1 < x < y. Tor has retained this feature, but has not yet made use of it. 3. Design Every bridge currently chooses a set of guard nodes for its circuits. Bridges should also re-route client circuits through these circuits. Specifically, when a bridge receives a request from a client to extend a circuit, it should first create a circuit to its guard, and then relay that extend cell through the guard. The bridge should add an additional layer of encryption to outgoing cells on that circuit corresponding to the encryption that the guard will remove, and remove a layer of encryption on incoming cells on that circuit corresponding to the encryption that the guard will add. 3.1. Loose-Source Routed Circuit Construction Alice, an OP, is using a bridge, Bob, and she has chosen the following path through the network: Alice -> Bob -> Charlie -> Deidra However, Bob has decided to take advantage of the loose-source routing circuit characteristic (for example, in order to use a bridge guard), and Bob has chosen N additional loose-source routed hop(s), through which he will transparently relays cells. NOTE: For the purposes of bridge guards, N is always 1. However, for completion's sake, the following details of the circuit construction are generalized to include N > 1. Additionally, the following steps should hold for a hop at any position in Alice's circuit that has decided to take advantage of the loose-source routing feature, not only for bridge ORs. From Alice's perspective, her circuit path matches the one diagrammed above. However, the overall path of the circuit is: Alice -> Bob -> Guillaume -> Charlie -> Deidra From Bob's perspective, the circuit's path is: Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN Interestingly, because Bob's behaviour towards Guillaume and choices of cell types is that of a normal OP, Guillaume's perspective of the circuit's path is: Bob -> Guillaume -> Charlie -> UNKNOWN That is, to Guillaume, Bob appears (for the most part) to be a normally connecting client. (See §4.1 for more detailed analysis.) 3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction 1. Connection from OP Alice has connected to Bob, and she has sent to Bob either a CREATE/CREATE_FAST or CREATE2 cell. 2. Loose-Source Path Selection In anticipation of Alice's first RELAY_EARLY cell (which will contain an EXTEND cell to Alice's next hop), Bob begins constructing a loose-source routed circuit. To do so, Bob chooses N additional hop(s): 2.a. For the first additional hop, H_1, Bob chooses a suitable entry guard node, Guillaume, using the same algorithm as OPs. See "§5 Guard nodes" of path-spec.txt for additional information on the selection algorithm. 2.b. Each additional hop, [H_2, ..., H_N], is chosen at random from a list of suitable, non-excluded ORs. 3. Loose-Source Routed Circuit Extension and Cell Types Bob now follows the same procedure as OPs use to complete the key exchanges with his chosen additional hop(s). While undergoing these following substeps, Bob SHOULD continue to proceed with Step 4, below, in parallel, as an optimization for speeding up circuit construction. 3.a. Create Cells Bob sends the appropriate type of create cell to Guillaume. For ORs new enough to support the NTor handshake (nearly all of them at this point), Bob sends a CREATE2 cell. Otherwise, for ORs which only support the older TAP handshake, Bob sends either a CREATE or CREATE_FAST cell, using the same decision-making logic as OPs. See §4.1 for more information the distinguishability of bridges based upon whether they use CREATE versus CREATE_FAST. Also note that the CREATE2 cell has since become ubiquitous after this proposal was originally drafted. Thus, because we prefer ORs which use loose-source routing to behave (as much as possible) like OPs, we now prefer to use CREATE2. 3.b. Created Cells Later, when Bob receives a corresponding CREATED/CREATED_FAST or CREATED2 cell from Guillaume, Bob extracts key material for the shared forward and reverse keys, KG_f and KG_b, respectively. 3.c. Extend Cells When N > 1, for each additional hop, H_i, in [H_2, ..., H_N], Bob chooses the appropriate type of extend cell for H_i, and sends this extend cell to H_i-1, who transforms it into a create cell in order to perform the extension. To choose which type of extend cell to send, Bob uses the same algorithm as an OP to determine whether to use EXTEND or EXTEND2. Similar to the CREATE* cells above, for most modern ORs, this will very likely mean an EXTEND2 cell. 3.d. Extended Cells When a corresponding EXTENDED/EXTENDED2 cell is received for an additional hop, H_i, Bob extracts the shared forward and reverse keys, Ki_f and Ki_b, respectively. 4. Responding to the OP Now that the additional hops in Bob's loose-source routed circuit are chosen, and construction of the loose-source routed circuit has begun, Bob answers Alice's original CREATE/CREATE_FAST or CREATE2 cell (from Step 1) by sending the corresponding created cell type. Alice has now built a circuit through Bob, and the two share the negotiated forward and reverse keys, KB_n and KB_p, respectively. Note that Bob SHOULD do this step in tandem with the loose-source routed circuit construction procedure outlined in Step 3, above. 5. OP Circuit Extension Alice then wants to extend the circuit to node Charlie. She makes a hybrid-encrypted onionskin, encrypted to Charlie's public key, containing her chosen g^x value. She puts this in an extend cell: "Extend (Charlie's address) (Charlie's OR Port) (Onionskin) (Charlie's ID)". She encrypts this with KB_n and sends it as a RELAY_EARLY cell to Bob. Bob's behaviour is now dependent on whether the loose-source routed circuit construction steps (as outlined in Step 3, above) have already completed. 5.a. The Loose-Source Routed Circuit Construction is Incomplete If Bob has not yet finished the loose-source routed circuit construction, then Bob MUST store the first outgoing (i.e. exitward) RELAY_EARLY cell received from Alice until the loose-source routed circuit construction has been completed. If any incoming (i.e. toward the OP) RELAY* cell is received while the loose-source routed circuit is not fully constructed, Bob MUST drop the cell. If Bob has already stored Alice's first RELAY_EARLY cell, and Alice sends any additional RELAY* cell, then Bob SHOULD mark the entire circuit for close with END_CIRC_REASON_TORPROTOCOL. 5.b. The Loose-Source Routed Circuit Construction is Completed Later, when the loose-source routed circuit is fully constructed, Bob MUST send any stored cells from Alice outward by following the procedure described in Step 6.a. 6. Relay Cells When receiving a RELAY* cell in either direction, Bob MAY keep statistics on the number of relay cells encountered, as well as the number of relay cells relayed. 6.a. Outgoing Relay Cells Bob decrypts the RELAY* cell with KB_n. If the cell becomes recognized, Bob should now follow the relay command checks described in Step 6.c. Bob MUST encrypt the relay cell's underlying payload to each additional hop in the loose-source routed circuit, in reverse: for each additional hop, H_i, in [H_N, ..., H_1], Bob encrypts the relay cell payload to Ki_f, the shared forward key for the hop H_i. Bob MUST update the forward digest, DG_f, of the relay cell, regardless of whether or not the cell is recognized. See 6.c. for additional information on recognized cells. Bob now sends the cell outwards through the additional hops. At each hop, H_i, the hop removes a layer of the onionskin by decrypting the cell with Ki_f, and then hop H_i forwards the cell to the next addition additional hop H_i+1. When the final additional hop, H_N, received the cell, the OP's cell command and payload should be processed by H_N in the normal manner for an OR. 6.b. Incoming Relay Cells Bob MUST decrypt the relay cell's underlying payload from each additional hop in the loose-source routed circuit (in forward order, this time): For each additional hop, H_i, in [H_1, ..., H_N], Bob decrypts the relay cell payload with Ki_b, the shared backward key for the hop H_i. If the cell has becomes recognized after all decryptions, Bob should now follow the relay command checks described in Step 6.c. Bob MUST update the backward digest, DG_b, of the relay cell, regardless of whether or not the cell is recognized. See 6.c. for additional information on recognized cells. Bob encrypts the cell towards the OP with KB_p, and sends the cell inwards. 6.c. Recognized Cells If a relay cell, either incoming or outgoing, becomes recognized (i.e. Bob sees that the cell was intended for him) after decryption, and there is no stream attached to the circuit, then Bob SHOULD mark the circuit for close if the relay command contained within the cell is any of the following types: - RELAY_BEGIN - RELAY_CONNECTED - RELAY_END - RELAY_RESOLVE - RELAY_RESOLVED - RELAY_BEGIN_DIR Apart from the above checks, Bob SHOULD essentially treat every cell as "unrecognized" by following the en-/de-cryption procedures in Steps 6.a. and 6.b. regardless of whether the cell is actually recognized or not. That is, since this is a loose-source routed circuit, Bob SHOULD relay cells not intended for him *and* cells intended for him through the leaky pipe, no matter what the cell's underlying payload and command are. 3.1.2. Example Loose-Source Circuit Construction For example, given the following circuit path chosen by Alice: Alice -> Bob -> Charlie -> Deidra when Alice wishes to extend to node Charlie, and Bob the bridge is using only one additional loose-source routed hop, Guillaume, as his bridge guard, the following steps are taken: - Alice packages the extend into a RELAY_EARLY cell and encrypts the RELAY_EARLY cell with KB_f to Bob. - Bob receives the RELAY_EARLY cell from Alice, and he follows the procedure (outlined in §3.1.1. Step 6.a.) by: * Decrypting the cell with KB_f, * Encrypting the cell to the forward key, KG_f, which Bob shares with his guard node, Guillaume, * Updating the cell forward digest, DG_f, and * Sending the cell as a RELAY_EARLY cell to Guillaume. - When Guillaume receives the cell from Bob, he processes it by: * Decrypting the cell with KG_f. Guillaume now sees that it is a RELAY_EARLY cell containing an extend cell "intended" for him, containing: "Extend (Charlie's address) (Charlie's OR Port) (Onionskin) (Charlie's ID)". * Performing the circuit extension to the specified node, Charlie, by acting accordingly: creating a connection to Charlie if he doesn't have one, ensuring that the ID is as expected, and then sending the onionskin in a create cell on that connection. Note that Guillaume is behaving exactly as a regular node would upon receiving an Extend cell. * Now the handshake finishes. Charlie receives the onionskin and sends Guillaume "CREATED g^y,KH". * Making an extended cell for Bob which contains "E(KG_b, EXTENDED g^y KH)", and * Sending the extended cell to Bob. Note that Charlie and Guillaume are both still behaving in a manner identical to regular ORs. - Bob receives the extended cell from Guillaume, and he follows the procedure (outlined in §3.1.1. Step 6.b.) by: * Decrypting the cell with KG_b, * Encrypting the cell to Alice with KB_b, * Updating the cell backward digest, DG_b, and * Sending the cell to Alice. - Alice receives the cell, and she decrypts it with KB_b, just as she would have if Bob had extended to Charlie directly. She then processes the extended cell contained within to extract shared keys with Charlie. Note that Alice's behaviour is identical to regular OPs. 3.2. Additional Notes on the Construction Note that this design does not require that our stream cipher operations be commutative, even though they are. Note also that this design requires no change in behavior from any node other than Bob, and as we can see in the above example in §3.1.2 for Alice's circuit extension, Alice, Guillaume, and Charlie behave identical to a normal OP and normal ORs. Finally, observe that even though the circuit N hops longer than it would be otherwise, no relay's count of permissible RELAY_EARLY cells falls lower than it otherwise would. This is because the extra hop that Bob adds is done with RELAY_EARLY cells, then he continues to relay Alice's cells as RELAY_EARLY, until the appropriate maximum number of RELAY_EARLY cells is reached. Afterwards, further RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY cells. 4. Alternative designs 4.1. Client-enforced bridge guards What if Tor didn't have loose source routing? We could have bridges tell clients what guards to use by advertising those guard in their descriptors, and then refusing to extend circuits to any other nodes. This change would require all clients to upgrade in order to be able to use the newer bridges, and would quite possibly cause a fair amount of pain along the way. Fortunately, we don't need to go down this path. So let's not! 4.2. Separate bridge-guards and client-guards In the design above, I specify that bridges should use the same guard nodes for extending client circuits as they use for their own circuits. It's not immediately clear whether this is a good idea or not. Having separate sets would seem to make the two kinds of circuits more easily distinguishable (even though we already assume they are distinguishable). Having different sets of guards would also seem like a way to keep the nodes who guard our own traffic from learning that we're a bridge... but another set of nodes will learn that anyway, so it's not clear what we'd gain. One good reason to keep separate guard lists is to prevent the *client* of the bridge from being able to enumerate the guards that the bridge uses to protect its own traffic (by extending a circuit through the bridge to a node it controls, and finding out where the extend request arrives from). 5. Additional bridge enumeration methods and protections In addition to the design above, there are more ways to try to prevent enumeration. Right now, there are multiple ways for the node after a bridge to distinguish a circuit extended through the bridge from one originating at the bridge. (This lets the node after the bridge tell that a bridge is talking to it.) 5.1. Make it harder to tell clients from bridges When using the older TAP circuit handshake protocol, one of the giveaways is that the first hop in a circuit is created with CREATE_FAST cells, but all subsequent hops are created with CREATE cells. However, because nearly everything in the network now uses the newer NTor circuit handshake protocol, clients send CREATE2 cells to all hops, regardless of position. Therefore, in the above design, it's no longer quite so simple to distinguish an OP connecting through bridge from an actual OP, since all of the circuits that extend through a bridge now reach its guards through CREATE2 cells (whether the bridge originated them or not), and only as a fallback (e.g. if an additional node in the loose-source routed path does not support NTor) will the bridge ever use CREATE/CREATE_FAST. (Additionally, when using the fallback mathod, the behaviour for choosing either CREATE or CREATE_FAST is identical to normal OP behaviour.) The CREATE/CREATE_FAST distinction is not the only way for a bridge's guard to tell bridges from orginary clients, however. Most importantly, a busy bridge will open far more circuits than a client would. More subtly, the timing on response from the client will be higher and more highly variable that it would be with an ordinary client. I don't think we can make bridges behave wholly indistinguishably from clients: that's why we should go with guard nodes for bridges. [XXX For further research: we should study the methods by which a bridge guard can determine that they are acting as a guard for a bridge, rather than for a normal OP, and which methods are likely to be more accurate or efficient than others. -IL] 5.2. Bridge Reachability Testing Currently, a bridge's reachability is tested both by the bridge itself (called "self-testing") and by the BridgeAuthority. 5.2.1. Bridge Reachability Self-Testing Before a bridge uploads its descriptors to the BridgeAuthority, it creates a special type of testing circuit which ends at itself: Bob -> Guillaume -> Charlie -> Bob Thus, going to all this trouble to later use loose-source routing in order to relay Alice's traffic through Guillaume (rather than connecting directly to Charlie, as Alice intended) is diminished by the fact that Charlie can still passively enumerate bridges by waiting to be asked to connect to a node which is not contained within the consensus. We could get around this option by disabling self-testing for bridges entirely, by automatically setting "AssumeReachable 1" for all bridge relays… although I am not sure if this is wise. Our best idea thus far, for bridge reachability self-testing, is to create a circuit like so: Bridge → Guard → Middle → OtherMiddle → Guard → Bridge While, clearly, that circuit is just a little bit insane, it must be that way because we cannot simply do: Bridge → Guard → Middle → Guard → Bridge because the Middle would refuse to extend back to the previous node (all ORs follow this rule). Similarly, it would be inane to do: Bridge → Guard → Middle → OtherMiddle → Bridge because, obviously, that merely shifts the problem to OtherMiddle and accomplishes nothing. [XXX Is there something smarter we could do? —IL] 5.2.2. Bridge Reachability Testing by the BridgeAuthority After receiving Bob's descriptors, the BridgeAuthority attempts to connect to Bob's ORPort by making a direct TLS connection to the bridge's advertised ORPort. Should we change this behaviour? One the one hand, at least this does not enable any random OR in the entire network to enumerate bridges. On the other hand, any adversary who can observe packets from the BridgeAuthority is capable of enumeration. 6. Other considerations What fraction of our traffic is bridge traffic? Will this alter our circuit selection weights?
Filename: 189-authorize-cell.txt Title: AUTHORIZE and AUTHORIZED cells Author: George Kadianakis Created: 04 Nov 2011 Status: Obsolete 1. Overview Proposal 187 introduced the concept of the AUTHORIZE cell, a cell whose purpose is to make Tor bridges resistant to scanning attacks. This is achieved by having the bridge and the client share a secret out-of-band and then use AUTHORIZE cells to validate that the client indeed knows that secret before proceeding with the Tor protocol. This proposal specifies the format of the AUTHORIZE cell and also introduces the AUTHORIZED cell, a way for bridges to announce to clients that the authorization process is complete and successful. 2. Motivation AUTHORIZE cells should be able to perform a variety of authorization protocols based on a variety of shared secrets. This forces the AUTHORIZE cell to have a dynamic format based on the authorization method used. AUTHORIZED cells are used by bridges to signal the end of a successful bridge client authorization and the beginning of the actual link handshake. AUTHORIZED cells have no other use and for this reason their format is very simple. Both AUTHORIZE and AUTHORIZED cells are to be used under censorship conditions and they should look innocuous to any adversary capable of monitoring network traffic. As an attack example, an adversary could passively monitor the traffic of a bridge host, looking at the packets directly after the TLS handshake and trying to deduce from their packet size if they are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and AUTHORIZED cells are padded with a random amount of padding before sending. 3. Design 3.1. AUTHORIZE cell The AUTHORIZE cell is a variable-sized cell. The generic AUTHORIZE cell format is: AuthMethod [1 octet] MethodFields [...] PadLen [2 octets] Padding ['PadLen' octets] where: 'AuthMethod', is the authorization method to be used. 'MethodFields', is dependent on the authorization Method used. It's a meta-field hosting an arbitrary amount of fields. 'PadLen', specifies the amount of padding in octets. Implementations SHOULD pick 'PadLen' to be a random integer from 1 to 3141 inclusive. 'Padding', is 'PadLen' octets of random content. 3.2. AUTHORIZED cell format The AUTHORIZED cell is a variable-sized cell. The AUTHORIZED cell format is: 'AuthMethod' [1 octet] 'PadLen' [2 octets] 'Padding' ['PadLen' octets] where all fields have the same meaning as in section 3.1. 3.3. Cell parsing Implementations MUST ignore the contents of 'Padding'. Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where the 'Padding' field is not 'PadLen' octets long. Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod' they don't recognize. 4. Discussion 4.1. What's up with the [1,3141] padding bytes range? The upper limit is larger than the Ethernet MTU so that AUTHORIZE and AUTHORIZED cells are not always transmitted into a single packet. Other than that, it's indeed pretty much arbitrary. 4.2. Why not let the pluggable transports do the padding, like they are supposed to do for the rest of the Tor protocol? The arguments of section "Alternative design: Just use pluggable transports" of proposal 187, apply here as well: All bridges who use client authorization will also need padded AUTHORIZE and AUTHORIZED cells. 4.3. How should multiple round-trip authorization protocols be handled? Protocols that require multiple round trips between the client and the bridge should use AUTHORIZE cells for communication. The format of the AUTHORIZE cell is flexible enough to support messages from the client to the bridge and the reverse. At the end of a successful multiple-round-trip protocol, an AUTHORIZED cell must be issued from the bridge to the client. 4.4. AUTHORIZED seems useless. Why not use VPADDING instead? As noted in proposal 187, the Tor protocol uses VPADDING cells for padding; any other use of VPADDING makes the Tor protocol kludgy. In the future, and in the example case of a v3 handshake, a client can optimistically send a VERSIONS cell along with the final AUTHORIZE cell of an authorization protocol. That allows the bridge, in the case of successful authorization, to also process the VERSIONS cell and begin the v3 handshake promptly. 4.5. What should actually happen when a bridge rejects an AUTHORIZE cell? When a bridge detects a badly formed or malicious AUTHORIZE cell, it should assume that the other side is an adversary scanning for bridges. The bridge should then act accordingly to avoid detection. This proposal does not try to specify how a bridge can avoid detection by an adversary.
Filename: 190-shared-secret-bridge-authorization.txt Title: Bridge Client Authorization Based on a Shared Secret Author: George Kadianakis Created: 04 Nov 2011 Status: Obsolete Notes: This is obsoleted by pluggable transports. 1. Overview Proposals 187 and 189 introduced AUTHORIZE and AUTHORIZED cells. Their purpose is to make bridge relays scanning-resistant against censoring adversaries capable of probing hosts to observe whether they speak the Tor protocol. This proposal specifies a bridge client authorization scheme based on a shared secret between the bridge user and bridge operator. 2. Motivation A bridge client authorization scheme should only allow clients who show knowledge of a shared secret to talk Tor to the bridge. 3. Shared-secret-based authorization 3.1. Where do shared secrets come from? A shared secret is a piece of data known only to the bridge operator and the bridge client. It's meant to be automatically generated by the bridge implementation to avoid issues with insecure and weak passwords. Bridge implementations SHOULD create shared secrets by generating random data using a strong RNG or PRNG. 3.2. AUTHORIZE cell format In shared-secret-based authorization, the MethodFields field of the AUTHORIZE cell becomes: 'shared_secret' [10 octets] where: 'shared_secret', is the shared secret between the bridge operator and the bridge client. 3.3. Cell parsing Bridge implementations MUST reject any AUTHORIZE cells whose 'shared_secret' field does not match the shared secret negotiated between the bridge operator and authorized bridge clients. 4. Tor implementation 4.1. Bridge side Tor bridge implementations MUST create the bridge shared secret by generating 10 octets of random data using a strong RNG or PRNG. Tor bridge implementations MUST store the shared secret in 'DataDirectory/keys/bridge_auth_ss_key' in hexadecimal encoding. Tor bridge implementations MUST support the boolean 'BridgeRequireClientSharedSecretAuthorization' configuration file option which enables bridge client authorization based on a shared secret. If 'BridgeRequireClientSharedSecretAuthorization' is set, bridge implementations MUST generate a new shared secret, if 'DataDirectory/keys/bridge_auth_ss_key' does not already exist. 4.2. Client side Tor client implementations must extend their Bridge line format to support bridge shared secrets. The new format is: Bridge [<method>] <address[:port]> [["keyid="]<id-fingerprint>] ["shared_secret="<shared_secret>] where <shared_secret> is the bridge shared secret in hexadecimal encoding. Tor clients who use bridges with shared-secret-based client authorization must specify the bridge's shared secret as in: Bridge 12.34.56.78 shared_secret=934caff420aa7852b855 5. Discussion 5.1. What should actually happen when a bridge rejects an AUTHORIZE cell? When a bridge detects a badly formed or malicious AUTHORIZE cell, it should assume that the other side is an adversary scanning for bridges. The bridge should then act accordingly to avoid detection. This proposal does not try to specify how a bridge can avoid detection by an adversary. 6. Acknowledgements Thanks to Nick Mathewson and Robert Ransom for the help and suggestions while writing this proposal.
Filename: 191-mitm-bridge-detection-resistance.txt Title: Bridge Detection Resistance against MITM-capable Adversaries Author: George Kadianakis Created: 07 Nov 2011 Status: Obsolete 1. Overview Proposals 187, 189 and 190 make the first steps toward scanning resistant bridges. They attempt to block attacks from censoring adversaries who provoke bridges into speaking the Tor protocol. An attack vector that hasn't been explored in those previous proposals is that of an adversary capable of performing Man In The Middle attacks to Tor clients. At the moment, Tor clients using the v3 link protocol have no way to detect such an MITM attack, and will gladly send a VERSIONS or AUTHORIZE cell to the MITMed connection, thereby revealing the Tor protocol and thus the bridge. This proposal introduces a way for clients to detect an MITMed SSL connection, allowing them to protect against the above attack. 2. Motivation When the v3 link handshake protocol is performed, Tor's SSL handshake is performed with the server sending a self-signed certificate and the client blindly accepting it. This allows the adversary to perform an MITM attack. A Tor client must detect the MITM attack before he initiates the Tor protocol by sending a VERSIONS or AUTHORIZE cell. A good moment to detect such an MITM attack is during the SSL handshake. To achieve that, bridge operators provide their bridge users with a hash digest of the public-key certificate their bridge is using for SSL. Bridge clients store that hash digest locally and associate it with that specific bridge. Bridge clients who have "pinned" a bridge to a certificate "fingerprint" can thereafter validate that their SSL connection peer is the intended bridge. Of course, the hash digest must be provided to users out-of-band and before the actual SSL handshake. Usually, the bridge operator gives the hash digest to her bridge users along with the rest of the bridge credentials, like the bridge's address and port. 3. Security implications Bridge clients who have pinned a bridge to a certificate fingerprint will be able to detect an MITMing adversary in time. If after detection they act as an innocuous Internet client, they can successfully remove suspicion from the SSL connection and subvert bridge detection. Pinning a certificate fingerprint and detecting an MITMing attacker does not automatically alleviate suspicions from the bridge or the client. Clients must have a behavior to follow after detecting the MITM attack so that they look like innocent Netizens. This proposal does not try to specify such a behavior. Implementation and use of this scheme does not render bridges and clients immune to scanning or DPI attacks. This scheme should be used along with bridge client authorization schemes like the ones detailed in proposal 190. 4. Tor Implementation 4.1. Certificate fingerprint creation The certificate fingerprints used on this scheme MUST be computed by applying the SHA256 cryptographic hash function upon the ASN.1 DER encoding of a public-key certificate, then truncating the hash output to 12 bytes, encoding it to RFC4648 Base32 and omitting any trailing padding '='. 4.2. Bridge side implementation Tor bridge implementations SHOULD provide a command line option that exports a fully equipped Bridge line containing the bridge address and port, the link certificate fingerprint, and any other enabled Bridge options, so that bridge operators can easily send it to their users. In the case of expiring SSL certificates, Tor bridge implementations SHOULD warn the bridge operator a sensible amount of time before the expiration, so that she can warn her clients and potentially rotate the certificate herself. 4.3. Client side implementation Tor client implementations MUST extend their Bridge line format to support bridge SSL certificate fingerprints. The new format is: Bridge <method> <address:port> [["keyid="]<id-fingerprint>] \ ["shared_secret="<shared_secret>] ["link_cert_fpr="<fingerprint>] where <fingerprint> is the bridge's SSL certificate fingerprint. Tor clients who use bridges and want to pin their SSL certificates must specify the bridge's SSL certificate fingerprint as in: Bridge 12.34.56.78 shared_secret=934caff420aa7852b855 \ link_cert_fpr=GM4GEMBXGEZGKOJQMJSWINZSHFSGMOBRMYZGCMQ 4.4. Implementation prerequisites Tor bridges currently rotate their SSL certificates every 2 hours. This not only acts as a fingerprint for the bridges, but it also acts as a blocker for this proposal. Tor trac ticket #4390 and proposal YYY were created to resolve this issue. 5. Other ideas 5.1. Certificate tagging using a shared secret Another idea worth considering is having the bridge use the shared secret from proposal 190 to embed a "secret message" on her certificate, which could only be understood by a client who knows that shared secret, essentially authenticating the bridge. Specifically, the bridge would "tag" the Serial Number (or any other covert field) of her certificate with the (potentially truncated) HMAC of her link public key, using the shared secret of proposal 190 as the key: HMAC(shared_secret, link_public_key). A client knowing the shared secret would be able to verify the 'link_public_key' and authenticate the bridge, and since the Serial Number field is usually composed of random bytes a probing attacker would not notice the "tagging" of the certificate. Arguments for this scheme are that it: a) doesn't need extra bridge credentials apart from the shared secret of prop190. b) doesn't need any maintenance in case of certificate expiration. Arguments against this scheme are: a) In the case of self-signed certificates, OpenSSL creates an 8-bytes random Serial number, and we would probably need something more than 8-bytes to tag. There are not many other covert fields in SSL certificates mutable by vanilla OpenSSL. b) It complicates the scheme, and if not implemented and researched wisely it might also make it fingerprintable. c) We most probably won't be able to tag CA-signed certificates. 6. Discussion 6.1. In section 4.1, why do you truncate the SHA256 output to 12 bytes?! Bridge credentials are frequently propagated by word of mouth or are physically written down, which renders the occult Base64 encoding unsatisfactory. The 104 characters Base32 encoding or the 64 characters hex representation of the SHA256 output would also be too much bloat. By truncating the SHA256 output to 12 bytes and encoding it with Base32, we get 39 characters of readable and easy to transcribe output, and sufficient security. Finally, dividing '39' by the golden ratio gives us about 24.10! 7. Acknowledgements Thanks to Robert Ransom for his great help and suggestions on devising this scheme and writing this proposal!
Filename: 192-store-bridge-information.txt Title: Automatically retrieve and store information about bridges Author: Sebastian Hahn Created: 16-Nov-2011 Status: Obsolete Target: 0.2.[45].x Overview: Currently, tor already stores some information about the bridges it is configured to use locally, but doesn't make great use of the stored data. This data is the Tor configuration information about the bridge (IP address, port, and optionally fingerprint) and the bridge descriptor which gets stored along with the other descriptors a Tor client fetches, as well as an "EntryGuard" line in the state file. That line includes the Tor version we used to add the bridge, and a slightly randomized timestamp (up to a month in the past of the real date). The descriptor data also includes some more accurate timestamps about when the descriptor was fetched. The information we give out about bridges via bridgedb currently only includes the IP address and port, because giving out the fingerprint as well might mean that Tor clients make direct connections to the bridge authority, since we didn't design Tor's UpdateBridgesFromAuthority behaviour correctly. Motivation: The only way to let Tor know about a change affecting the bridge (IP address or port change) is to either ask the bridge authority directly, or reconfigure Tor. The former requires making a non-anonymized direct connection to the bridge authority Tonga and asking it for the current descriptor of the bridge with a given fingerprint - this is unsafe and also requires prior knowledge of the fingerprint. The latter requires user intervention, first to learn that there was an update and second to actually teach Tor about the change. This is way too complicated for most users, and should be unnecessary while the user has at least one bridge that remains working: Tonga can give out bridge descriptors when asked for the descriptor for a certain fingerprint, and Tor clients learn the fingerprint either from their torrc file or from the first connection they make to a bridge. For some users, however, this option is not what they want: They might use private bridges or have special security concerns, which would make them want to connect to the IP addresses specified in their configuration only, and not tell Tonga about the set of bridges they know about, even through a Tor circuit. Also see https://blog.torproject.org/blog/different-ways-use-bridge for more information about the different types of bridge users. Design: Tor should provide a new configuration option that allows bridge users to indicate that they wish to contact Tonga anonymously and learn about updates for the bridges that they know about, but can't currently reach. Once those updates have been received, the clients would then hold on to the new information in their state file, and use it across restarts for connection attempts. The option UpdateBridgesFromAuthority should be removed or recycled for this purpose, as it is currently dangerous to set (it makes direct connections to the bridge authority, thus leaking that a user is about to use bridges). Recycling the option is probably the better choice, because current users of the option get a surprising and never useful behaviour. On the other hand, users who downgrade their Tors might get the old behaviour by accident. If configured with this option, tor would make an anonymized connection to Tonga to ask for the descriptors of bridges that it cannot currently connect to, once every few hours. Making more frequent requests would likely not help, as bridge information doesn't typically change that frequently, and may overload Tonga. This information needs to be stored in the state file: - An exact copy of the Bridge stanza in the torrc file, so that tor can detect when the bridge is unconfigured/the configuration is changed - The IP address, port, and fingerprint we last used when making a successful connection to the bridge, if this differs from/supplements the configured data. - The IP address, port, and fingerprint we learned from the bridge authority, if this differs from both the configured data and the data we used for the last successful connection. We don't store more data in the state file to avoid leaking too much if the state file falls into the hands of an adversary. Security implications: Storing sensitive data on disk is risky when the computer one uses gets into the wrong hands, and state file entries can be used to identify times the user was online. This is already a problem for the Bridge lines in a user's configuration file, but by storing more information about bridges some timings can be deduced. Another risk is that this allows long-term tracking of users when the set of bridges a user knows about is known to the attacker, and the set is unique. This is not very hard to achieve for bridgedb, as users typically make requests to it non-anomymized and bridgedb can selectively pick bridges to report. By combining the data about descriptor fetches on Tonga and this fingerprint, a usage pattern can be established. Also, bridgedb could give out a made-up fingerprint to a user that requested bridges, thus easily creating a unique set. Users of private bridges should not set this option, as it will leak the fingerprints of their bridges to Tonga. This is not a huge concern, as Tonga doesn't know about those descriptors, but private bridge users will likely want to avoid leaking the existence of their bridge. We might want to figure out a way to indicate that a bridge is private on the Bridge line in the configuration, so fetching the descriptor from Tonga is disabled for those automatically. This warrants more discussion to find a solution that doesn't require bridge users to understand the trade-offs of setting a configuration option. One idea is to indicate that a bridge is private by a special flag in its bridge descriptor, so clients can avoid leaking those to the bridge authority automatically. Also, Bridge lines for private bridges shouldn't include the fingerprint so that users don't accidentally leak the fingerprint to the bridge authority before they have talked to the bridge. Specification: No change/addition to the current specification is necessary, as the data that gets stored at clients is not covered by the specification. This document is supposed to serve as a basis for discussion and to provide hints for implementors. Compatibility: Tonga is already set up to send out descriptors requested by clients, so the bridge authority side doesn't need any changes. The new configuration options governing the behaviour of Tor would be incompatible with previous versions, so the torrc needs to be adapted. The state file changes should not affect older versions.
Filename: 193-safe-cookie-authentication.txt Title: Safe cookie authentication for Tor controllers Author: Robert Ransom Created: 2012-02-04 Status: Closed Overview: Not long ago, all Tor controllers which automatically attempted 'cookie authentication' were vulnerable to an information-disclosure attack. (See https://bugs.torproject.org/4303 for slightly more information.) Now, some Tor controllers which automatically attempt cookie authentication are only vulnerable to an information-disclosure attack on any 32-byte files they can read. But the Ed25519 signature scheme (among other cryptosystems) has 32-byte secret keys, and we would like to not worry about Tor controllers leaking our secret keys to whatever can listen on what the controller thinks is Tor's control port. Additionally, we would like to not have to remodel Tor's innards and rewrite all of our Tor controllers to use TLS on Tor's control port this week (or deal with the many design issues which that would raise). Design: From af6bf472d59162428a1d7f1d77e6e77bda827414 Mon Sep 17 00:00:00 2001 From: Robert Ransom <rransom.8774@gmail.com> Date: Sun, 5 Feb 2012 04:02:23 -0800 Subject: [PATCH] Add SAFECOOKIE control-port authentication method --- control-spec.txt | 59 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 files changed, 51 insertions(+), 8 deletions(-) diff --git a/control-spec.txt b/control-spec.txt index 66088f7..3651c86 100644 --- a/control-spec.txt +++ b/control-spec.txt @@ -323,11 +323,12 @@ For information on how the implementation securely stores authentication information on disk, see section 5.1. - Before the client has authenticated, no command other than PROTOCOLINFO, - AUTHENTICATE, or QUIT is valid. If the controller sends any other command, - or sends a malformed command, or sends an unsuccessful AUTHENTICATE - command, or sends PROTOCOLINFO more than once, Tor sends an error reply and - closes the connection. + Before the client has authenticated, no command other than + PROTOCOLINFO, AUTHCHALLENGE, AUTHENTICATE, or QUIT is valid. If the + controller sends any other command, or sends a malformed command, or + sends an unsuccessful AUTHENTICATE command, or sends PROTOCOLINFO or + AUTHCHALLENGE more than once, Tor sends an error reply and closes + the connection. To prevent some cross-protocol attacks, the AUTHENTICATE command is still required even if all authentication methods in Tor are disabled. In this @@ -949,6 +950,7 @@ "NULL" / ; No authentication is required "HASHEDPASSWORD" / ; A controller must supply the original password "COOKIE" / ; A controller must supply the contents of a cookie + "SAFECOOKIE" ; A controller must prove knowledge of a cookie AuthCookieFile = QuotedString TorVersion = QuotedString @@ -970,9 +972,9 @@ methods that Tor currently accepts. AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff - the METHODS field contains the method "COOKIE". Controllers MUST handle - escape sequences inside this string. + authentication cookie that Tor is expecting and is provided iff the + METHODS field contains the method "COOKIE" and/or "SAFECOOKIE". + Controllers MUST handle escape sequences inside this string. The VERSION line contains the Tor version. @@ -1033,6 +1035,47 @@ [TAKEOWNERSHIP was added in Tor 0.2.2.28-beta.] +3.24. AUTHCHALLENGE + + The syntax is: + "AUTHCHALLENGE" SP "AUTHMETHOD=SAFECOOKIE" + SP "COOKIEFILE=" AuthCookieFile + SP "CLIENTCHALLENGE=" 2*HEXDIG / QuotedString + CRLF + + The server will reject this command with error code 512, then close + the connection, if Tor is not using the file specified in the + AuthCookieFile argument as a controller authentication cookie file. + + If the server accepts the command, the server reply format is: + "250-AUTHCHALLENGE" + SP "CLIENTRESPONSE=" 64*64HEXDIG + SP "SERVERCHALLENGE=" 2*HEXDIG + CRLF + + The CLIENTCHALLENGE, CLIENTRESPONSE, and SERVERCHALLENGE values are + encoded/decoded in the same way as the argument passed to the + AUTHENTICATE command. + + The CLIENTRESPONSE value is computed as: + HMAC-SHA256(HMAC-SHA256("Tor server-to-controller cookie authenticator", + CookieString) + ClientChallengeString) + (with the HMAC key as its first argument) + + After a controller sends a successful AUTHCHALLENGE command, the + next command sent on the connection must be an AUTHENTICATE command, + and the only authentication string which that AUTHENTICATE command + will accept is: + HMAC-SHA256(HMAC-SHA256("Tor controller-to-server cookie authenticator", + CookieString) + ServerChallengeString) + + [Unlike other commands besides AUTHENTICATE, AUTHCHALLENGE may be + used (but only once!) before AUTHENTICATE.] + + [AUTHCHALLENGE was added in Tor FIXME.] + 4. Replies Reply codes follow the same 3-character format as used by SMTP, with the -- 1.7.8.3 Rationale: The weird inner HMAC was meant to ensure that whatever impersonates Tor's control port cannot even abuse a secret key meant to be used with HMAC-SHA256. Then I added the server-to-controller challenge-response authentication step, to ensure that the server can only use a controller as an HMAC oracle if it already knows the contents of the cookie file. Now, the inner HMAC is just a not-very-efficient way to keep controllers from using the server as an oracle for its own challenges (it could be replaced with a hash function).
Filename: 194-mnemonic-urls.txt Title: Mnemonic .onion URLs Author: Sai, Alex Fink Created: 29-Feb-2012 Status: Superseded 1. Overview Currently, canonical Tor .onion URLs consist of a naked 80-bit hash[1]. This is not something that users can even recognize for validity, let alone produce directly. It is vulnerable to partial-match fuzzing attacks[2], where a would-be MITM attacker generates a very similar hash and uses various social engineering, wiki poisoning, or other methods to trick the user into visiting the spoof site. This proposal gives an alternative method for displaying and entering .onion and other URLs, such that they will be easily remembered and generated by end users, and easily published by hidden service websites, without any dependency on a full domain name type system like e.g. namecoin[3]. This makes it easier to implement (requiring only a change in the proxy). This proposal could equally be used for IPv4, IPv6, etc, if normal DNS is for some reason untrusted. This is not a petname system[4], in that it does not allow service providers or users[5] to associate a name of their choosing to an address[6]. Rather, it is a mnemonic system that encodes the 80 bit .onion address into a meaningful[7] and memorable sentence. A full petname system (based on registration of some kind, and allowing for shorter, service-chosen URLs) can be implemented in parallel[8]. This system has the three properties of being secure, distributed, and human-meaningful — it just doesn't also have choice of name (except of course by brute force creation of multiple keys to see if one has an encoding the operator likes). This is inspired by Jonathan Ackerman's "Four Little Words" proposal[9] for doing the same thing with IPv4 addresses. We just need to handle 80+ bits, not just 32 bits. It is similar to Markus Jakobsson & Ruj Akavipat's FastWord system[10], except that it does not permit user choice of passphrase, does not know what URL a user will enter (vs verifying against a single stored password), and again has to encode significantly more data. This is also similar to RFC1751[11], RFC2289[12], and multiple other fingerprint encoding systems[13] (e.g. PGPfone[14] using the PGP wordlist[15], and Arturo Filatsò's OnionURL[16]), but we aim to make something that's as easy as possible for users to remember — and significantly easier than just a list of words or pseudowords, which we consider only useful as an active confirmation tool, not as something that can be fully memorized and recalled, like a normal domain name. 2. Requirements 2.1. encodes at least 80 bits of random data (preferably more, eg for a checksum) 2.2. valid, visualizable English sentence — not just a series of words[17] 2.3. words are common enough that non-native speakers and bad spellers will have minimum difficulty remembering and producing (perhaps with some spellcheck help) 2.4. not syntactically confusable (e.g. order should not matter) 2.5. short enough to be easily memorized and fully recalled at will, not just recognized 2.6. no dependency on an external service 2.7. dictionary size small enough to be reasonable for end users to download as part of the onion package 2.8. consistent across users (so that websites can e.g. reinforce their random hash's phrase with a clever drawing) 2.9. not create offensive sentences that service providers will reject 2.10. resistant against semantic fuzzing (e.g. by having uniqueness against WordNet synsets[18]) 3. Possible implementations This section is intentionally left unfinished; full listing of template sentences and the details of their parser and generating implementation is co-dependent on the creation of word class dictionaries fulfilling the above criteria. Since that's fairly labor-intensive, we're pausing at this stage for input first, to avoid wasting work. 3.1. Have a fixed number of template sentences, such as: 1. Adj subj adv vtrans adj obj 2. Subj and subj vtrans adj obj 3. … etc For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word) dictionaries for each word category. If multiple words of the same category are used, they must either play different grammatical roles (eg subj vs obj, or adj on a different item), be chosen from different dictionaries, or there needs to be an order-agnostic way to join them at the bit level. Preferably this should be avoided, just to prevent users forgetting the order. 3.2. As 3.1, but treat sentence generation as decoding a prefix code, and have a Huffman code for each word class. We suppose it’s okay if the generated sentence has a few more words than it might, as long as they’re common lean words. E.g., for adjectives, "good" might cost only six bits while "unfortunate" costs twelve. Choice between different sentence syntaxes could be worked into the prefix code as well, and potentially done separately for each syntactic constituent. 4. Usage To form mnemonic .onion URL, just join the words with dashes or underscores, stripping minimal words like 'a', 'the', 'and' etc., and append '.onion'. This can be readily distinguished from standard hash-style .onion URLs by form. Translation should take place at the client — though hidden service servers should also be able to output the mnemonic form of hashes too, to assist website operators in publishing them (e.g. by posting an amusing drawing of the described situation on their website to reinforce the mnemonic). After the translation stage of name resolution, everything proceeds as normal for an 80-bit hash onion URL. The user should be notified of the mnemonic form of hash URL in some way, and have an easy way in the client UI to translate mnemonics to hashes and vice versa. For the purposes of browser URLs and the like though, the mnemonic should be treated on par with the hash; if the user enters a mnemonic URL they should not become redirected to the hash version. (If anything, the opposite may be true, so that users become used to seeing and verifying the mnemonic version of hash URLs, and gain the security benefits against partial-match fuzzing.) Ideally, inputs that don't validly resolve should have a response page served by the proxy that uses a simple spell-check system to suggest alternate domain names that are valid hash encodings. This could hypothetically be done inline in URL input, but would require changes on the browser (normally domain names aren't subject so spellcheck), and this avoids that implementation problem. 5. International support It is not possible for this scheme to support non-English languages without a) (usually) Unicode in domains (which is not yet well supported by browsers), and b) fully customized dictionaries and phrase patterns per language The scheme must not be used in an attempted 'translation' by simply replacing English words with glosses in the target language. Several of the necessary features would be completely mangled by this (e.g. other languages have different synonym, homonym, etc groupings, not to mention completely different grammar). It is unlikely a priori that URLs constructed using a non-English dictionary/pattern setup would in any sense 'translate' semantically to English; more likely is that each language would have completely unrelated encodings for a given hash. We intend to only make an English version at first, to avoid these issues during testing. ________________ [1] https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames https://gitweb.torproject.org/torspec.git/blob/HEAD:/address-spec.txt [2] http://www.thc.org/papers/ffp.html [3] http://dot-bit.org/Namecoin [4] https://en.wikipedia.org/wiki/Zooko's_triangle [5] https://addons.mozilla.org/en-US/firefox/addon/petname-tool/ [6] However, service operators can generate a large number of hidden service descriptors and check whether their hashes result in a desirable phrasal encoding (much like certain hidden services currently use brute force generated hashes to ensure their name is the prefix of their raw hash). This won't get you whatever phrase you want, but will at least improve the likelihood that it's something amusing and acceptable. [7] "Meaningful" here inasmuch as e.g. "Barnaby thoughtfully mangles simplistic yellow camels" is an absurdist but meaningful sentence. Absurdness is a feature, not a bug; it decreases the probability of mistakes if the scenario described is not one that the user would try to fit into a template of things they have previously encountered IRL. See research into linguistic schema for further details. [8] https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-oni on-nyms.txt [9] http://blog.rabidgremlin.com/2010/11/28/4-little-words/ [10] http://fastword.me/ [11] https://tools.ietf.org/html/rfc1751 [12] http://tools.ietf.org/html/rfc2289 [13] https://github.com/singpolyma/mnemonicode http://mysteryrobot.com https://github.com/zacharyvoase/humanhash [14] http://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf [15] http://en.wikipedia.org/wiki/PGP_word_list [16] https://github.com/hellais/Onion-url https://github.com/hellais/Onion-url/blob/master/dev/mnemonic.py [17] http://www.reddit.com/r/technology/comments/ecllk [18] http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html
Filename: 195-TLS-normalization-for-024.txt Title: TLS certificate normalization for Tor 0.2.4.x Author: Jacob Appelbaum, Gladys Shufflebottom, Nick Mathewson, Tim Wilde Created: 6-Mar-2012 Status: Dead Target: 0.2.4.x 0. Introduction The TLS (Transport Layer Security) protocol was designed for security and extensibility, not for uniformity. Because of this, it's not hard for an attacker to tell one application's use of TLS from another's. We proposes improvements to Tor's current TLS certificates to reduce the distinguishability of Tor traffic. 0.1. History This draft is based on parts of Proposal 179, by Jacob Appelbaum and Gladys Shufflebottom, but removes some already implemented parts and replaces others. 0.2. Non-Goals We do not address making TLS harder to distinguish after the handshake is done. We also do not discuss TLS improvements not related to distinguishability (such as increased key size, algorithm choice, and so on). 1. Certificate Issues Currently, Tor generates certificates according to a fixed pattern, where lifetime is fairly small, the certificate Subject DN is a single randomly generated CN, and the certificate Issuer DN is a different single randomly generated CN. We propose several ways to improve this below. 1.1. Separate initial certificate from link certificate When Tor is using the v2 or v3 link handshake (see tor-spec.txt), it currently presents an initial handshake authenticating the link key with the identity key. We propose instead that Tor should be able to present an arbitrary initial certificate (so long as its key matches the link key used in the actual TLS handshake), and then present the real certificate authenticating the link key during the Tor handshake. (That is, during the v2 handshake's renegotiation step, or in the v3 handshake's CERTS cell.) The TLS protocol and the Tor handshake protocol both allow this, and doing so will give us more freedom for the alternative certificate presentation ideas below. 1.2. Allow externally generated certificates It should be possible for a Tor relay operator to generate and provide their own certificate and secret key. This will allow a relay or bridge operator to use a certificate signed by any member of the "SSL mafia,"[*] to generate their own self-signed certificate, and so on. For compatibility, we need to require that the key be an RSA secret key, of at least 1024 bits, generated with e=65537. As a proposed interface, let's require that the certificate be stored in ${DataDir}/tls_cert/tls_certificate.crt , that the secret key be stored in ${DataDir}/tls_cert/private_tls_key.key , and that they be used instead of generating our own certificate whenever the new boolean option "ProvidedTLSCert" is set to true. (Alternative interface: Allow the cert and key cert to be stored wherever, and have the user provide their respective locations with TLSCertificateFile and TLSCertificateKeyFile options.) 1.3. Longer certificate lifetimes Tor's current certificates aren't long-lived, which makes them different from most other certificates in the wild. Typically, certificates are valid for a year, so let's use that as our default lifetime. [TODO: investigate whether "a year" for most CAs and self-signed certs have their validity dates running for a calendar year ending at the second of issue, one calendar year ending at midnight, or 86400*(365.5 +/- .5) seconds, or what.] There are two ways to approach this. We could continue our current certificate management approach where we frequently generate new certificates (albeit with longer lifetimes), or we could make a cert, store it to disk, and use it for all or most of its declared lifetime. If we continue to use fairly short lifetimes for the _true_ link certificates (the ones presented during the Tor handshake), then presenting long-lived certificates doesn't hurt us much: in the event of a link-key-only compromise, the adversary still couldn't actually impersonate a server for long.[**] Using shorter-lived certificates with long nominal lifetimes doesn't seem to buy us much. It would let us rotate link keys more frequently, but we're already getting forward secrecy from our use of diffie-hellman key agreement. Further, it would make our behavior look less like regular TLS behavior, where certificates are typically used for most of their nominal lifetime. Therefore, let's store and use certs and link keys for the full year. 1.4. Self-signed certificates with better DNs When we generate our own certificates, we currently set no DN fields other than the commonName. This behavior isn't terribly common: users of self-signed certs usually/often set other fields too. [TODO: find out frequency.] Unfortunately, it appears that no particular other set of fields or way of filling them out _is_ universal for self-signed certificates, or even particularly common. The most common schema seem to be for things most censors wouldn't mind blocking, like embedded devices. Even the default openssl schema, though common, doesn't appear to represent a terribly large fraction of self-signed websites. [TODO: get numbers here.] So the best we can do here is probably to reproduce the process that results in self-signed certificates originally: let the bridge and relay operators to pick the DN fields themselves. This is an annoying interface issue, and wants a better solution. 1.5. Better commonName values Our current certificates set the commonName to a randomly generated field like www.rmf4h4h.net. This is also a weird behavior: nearly all TLS certs used for web purposes will have a hostname that resolves to their IP. The simplest way to get a plausible commonName here would be to do a reverse lookup on our IP and try to find a good hostname. It's not clear whether this would actually work out in practice, or whether we'd just get dynamic-IP-pool hostnames everywhere blocked when they appear in certificates. Alternatively, if we are told a hostname in our Torrc (possibly in the Address field), we could try to use that. 2. TLS handshake issues 2.1. Session ID. Currently we do not send an SSL session ID, as we do not support session resumption. However, Apache (and likely other major SSL servers) do have this support, and do send a 32 byte SSLv3/TLSv1 session ID in their Server Hello cleartext. We should do the same to avoid an easy fingerprinting opportunity. It may be necessary to lie to OpenSSL to claim that we are tracking session IDs to cause it to generate them for us. (We should not actually support session resumption.) [*] "Hey buddy, it's a nice website you've got there. Sure would be a shame if somebody started poppin' up warnings on all your user's browsers, tellin' everbody that you're _insecure_..." [**] Furthermore, a link-key-only compromise isn't very realistic atm; nearly any attack that would let an adversary learn a link key would probably let the adversary learn the identity key too. The most plausible way would probably be an implementation bug in OpenSSL or something.
Filename: 196-transport-control-ports.txt Title: Extended ORPort and TransportControlPort Author: George Kadianakis, Nick Mathewson Created: 14 Mar 2012 Status: Closed Implemented-In: 0.2.5.2-alpha 1. Overview Proposal 180 defined Tor pluggable transports, a way to decouple protocol-level obfuscation from the core Tor protocol in order to better resist client-bridge censorship. This is achieved by introducing pluggable transport proxies, programs that obfuscate Tor traffic to resist DPI detection. Proposal 180 defined a way for pluggable transport proxies to communicate with local Tor clients and bridges, so as to exchange traffic. This document extends this communication protocol, so that pluggable transport proxies can exchange arbitrary operational information and metadata with Tor clients and bridges. 2. Motivation The communication protocol specified in Proposal 180 gives a way for transport proxies to announce the IP address of their clients to tor. Still, modern pluggable transports might have more (?) needs than this. For example: 1. Tor might want to inform pluggable transport proxies on how to rate-limit incoming or outgoing connections. 2. Server pluggable transport proxies might want to pass client information to an anti-active-probing system controlled by tor. 3. Tor might want to temporarily stop a transport proxy from obfuscating traffic. To satisfy the above use cases, there must be real-time communication between the tor process and the pluggable transport proxy. To achieve this, this proposal refactors the Extended ORPort protocol specified in Proposal 180, and introduces a new port, TransportControlPort, whose sole role is the exchange of control information between transport proxies and tor. Specifically, transports proxies deliver each connection to the "Extended ORPort", where they provide metadata and agree on an identifier for each tunneled connection. Once this handshake occurs, the OR protocol proceeds unchanged. Additionally, each transport maintains a single connection to Tor's "TransportControlPort", where it receives instructions from Tor about rate-limiting on individual connections. 3. The new port protocols 3.1. The new extended ORPort protocol 3.1.1. Protocol The extended server port protocol is as follows: COMMAND [2 bytes, big-endian] BODYLEN [2 bytes, big-endian] BODY [BODYLEN bytes] Commands sent from the transport proxy to the bridge are: [0x0000] DONE: There is no more information to give. The next bytes sent by the transport will be those tunneled over it. (body ignored) [0x0001] USERADDR: an address:port string that represents the client's address. [0x0002] TRANSPORT: a string of the name of the pluggable transport currently in effect on the connection. Replies sent from tor to the proxy are: [0x1000] OKAY: Send the user's traffic. (body ignored) [0x1001] DENY: Tor would prefer not to get more traffic from this address for a while. (body ignored) [0x1002] CONTROL: a NUL-terminated "identifier" string. The pluggable transport proxy must use the "identifier" to access the TransportControlPort. See the 'Association and identifier creation' section below. Parties MUST ignore command codes that they do not understand. If the server receives a recognized command that does not parse, it MUST close the connection to the client. 3.1.2. Command descriptions 3.1.2.1. USERADDR An ASCII string holding the TCP/IP address of the client of the pluggable transport proxy. A Tor bridge SHOULD use that address to collect statistics about its clients. Recognized formats are: 1.2.3.4:5678 [1:2::3:4]:5678 (Current Tor versions may accept other formats, but this is a bug: transports MUST NOT send them.) The string MUST not be NUL-terminated. 3.1.2.2. TRANSPORT An ASCII string holding the name of the pluggable transport used by the client of the pluggable transport proxy. A Tor bridge that supports multiple transports SHOULD use that information to collect statistics about the popularity of individual pluggable transports. The string MUST not be NUL-terminated. Pluggable transport names are C-identifiers and Tor MUST check them for correctness. 3.2. The new TransportControlPort protocol The TransportControlPort protocol is as follows: CONNECTIONID[16 bytes, big-endian] COMMAND [2 bytes, big-endian] BODYLEN [2 bytes, big-endian] BODY [BODYLEN bytes] Commands sent from the transport proxy to the bridge: [0x0001] RATE_LIMITED: Message confirming that the rate limiting request of the bridge was carried out successfully (body ignored). See the 'Rate Limiting' section below. [0x0002] NOT_RATE_LIMITED: Message notifying that the transport proxy failed to carry out the rate limiting request of the bridge (body ignored). See the 'Rate Limiting' section below. Configuration commands sent from the bridge to the transport proxy are: [0x1001] NOT_ALLOWED: Message notifying that the CONNECTIONID could not be matched with an authorized connection ID. The bridge SHOULD shutdown the connection. [0x1001] RATE_LIMIT: Carries information on how the pluggable transport proxy should rate-limit its traffic. See the 'Rate Limiting' section below. CONNECTIONID should carry the connection identifier described in the 'Association and identifier creation' section. Parties should ignore command codes that they do not understand. 3.3. Association and identifier creation For Tor and a transport proxy to communicate using the TransportControlPort, an identifier must have already been negotiated using the 'CONTROL' command of Extended ORPort. The TransportControlPort identifier should not be predictable by a user who hasn't received a 'CONTROL' command from the Extended ORPort. For this reason, the TransportControlPort identifier should not be cryptographically-weak or deterministically created. Tor MUST create its identifiers by generating 16 bytes of random data. 4. Configuration commands 4.1. Rate Limiting A Tor relay should be able to inform a transport proxy in real-time about its rate-limiting needs. This can be achieved by using the TransportControlPort and sending a 'RATE_LIMIT' command to the transport proxy. The body of the 'RATE_LIMIT' command should contain two integers, 4 bytes each, in big-endian format. The two numbers should represent the bandwidth rate and bandwidth burst respectively in 'bytes per second' which the transport proxy must set as its overall rate-limiting setting. When the transport proxy sets the appropriate rate limiting, it should send back a 'RATE_LIMITED' command. If it fails while setting up rate limiting, it should send back a 'NOT_RATE_LIMITED' command. After sending a 'RATE_LIMIT' command. the tor bridge MAY want to stop pushing data to the transport proxy, till it receives a 'RATE_LIMITED' command. If, instead, it receives a 'NOT_RATE_LIMITED' command it MAY want to shutdown its connections to the transport proxy. 5. Authentication To defend against cross-protocol attacks on the Extended ORPort, proposal 213 defines an authentication scheme that should be used to protect it. If the Extended ORPort is enabled, Tor should regenerate the cookie file of proposal 213 on startup and store it in $DataDirectory/extended_orport_auth_cookie. The location of the cookie can be overriden by using the configuration file parameter ExtORPortCookieAuthFile, which is defined as: ExtORPortCookieAuthFile <path> where <path> is a filesystem path. XXX should we also add an ExtORPortCookieFileGroupReadable torrc option? 6. Security Considerations Extended ORPort or TransportControlPort do _not_ provide link confidentiality, authentication or integrity. Sensitive data, like cryptographic material, should not be transferred through them. An attacker with superuser access, is able to sniff network traffic, and capture TransportControlPort identifiers and any data passed through those ports. Tor SHOULD issue a warning if the bridge operator tries to bind Extended ORPort or TransportControlPort to a non-localhost address. Pluggable transport proxies SHOULD issue a warning if they are instructed to connect to a non-localhost Extended ORPort or TransportControlPort. 7. Future In the future, we might have pluggable transports which require the _client_ transport proxy to use the TransportControlPort and exchange control information with the Tor client. The current proposal doesn't yet support this, but we should not add functionality that will prevent it from being possible.
Filename: 197-postmessage-ipc.txt Title: Message-based Inter-Controller IPC Channel Author: Mike Perry Created: 16-03-2012 Status: REJECTED Overview This proposal seeks to create a means for inter-controller communication using the Tor Control Port. Motivation With the advent of pluggable transports, bridge discovery mechanisms, and tighter browser-Vidalia integration, we're going to have an increasing number of collaborating Tor controller programs communicating with each other. Rather than define new pairwise IPC mechanisms for each case, we will instead create a generalized message-passing mechanism through the Tor Control Port. Control Protocol Specification Changes CONTROLLERNAME command Sent from the client to the server. The syntax is: "CONTROLLERNAME" SP ControllerID ControllerID = 1*(ALNUM / "_") Server returns "250 OK" and records the ControllerID to use for this control port connection for messaging information if successful, or "553 Controller name already in use" if the name is in use by another controller, or if an attempt is made to register the special names "all" or "unset". [CONTROLLERNAME need not be issued to send POSTMESSAGE commands, and CONTROLLERNAME may be unsupported by initial POSTMESSAGE implementations in Tor.] POSTMESSAGE command Sent from the client to the server. The syntax is: "POSTMESSAGE" SP "@" DestControllerID SP LineItem CRLF DestControllerID = "all" / 1*(ALNUM / "_") If DestControllerID is "all", the message will be posted to all controllers that have "SETEVENTS POSTMESSAGE" set. Otherwise, the message should be posted to the controller with the appropriate ControllerID. Server returns "250 OK" if successful, or "552 Invalid destination controller name" if the name is not registered. [Initial implementations may require DestControllerID always be "all"] POSTMESSAGE event "650" SP "POSTMESSAGE" SP MessageID SP SourceControllerID SP "@" DestControllerID SP LineItem CRLF MessageID = 1*DIGIT SourceControllerID = "unset" / 1*(ALNUM / "_") DestControllerID = "all" / 1*(ALNUM / "_") MessageID is an incrementing integer identifier that uniquely identifies this message to all controllers. The SourceControllerID is the value from the sending controller's CONTROLLERNAME command, or "unset" if the CONTROLLERNAME command was not used or unimplemented. GETINFO commands "recent-messages" -- Retrieves messages sent to ControllerIDs that match the current controller in POSTMESSAGE event format. This list should be generated on the fly, to handle disconnecting controllers. "new-messages" -- Retrieves the last 10 "unread" messages sent to this controller, in POSTMESSAGE event format. If SETEVENTS POSTMESSAGE was set, this command should always return nothing. "list-controllers" -- Retrieves a list of all connected controllers with either their registered ControllerID or "unset". Implementation plan The POSTMESSAGE protocol is designed to be incrementally deployable. Initial implementations are only expected to implement broadcast capabilities and SETEVENTS based delivery. CONTROLLERNAME need not be supported, nor do non-"@all" POSTMESSAGE destinations. To support command-based controllers (which do not use SETEVENTS) such as Torbutton, at minimum the "GETINFO recent-messages" command is needed. However, Torbutton does not have immediate need for this protocol.
Filename: 198-restore-clienthello-semantics.txt Title: Restore semantics of TLS ClientHello Author: Nick Mathewson Created: 19-Mar-2012 Status: Closed Target: 0.2.4.x Status: Tor 0.2.3.17-beta implements the client-side changes, and no longer advertises openssl-supported TLS ciphersuites we don't have. Overview: Currently, all supported Tor versions try to imitate an older version of Firefox when advertising ciphers in their TLS ClientHello. This feature is intended to make it harder for a censor to distinguish a Tor client from other TLS traffic. Unfortunately, it makes the contents of the ClientHello unreliable: a server cannot conclude that a cipher is really supported by a Tor client simply because it is advertised in the ClientHello. This proposal suggests an approach for restoring sanity to our use of ClientHello, so that we still avoid ciphersuite-based fingerprinting, but allow nodes to negotiate better ciphersuites than they are allowed to negotiate today. Background reading: Section 2 of tor-spec.txt 2 describes our current baroque link negotiation scheme. Proposals 176 and 184 describe more information about how it got that way. Bug 4744 is a big part of the motivation for this proposal: we want to allow Tors to advertise even more ciphers, some of which we would actually prefer to the ones we are using now. What you need to know about the TLS handshake is that the client sends a list of all the ciphersuites that it supports in its ClientHello message, and then the server chooses one and tells the client which one it picked. Motivation and constraints: We'd like to use some of the ECDHE TLS ciphersuites, since they allow us to get better forward-secrecy at lower cost than our current DH-1024 usage. But right now, we can't ever use them, since Tor will advertise them whether or not it has a version of OpenSSL that supports them. (OpenSSL before 1.0.0 did not support ECDHE ciphersuites; OpenSSL before 1.0.0e or so had some security issues with them.) We cannot have the rule be "Tors must only advertise ciphersuites that they can use", since current Tors will advertise such ciphersuites anyway. We cannot have the rule be "Tors must support every ECDHE ciphersuite on the following list", since current Tors don't do all that, and since one prominent Linux distribution builds OpenSSL without ECC support because of patent/freedom fears. Fortunately, nearly every ciphersuite that we would like to advertise to imitate FF8 (see bug 4744) is currently supported by OpenSSL 1.0.0 and later. This enables the following proposal to work: Proposed spec changes: I propose that the rules for handling ciphersuites at the server side become the following: If the ciphersuites in the ClientHello contains no ciphers other than the following[*], they indicate that the Tor v1 link protocol is in use. TLS_DHE_RSA_WITH_AES_256_CBC_SHA TLS_DHE_RSA_WITH_AES_128_CBC_SHA SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA If the advertised ciphersuites in the ClientHello are _exactly_[*] the following, they indicate that the Tor v2+ link protocol is in use, AND that the ClientHello may have unsupported ciphers. In this case, the server may choose DHE_RSA_WITH_AES_128_CBC_SHA or DHE_RSA_WITH_AES_256_SHA, but may not choose any other cipher. TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA TLS1_DHE_RSA_WITH_AES_256_SHA TLS1_DHE_DSS_WITH_AES_256_SHA TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA TLS1_RSA_WITH_AES_256_SHA TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA TLS1_ECDHE_RSA_WITH_RC4_128_SHA TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA TLS1_DHE_RSA_WITH_AES_128_SHA TLS1_DHE_DSS_WITH_AES_128_SHA TLS1_ECDH_RSA_WITH_RC4_128_SHA TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA TLS1_ECDH_ECDSA_WITH_RC4_128_SHA TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA SSL3_RSA_RC4_128_MD5 SSL3_RSA_RC4_128_SHA TLS1_RSA_WITH_AES_128_SHA TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA SSL3_EDH_RSA_DES_192_CBC3_SHA SSL3_EDH_DSS_DES_192_CBC3_SHA TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA SSL3_RSA_DES_192_CBC3_SHA [*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is not counted when checking the list of ciphersuites. Otherwise, the ClientHello has these semantics: The inclusion of any cipher supported by OpenSSL 1.0.0 means that the client supports it, with the exception of SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA which is never supported. Clients MUST advertise support for at least one of TLS_DHE_RSA_WITH_AES_256_CBC_SHA or TLS_DHE_RSA_WITH_AES_128_CBC_SHA. The server MUST choose a ciphersuite with ephemeral keys for forward secrecy; MUST NOT choose a weak or null ciphersuite; and SHOULD NOT choose any cipher other than AES or 3DES. Discussion and consequences: Currently, OpenSSL 1.0.0 (in its default configuration) supports every cipher that we would need in order to give the same list as Firefox versions 8 through 11 give in their default configuration, with the exception of the FIPS ciphersuite above. Therefore, we will be able to fake the new ciphersuite list correctly in all of our bundles that include OpenSSL, and on every version of Unix that keeps up-to-date. However, versions of Tor compiled to use older versions of OpenSSL, or versions of OpenSSL with some ciphersuites disabled, will no longer give the same ciphersuite lists as other versions of Tor. On these platforms, Tor clients will no longer impersonate Firefox. Users who need to do so will have to download one of our bundles, or use a non-system OpenSSL. The proposed spec change above tries to future-proof ourselves by not declaring that we support every declared cipher, in case we someday need to handle a new Firefox version. If a new Firefox version comes out that uses ciphers not supported by OpenSSL 1.0.0, we will need to define whether clients may advertise its ciphers without supporting them; but existing servers will continue working whether we decide yes or no. The restriction to "servers SHOULD only pick AES or 3DES" is meant to reflect our current behavior, not to represent a permanent refusal to support other ciphers. We can revisit it later as appropriate, if for some bizarre reason Camellia or Seed or Aria becomes a better bet than AES. Open questions: Should the client drop connections if the server chooses a bad cipher, or a suite without forward secrecy? Can we get OpenSSL to support the dubious FIPS suite excluded above, in order to remove a distinguishing opportunity? It is not so simple as just editing the SSL_CIPHER list in s3_lib.c, since the nonstandard SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA cipher is (IIUC) defined to use the TLS1 KDF, while declaring itself to be an SSL cipher (!). Can we do anything to eventually allow the IE7+[**] cipher list as well? IE does not support TLS_DHE_RSA_WITH_AES_{256,128}_SHA or SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, and so wouldn't work with current Tor servers, which _only_ support those. It looks like the only forward-secure ciphersuites that IE7+ *does* support are ECDHE ones, and DHE+DSS ones. So if we want this flexibility, we could mandate server-side ECDHE, or somehow get DHE+DSS support (which would play havoc with our current certificate generation code IIUC), or say that it is sometimes acceptable to have a non-forward-secure link protocol[***]. None of these answers seems like a great one. Is one best? Are there other options? [**] Actually, I think it's the Windows SChannel cipher list we should be looking at here. [***] If we did _that_, we'd want to specify that CREATE_FAST could never be used on a non-forward-secure link. Even so, I don't like the implications of leaking cell types and circuit IDs to a future compromise.
Filename: 199-bridgefinder-integration.txt Title: Integration of BridgeFinder and BridgeFinderHelper Author: Mike Perry Created: 18-03-2012 Status: OBSOLETE Overview This proposal describes how the Tor client software can interact with an external program that performs bridge discovery based on user input or information extracted from a web page, QR Code, online game, or other transmission medium. Scope and Audience This document describes how all of the components involved in bridge discovery communicate this information to the rest of the Tor software. The mechanisms of bridge discovery are not discussed, though the design aims to be generalized enough to allow arbitrary new discovery mechanisms to be added at any time. This document is also written with the hope that those who wish to implement BridgeFinder components and BridgeFinderHelpers can get started immediately after a read of this proposal, so that development of bridge discovery mechanisms can proceed in parallel to supporting functionality improvements in the Tor client software. Components and Responsibilities 0. Tor Client The Tor Client is the piece of software that connects to the Tor network (optionally using bridges) and provides a SOCKS proxy for use by the user. In initial implementations, the Tor Client will support only standard bridges. In later implementations, it is expected to support pluggable transports as defined by Proposal 180. 1. Tor Control Port The Tor Control Port provides commands to perform operations, configuration, and to obtain status information. It also optionally provides event driven status updates. In initial implementations, it will be used directly by BridgeFinder to configure bridge information via GETINFO and SETCONF. It is covered by control-spec.txt in the tor-specs git repository. In later implementations, it will support the inter-controller POSTMESSAGE IPC protocol as defined by Proposal 197 for use in conveying bridge information to the Primary Controller. 2. Primary Controller The Primary Controller is the program that launches and configures the Tor client, and monitors its status. On desktop platforms, this program is Vidalia, and it also launches the Tor Browser. On Android, this program is Orbot. Orbot does not launch a browser. On all platforms, this proposal requires that the Primary Controller will launch one or more BridgeFinder child processes and provide them with authentication information through the environment variables TOR_CONTROL_PORT and TOR_CONTROL_PASSWD. In later implementations, the Primary Controller will be expected to receive Bridge configuration information via the free-form POSTMESSAGE protocol from Proposal 197, validate that information, and hold that information for user approval. 3. BridgeFinder A BridgeFinder is a program that discovers bridges and configures Tor to use them. In initial implementations, it is likely to be very dumb, and its main purpose will be to serve as a layer of abstraction that should free the Primary Controller from having to directly implement numerous ways of retrieving bridges for various pluggable transports. In later implementations, it may perform arbitrary network operations to discover, authenticate to, and/or verify bridges, possibly using informational hints provided by one or more external BridgeFinderHelpers (see next component). It could even go so far as to download new pluggable transport plugins and/or transform definition files from arbitrary urls. It will be launched by the Primary Controller and given access to the Tor Control Port via the environment variables TOR_CONTROL_PORT and TOR_CONTROL_PASSWD. Initial control port interactions can be command driven via GETINFO and SETCONF, and do not need to subscribe to or process control port events. Later implementations will use POSTMESSAGE as defined in Proposal 197 to pass command requests to Vidalia, which will parse them and ask for user confirmation before deploying them. Use of POSTMESSAGE may or may not require event driven operation, depending on POSTMESSAGE implementation status (POSTMESSAGE is designed to support both command and event driven operation, but it is possible event driven operation will happen first). 4. BridgeFinderHelper Each BridgeFinder implementation can optionally communicate with one or more BridgeFinderHelpers. BridgeFinderHelpers are plugins to external 3rd party applications that can inspect traffic, handle mime types, or implement protocol handlers for accepting bridge discovery information to pass to BridgeFinder. Example 3rd party applications include Chrome, World of Warcraft, QR Code readers, or simple cut and paste. Due to the arbitrary nature of sandboxing that may be present in various BridgeFinderHelper host applications, we do not mandate the exact nature of the IPC between BridgeFinder instances and external BridgeFinderHelper addons. However, please see the "Security Concerns" section for common pitfalls to avoid. 5. Tor Browser This is the browser the user uses with Tor. It is not useful until Tor is properly configured to use bridges. It fails closed. It is not expected to run BridgeFinderHelper plugin instances, unless those plugin instances exist to ensure the user always has a pool of working bridges available after successfully configuring an initial bridge. Once all bridges fail, the Tor Browser is useless. 6. Non-Tor Browser (aka BridgeFinderHelper host) This is the program the user uses for normal Internet activity to obtain bridges via a BridgeFinderHelper plugin. It does not have to be a browser. In advanced scenarios, this component may not be a browser at all, but may be a program such as World of Warcraft instead. Incremental Deployability The system is designed to be incrementally deployable: Simple designs should be possible to develop and test immediately. The design is flexible enough to be easily upgraded as more advanced features become available from both Tor and new pluggable transports. Initial Implementation In the simplest possible initial implementation, BridgeFinder will only discover Tor Bridges as they are deployed today. It will use the Tor Control Port to configure these bridges directly via the SETCONF command. It may or may not receive bridge information from a BridgeFinderHelper. In an even more degenerate case, BridgeFinderHelper may even be Vidalia or Orbot itself, acting upon user input from cut and paste. Initial Implementation: BridgeFinder Launch In the initial implementation, the Primary Controller will launch one or more BridgeFinders, providing control port authentication information to them through the environment variables TOR_CONTROL_PORT and TOR_CONTROL_PASSWD. BridgeFinder will then directly connect to the control port and authenticate. Initial implementations should be able to function without using SETEVENTS, and instead only using command-based status inquiries and configuration (GETINFO and SETCONF). Initial Implementation: Obtaining Bridge Hint Information In the initial implementation, to test functionality, BridgeFinderHelper can simply scrape bridges directly from https://bridges.torproject.org. In slightly more advanced implementations, a BridgeFinderHelper instance may be written for use in the user's Non-Tor Browser. This plugin could extract bridges from images, html comments, and other material present in ad banners and slack space on unrelated pages. BridgeFinderHelper would then communicate with the appropriate BridgeFinder instance over an acceptable IPC mechanism. This proposal does not seek to specify the nature of that IPC channel (because BridgeFinderHelper may be arbitrarily constrained due to host application sandboxing), but we do make several security recommendations under the section "Security Concerns: BridgeFinder and BridgeFinderHelper". Initial Implementation: Configuring New Bridges In the initial implementation, Bridge configuration will be done directly though the control port using the SETCONF command. Initial implementations will support only retrieval and configuration of standard Tor Bridges. These are configured using SETCONF on the Tor Control Port as follows: SETCONF Bridge="IP:ORPort [fingerprint]" Future Implementations In future implementations, the system can incrementally evolve in a few different directions. As new pluggable transports are created, it is conceivable that BridgeFinder may want to download new plugin binaries (and/or new transport transform definition files) and provide them to Tor. Furthermore, it may prove simpler to deploy multiple concurrent BridgeFinder+BridgeFinderHelper pairs as opposed to adding new functionality to existing prototypes. Finally, it is desirable for BridgeFinder to obtain approval from the user before updating bridge configuration, especially for cases where BridgeFinderHelper is automatically discovering bridges in-band during Non-Tor activity. The exact mechanisms for accomplishing these improvements is described in the following subsections. Future Implementations: BridgeFinder Launch and POSTMESSAGE handshake The nature of the BridgeFinder launch and the environment variables provided is not expected to change. However, future Primary Controller implementations may decide to launch more than one BridgeFinder instance side by side. Additionally, to negotiate the IPC channel created by Proposal 197 for purposes of providing user confirmation, it is recommended that BridgeFinder and the Primary Controller perform a handshake using POSTMESSAGE upon launch, to establish that all parties properly support the feature: Primary Controller: "POSTMESSAGE @all Controller wants POSTMESSAGE v1.1" BridgeFinder: "POSTMESSAGE @all BridgeFinder has POSTMESSAGE v1.0" Primary Controller: "POSTMESSAGE @all Controller expects POSTMESSAGE v1.0" BridgeFinder: "POSTMESSAGE @all BridgeFinder will POSTMESSAGE v1.0" If this 4 step handshake proceeds with an acceptable version, BridgeFinder must use POSTMESSAGE to transmit SETCONF Bridge lines (see "Future Implementations: Configuring New Bridges" below). If POSTMESSAGE support is expected, but the handshake does not complete for any reason, BridgeFinder should either exit or go dormant. The exact nature of the version negotiation and exactly how much backwards compatibility must be tolerated is unspecified. "All-or-nothing" is a safe assumption to get started. Future Implementations: Obtaining Bridge Hint Information Future BridgeFinder implementations may download additional information based on what is provided by BridgeFinderHelper. They may fetch pluggable transport plugins, transformation parameters, and other material. Future Implementations: Configuring New Bridges Future implementations will be concerned with providing two new pieces of functionality with respect to configuring bridges: configuring pluggable transports, and properly prompting the user before altering Tor configuration. There are two ways to tell Tor clients about pluggable transports (as defined in Proposal 180). On the control port, an external Proposal 180 transport will be configured with SETCONF ClientTransportPlugin=<method> socks5 <addr:port> [auth=X] as in SETCONF ClientTransportPlugin="trebuchet socks5 127.0.0.1:9999". A managed proxy is configured with SETCONF ClientTransportPlugin=<methods> exec <path> [options] as in SETCONF ClientTransportPlugin="trebuchet exec /usr/libexec/trebuchet --managed". This example tells Tor to launch an external program to provide a socks proxy for 'trebuchet' connections. The Tor client only launches one instance of each external program with a given set of options, even if the same executable and options are listed for more than one method. Pluggable transport bridges discovered for this transport by BridgeFinder would then be set with: SETCONF Bridge="trebuchet 3.2.4.1:8080 keyid=09F911029D74E35BD84156C5635688C009F909F9 rocks=20 height=5.6m". For more information on pluggable transports and supporting Tor configuration commands, see Proposal 180. Future Implementations: POSTMESSAGE and User Confirmation Because configuring even normal bridges alone can expose the user to attacks, it is strongly desired to provide some mechanism to allow the user to approve new bridges prior to their use, especially for situations where BridgeFinderHelper is extracting them transparently while the user performs unrelated activity. If BridgeFinderHelper grows to the point where it is downloading new transform definitions or plugins, user confirmation becomes absolutely required. To achieve user confirmation, we depend upon the POSTMESSAGE command defined in Proposal 197. If the POSTMESSAGE handshake succeeds, instead of sending SETCONF commands directly to the control port, the commands will be wrapped inside a POSTMESSAGE: POSTMESSAGE @all SETCONF Bridge="www.example.com:8284" Upon receiving this POSTMESSAGE, the Primary Controller will validate it, evaluate it, store it to be later enabled by the user, and alert the user that new bridges are available for approval. It is only after the user has approved the new bridges that the Primary Controller should then re-issue the SETCONF commands to configure and deploy them in the tor client. Additionally, see "Security Concerns: Primary Controller" for more discussion on potential pitfalls with POSTMESSAGE. Security Concerns While automatic bridge discovery and configuration is quite compelling and powerful, there are several serious security concerns that warrant extreme care. We've broken them down by component. Security Concerns: Primary Controller In the initial implementation, Orbot and Vidalia must take care to transmit the Tor Control password to BridgeFinder in such a way that it does not end up in system logs, process list, or viewable by other system users. The best known strategy for doing this is by passing the information through exported environment variables. Additionally, in future implementations, Orbot and Vidalia will need to validate Proposal 197 POSTMESSAGE input before prompting the user. POSTMESSAGE is a free-form message-passing mechanism. All sorts of unexpected input may be passed through it by any other authenticated Tor Controllers for their own unrelated communication purposes. Minimal validation includes verifying that the POSTMESSAGE data is a valid Bridge or ClientTransportPlugin line and is acceptable input for SETCONF. All unexpected characters should be removed through using a whitelist, and format and structure should be checked against a regular expression. Additionally, the POSTMESSAGE string should not be passed through any string processing engines that automatically decode character escape encodings, to avoid arbitrary control port execution. At the same time, POSTMESSAGE validation should be light. While fully untrusted input is not expected due to the need for control port authentication and BridgeFinder sanitation, complicated manual string parsing techniques during validation should be avoided. Perform simple easy-to-verify whitelist-based checks, and ignore unrecognized input. Beyond POSTMESSAGE validation, the manner in which the Primary Controller achieves consent from the user is absolutely crucial to security under this scheme. A simple "OK/Cancel" dialog is insufficient to protect the user from the dangers of switching bridges and running new plugins automatically. Newly discovered bridge lines from POSTMESSAGE should be added to a disabled set that the user must navigate to as an independent window apart from any confirmation dialog. The user must then explicitly enable recently added plugins by checking them off individually. We need the user's brain to be fully engaged and aware that it is interacting with Tor during this step. If they get an "OK/Cancel" popup that interrupts their online game play, they will almost certainly simply click "OK" just to get back to the game quickly. The Primary Controller should transmit the POSTMESSAGE content to the control port only after obtaining this out-of-band approval. Security Concerns: BridgeFinder and BridgeFinderHelper The unspecified nature of the IPC channel between BridgeFinder and BridgeFinderHelper makes it difficult to make concrete security suggestions. However, from past experience, the following best practices must be employed to avoid security vulnerabilities: 1. Define a non-webby handshake and/or perform authentication The biggest risk is that unexpected applications will be manipulated into posting malformed data to the BridgeFinder's IPC channel as if it were from BridgeFinderHelper. The best way to defend against this is to require a handshake to properly complete before accepting input. If the handshake fails at any point, the IPC channel must be abandoned and closed. Do not continue scanning for good input after any bad input has been encountered. Additionally, if possible, it is wise to establish a shared secret between BridgeFinder and BridgeFinderHelper through the filesystem or any other means available for use in authentication. For an a good example on how to use such a shared secret properly for authentication, see Trac Ticket #5185 and/or the SafeCookie Tor Control Port authentication mechanism. 2. Perform validation before parsing Care must be taken before converting BridgeFinderHelper data into Bridge lines, especially for cases where the BridgeFinderHelper data is fed directly to the control port after passing through BridgeFinder. The input should be subjected to a character whitelist and possibly also validated against a regular expression to verify format, and if any unexpected or poorly-formed data is encountered, the IPC channel must be closed. 3. Fail closed on unexpected input If the handshake fails, or if any other part of the BridgeFinderHelper input is invalid, the IPC channel must be abandoned and closed. Do *not* continue scanning for good input after any bad input has been encountered.
Filename: 200-new-create-and-extend-cells.txt Title: Adding new, extensible CREATE, EXTEND, and related cells Author: Robert Ransom Created: 2012-03-22 Status: Closed Implemented-In: 0.2.4.8-alpha History The original draft of this proposal was from 2010-12-27; nickm revised it slightly on 2012-03-22 and added it as proposal 200. Overview and Motivation: In Tor's current circuit protocol, every field, including the 'onion skin', in the EXTEND relay cell has a fixed meaning and length. This prevents us from extending the current EXTEND cell to support IPv6 relays, efficient UDP-based link protocols, larger 'onion keys', new circuit-extension handshake protocols, or larger identity-key fingerprints. We will need to support all of these extensions in the near future. This proposal specifies a replacement EXTEND2 cell and related cells that provide more room for future extension. Design: FIXME - allocate command ID numbers (non-RELAY commands for CREATE2 and CREATED2; RELAY commands for EXTEND2 and EXTENDED2) The CREATE2 cell contains the following payload: Handshake type [2 bytes] Handshake data length [2 bytes] Handshake data [variable] The relay payload for an EXTEND2 relay cell contains the following payload: Number of link specifiers [1 byte] N times: Link specifier type [1 byte] Link specifier length [1 byte] Link specifier [variable] Handshake type [2 bytes] Handshake data length [2 bytes] Handshake data [variable] The CREATED2 cell and EXTENDED2 relay cell both contain the following payload: Handshake data length [2 bytes] Handshake data [variable] All four cell types are padded to 512-byte cells. When a relay X receives an EXTEND2 relay cell: * X finds or opens a link to the relay Y using the link target specifiers in the EXTEND2 relay cell; if X fails to open a link, it replies with a TRUNCATED relay cell. (FIXME: what do we do now?) * X copies the handshake type and data into a CREATE2 cell and sends it along the link to Y. * If the handshake data is valid, Y replies by sending a CREATED2 cell along the link to X; otherwise, Y replies with a TRUNCATED relay cell. (XXX: we currently use a DESTROY cell?) * X copies the contents of the CREATED2 cell into an EXTENDED2 relay cell and sends it along the circuit to the OP. Link target specifiers: The list of link target specifiers must include at least one address and at least one identity fingerprint, in a format that the extending node is known to recognize. The extending node MUST NOT accept the connection unless at least one identity matches, and should follow the current rules for making sure that addresses match. [00] TLS-over-TCP, IPv4 address A four-byte IPv4 address plus two-byte ORPort [01] TLS-over-TCP, IPv6 address A sixteen-byte IPv6 address plus two-byte ORPort [02] Legacy identity A 20-byte SHA1 identity fingerprint. At most one may be listed. As always, values are sent in network (big-endian) order. Legacy handshake type: The current "onionskin" handshake type is defined to be handshake type [00 00], or "legacy". The first (client->relay) message in a handshake of type “legacy” contains the following data: ‘Onion skin’ (as in CREATE cell) [DH_LEN+KEY_LEN+PK_PAD_LEN bytes] This value is generated and processed as sections 5.1 and 5.2 of tor-spec.txt specify for the current CREATE cell. The second (relay->client) message in a handshake of type “legacy” contains the following data: Relay DH public key [DH_LEN bytes] KH (see section 5.2 of tor-spec.txt) [HASH_LEN bytes] These values are generated and processed as sections 5.1 and 5.2 of tor-spec.txt specify for the current CREATED cell. After successfully completing a handshake of type “legacy”, the client and relay use the current relay cryptography protocol. Bugs: This specification does not accommodate: * circuit-extension handshakes requiring more than one round No circuit-extension handshake should ever require more than one round (i.e. more than one message from the client and one reply from the relay). We can easily extend the protocol to handle this, but we will never need to. * circuit-extension handshakes in which either message cannot fit in a single 512-byte cell along with the other required fields This can be handled by specifying a dummy handshake type whose data (sent from the client) consists of another handshake type and the beginning of the data required by that handshake type, and then using several (newly defined) HANDSHAKE_COMPLETION relay cells sent in each direction to transport the remaining handshake data. The specification of a HANDSHAKE_COMPLETION relay cell and its associated dummy handshake type can safely be postponed until we develop a circuit-extension handshake protocol that would require it. * link target specifiers that cause EXTEND2 cells to exceed 512 bytes This can be handled by specifying a LONG_COMMAND relay cell type that can be used to transport a large ‘virtual cell’ in multiple 512-byte cells. The specification of a LONG_COMMAND relay cell can safely be postponed until we develop a link target specifier, a RELAY_BEGIN2 relay cell and stream target specifier, or some other relay cell type that would require it.
Filename: 201-bridge-v3-reqs-stats.txt Title: Make bridges report statistics on daily v3 network status requests Author: Karsten Loesing Created: 10-May-2012 Status: Reserve Target: 0.2.4.x Overview: Our current approach [0] to estimate daily bridge users is based on unique IP addresses reported by bridges, and it is very likely broken. A bridge user can connect to two or more bridges, so that unique IP address sets overlap to an unknown extent. We should instead count requests for v3 network statuses, sum them up for all bridges, and divide by the average number of requests that a bridge client makes per day. This approach is similar to how we estimate directly connecting users. This proposal describes how bridges would report v3 network status requests in their extra-info descriptors. Specification: Bridges include a new keyword line in their extra-info descriptors that contains the number of v3 network status requests by country they saw over a period of 24 hours. The reported numbers refer to the period stated in the "bridge-stats-end" line. The new keyword line would go after the "bridge-ips" line in dir-spec.txt: "bridge-v3-reqs" CC=N,CC=N,... NL [At most once.] List of mappings from two-letter country codes to the number of requests for v3 network statuses from that country as seen by the bridge, rounded up to the nearest multiple of 8. Only those requests are counted that the directory can answer with a 200 OK status code. [0] https://metrics.torproject.org/papers/countingusers-2010-11-30.pdf
Filename: 202-improved-relay-crypto.txt Title: Two improved relay encryption protocols for Tor cells Author: Nick Mathewson Created: 19 Jun 2012 Status: Meta Note: This is an important development step in improving our relay crypto, but it doesn't actually say how to do this. Overview: This document describes two candidate designs for a better Tor relay encryption/decryption protocol, designed to stymie tagging attacks and better accord with best practices in protocol design. My hope is that readers will examine these protocols, evaluate their security and practicality, improve on them, and help to pick one for implementation in Tor. In section 1, I describe Tor's current relay crypto protocol, its drawbacks, how it fits in with the rest of Tor, and some requirements/desiderata for a replacement. In sections 2 and 3, I propose two alternative replacements for this protocol. In section 4, I discuss their pros and cons. 1. Background and Motivation 1.0. A short overview of Tor's current protocols The core pieces of the Tor protocol are the link protocol, the circuit extend protocol, the relay protocol, and the stream protocol. All are documented in [TorSpec]. Briefly: - The link protocol handles all direct pairwise communication between nodes. Everything else is transmitted over it. It uses TLS. - The circuit extend protocol uses public-key crypto to set up multi-node virtual tunnels, called "circuits", from a client through one or more nodes. *** The relay protocol uses established circuits to communicate from a client to a node on a circuit. That's the one we'll be talking about here. *** - The stream protocol is tunneled over relay protocol; clients use it to tell servers to open anonymous TCP connections, to send data, and so forth. Servers use it to report success or failure opening anonymous TCP connections, to send data from TCP connections back to clients, and so forth. In more detail: The link protocol's job is to connect two computers with an encrypted, authenticated stream, to authenticate one or both of them to the other, and to provide a channel for passing "cells" between them. The circuit extend protocol's job is to set up circuits: persistent tunnels that connect a Tor client to an exit node through a series of (typically three) hops, each of which knows only the previous and next hop, and each of which has a set of keys that they share only with the client. Finally, the relay protocol's job is to allow a client to communicate bidirectionally with the node(s) on the circuit, once their shared keys have been established. (We'll ignore the stream protocol for the rest of this document.) Note on terminology: Tor nodes are sometimes called "servers", "relays", or "routers". I'll use all these terms more or less interchangeably. For simplicity's sake, I will call the party who is constructing and using a circuit "the client" or "Alice", even though nodes sometimes construct circuits too. Tor's internal packets are called "cells". All the cells we deal with here are 512 bytes long. The nodes in a circuit are called its "hops"; most circuits are 3 hops long. This doesn't include the client: when Alice builds a circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is "Bob_1" and the last hop is "Bob_3". 1.1. The current relay protocol and its drawbacks [This section describes Tor's current relay protocol. It is not a proposal; rather it is what we do now. Sections 2 and 3 have my proposed replacements for it.] A "relay cell" is a cell that is generated by the client to send to a node, or by a node to send to the client. It's called a "relay" cell because a node that receives one may need to relay it to the next or previous node in the circuit (or to handle the cell itself). A relay cell moving towards the client is called "inbound"; a cell moving away is called "outbound". When a relay cell is constructed by the client, the client adds one layer of crypto for each node that will process the cell, and gives the cell to the first node in the circuit. Each node in turn then removes one layer of crypto, and either forwards the cell to the next node in the circuit or acts on that cell itself. When a relay cell is constructed by a node, it encrypts it and sends it to the preceding node in the circuit. Each node between the originating node and the client then encrypts the cell and passes it back to the preceding node. When the client receives the cell, it removes layers of crypto until it has an unencrypted cell, which it then acts on. In the current protocol, the body of each relay cell contains, in its unencrypted form: Relay command [1 byte] Zeros [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [498 bytes] (This only adds up to 509 bytes. That's because the Tor link protocol transfers 512-byte cells, and has a 3 byte overhead per cell. Not how we would do it if we were starting over at this point.) At every node of a circuit, the node relaying a cell encrypts/decrypts it with AES128-CTR. If the cell is outbound and the "zeros" field is set to all-zeros, the node additionally checks whether 'digest' is correct. A correct digest is the first 4 bytes of the running SHA1 digest of: a shared secret, concatenated with all the relay cells destined for this node on this circuit so far, including this cell. If _that's_ true, then the node accepts this cell. (See section 6 of [TorSpec] for full detail; see section A.1 for a more rigorous description.) The current approach has some actual problems. Notably: * It permits tagging attacks. Because AES_CTR is an XOR-based stream cipher, an adversary who controls the first node in a circuit can XOR anything he likes into the relay cell, and then see whether he/she encounters a correspondingly defaced cell at some exit that he also controls. That is, the attacker picks some pattern P, and when he would transmit some outbound relay cell C at hop 1, he instead transmits C xor P. If an honest exit receives the cell, the digest will probably be wrong, and the honest exit will reject it. But if the attacker controls the exit, he will notice that he has received a cell C' where the digest doesn't match, but where C' xor P _does_ have a good digest. The attacker will then know that his two nodes are on the same circuit, and thereby be able to link the user (whom the first node sees) to her activities (which the last node sees). Back when we did the Tor design, this didn't seem like a big deal, since an adversary who controls both the first and last node in a circuit is presumed to win already based on traffic correlation attacks. This attack seemed strictly worse than that, since it was trivially detectable in the case where the attacker _didn't_ control both ends. See section 4.4 of the Tor paper [TorDesign] for our early thoughts here; see Xinwen Fu et al's 2009 work for a more full explanation of the in-circuit tagging attack [XF]; and see "The 23 Raccoons'" March 2012 "Analysis of the Relative Severity of Tagging Attacks" mail on tor-dev (and the ensuing thread) for a discussion of why we may want to care after all, due to attacks that use tagging to amplify route capture. [23R] It also has some less practical issues. * The digest portion is too short. Yes, if you're an attacker trying to (say) change an "ls *" into an "rm *", you can only expect to get one altered cell out of 2^32 accepted -- and all future cells on the circuit will be rejected with similar probability due to the change in the running hash -- but 1/2^32 is a pretty high success rate for crypto attacks. * It does MAC-then-encrypt. That makes smart people cringe. * Its approach to MAC is H(Key | Msg), which is vulnerable to length extension attack if you've got a Merkle-Damgard hash (which we do). This isn't relevant in practice right now, since the only parties who see the digest are the two parties that rely on it (because of the MAC-then-encrypt). 1.2. Some feature requirements Relay cells can originate at the client or at a relay. Relay cells that originate at the client are given to the first node in the circuit, and constructed so that they will be decrypted and forwarded by the first M-1 nodes in the circuit, and finally decrypted and processed by the Mth node, where the client chooses M. (Usually, the Mth node is the the last one, which will be an exit node.) Relay cells that originate at a hop in the circuit are given to the preceding node, and eventually delivered to the client. Tor provides a so called "leaky pipe" circuit topology [TorDesign]: a client can send a cell to any node in the circuit, not just the last node. I'll try to keep that property, although historically we haven't really made use of it. In order to implement telescoping circuit construction (where the circuit is built by iteratively asking the last node in the circuit to extend the circuit one hop more), it's vital that the last hop of the circuit be able to change. Proposal 188 [Prop188] suggests that we implement a "bridge guards" feature: making some (non-public) nodes insert an extra hop into the circuit after themselves, in a way that will make it harder for other nodes in the network to enumerate them. We therefore want our circuits to be one-hop re-extensible: when the client extends a circuit from Bob1 to Bob2, we want Bob1 to be able to introduce a new node "Bob1.5" into the circuit such that the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has been called "loose source routing".) Any new approach should be able to coexist on a circuit with the old approach. That is, if Alice wants to build a circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a revised relay protocol, then Alice should be able to build a circuit such that she can have Bob1 and Bob3 process each cell with the current protocol, and Bob2 process it with a revised protocol. (Why? Because if all nodes in a circuit needed to use the same relay protocol, then each node could learn information about the other nodes in the circuit from which relay protocol was chosen. For example, if Bob1 supports the new protocol, and sees that the old relay protocol is in use, and knows that Bob2 supports the new one, then Bob1 has learned that Bob3 is some node that does not support the new relay protocol.) Cell length needs to be constant as cells move through the network. For historical reasons mentioned above in section 1.1, the length of the encrypted part of a relay cell needs to be 509 bytes. 1.3. Some security requirements Two adjacent nodes on a circuit can trivially tell that they are on the same circuit, and the first node can trivially tell who the client is. Other than that, we'd like any attacker who controls a node on the circuit not to have a good way to learn any other nodes, even if he/she controls those nodes. [*] Relay cells should not be malleable: no relay should be able to alter a cell between an honest sender and an honest recipient in a way that they cannot detect. Relay cells should be secret: nobody but the sender and recipient of a relay cell should be able to learn what it says. Circuits should resist transparent, recoverable tagging attacks: if an attacker controls one node in a circuit and alters a relay cell there, no non-adjacent point in the circuit should be able to recover the relay cell as it would have received it had the attacker not altered it. The above properties should apply to sequences of cells too: an attacker shouldn't be able to change what sequence of cells arrives at a destination (for example, by removing, replaying, or reordering one or more cells) without being detected. (Feel free to substitute whatever formalization of the above requirements makes you happiest, and add whatever caveats are necessary to make you comfortable. I probably missed at least two critical properties.) [*] Of course, an attacker who controls two points on a circuit can probably confirm this through traffic correlation. But we'd prefer that the cryptography not create other, easier ways for them to do this. 1.4. A note on algorithms This document is deliberately agnostic concerning the choice of cryptographic primitives -- not because I have no opinions about good ciphers, MACs, and modes of operation -- but because experience has taught me that mentioning any particular cryptographic primitive will prevent discussion of anything else. Please DO NOT suggest algorithms to use in implementing these protocols yet. It will distract! There will be time later! If somebody _else_ suggests algorithms to use, for goodness' sake DON'T ARGUE WITH THEM! There will be time later! 2. Design 1: Large-block encryption In this approach, we use a tweakable large-block cipher for encryption and decryption, and a tweak-chaining function TC. 2.1. Chained large-block what now? We assume the existence of a primitive that provides the desired properties of a tweakable[Tweak] block cipher, taking blocks of any desired size. (In our case, the block size is 509 bytes[*].) It also takes a Key, and a per-block "tweak" parameter that plays the same role that an IV plays in CBC, or that the counter plays in counter mode. The Tweak-chaining function TC takes as input a previous tweak, a tweak chaining key, and a cell; it outputs a new tweak. Its purpose is to make future cells undecryptable unless you have received all previous cells. It could probably be something like a MAC of the old tweak and the cell using the tweak chaining key as the MAC key. (If the initial tweak is secret, I am not sure that TC needs to be keyed.) [*] Some large-block cipher constructions use blocks whose size is the multiple of some underlying cipher's block size. If we wind up having to use one of those, we can use 496-byte blocks instead at the cost of 2.5% wasted space. 2.2. The protocol 2.2.1. Setup phase The circuit construction algorithm needs to produce forward and backward keys Kf and Kb, the forward and backward tweak chaining keys TCKf and TCKb, as well as initial tweak values Tf and Tb. 2.2.2. The cell format We replace the "Digest" and "Zeros" fields of the cell with a single Z-byte "Zeros" field to determine when the cell is recognized and correctly decrypted; its length is a security parameter. 2.2.3. The decryption operations For a relay to handle an inbound RELAY cell, it sets Tb_next to TC(TCKb, Tb, Cell). Then it encrypts the cell using the large block cipher keyed with Kb and tweaked with Tb. Then it sets Tb to Tb_next. For a relay to handle an outbound RELAY cell, it sets Tf_next to TC(TCKf, Tf, Cell). Then it decrypts the cell using the large block cipher keyed with Kf and tweaked with Tf. Then it sets Tf to Tf_next. Then it checks the 'Zeros' field on the cell; if that field is all [00] bytes, the cell is for us. 2.3. Security discussion This approach is fairly simple (at least, no more complex than its primitives) and achieves some of our security goals. Because of the large block cipher approach, any change to a cell will render that cell undecryptable, and indistinguishable from random junk. Because of the tweak chaining approach, if even one cell is missed or corrupted or reordered, future cells will also decrypt into random junk. The tagging attack in this case is turned into a circuit-junking attack: an adversary who tries to mount it can probably confirm that he was indeed first and last node on a target circuit (assuming that circuits which turn to junk in this way are rare), but cannot recover the circuit after that point. As a neat side observation, note that this approach improves upon Tor's current forward secrecy, by providing forward secrecy while circuits are still operational, since each change to the tweak should make previous cells undecryptable if the old tweak value isn't recoverable. The length of Zeros is a parameter for what fraction of "random junk" cells will potentially be accepted by a router or client. If Zeros is Z bytes long, then junk cells will be accepted with P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of the Length field must also be 0.) There's no trouble using this protocol in a mixed circuit, where some nodes speak the old protocol and some speak the large-block-encryption protocol. 3. Design 2: short-MAC-and-pad In this design, we behave more similarly to a mix-net design (such as Mixmaster or Mixminion's headers). Each node checks a MAC, and then re-pads the cell to its chosen length before decoding the cell. This design uses as a primitive a MAC and a stream cipher. It might also be possible to use an authenticating cipher mode, if we can find one that works like a stream cipher and allows us to efficiently output authenticators for the stream so far. NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be replaced with an authenticated encryption mode without too much loss of generality. 3.1. The protocol 3.1.1 Setup phase The circuit construction algorithm needs to produce forward and backward keys Kf and Kb, forward and backward stream cipher IVs IVf and IVb, and forward and backward MAC keys Mf and Mb. Additionally, the circuit construction algorithm needs a way for the client to securely (and secretly? XXX) tell each hop in the circuit a value 'bf' for the number of bytes of MAC it should expect on outbound cells and 'bb' for the number of bytes it should use on cells it's generating. Each node can get a different 'bf' and 'bb'. These values can be 0: if a node's bf is 0, it doesn't authenticate cells; if its bb is 0, it doesn't originate them. The circuit construction algorithm also needs a way to tell each the client to securely (and secretly? XXX) tell each hop in the circuit whether it is allowed to be the final destination for relay cells. Set the stream Sf and the stream Sb to empty values. 3.1.2. The cell format The Zeros and Digest field of the cell format are removed. 3.1.3. The relay operations Upon receiving an outbound cell, a node removes the first b bytes of the cell, and puts them aside as 'M'. The node then computes between 0 and 2 MACs of the stream consisting of all previously MAC'd data, plus the remainder of the cell: If b>0 and there is a next hop, the node computes M_relay. If this node was told to deliver traffic, or it is the last node in the circuit so far, the node computes M_receive. M_relay is computed as MAC(stream | "relay"); M_receive is computed as MAC(stream | "receive"). If M = M_receive, this cell is for the node; it should process it. If M = M_relay, or if b == 0, this cell should be relayed. If a MAC was computed and neither of the above cases was met, then the cell is bogus; the node should discard it and destroy the circuit. The node then removes the first bf bytes of the cell, and pads the cell at the end with bf zero bytes. Finally, the node decrypts the whole remaining padded cell with the stream cipher. To handle an inbound cell, the node simply does a stream cipher with no checking. 3.1.4. Generating inbound cells. To generate an inbound cell, a node makes a 509-bb byte RELAY cell C, encrypts that cell with Kb, appends the encrypted cell to Sb, and prepends M_receive(Sb) to the cell. 3.1.5. Generating outbound cells Generating an outbound cell is harder, since we need to know what padding the earlier nodes will generate in order to know what padding the later nodes will receive and compute their MACs, but we also need to know what MACs we'll send to the later nodes in order to compute which MACs we'll send to the earlier ones. Mixnet clients have needed to do this for ages, though, so the algorithms are pretty well settled. I'll give one below in A.3. 3.2. Security discussion This approach is also simple and (given good parameter choices) can achieve our goals. The trickiest part is the algorithm that clients must follow to package cells, but that's not so bad. It's not necessary to check MACs on inbound traffic, because nobody but the client can tell scrambled messages from good ones, and the client can be trusted to keep the client's own secrets. With this protocol, if the attacker tries to do a tagging attack, the circuit should get destroyed by the next node in the circuit that has a nonzero "bf" value, with probability == 1-2^-(8*bf). (If there are further intervening honest nodes, they also have a chance to detect the attack.) Similarly, any attempt to replay, or reorder outbound cells should fail similarly. The "bf" values could reveal to each node its position in the circuit and the client preferences, depending on how we set them; on the other hand, having a fixed bf value would reveal to the last node the length of the circuit. Neither option seems great. This protocol doesn't provide any additional forward secrecy beyond what Tor has today. We could fix that by changing our use of the stream cipher so that instead of running in counter mode between cells, we use a tweaked stream cipher and change the tweak with each cell (possibly based on the unused portion of the MAC). This protocol does support loose source routing, so long as no padding bytes are added by any router-added nodes. In a circuit, every node has at least one relay cell sent to it: even non-exit nodes get a RELAY_EXTEND cell. 4. Discussion I'm not currently seeing a reason to strongly prefer one of these approaches over another. In favor of large-block encryption: - The space overhead seems smaller: we need to use up fewer bytes in order to get equivalent looking security. (For example, if we want to have P < 2^64 that a cell altered by hop 1 could be accepted by hop 2 or hop 3, *and* we want P < 2^64 that a cell altered by hop 2 could be accepted by hop 3, the large-block approach needs about 8 bytes for the Zeros field, whereas the short-MAC-and-pad approach needs 16 bytes worth of MACs.) - We get forward secrecy pretty easily. - The format doesn't leak anything about the length of the circuit, or limit it. - We don't need complicated logic to set the 'b' parameters. - It doesn't need tricky padding code. In the favor of short-MAC-and-pad: - All of the primitives used are much better analyzed and understood. There's nothing esoteric there. The format itself is similar to older, well-analyzed formats. - Most of the constructions for the large-block-cipher approach seem less efficient in CPU cycles than a good stream cipher and a MAC. (But I don't want to discuss that now; see section 1.4 above!) Unclear: - Suppose that an attacker controls the first and last hop of a circuit, and tries an end-to-end tagging attack. With large-block encryption, the tagged cell and all future cells on the circuit turn to junk after the tagging attack, with P~~100%. With short-MAC-and-pad, the circuit is destroyed at the second hop, with P ~~ 1- 2^(-8*bf). Is one of these approaches materially worse for the attacker? - Can we do better than the "compute two MACs" approach for distinguishing the relay and receive cases of the short-MAC-and-pad protocol? - To what extent can we improve these protocols? - If we do short-MAC-and-pad, should we apply the forward security hack alluded to in section 3.2? 5. Acknowledgments Thanks to the many reviewers of the initial drafts of this proposal. If you can make any sense of what I'm saying, they deserve much of the credit. A. Formal description Note that in all these cases, more efficient descriptions exist. A.1. The current Tor relay protocol. Relay cell format: Relay command [1 byte] Zeros [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [498 bytes] Circuit setup: (Specified elsewhere; the client negotiates with each router in a circuit the secret AES keys Kf, Kb, and the secret 'digest keys' Df, and Db. They initialize AES counters Cf and Cb to 0. They initialize the digest stream Sf to Df, and Sb to Db.) HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]): 1. If the Zeros field of Cell is not [00 00], return False. 2. Let Cell' = Cell with its Digest field set to [00 00 00 00]. 3. Let S' = Stream | Cell'. 4. If SHA1(S') = the Digest field of Cell, set Stream to S', and return True. 5. Otherwise return False. HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out]) 1. Set the Zeros and Digest field of Cell' to [00] bytes. 2. Set Stream to Stream | Cell'. 3. Construct Cell from Cell' by setting the Digest field to SHA1(Stream), and taking all other fields as-is. 4. Return Cell. HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and counter 'Ctr'. Increment 'Ctr' by the length of the cell. HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Same as ENCRYPT. Router operation, upon receiving an inbound cell -- that is, one sent towards the client. 1. ENCRYPT(cell, Kb, Cb) 2. Send the decrypted contents towards the client. Router operation, upon receiving an outbound cell -- that is, one sent away from the client. 1. DECRYPT(cell, Kf, Cf) 2. If CHECK(Cell, Sf) is true, this cell is for us. Do not relay the cell. 3. Otherwise, this cell is not for us. Send the decrypted cell to the next hop on the circuit, or discard it if there is no next hop. Router operation, to create a relay cell that will be delivered to the client. 1. Construct a Relay cell Cell' with the relay command, length, stream ID, and body fields set as appropriate. 2. Let Cell = CONSTRUCT(Cell', Sb). 3. ENCRYPT(Cell, Kb, Cb) 4. Send the encrypted cell towards the client. Client operation, receiving an inbound cell. For each hop H in a circuit, starting with the first hop and ending (possibly) with the last: 1. DECRYPT(Cell, Kb_H, Cb_H) 2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop H. Exit the loop, and return the cell in its current form. If we reach the end of the loop without finding the right hop, the cell is bogus or corrupted. Client operation, sending an outbound cell to hop H. 1. Construct a Relay cell Cell' with the relay command, length, stream ID, and body fields set as appropriate. 2. Let Cell = CONSTRUCT(Cell', Sf_H) 3. For i = H..1: A. ENCRYPT(Cell, Sf_i, Cf_i) 4. Deliver Cell to the first hop in the circuit. A.2. The large-block-cipher protocol Same as A.1, except for the following changes. The cell format is now: Zeros [Z bytes] Length [2 bytes] StreamID [2 bytes] Relay command [1 byte] Data [504-Z bytes] Ctr is no longer a counter, but a Tweak value. Each key is now a tuple of (Key_Crypt, Key_TC) Streams are no longer used. HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]) 1. If the Zeros field of Cell contains only [00] bytes, return True. 2. Otherwise return false. HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out]) 1. Let Cell be Cell', with its "Zeros" field set to Z [00] bytes. 2. Return Cell'. HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out]) 2. Encrypt Cell using Key and Tweak 1. Let Tweak' = TC(Key_TC, Tweak, Cell) 3. Set Tweak to Tweak'. HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out]) 1. Let Tweak' = TC(Key_TC, Tweak, Cell) 2. Decrypt Cell using Key and Tweak 3. Set Tweak to Tweak'. A.3. The short-MAC-and-pad protocol. Define M_relay(K,S) as MAC(K, S|"relay"). Define M_receive(K,S) as MAC(K, S|"receive"). Define Z(n) as a series of n [00] bytes. Define BODY_LEN as 509 The cell message format is now: Relay command [1 byte] StreamID [2 bytes] Length [2 bytes] Data [variable bytes] Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]): Let M = Cell[0:b] Let Rest = Cell[b:...] If b == 0: Return (nil, Rest) Let S' = S | Rest If M == M_relay(K,S')[0:b]: Set S = S' Return ("relay", Rest) If M == M_receive(K,S')[0:b]: Set S = S' Return ("receive", Rest) Return ("BAD", nil) HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and counter 'Ctr'. Increment 'Ctr' by the length of the cell. HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Same as ENCRYPT. Router operation, upon receiving an inbound cell: 1. ENCRYPT(cell, Kb, Cb) 2. Send the decrypted contents towards the client. Router operation, upon receiving an outbound cell: 1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf) 2. If Status == "BAD", drop the cell and destroy the circuit. 3. Let Cell' = Msg | Z(BODY_LEN - len(Msg)) 4. DECRYPT(Cell', Kf, Cf) [*] 5. If Status == "receive" or (Status == nil and there is no next hop), Cell' is for us: process it. 6. Otherwise, send Cell' to the next node. Router operation, sending a cell towards the client: 1. Let Body = a relay cell body of BODY_LEN-bb_i bytes. 2. Let Cell' = ENCRYPT(Body, Kb, Cb) 3. Let Sb = Sb | Cell' 4. Let M = M_receive(Mb, Sb)[0:b] 5. Send the cell M | Cell' back towards the client. Client operation, upon receiving an inbound cell: For each hop H in the circuit, from first to last: 1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i) 2. If Status = "relay", drop the cell and destroy the circuit. (BAD is okay; it means that this hop didn't originate the cell.) 3. DECRYPT(Msg, Kb_i, Cb_i) 4. If Status = "receive", this cell is from hop i; process it. 5. Otherwise, set Cell = Msg. Client operation, sending an outbound cell: Let BF = the total of all bf_i values. 1. Construct a relay cell body Msg of BODY_LEN-BF bytes. 2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i) 3. Let Pad_0 = "". 4. For i in range 1..N, where N is the number of hops: Let Pad_i = Pad_{i-1} | Z(bf_i) Let S_last = the last len(Pad_i) bytes of Stream_i. Let Pad_i = Pad_i xor S_last Now Pad_i is the padding as it will stand after node i has processed it. 5. For i in range N..1, where N is the number of hops: If this is the last hop, let M_* = M_receive. Else let M_* = M_relay. Let Body = Msg xor the first len(Msg) bytes of Stream_i Let M = M_*(Mf, Body | Pad_(i-1)) Set Msg = M[:bf_i] | Body 6. Send Msg outbound to the first relay in the circuit. [*] Strictly speaking, we could omit the pad-and-decrypt operation once we know we're the final hop. R. References [Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-guards.txt [TorSpec] The Tor Protocol Specification https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.txt [TorDesign] Dingledine et al, "Tor: The Second Generation Onion Router", https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf [Tweak] Liskov et al, "Tweakable Block Ciphers", http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf [XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity" [23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html (You'll want to read the rest of the thread too.)
Filename: 203-https-frontend.txt Title: Avoiding censorship by impersonating an HTTPS server Author: Nick Mathewson Created: 24 Jun 2012 Status: Obsolete Note: Obsoleted-by pluggable transports. Overview: One frequently proposed approach for censorship resistance is that Tor bridges ought to act like another TLS-based service, and deliver traffic to Tor only if the client can demonstrate some shared knowledge with the bridge. In this document, I discuss some design considerations for building such systems, and propose a few possible architectures and designs. Background: Most of our previous work on censorship resistance has focused on preventing passive attackers from identifying Tor bridges, or from doing so cheaply. But active attackers exist, and exist in the wild: right now, the most sophisticated censors use their anti-Tor passive attacks only as a first round of filtering before launching a secondary active attack to confirm suspected Tor nodes. One idea we've been talking about for a while is that of having a service that looks like an HTTPS service unless a client does some particular secret thing to prove it is allowed to use it as a Tor bridge. Such a system would still succumb to passive traffic analysis attacks (since the packet timings and sizes for HTTPS don't look that much like Tor), but it would be enough to beat many current censors. Goals and requirements: We should make it impossible for a passive attacker who examines only a few packets at a time to distinguish Tor->Bridge traffic from an HTTPS client talking to an HTTPS server. We should make it impossible for an active attacker talking to the server to tell a Tor bridge server from a regular HTTPS server. We should make it impossible for an active attacker who can MITM the server to learn from the client whether it thought it was connecting to an HTTPS server or a Tor bridge. (This implies that an MITM attacker shouldn't be able to learn anything that would help it convince the server to act like a bridge.) It would be nice to minimize the required code changes to Tor, and the required code changes to any other software. It would be good to avoid any requirement of close integration with any particular HTTP or HTTPS implementation. If we're replacing our own profile with that of an HTTPS service, we should do so in a way that lets us use the profile of a popular HTTPS implementation. Efficiency would be good: layering TLS inside TLS is best avoided if we can. Discussion: We need an actual web server; HTTP and HTTPS are so complicated that there's no practical way to behave in a bug-compatible way with any popular webserver short of running that webserver. More obviously, we need a TLS implementation (or we can't implement HTTPS), and we need a Tor bridge (since that's the whole point of this exercise). So from a top-level point of view, the question becomes: how shall we wire these together? There are three obvious ways; I'll discuss them in turn below. Design #1: TLS in Tor Under this design, Tor accepts HTTPS connections, decides which ones don't look like the Tor protocol, and relays them to a webserver. +--------------------------------------+ +------+ TLS | +------------+ http +-----------+ | | User |<------> | Tor Bridge |<----->| Webserver | | +------+ | +------------+ +-----------+ | | trusted host/network | +--------------------------------------+ This approach would let us use a completely unmodified webserver implementation, but would require the most extensive changes in Tor: we'd need to add yet another flavor to Tor's TLS ice cream parlor, and try to emulate a popular webserver's TLS behavior even more thoroughly. To authenticate, we would need to take a hybrid approach, and begin forwarding traffic to the webserver as soon as a webserver might respond to the traffic. This could be pretty complicated, since it requires us to have a model of how the webserver would respond to any given set of bytes. As a workaround, we might try relaying _all_ input to the webserver, and only replying as Tor in the cases where the website hasn't replied. (This would likely create recognizable timing patterns, though.) The authentication itself could use a system akin to Tor proposals 189/190, where an early AUTHORIZE cell shows knowledge of a shared secret if the client is a Tor client. Design #2: TLS in the web server +----------------------------------+ +------+ TLS | +------------+ tor0 +-----+ | | User |<------> | Webserver |<------->| Tor | | +------+ | +------------+ +-----+ | | trusted host/network | +----------------------------------+ In this design, we write an Apache module or something that can recognize an authenticator of some kind in an HTTPS header, or recognize a valid AUTHORIZE cell, and respond by forwarding the traffic to a Tor instance. To avoid the efficiency issue of doing an extra local encrypt/decrypt, we need to have the webserver talk to Tor over a local unencrypted connection. (I've denoted this as "tor0" in the diagram above.) For implementation convenience, we might want to implement that as a NULL TLS connection, so that the Tor server code wouldn't have to change except to allow local NULL TLS connections in this configuration. For the Tor handshake to work properly here, we'll need a way for the Tor instance to know which public key the webserver is configured to use. We wouldn't need to support the parts of the Tor link protocol used to authenticate clients to servers: relays shouldn't be using this subsystem at all. The Tor client would need to connect and prove its status as a Tor client. If the client uses some means other than AUTHORIZE cells, or if we want to do the authentication in a pluggable transport, and we therefore decided to offload the responsibility for TLS itself to the pluggable transport, that would scare me: Supporting pluggable transports that have the responsibility for TLS would make it fairly easy to mess up the crypto, and I'd rather not have it be so easy to write a pluggable transport that accidentally makes Tor less secure. Design #3: Reverse proxy +----------------------------------+ | +-------+ http +-----------+ | | | |<------>| Webserver | | +------+ TLS | | | +-----------+ | | User |<------> | Proxy | | +------+ | | | tor0 +-----------+ | | | |<------>| Tor | | | +-------+ +-----------+ | | trusted host/network | +----------------------------------+ In this design, we write a server-side proxy to sit in front of Tor and a webserver, or repurpose some existing HTTPS proxy. Its role will be to do TLS, and then forward connections to Tor or the webserver as appropriate. (In the web world, this kind of thing is called a "reverse proxy", so that's the term I'm using here.) To avoid fingerprinting, we should choose a proxy that's already in common use as a TLS front-end for webservers -- nginx, perhaps. Unfortunately, the more popular tools here seem to be pretty complex, and the simpler tools less widely deployed. More investigation would be needed. The authorization considerations would be as in Design #2 above; for the reasons discussed there, it's probably a good idea to build the necessary authorization into Tor itself. I generally like this design best: it lets us isolate the "Check for a valid authenticator and/or a valid or invalid HTTP header, and react accordingly" question to a single program. How to authenticate: The easiest way Designing a good MITM-resistant AUTHORIZE cell, or an equivalent HTTP header, is an open problem that we should solve in proposals 190 and 191 and their successors. I'm calling it out-of-scope here; please see those proposals, their attendant discussion, and their eventual successors. How to authenticate: a slightly harder way Some proposals in this vein have in the past suggested a special HTTP header to distinguish Tor connections from non-Tor connections. This could work too, though it would require substantially larger changes on the Tor client's part, would still require the client take measures to avoid MITM attacks, and would also require the client to implement a particular browser's http profile. Some considerations on distinguishability Against a passive eavesdropper, the easiest way to avoid distinguishability in server responses will be to use an actual web server or reverse web proxy's TLS implementation. (Distinguishability based on client TLS use is another topic entirely.) Against an active non-MITM attacker, the best probing attacks will be ones designed to provoke the system into acting in ways different from those in which a webserver would act: responding earlier than a web server would respond, or later, or differently. We need to make sure that, whatever the front-end program is, it answers anything that would qualify as a well-formed or ill-formed HTTP request whenever the web server would. This must mean, for example, that whatever the correct form of client authorization turns out to be, no prefix of that authorization is ever something that the webserver would respond to. With some web servers (I believe), that's as easy as making sure that any valid authenticator isn't too long, and doesn't contain a CR or LF character. With others, the authenticator would need to be a valid HTTP request, with all the attendant difficulty that would raise. Against an attacker who can MITM the bridge, the best attacks will be to wait for clients to connect and see how they behave. In this case, the client probably needs to be able to authenticate the bridge certificate as presented in the initial TLS handshake -- or some other aspect of the TLS handshake if we're feeling insane. If the certificate or handshake isn't as expected, the client should behave as a web browser that's just received a bad TLS certificate. (The alternative there would be to try to impersonate an HTTPS client that has just accepted a self-signed certificate. But that would probably require the Tor client to impersonate a full web browser, which isn't realistic.) Side note: What to put on the webserver? To credibly pretend not to be ourselves, we must pretend to be something else in particular -- and something not easily identifiable or inherently worthless. We should not, for example, have all deployments of this kind use a fixed website, even if that website is the default "Welcome to Apache" configuration: A censor would probably feel that they weren't breaking anything important by blocking all unconfigured websites with nothing on them. Therefore, we should probably conceive of a system like this as "Something to add to your HTTPS website" rather than as a standalone installation. Related work: meek [1] is a pluggable transport that uses HTTP for carrying bytes and TLS for obfuscation. Traffic is relayed through a third-party server (Google App Engine). It uses a trick to talk to the third party so that it looks like it is talking to an unblocked server. meek itself is not really about HTTP at all. It uses HTTP only because it's convenient and the big Internet services we use as cover also use HTTP. meek uses HTTP as a transport, and TLS for obfuscation, but the key idea is really "domain fronting," where it appears to the censor you are talking to one domain (www.google.com), but behind the scenes you are talking to another (meek-reflect.appspot.com). The meek-server program is an ordinary HTTP (not necessarily even HTTPS!) server, whose communication is easily fingerprintable; but that doesn't matter because the censor never sees that part of the communication, only the communication between the client and CDN. One way to think about the difference: if a censor (somehow) learns the IP address of a bridge as described in this proposal, it's easy and low-cost for the censor to block that bridge by IP address. meek aims to make it much more expensive: even if you know a domain is being used (in part) for circumvention, in order to block it have to block something important like the Google frontend or CloudFlare (high collateral damage). 1. https://trac.torproject.org/projects/tor/wiki/doc/meek
Filename: 204-hidserv-subdomains.txt Title: Subdomain support for Hidden Service addresses Author: Alessandro Preite Martinez Created: 6 July 2012 Status: Closed 1. Overview This proposal aims to extend the .onion naming scheme for Hidden Service addresses with sub-domain components, which will be ignored by the Tor layer but will appear in HTTP Host headers, allowing subdomain-based virtual hosting. 2. Motivation Sites doing large-scale HTTP virtual hosting on subdomains currently do not have a good option for exposure via Hidden Services, short of creating a separate HS for every subdomain (which in some cases is simply not possible due to the subdomains not being fully known beforehand). 3. Implementation Tor should ignore any subdomain components besides the Hidden Service key, i.e. "foo.aaaaaaaaaaaaaaaa.onion" should be treated simply as "aaaaaaaaaaaaaaaa.onion".
Filename: 205-local-dnscache.txt Title: Remove global client-side DNS caching Author: Nick Mathewson Created: 20 July 2012 Implemented-In: 0.2.4.7-alpha. Status: Closed -1. STATUS In 0.2.4.7-alpha, client-side DNS caching is off by default; there didn't seem to be much benefit in having per-circuit caches. I'm leaving the original proposal below in tact for historical reasons. -Nick 0. Overview This proposal suggests that, for reasons of security, we move client-side DNS caching from a global cache to a set of per-circuit caches. This will break some things that used to work. I'll explain how to fix them. 1. Background and Motivation Since the earliest Tor releases, we've kept a client-side DNS cache. This lets us implement exit policies and exit enclaves -- if we remember that www.mit.edu is 18.9.22.169 the first time we see it, then we can avoid making future requests for www.mit.edu via any node whose exit policy refuses net 18. Also, if there happened to be a Tor node at 18.9.22.169, we could use that node as an exit enclave. But there are security issues with DNS caches. A malicious exit node or DNS server can lie. And unlike other traffic, where the effect of a lie is confined to the request in question, a malicious exit node can affect the behavior of future circuits when it gives a false DNS reply. This false reply could be used to keep a client connecting to an MITM'd target, or to make a client use a chosen node as an exit enclave for that node, or so on. With IPv6, tracking attacks will become even more possible: A hostile exit node can give every client a different IPv6 address for every hostname they want to resolve, such that every one of those addresses is under the attacker's control. And even if the exit node is honest, having a cached DNS result can cause Tor clients to build their future circuits distinguishably: the exit on any subsequent circuit can tell whether the client knew the IP for the address yet or not. Further, if the site's DNS provides different answers to clients from different parts of the world, then the client's cached choice of IP will reveal where it first learned about the website. So client-side DNS caching needs to go away. 2. Design 2.1. The basic idea I propose that clients should cache DNS results in per-circuit DNS caches, not in the global address map. 2.2. What about exit policies? Microdescriptor-based clients have already dropped the ability to track which nodes declare which exit policies, without much ill effect. As we go forward, I think that remembering the IP address of each request so that we can match it to exit policies will be even less effective, especially if proposals to allow AS-based exit policies can succeed. 2.3. What about exit enclaves? Exit enclaves are already borken. They need to move towards a cross-certification solution where a node advertises that it can exit to a hostname or domain X.Y.Z, and a signed record at X.Y.Z advertises that the node is an enclave exit for X.Y.Z. That's out-of-scope for this proposal, except to note that nothing proposed here keeps that design from working. 2.4. What about address mapping? Our current address map algorithm is, more or less: N = 0 while N < MAX_MAPPING && exists map[address]: address = map[address] N = N + 1 if N == MAX_MAPPING: Give up, it's a loop. Where 'map' is the union of all mapping entries derived from the controller, the configuration file, trackhostexits maps, virtual-address maps, DNS replies, and so on. With this proposed design, the DNS cache will not be part of the address map. That means that entries in the address map which relied on happening after the DNS cache entries can no longer work so well. These would include: A) Mappings from an IP address to a particular exit, either manually declared or inserted by TrackHostExits. B) Mappings from IP addresses to other IP addresses. C) Mappings from IP addresses to hostnames. We can try to solve these by introducing an extra step of address mapping after the DNS cache is applied. In other words, we should apply the address map, then see if we can attach to a circuit. If we can, we try to apply that circuit's dns cache, then apply the address map again. 2.5. What about the performance impact? That all depends on application behavior. If the application continues to make all of its requests with the hostname, there shouldn't be much trouble. Exit-side DNS caches and exit-side DNS will avoid any additional round trips across the Tor network; compared to that, the time to do a DNS resolution at the exit node *should* be small. That said, this will hurt performance a little in the case where the exit node and its resolver don't have the answer cached, and it takes a long time to resolve the hostname. If the application is doing "resolve, then connect to an IP", see 2.6 below. 2.6. What about DNSPort? If the application is doing its own DNS caching, they won't get much security benefit from here. If the application is doing a resolve before each connect, there will be a performance hit when the resolver is using a circuit that hadn't previously resolved the address. Also, DNSPort users: AutomapHostsOnResolve is your friend. 3. Alternate designs and future directions 3.1. Why keep client-side DNS caching at all? A fine question! I am not sure it actually buys us anything any longer, since exits also have DNS caching. Shall we discuss that? It would sure simplify matters. 3.2. The impact of DNSSec Once we get DNSSec support, clients will be able to verify whether an exit's answers are correctly signed or not. When that happens, we could get most of the benefits of global DNS caching back, without most of the security issues, if we restrict it to DNSSec-signed answers.
Filename: 206-directory-sources.txt Title: Preconfigured directory sources for bootstrapping Author: Nick Mathewson Created: 10-Oct-2012 Status: Closed Implemented-In: 0.2.4.7-alpha Motivation and History: We've long wanted a way for clients to do their initial bootstrapping not from the directory authorities, but from some other set of nodes expected to probably be up when future clients are starting. We tried to solve this a while ago by adding a feature where we could ship a 'fallback' networkstatus file -- one that would get parsed when we had no current networkstatus file, and which we would use to learn about possible directory sources. But we couldn't actually use it, since it turns out that a randomly chosen list of directory caches from 4-5 months ago is a terrible place to go when bootstrapping. Then for a while we considered an "Extra-Stable" flag so that clients could use only nodes with a long history of existence from these fallback networkstatus files. We never built it, though. Instead, we can do this so much more simply. If we want to ship Tor with a list of initial locations to go for directory information, why not just do so? Proposal: In the same way that Tor currently ships with a list of directory authorities, Tor should also ship with a list of directory sources -- places to go for an initial consensus if you don't have a somewhat recent one. These need to include an address for the cache's ORPort, and its identity key. Additionally, they should include a selection weight. They can be configured with a torrc option, just like directory authorities are now. Whenever Tor is starting without a consensus, if it would currently ask a directory authority for a consensus, it should instead ask one of these preconfigured directory sources. I have code for this (see git branch fallback_dirsource_v2) in my public repository. When we deploy this, we can (and should) rip out the Fallback Networkstatus File logic. How to find nodes to make into directory sources: We could take any of three approaches for selecting these initial directory sources. First, we could try to vet them a little, with a light variant of the process we use for authorities. We'd want to look for nodes where we knew the operators, verify that they were okay with keeping the same IP for a very long time, and so forth. Second, we could try to pick nodes for listing with each Tor release based entirely on how long those nodes have been up. Anything that's been a high-reliability directory for a long time on the same IP (like, say, a year) could be a good choice. Third, we could blend the approach and start by looking for up-for-a-long-time nodes, and then also ask the operators whether their nodes are likely to stay running for a long time. I think the third model is best. Some notes on security: Directory source nodes have an opportunity to learn about new users connecting to the network for the first time. Once we have directory guards, that's going to be a fairly uncommon ability. We should be careful in any directory guard design to make sure that we don't fall back to the directory sources any more than we need to. See proposal 207.
Filename: 207-directory-guards.txt Title: Directory guards Author: Nick Mathewson Created: 10-Oct-2012 Status: Closed Target: 0.2.4.x Motivation: When we added guard nodes to resist profiling attacks, we made it so that clients won't build general-purpose circuits through just any node. But clients don't use their guard nodes when downloading general-purpose directory information from the Tor network. This allows a directory cache, over time, to learn a large number of IPs for non-bridge-using users of the Tor network. Proposal: In the same way as they currently pick guard nodes as needed, adding more guards as those nodes are down, clients should also pick a small-ish set of directory guard nodes, to persist in Tor's state file. Clients should, as much as possible, use their regular guards as their directory guards. When downloading a regular directory object (that is, not a hidden service descriptor), clients should prefer their directory guards first. Then they should try more directories from a recent consensus (if they have one) and pick one of those as a new guard if the existing guards are down and a new one is up. Failing that, they should fall back to a directory authority (or a directory source, if those get implemented-- see proposal 206). If a client has only one directory guard running, they should add new guards and try them, and then use their directory guards to fetch multiple descriptors in parallel. Open questions and notes: What properties does a node need to be a suitable directory guard? If we require that it have the Guard flag, we'll lose some nodes: only 74% of the directory caches have it (weighted by bandwidth). We may want to tune the algorithm used to update guards. For future-proofing, we may want to have the DirCache flag from 185 be the one that nodes must have in order to be directory guards. For now, we could have authorities set it to Guard && DirPort!=0, with a better algorithm to follow. Authorities should never get the DirCache flag.
Filename: 208-ipv6-exits-redux.txt Title: IPv6 Exits Redux Author: Nick Mathewson Created: 10-Oct-2012 Status: Closed Target: 0.2.4.x Implemented-In: 0.2.4.7-alpha 1. Obligatory Motivation Section [Insert motivations for IPv6 here. Mention IPv4 address exhaustion. Insert official timeline for official IPv6 adoption here. Insert general desirability of being able to connect to whatever address there is here. Insert profession of firm conviction that eventually there will be something somebody wants to connect to which requires the ability to connect to an IPv6 address.] 2. Proposal Proposal 117 has been there since coderman wrote it in 2007, and it's still mostly right. Rather than replicate it in full, I'll describe this proposal as a patch to it. 2.1. Exit policies Rather than specify IPv6 policies in full, we should move (as we have been moving with IPv4 addresses) to summaries of which IPv6 ports are generally permitted. So let's allow server descriptors to include a list of accepted IPv6 ports, using the same format as the "p" line in microdescriptors, using the "ipv6-policy" keyword. "ipv6-policy" SP ("accept" / "reject") SP PortList NL Exits should still, of course, be able to configure more complex policies, but they should no longer need to tell the whole world about them. After this ipv6-policy line is validated, its numeric ports and ranges should get copied into a "p6" line in microdescriptors. This change breaks the existing exit enclave idea for IPv6, but the exiting exit enclave implementation never worked right in the first place. If we can come up with a good way to support it, we can add that back in. 2.2. Which addresses should we connect to? One issue that's tripped us up a few times is how to decide whether we can use IPv6 addresses. You can't use them with SOCKS4 or SOCKS4a, IIUC. With SOCKS5, there's no way to indicate that you prefer IPv4 or IPv6. It's possible that some SOCKS5 users won't understand IPv6 addresses. With this in mind, I'm going to suggest that with SOCKS4 or SOCKS4a, clients should always require IPv4. With SOCKS5, clients should accept IPv6. If it proves necessary, we can also add per-SOCKSPort configuration flags to override the above default behavior. See also partitioning discussion in Security Notes below. 2.3. Extending BEGIN cells. Prop117 (and the section above) says that clients should prefer one address or another, but doesn't give them a means to tell the exit to do so. Here's one. We define an extension to the BEGIN cell as follows. After the ADDRESS | ':' | PORT | [00] portion, the cell currently contains all [00] bytes. We add a 32-bit flags field, stored as an unsigned 32 bit value, after the [00]. All these flags default to 0, obviously. We define the following flags: bit 1 -- IPv6 okay. We support learning about IPv6 addresses and connecting to IPv6 addresses. 2 -- IPv4 not okay. We don't want to learn about IPv4 addresses or connect to them. 3 -- IPv6 preferred. If there are both IPv4 and IPv6 addresses, we want to connect to the IPv6 one. (By default, we connect to the IPv4 address.) 4..32 -- Reserved. As with so much else, clients should look at the platform version of the exit they're using to see if it supports these flags before sending them. 2.4. Minor changes to proposal 117 GETINFO commands that return an address, and which should return two, should not in fact begin returning two addresses separated by CRLF. They should retain their current behavior, and there should be a new "all my addresses" GETINFO target. 3. Security notes: Letting clients signal that they want or will accept IPv6 addresses creates two partitioning issues that didn't exist before. One is the version partitioning issue: anybody who supports IPv6 addresses is obviously running the new software. Another is option partitioning: anybody who is using a SOCKS4a application will look different from somebody who is using a SOCKS5 application. We can't do much about version partitioning, I think. If we felt especially clever, we could have a flag day. Is that necessary? For option partitioning, are there many applications whose behavior is indistinguishable except that they are sometimes configured to use SOCKS4a and sometimes to use SOCKS5? If so, the answer may well be to persuade as many users as possible to switch those to SOCKS5, so that they get IPv6 support and have a large anonymity set. IPv6 addresses are plentiful, which makes caching them dangerous if you're hoping to avoid tracking over time. (With IPv4 addresses, it's harder to give every user a different IPv4 address for a target hostname with a long TTL, and then accept connections to those IPv4 addresses from different exits over time. With IPv6, it's easy.) This makes proposal 205 especially necessary here.
Filename: 209-path-bias-tuning.txt Title: Tuning the Parameters for the Path Bias Defense Author: Mike Perry Created: 01-10-2012 Status: Obsolete Target: 0.2.4.x+ Overview This proposal describes how we can use the results of simulations in combination with network scans to set reasonable limits for the Path Bias defense, which causes clients to be informed about and ideally rotate away from Guards that provide extremely low circuit success rates. Motivation The Path Bias defense is designed to defend against a type of route capture where malicious Guard nodes deliberately fail circuits that extend to non-colluding Exit nodes to maximize their network utilization in favor of carrying only compromised traffic. This attack was explored in the academic literature in [1], and a variant involving cryptographic tagging was posted to tor-dev[2] in March. In the extreme, the attack allows an adversary that carries c/n of the network capacity to deanonymize c/n of the network connections, breaking the O((c/n)^2) property of Tor's original threat model. In this case, however, the adversary is only carrying circuits for which either the entry and exit are compromised, or all three nodes are compromised. This means that the adversary's Guards will fail all but (c/n) + (c/n)^2 of their circuits for clients that select it. For 10% c/n compromise, such an adversary succeeds only 11% of their circuits that start at their compromised Guards. For 20% c/n compromise, such an adversary would only succeed 24% of their circuit attempts. It is this property which leads me to believe that a simple local accounting defense is indeed possible and worthwhile. Design Description The Path Bias defense is a client-side accounting mechanism in Tor that tracks the circuit failure rate for each of the client's guards. Clients maintain two integers for each of their guards: a count of the number of times a circuit was extended at least one hop through that guard, and a count of the number of circuits that successfully complete through that guard. The ratio of these two numbers is used to determine a circuit success rate for that Guard. The system should issue a notice log message when Guard success rate falls below 70%, a warn when Guard success rate falls below 50%, and should drop the Guard when the success rate falls below 30%. Circuit build timeouts are only counted as path failures if the circuit fails to complete before the 95% "right-censored" (aka "MEASUREMENT_EXPIRED") timeout interval, not the 80% timeout condition[5]. This was done based on the assumption that destructive cryptographic tagging is the primary vector for the path bias attack, until such time as Tor's circuit crypto can be upgraded. Therefore, being more lenient with timeout makes us more resilient to network conditions. To ensure correctness, checks are performed to ensure that we do not count successes without also counting the first hop (see usage of path_state_t as well as pathbias_* in the source). Similarly, to provide a moving average of recent Guard activity while still preserving the ability to ensure correctness, we periodically "scale" the success counts by first multiplying by a numerator (currently 1) and then dividing by an integer divisor (currently 2). Scaling is performed when when the counts exceed the moving average window (300) and when the division does not produce integer truncation. No log messages should be displayed, nor should any Guard be dropped until it has completed at least 150 first hops (inclusive). Analysis: Simulation To test the defense in the face of various types of malicious and non-malicious Guard behavior, I wrote a simulation program in Python[3]. The simulation confirmed that without any defense, an adversary that provides c/n of the network capacity is able to observe c/n of the network flows using circuit failure attacks. It also showed that with the defense, an adversary that wishes to evade detection has compromise rates bounded by: P(compromise) <= (c/n)^2 * (100/CUTOFF_PERCENT) circs_per_client <= circuit_attempts*(c/n) In this way, the defense restores the O((c/n)^2) compromise property, but unfortunately only over long periods of time (see Security Considerations below). The spread between the cutoff values and the normal rate of circuit success has a substantial effect on false positives. From the simulation's results, the sweet spot for the size of this spread appears to be 10%. In other words, we want to set the cutoffs such that they are 10% below the success rate we expect to see in normal usage. The simulation also demonstrates that larger "scaling window" sizes reduce false positives for instances where non-malicious guards experience some ambient rate of circuit failure. Analysis: Live Scan Preliminary Guard node scanning using the txtorcon circuit scanner[4] shows normal circuit completion rates between 80-90% for most Guard nodes. However, it also showed that CPU overload conditions can easily push success rates as low as 45%. Even more concerning is that for a brief period during the live scan, success rates dropped to 50-60% network-wide (regardless of Guard node choice). Based on these results, the notice condition should be 70%, the warn condition should be 50%, and the drop condition should be 30%. However, see the Security Considerations sections for reasons to choose more lenient bounds. Future Analysis: Deployed Clients It's my belief that further analysis should be done by deploying loglines for all three thresholds in clients in the live network to utilize user reports on how often high rates of circuit failure are seen before we deploy changes to rotate away from failing Guards. I believe these log lines should be deployed in 0.2.3.x clients, to maximize the exposure of the code to varying network conditions, so that we have enough data to consider deploying the Guard-dropping cutoff in 0.2.4.x. Security Considerations: DoS Conditions While the scaling window does provide freshness and can help mitigate "bait-and-switch" attacks, it also creates the possibility of conditions where clients can be forced off their Guards due to temporary network-wide CPU DoS. This provides another reason beyond false positive concerns to set the scaling window as large as is reasonable. A DoS directed at specific Guard nodes is unlikely to allow an adversary to cause clients to rotate away from that Guard, because it is unlikely that the DoS can be precise enough to allow first hops to that Guard to succeed, but also cause extends to fail. This leaves network-wide DoS as the primary vector for influencing clients. Simulation results show that in order to cause clients to rotate away from a Guard node that previously succeeded 80% of its circuits, an adversary would need to induce a 25% success rate for around 350 circuit attempts before the client would reject it or a 5% success rate for around 215 attempts, both using a scaling window of 300 circuits. Assuming one circuit per Guard per 10 minutes of active client activity, this is a sustained network-wide DoS attack of 60 hours for the 25% case, or 38 hours for the 5% case. Presumably this is enough time for the directory authorities to respond by altering the pb_disablepct consensus parameter before clients rotate, especially given that most clients are not active for even 38 hours on end, and will tend to stop building circuits while idle. If we raised the scaling window to 500 circuits, it would require 1050 circuits if the DoS brought circuit success down to 25% (175 hours), and 415 circuits if the DoS brought the circuit success down to 5% (69 hours). The tradeoff, though, is that larger scaling window values allow Guard nodes to compromise clients for duty cycles of around the size of this window (up to the (c/n)^2 * 100/CUTOFF_PERCENT limit in aggregate), so we do have to find balance between these concerns. Security Considerations: Targeted Failure Attacks If an adversary controls a significant portion of the network, they may be able to target a Guard node by failing their circuits. In the context of cryptographic tagging, both the Middle node and the Exit node are able to recognize their colluding peers. The Middle node sees the Guard directly, and the Exit node simply reverses a non-existent tag, causing a failure. P(EvilMiddle) || P(EvilExit) = 1.0 - P(HonestMiddle) && P(HonestExit) = 1.0 - (1.0-(c/n))*(1.0-(c/n)) For 10% compromise, this works out to the ability to fail an additional 19% of honest Guard circuits, and for 20% compromise, it works out to 36%. When added to the ambient circuit failure rates (10-20%), this is within range of the notice and warn conditions, but not the guard failure condition. However, this attack does become feasible if a network-wide DoS (or simply CPU load) is able to elevate the ambient failure rate to 51% for the 10% compromise case, or 34% for the 20% compromise case. Since both conditions would elicit notices and/or warns from *all* clients, this attack should be detectable. It can also be detected through the bandwidth authorities (who could possibly even set pathbias parameters directly based on measured ambient circuit failure rates), should we deploy #7023. Implementation Notes: Log Messages Log messages need to be chosen with care to avoid alarming users. I suggest: Notice: "Your Guard %s is failing more circuits than usual. Most likely this means the Tor network is overloaded. Success counts are %d/%d." Warn: "Your Guard %s is failing a very large amount of circuits. Most likely this means the Tor network is overloaded, but it could also mean an attack against you or potentially the Guard itself. Success counts are %d/%d." Drop: "Your Guard %s is failing an extremely large amount of circuits. [Tor has disabled use of this Guard.] Success counts are %d/%d." The second piece of the Drop message would not be present in 0.2.3.x, since the Guard won't actually be dropped. Implementation Notes: Consensus Parameters The following consensus parameters reflect the constants listed in the proposal. These parameters should also be available for override in torrc. pb_mincircs=150 The minimum number of first hops before we log or drop Guards. pb_noticepct=70 The threshold of circuit success below which we display a notice. pb_warnpct=50 The threshold of circuit success below which we display a warn. pb_disablepct=30 The threshold of circuit success below which we disable the guard. pb_scalecircs=300 The number of first hops at which we scale the counts down. pb_multfactor=1 The integer numerator by which we scale. pb_scalefactor=2 The integer divisor by which we scale. pb_dropguards=0 If non-zero, we should actually drop guards as opposed to warning. Implementation Notes: Differences between proposal and current source This proposal adds a few changes over the implementation currently deployed in origin/master. The log messages suggested above are different than those in the source. The following consensus parameters had changes to their default values, based on results from simulation and scanning: pb_mincircs=150 pb_noticepct=70 pb_disablepct=30 pb_scalecircs=300 Also, the following consensus parameters are additions: pb_multfactor=1 pb_warnpct=50 pb_dropguards=0 Finally, 0.2.3.x needs to be synced with origin/master, but should also ignore the pb_dropguards parameter (but ideally still provide the equivalent pb_dropguards torrc option). 1. http://freehaven.net/anonbib/cache/ccs07-doa.pdf 2. https://lists.torproject.org/pipermail/tor-dev/2012-March/003347.html 3. https://gitweb.torproject.org/torflow.git/tree/HEAD:/CircuitAnalysis/PathBias 4. https://github.com/meejah/txtorcon/blob/exit_scanner/apps/exit_scanner/failure-rate-scanner.py 5. See 2.4.1 of path-spec.txt for further details on circuit timeout calculations.
Filename: 210-faster-headless-consensus-bootstrap.txt Title: Faster Headless Consensus Bootstrapping Author: Mike Perry, Tim Wilson-Brown, Peter Palfrader Created: 01-10-2012 Last Modified: 02-10-2015 Status: Superseded Target: 0.2.8.x+ Status-notes: * This has been partially superseded by the fallback directory code, and partially by the exponential-backoff code. Overview and Motiviation This proposal describes a way for clients to fetch the initial consensus more quickly in situations where some or all of the directory authorities are unreachable. This proposal is meant to describe a solution for bug #4483. Design: Bootstrap Process Changes The core idea is to attempt to establish bootstrap connections in parallel during the bootstrap process, and download the consensus from the first connection that completes. Connection attempts will be performed on an exponential backoff basis. Initially, connections will be performed to a randomly chosen hard coded directory mirror and a randomly chosen canonical directory authority. If neither of these connections complete, additional mirror and authority connections are tried. Mirror connections are tried at a faster rate than authority connections. Clients represent the majority of the load on the network. They can use directory mirrors to download their documents, as the mirrors download their documents from the authorities early in the consensus validity period. We specify that client mirror connections retry after one second, and then double the retry time with every connection attempt: 0, 1, 2, 4, 8, 16, 32, ... (The timers currently implemented in Tor increment with every connection failure.) We specify that client directory authority connections retry after 10 seconds, and then double the retry time with every connection: 0, 10, 20, ... If a client has both an IPv4 and IPv6 address, it will try IPv4 and IPv6 mirrors and authorities on the following schedule: IPv4, IPv6, IPv4, IPv6, ... [ TODO: should we add random noise to these scheduled times? - teor Tor doesn’t add random noise to the current failure-based timers, but as failures are a network event, they are somewhat random/arbitrary already. These attempt-based timers will go off every few seconds, exactly erraon the second. ] (Relays can’t use directory mirrors to download their documents, as they *are* the directory mirrors.) The maximum retry time for all these timers is 3 days + 1 hour. This places a small load on the mirrors and authorities, while allowing a client that regains a network connection to eventually download a consensus. We try IPv4 first to avoid overloading IPv6-enabled authorities and mirrors. Each timing schedule uses a separate IPv4/IPv6 schedule. This ensures that clients try an IPv6 authority within the first 10 seconds. This helps implement #8374 and related tickets. We don't want to keep on trying an IP version that always fails. Therefore, once sufficient IPv4 and IPv6 connections have been attempted, we select an IP version for new connections based on the ratio of their failure rates, up to a maximum of 1:5. This may not make a substantial difference to consensus downloads, as we only need one successful consensus download to bootstrap. However, it is important for future features like #17217, where clients try to automatically determine if they can use IPv4 or IPv6 to contact the Tor network. The retry timers and IP version schedules must reset on HUP and any network reachability events, so that clients that have unreliable networks can recover from network failures. [ TODO: Do we do this for any other timers? I think this needs another proposal, it’s out of scope here. - teor ] The first connection to complete will be used to download the consensus document and the others will be closed, after which bootstrapping will proceed as normal. We expect the vast majority of clients to succeed within 4 seconds, after making up to 4 connection attempts to mirrors and 1 connection attempt to an authority. Clients which can't connect in the first 10 seconds, will try 1 more mirror, then try to contact another directory authority. We expect almost all clients to succeed within 10 seconds. This is a much better success rate than the current Tor implementation, which fails k/n of clients if k of the n directory authorities are down. (Or, if the connection fails in certain ways, it will retry once, failing 1-(1-(k/n)^2).) If at any time, the total outstanding bootstrap connection attempts exceeds 10, no new connection attempts are to be launched until an existing connection attempt experiences full timeout. The retry time is not doubled when a connection is skipped. A benefit of connecting to directory authorities is that clients are warned if their clock is wrong. Starting the authority and fallback schedules at the same time should ensure that some clients check their clock with an authority at each bootstrap. Design: Fallback Dir Mirror Selection The set of hard coded directory mirrors from #572 shall be chosen using the 100 Guard nodes with the longest uptime. The fallback weights will be set using each mirror's fraction of consensus bandwidth out of the total of all 100 mirrors, adjusted to ensure no fallback directory sees more than 10% of clients. We will also exclude fallback directories that are less than 1/1000 of the consensus weight, as they are not large enough to make it worthwhile including them. This list of fallback dir mirrors should be updated with every major Tor release. In future releases, the number of dir mirrors should be set at 20% of the current Guard nodes (approximately 200 as of October 2015), rather than fixed at 100. [TODO: change the script to dynamically calculate an upper limit.] Performance: Additional Load with Current Parameter Choices This design and the connection count parameters were chosen such that no additional bandwidth load would be placed on the directory authorities. In fact, the directory authorities should experience less load, because they will not need to serve the entire consensus document for a connection in the event that one of the directory mirrors complete their connection before the directory authority does. However, the scheme does place additional TLS connection load on the fallback dir mirrors. Because bootstrapping is rare, and all but one of the TLS connections will be very short-lived and unused, this should not be a substantial issue. The dangerous case is in the event of a prolonged consensus failure that induces all clients to enter into the bootstrap process. In this case, the number of TLS connections to the fallback dir mirrors within the first second would be 2*C/100, or 40,000 for C=2,000,000 users. If no connections complete before the 10 retries, 7 of which go to mirrors, this could reach as high as 140,000 connection attempts, but this is extremely unlikely to happen in full aggregate. However, in the no-consensus scenario today, the directory authorities would already experience 2*C/9 or 444,444 connection attempts. (Tor currently tries 2 authorities, before delaying the next attempt.) The 10-retry scheme, 3 of which go to authorities, increases their total maximum load to about 666,666 connection attempts, but again this is unlikely to be reached in aggregate. Additionally, with this scheme, even if the dirauths are taken down by this load, the dir mirrors should be able to survive it. Implementation Notes: Code Modifications The implementation of the bootstrap process is unfortunately mixed in with many types of directory activity. The process starts in update_consensus_networkstatus_downloads(), which initiates a single directory connection through directory_get_from_dirserver(). Depending on bootstrap state, a single directory server is selected and a connection is eventually made through directory_initiate_command_rend(). There appear to be a few options for altering this code to retry multiple simultaneous connections. It looks like we can modify update_consensus_networkstatus_downloads() to make connections more often if the purpose is DIR_PURPOSE_FETCH_CONSENSUS and there is no valid (reasonably live) consensus. We can make multiple connections from update_consensus_networkstatus_downloads(), as the sockets are non-blocking. (This socket appears to be non-blocking on Unixes (SOCK_NONBLOCK & O_NONBLOCK) and Windows (FIONBIO).) As long as we can tolerate a timer resolution of ~1 second (due to the use of second_elapsed_callback and time_t), this requires no additional timers or callbacks. We can make 1 connection for each schedule per second, for a maximum of 2 per second. The schedules can be specified in: TestingClientBootstrapConsensusAuthorityDownloadSchedule TestingClientBootstrapConsensusFallbackDownloadSchedule (Similar to the existing TestingClientConsensusDownloadSchedule.) TestingServerIPVersionPreferenceSchedule (Consisting of a CSV like “4,6,4,6”, or perhaps “0,1,0,1”.) update_consensus_networkstatus_downloads() checks the list of pending connections and, if it is 10 or greater, skip the connection attempt, and leave the retry time constant. The code in directory_send_command() and connection_finished_connecting() would need to be altered to check that we are not already downloading the consensus. If we’re not, then download the consensus on this connection, and close any other pending consensus dircons. We might also need to make similar changes in authority_certs_fetch_missing(), as we can’t use a consensus until we have enough authority certificates. However, Tor already makes multiple requests (one per certificate), and only needs a majority of certificates to validate a consensus. Therefore, we will only need to modify authority_certs_fetch_missing() if clients download a consensus, then end up getting stuck downloading certificates. (Current tests show bootstrapping working well without any changes to authority certificate fetches.) Reliability Analysis We make the pessimistic assumptions that 50% of connections to directory mirrors fail, and that 20% of connections to authorities fail. (Actual figures depend on relay churn, age of the fallback list, and authority uptime.) We expect the first 10 connection retry times to be: (Research shows users tend to lose interest after 40 seconds.) Mirror: 0s 1s 2s 4s 8s 16s 32s Auth: 0s 10s 20s Success: 90% 95% 97% 98.7% 99.4% 99.89% 99.94% 99.988% 99.994% 97% of clients succeed in the first 2 seconds. 99.4% of clients succeed without trying a second authority. 99.89% of clients succeed in the first 10 seconds. 0.11% of clients remain, but in this scenario, 2 authorities are unreachable, so the client is most likely blocked from the Tor network. Alternately, they will likely succeed on relaunch. The current implementation makes 1 or 2 authority connections within the first second, depending on exactly how the first connection fails. Under the 20% authority failure assumption, these clients would have a success rate of either 80% or 96% within a few seconds. The scheme above has a greater success rate in the first few seconds, while spreading the load among a larger number of directory mirrors. In addition, if all the authorities are blocked, current clients will inevitably fail, as they do not have a list of directory mirrors.
Filename: 211-mapaddress-tor-status.txt Title: Internal Mapaddress for Tor Configuration Testing Author: Mike Perry Created: 08-10-2012 Status: Reserve Target: 0.2.4.x+ Overview This proposal describes a method by which we can replace the https://check.torproject.org/ testing service with an internal XML document provided by the Tor client. Motivation The Tor Check service is a central point of failure in terms of Tor usability. If it is ever out of sync with the set of exit nodes on the Tor network or down, user experience is degraded considerably. Moreover, the check itself is very time-consuming. Users must wait seconds or more for the result to come back. Worse still, if the user's software *was* in fact misconfigured, the check.torproject.org DNS resolution and request leaks out on to the network. Design Overview The system will have three parts: an internal hard-coded IP address mapping (127.84.111.114:80), a hard-coded mapaddress to a DNS name (selftest.torproject.org:80), and a DirPortFrontPage-style simple HTTP server that serves an XML document for both addresses. Upon receipt of a request to the IP address mapping, the system will create a new 128 bit randomly generated nonce and provide it in the XML document. Requests to http://selftest.torproject.org/ must include a valid, recent nonce as the GET url path. Upon receipt of a valid nonce, it is removed from the list of valid nonces. Nonces are only valid for 60 seconds or until SIGNAL NEWNYM, which ever comes first. The list of pending nonces should not be allowed to grow beyond 10 entries. The timeout period and nonce limit should be configurable in torrc. Design: XML document format for http://127.84.111.114 To avoid the need to localize the message in Tor, Tor will only provide a XML object with connectivity information. Here is an example form: <tor-test> <tor-bootstrap-percent>100</tor-bootstrap-percent> <tor-version-current>true</tor-version-current> <dns-nonce>4977eb4842c7c59fa5b830ac4da896d9</dns-nonce> <tor-test/> The tor-bootstrap-percent field represents the results of the Tor client bootstrap status as integer percentages from bootstrap_status_t. The tor-version-current field represents the results of the Tor client consensus version check. If the bootstrap process has not yet downloaded a consensus document, this field will have the value null. The dns-nonce field contains a 128-bit secret, encoded in base16. This field is only present for requests that list the Host: header as 127.84.111.114. Design: XML document format for http://selftest.torproject.org/nonce <tor-test> <tor-bootstrap-percent>100</tor-bootstrap-percent> <tor-version-current>true</tor-version-current> <dns-nonce-valid>true</dns-nonce-valid> <tor-test/> The first two fields are the same as for the IP address version. The dns-nonce-valid field is only true if the Host header matches selftest.torproject.org and the nonce is current and valid. Upon receipt of a valid nonce, that nonce is removed from the list of valid nonces. Design: Request Servicing Care must be taken with the dns-nonce generation and usage, to prevent users from being tracked through leakage of nonce value to application content. While the usage of XML appears to make this impossible due to stricter same-origin policy enforcement than JSON, same-origin enforcement is still fraught with exceptions and loopholes. In particular: Any requests that contain the Origin: header MUST be ignored, as the Origin: header is only included for third party web content (CORS). dns-nonce fields MUST be omitted if the HTTP Host: header does not match the IP address 127.84.111.114. Requests to selftest.torproject.org MUST return false for the dns-nonce-valid field if the HTTP Host: header does not match selftest.torproject.org, regardless of nonce value. Further, requests to selftest.torproject.org MUST validate that 'selftest.torproject.org' was the actual hostname provided to SOCKS4A, and not some alternate address mapping (due to DNS rebinding attacks, for example). Design: Application Usage Applications will use the system in two steps. First, they will make an HTTP request to http://127.84.111.114:80/ over Tor's SOCKS port and parse the resulting XML, if any. If the request at this stage fails, the application should inform the user that either their Tor client is too old, or that it is misconfigured, depending upon the nature of the failure. If the request succeeds and valid XML is returned, the application will record the value of the dns-nonce field, and then perform a second request to http://selftest.torproject.org/nonce_value. If the second request succeeds, and the dns-nonce-valid field is true, the application may inform the user that their Tor settings are valid. If the second request fails, or does not provide the correct dns-nonce, the application will inform the user that their Tor DNS proxy settings are incorrect. If either tor-bootstrap-percent is not 100, or tor-version-current is false, applications may choose to inform the user of these facts using properly localized strings and appropriate UI. Security Considerations XML was chosen over JSON due to the risks of the identifier leaking in a way that could enable websites to track the user[1]. Because there are many exceptions and circumvention techniques to the same-origin policy, we have also opted for strict controls on dns-nonce lifetimes and usage, as well as validation of the Host header and SOCKS4A request hostnames. 1. http://www.hpenterprisesecurity.com/vulncat/en/vulncat/dotnet/javascript_hijacking_vulnerable_framework.html
Filename: 212-using-old-consensus.txt Title: Increase Acceptable Consensus Age Author: Mike Perry Created: 01-10-2012 Status: Needs-Revision Target: 0.2.4.x+ Overview This proposal aims to extend the duration that clients will accept old consensus material under conditions where the directory authorities are either down or fail to produce a valid consensus for an extended period of time. Motivation Currently, if the directory authorities are down or fail to consense for 24 hours, the entire Tor network will cease to function. Worse, clients will enter into a state where they all need to re-bootstrap directly from the directory authorities, which will likely exacerbate any potential DoS condition that may have triggered the downtime in the first place. The Tor network has had such close calls before. In the past, we've been able to mobilize a majority of the directory authority operators within this 24 hour window, but that is only because we've been exceedingly lucky and the DoS conditions were accidental rather than deliberate. If a DoS attack was deliberately timed to coincide with a major US and European combined holiday such as Christmas Eve, New Years Eve, or Easter, it is very unlikely we would be able to muster the resources to diagnose and deploy a fix to the authorities in time to prevent network collapse. Description Based on the need to survive multi-day holidays and long weekends balanced with the need to ensure clients can't be captured on an old consensus forever, I propose that the consensus liveness constants be set at 3 days rather than 24 hours. This requires updating two consensus defines in the source, and one descriptor freshness variable. The descriptor freshness should be set to a function of the consensus freshness. See Implementation Notes for further details. Security Concerns: Using an Old Consensus Clients should not trust old consensus data without an attempt to download fresher data from a directory mirror. As far as I could tell, the code already does this. The minimum consensus age before we try to download new data is two hours. However, the ability to accept old consensus documents does introduce the ability of malicious directory mirrors to feed their favorite old consensus document to clients to alter their paths until they download a fresher consensus from elsewhere. Directory guards (Proposal 207) may exacerbate this ability. This proposal does not address such attacks, and seeks only a modest increase in the valid timespan as a compromise. Future consideration of these and other targeted-consensus attacks will be left to proposals related to ticket #7126[1]. Once those proposals are complete and implemented, raising the freshness limit beyond 3 days should be possible. Implementation Notes There appear to be at least three constants in the code involved with using potentially expired consensus data. Two of them (REASONABLY_LIVE_TIME and NS_EXPIRY_SLOP) involve the consensus itself, and two (OLD_ROUTER_DESC_MAX_AGE and TOLERATE_MICRODESC_AGE) deal with descriptor liveness. Two additional constants ROUTER_MAX_AGE and ROUTER_MAX_AGE_TO_PUBLISH are only used when submitting descriptors for consensus voting. FORCE_REGENERATE_DESCRIPTOR_INTERVAL is the maximum age a router descriptor will get before a relay will re-publish. It is set to 18 hours. OLD_ROUTER_DESC_MAX_AGE is set at 5 days. TOLERATE_MICRODESC_AGE is set at 7 days. The consensus timestamps are used in networkstatus_get_reasonably_live_consensus() and networkstatus_set_current_consensus(). OLD_ROUTER_DESC_MAX_AGE is checked in routerlist_remove_old_routers(), router_add_to_routerlist(), and client_would_use_router(). It is my opinion that we should combine REASONABLY_LIVE_TIME and NS_EXPIRY_SLOP into a single define, and make OLD_ROUTER_DESC_MAX_AGE a function of REASONABLY_LIVE_TIME and FORCE_REGENERATE_DESCRIPTOR_INTERVAL: #define REASONABLY_LIVE_TIME (3*24*60*60) #define NS_EXPIRY_SLOP REASONABLY_LIVE_TIME #define OLD_ROUTER_DESC_MAX_AGE \ (REASONABLY_LIVE_TIME+FORCE_REGENERATE_DESCRIPTOR_INTERVAL) Based on my review of the above code paths, these changes should be all we need to enable clients to use older consensuses for longer while still attempting to fetch new ones. 1. https://trac.torproject.org/projects/tor/ticket/7126
Filename: 213-remove-stream-sendmes.txt Title: Remove stream-level sendmes from the design Author: Roger Dingledine Created: 4-Nov-2012 Status: Dead 1. Motivation Tor uses circuit-level sendme cells to handle congestion / flow fairness at the circuit level, but it has a second stream-level flow/congestion/fairness layer under that to share a given circuit between multiple streams. The circuit-level flow control, or something like it, is needed because different users are competing for the same resources. But the stream-level flow control has a different threat model, since all the streams belong to the same user. When the circuit has only one active stream, the downsides are a) that we waste 2% of our bandwidth sending stream-level sendmes, and b) because of the circuit-level and stream-level window parameters we picked, we end up sending only half the cells we might otherwise send. When the circuit has two active streams, they each get to send 500 cells for their window, because the circuit window is 1000. We still spend the 2% overhead. When the circuit has three or more active streams, they're all typically limited by the circuit window, since the stream-level window won't kick in. We still spend the 2% overhead though. And depending on their sending pattern, we could experience cases where a given stream might be able to send more data on the circuit, but it chooses not to because its stream-level window is empty. More generally, we don't have a good handle on the interactions between all the layers of congestion control in Tor. It would behoove us to simplify in the case where we're not clear on what it buys us. 2. Design We should strip all aspects of this stream-level flow control from the Tor design and code. 2.1. But doesn't having a lower stream window than circuit window save room for new streams? It could be that a feature of the stream window is that there's always space in the circuit window for another begin cell, so new streams will open faster than otherwise. But first, if there are two or more active streams going, there won't be any extra space. Second, since begin cells are client-to-exit, and typical circuits don't fill their outbound circuit windows very often anyway, and also since we're hoping to move to a world where we isolate more activities between circuits, I'm not inclined to worry much about losing this maybe-feature. See also proposal 168, "reduce default circuit window" -- it's interesting to note that proposal 168 was unknowingly dabbling in exactly this question, since reducing the default circuit window to 500 or less made stream windows moot. It might be worth resurrecting the proposal 168 experiments once this proposal is implemented. 2.2. If we dump stream windows, we're effectively doubling them. Right now the circuit window starts at 1000, and the stream window starts at 500. So if we just rip out stream windows, we'll effectively change the stream window default to 1000, doubling the amount of data in flight and potentially clogging up the network more. We could either live with that, or we could change the default circuit window to 500 (which is easy to do even in a backward compatible way, since the edge connection can simply choose to not send as many cells). 3. Evaluation It would be wise to have some plan for making sure we didn't screw up the network too much with this change. The main trouble there is that torperf et al only do one stream at a time, so we really have no good baseline, or measurement tools, to capture network performance for multiple parallel streams. Maybe we should resolve task 7168 before the transition, so we're more prepared. 4. Transition Option one is to do a two-phase transition. In the first phase, edges stop enforcing the deliver window (i.e. stop closing circuits when the stream deliver goes negative, but otherwise they send and receive stream-level sendmes as now). In the second phase (once all old versions are gone), we can start disobeying the deliver window, and also stop sending stream-level sendmes back. That approach takes a while before it will matter. As an optimization, since clients can know which relay versions support the new behavior, we could have relays interpret violating the deliver window as signaling support for removed stream-level sendmes: the relay would then stop sending or expecting sendmes. That optimization is somewhat klunky though, first because web-browsing clients don't generally finish out a stream window in the upstream direction (so the klunky trick will probably never happen by accident), and second because if we lower the circuit window to 500 (see Sec 2.2), there's now no way to violate stream deliver windows. Option two is to introduce another relay cell type, which the client sends before opening any streams to let the other side know that it shouldn't use or expect stream-level sendmes. A variation here is to extend either the create cell or the begin cell (ha -- and they thought I was crazy when I included the explicit \0 at the end of the current begin cell payload), so we can specify our circuit preferences without any extra overhead. Option three is to wait until we switch to a new circuit protocol (e.g. when we move to ntor or ace), and use that as the signal to drop stream-level sendmes from the design. And hey, if we're lucky, by then we'll have sorted out the n23 questions (see ticket 4506) and we might be dumping circuit-level sendmes at that point too. Options two or three seem way better than option one. And since it's not super-urgent, I suggest we hold off on option two to see if option three makes sense. 5. Discussion Based on feedback from Andreas Krey on tor-dev, I believe this proposal is flawed, and should likely move to Status: Dead. Looking at it from the exit relay's perspective (which is where it matters most, since most use of Tor is sending a little bit and receiving a lot): when a create cell shows up to establish a circuit, that circuit is allowed to send back at most 1000 cells. When a begin relay cell shows up to ask that circuit to open a new stream, that stream is allowed to send back at most 500 cells. Whenever the Tor client has received 100 cells on that circuit, she immediately sends a circuit-level sendme back towards the exit, to let it know to increment its "number of cells it's allowed to send on the circuit" by 100. However, a stream-level sendme is only sent when both a) the Tor client has received 50 cells on a particular stream, *and* b) the application that initiated the stream is willing to accept more data. If we ripped out stream-level sendmes, then as you say, we'd have to choose between "queue all the data for the stream, no matter how big it gets" and "tell the whole circuit to shut up". I believe you have just poked a hole in the n23 ("defenestrator") design as well: http://freehaven.net/anonbib/#pets2011-defenestrator since it lacks any stream-level pushback for streams that are blocking on writes. Nicely done!
Filename: 214-longer-circids.txt Title: Allow 4-byte circuit IDs in a new link protocol Author: Nick Mathewson Created: 6 Nov 2012 Status: Closed Implemented-In: 0.2.4.11-alpha 0. Overview Relays are running out of circuit IDs. It's time to make the field bigger. 1. Background and Motivation Long ago, we thought that 65535 circuit IDs would be enough for anybody. It wasn't. But our cell format in link protocols is still: Cell [512 bytes] CircuitID [2 bytes] Command [1 byte] Payload [509 bytes] Variable-length cell [Length+5 bytes] CircID [2 bytes] Command [1 byte] Length [2 bytes] Payload [Length bytes] This means that a relay can run out of circuit IDs pretty easily. 2. Design I propose a new link cell format for relays that support it. It should be: Cell [514 bytes] CircuitID [4 bytes] Command [1 byte] Payload [509 bytes] Variable cell (Length+7 bytes) CircID [4 bytes] Command [1 byte] Length [2 bytes] Payload [Length bytes] We need to keep the payload size in fixed-length cells to its current value, since otherwise the relay protocol won't work. This new cell format should be used only when the link protocol is 4. (To negotiation link protocol 4, both sides need to use the "v3" handshake, and include "4" in their version cells. If version 4 or later is negotiated, this is the cell format to use.) 2.1. Better allocation of circuitID space In the current Tor design, circuit ID allocation is determined by whose RSA public key has the lower modulus. How ridiculous! Instead, I propose that when the version 4 link protocol is in use, the connection initiator use the low half of the circuit ID space, and the responder use the high half of the circuit ID space. 3. Discussion * Why 4 bytes? Because 3 would result in an odd cell size, and 8 seems like overkill. * Will this be distinguishable from the v3 protocol? Yes. Anybody who knows they're seeing the Tor protocol can probably tell by the TLS record sizes which version of the protocol is in use. Probably not a huge deal though; which approximate range of versions of Tor a client or server is running is not something we've done much to hide in the past. * Why a new link protocol and not a new cell type? Because pretty much every cell has a meaningful circuit ID. * Okay, why a new link protocol and not a new _set of_ cell types? Because it's a bad idea to mix short and long circIDs on the same channel. (That would leak which cells go with what kind of circuits ID, potentially.) * How hard is this to implement? I wasn't sure, so I coded it up. I've got a probably-buggy implementation in branch "wide_circ_ids" in my public repository. Be afraid! More testing is needed!
Filename: 215-update-min-consensus-ver.txt Title: Let the minimum consensus method change with time Author: Nick Mathewson Created: 15 Nov 2012 Status: Closed Implemented-In: 0.2.6.1-alpha 0. Overview This proposal suggests that we drop the requirement that authorities support the very old consensus method "1", and instead move to a wider window of recognized consensus methods as Tor evolves. 1. Background and Motivation When we designed the directory voting system, we added the notion of "consensus method" so that we could smoothly upgrade the voting process over time. We also said that all authorities must support the consensus method '1', and must fall back to it if they don't support the method that the supermajority of authorities will choose. Consensus method 1 is no longer viable for the Tor network. It doesn't result in a microdescriptor consensus, and omits other fields that clients need in order to work well. Consensus methods under 12 have security issues, since they let a single authority set a consensus parameter. In the future, new consensus methods will be needed so that microdescriptor-using clients can use IPv6 exits and ECC onion-keys. Rolling back from those would degrade functionality. We need a way to change the minimum consensus method over time. 2. Design I propose that we change the minimum consensus method about once per release cycle, or once per ever other release cycle. As a rule of thumb, let the minimum consensus method in Tor series X be the highest method supported by the oldest version that "anybody reasonable" would use for running an authority. Typically, that's the stable version of the previous release series. For flexibility, it might make sense to choose a slightly older method, if falling back to that method wouldn't cause security problems. For example, while Tor 0.2.4.x is under development, authorities should really not be running anything before Tor 0.2.3.x. Tor 0.2.3.x has supported consensus method 13 since 0.2.3.21-rc, so it's okay for 0.2.4.x to require 13 as the minimum method. We even might go back to method 12, since the worst outcome of not using 13 would be some warnings in client logs. Consensus method 12 was a security improvement, so we don't want to roll back before that. 2.1. Behavior when the method used is one we don't know The spec currently says that if an authority sees that a method will be used that it doesn't support, it should act as if the consensus method will be "1". This attempt will be doomed, since the other authorities will be computing the consensus with a more recent method, and any attempt to use method "1" won't get enough signatures. Instead, let's say that authorities fall back to the most recent method that they *do* support. This isn't any likelier to reach consensus, but it is less likely to result in anybody signing something they don't like. 3. Likely outcomes If a bunch of authorities were to downgrade to a much older version, all at once, then newer authorities would not be able to sign the consensus they made. That's probably for the best: if a bunch of authorities were to suddenly start running 0.2.0.x, consensing along with them would be a poor idea. 4. Alternatives We might choose a less narrow window of allowable method, when we can do so securely. Maybe two release series, rather than one, would be a good interval to do when the consensus format isn't changing rapidly. We might want to have the behavior when we see that everybody else will be using a method we don't support be "Don't make a consensus at all." That's harder to program, though.
Filename: 216-ntor-handshake.txt Title: Improved circuit-creation key exchange Author: Nick Mathewson Created: 11-May-2011 Status: Closed Implemented-In: 0.2.4.8-alpha Summary: This is an attempt to translate the proposed circuit handshake from "Anonymity and one-way authentication in key-exchange protocols" by Goldberg, Stebila, and Ustaoglu, into a Tor proposal format. It assumes that proposal 200 is implemented, to provide an extended CREATE cell format that can indicate what type of handshake is in use. Notation: Let a|b be the concatenation of a with b. Let H(x,t) be a tweakable hash function of output width H_LENGTH bytes. Let t_mac, t_key, and t_verify be a set of arbitrarily-chosen tweaks for the hash function. Let EXP(a,b) be a^b in some appropriate group G where the appropriate DH parameters hold. Let's say elements of this group, when represented as byte strings, are all G_LENGTH bytes long. Let's say we are using a generator g for this group. Let a,A=KEYGEN() yield a new private-public keypair in G, where a is the secret key and A = EXP(g,a). If additional checks are needed to ensure a valid keypair, they should be performed. Let PROTOID be a string designating this variant of the protocol. Let KEYID be a collision-resistant (but not necessarily preimage-resistant) hash function on members of G, of output length H_LENGTH bytes. Let each node have a unique identifier, ID_LENGTH bytes in length. Instantiation: Let's call this PROTOID "ntor-curve25519-sha256-1" (We might want to make this shorter if it turns out to save us a block of hashing somewhere.) Set H(x,t) == HMAC_SHA256 with message x and key t. So H_LENGTH == 32. Set t_mac == PROTOID | ":mac" t_key == PROTOID | ":key_extract" t_verify == PROTOID | ":verify" Set EXP(a,b) == curve25519(.,b,a), and g == 9 . Let KEYGEN() do the appropriate manipulations when generating the secret key (clearing the low bits, twiddling the high bits). Set KEYID(B) == B. (We don't need to use a hash function here, since our keys are already very short. It is trivially collision-resistant, since KEYID(A)==KEYID(B) iff A==B.) When representing an element of the curve25519 subgroup as a byte string, use the standard (32-byte, little-endian, x-coordinate-only) representation for curve25519 points. Protocol: Take a router with identity key digest ID. As setup, the router generates a secret key b, and a public onion key B with b, B = KEYGEN(). The router publishes B in its server descriptor. To send a create cell, the client generates a keypair x,X = KEYGEN(), and sends a CREATE cell with contents: NODEID: ID -- ID_LENGTH bytes KEYID: KEYID(B) -- H_LENGTH bytes CLIENT_PK: X -- G_LENGTH bytes The server generates a keypair of y,Y = KEYGEN(), and computes secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) auth_input = verify | ID | B | Y | X | PROTOID | "Server" The server sends a CREATED cell containing: SERVER_PK: Y -- G_LENGTH bytes AUTH: H(auth_input, t_mac) -- H_LENGTH bytes The client then checks Y is in G^* [see NOTE below], and computes secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) auth_input = verify | ID | B | Y | X | PROTOID | "Server" The client verifies that AUTH == H(auth_input, t_mac). Both parties check that none of the EXP() operations produced the point at infinity. [NOTE: This is an adequate replacement for checking Y for group membership, if the group is curve25519.] Both parties now have a shared value for KEY_SEED. They expand this into the keys needed for the Tor relay protocol. Key expansion: Currently, the key expansion formula used by Tor here is K = SHA(K0 | [00]) | SHA(K0 | [01]) | SHA(K0 | [02]) | ... where K0==g^xy, and K is divvied up into Df, Db, Kf, and Kb portions. Instead, let's have it be HKDF-SHA256 as defined in RFC5869: K = K_1 | K_2 | K_3 | ... Where K_1 = H(m_expand | INT8(1) , KEY_SEED ) and K_(i+1) = H(K_i | m_expand | INT8(i) , KEY_SEED ) and m_expand is an arbitrarily chosen value, and INT8(i) is a octet with the value "i". Ian says this is due to a construction from Krawczyk at http://eprint.iacr.org/2010/264 . Let m_expand be PROTOID | ":key_expand" In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, salt == t_key, and IKM == secret_input. Performance notes: In Tor's current circuit creation handshake, the client does: One RSA public-key encryption A full DH handshake in Z_p A short AES encryption Five SHA1s for key expansion And the server does: One RSA private-key decryption A full DH handshake in Z_p A short AES decryption Five SHA1s for key expansion While in the revised handshake, the client does: A full DH handshake A public-half of a DH handshake 3 H operations for the handshake 3 H operations for the key expansion and the server does: A full DH handshake A private-half of a DH handshake 3 H operations for the handshake 3 H operations for the key expansion Integrating with the rest of Tor: Add a new optional entry to router descriptors and microdescriptors: "ntor-onion-key" SP Base64Key NL where Base64Key is a base-64 encoded 32-byte value, with padding omitted. Add a new consensus method to tell servers to copy "ntor-onion-key" entries to from router descriptors to microdescriptors. In microdescriptors, "ntor-onion-key" can go right after the "onion-key" line. Add a "UseNTorHandshake" configuration option and a corresponding consensus parameter to control whether clients use the ntor handshake. If the configuration option is "auto", clients should obey the consensus parameter. Have the configuration default be "auto" and the consensus value initially be "0". Reserve the handshake type [00 02] for this handshake in CREATE2 and EXTEND2 cells. Specify that this handshake type can be used in EXTEND/EXTENDED/ CREATE/CREATED cells as follows: instead of a 190-byte TAP onionskin, send the 16-byte string "ntorNTORntorNTOR", followed by the client's ntor message. Instead of a 148-byte TAP response, send the server's ntor response. (We need this so that a client can extend from an 0.2.3 server, which doesn't know about CREATE2/CREATED2/EXTEND/EXTENDED2.) Test vectors for HKDF-SHA256: These are some test vectors for HKDF-SHA256 using the values for M_EXPAND and T_KEY above, taking 100 bytes of key material. INPUT: "" (The empty string) OUTPUT: d3490ed48b12a48f9547861583573fe3f19aafe3 f81dc7fc75eeed96d741b3290f941576c1f9f0b2 d463d1ec7ab2c6bf71cdd7f826c6298c00dbfe67 11635d7005f0269493edf6046cc7e7dcf6abe0d2 0c77cf363e8ffe358927817a3d3e73712cee28d8 INPUT: "Tor" (546f72) OUTPUT: 5521492a85139a8d9107a2d5c0d9c91610d0f959 89975ebee6c02a4f8d622a6cfdf9b7c7edd3832e 2760ded1eac309b76f8d66c4a3c4d6225429b3a0 16e3c3d45911152fc87bc2de9630c3961be9fdb9 f93197ea8e5977180801926d3321fa21513e59ac INPUT: "AN ALARMING ITEM TO FIND ON YOUR CREDIT-RATING STATEMENT" (414e20414c41524d494e47204954454d20544f2046494e44204f4e20 594f5552204352454449542d524154494e472053544154454d454e54) OUTPUT: a2aa9b50da7e481d30463adb8f233ff06e9571a0 ca6ab6df0fb206fa34e5bc78d063fc291501beec 53b36e5a0e434561200c5f8bd13e0f88b3459600 b4dc21d69363e2895321c06184879d94b18f0784 11be70b767c7fc40679a9440a0c95ea83a23efbf
Filename: 217-ext-orport-auth.txt Title: Tor Extended ORPort Authentication Author: George Kadianakis Created: 28-11-2012 Status: Closed Target: 0.2.5.x 1. Overview This proposal defines a scheme for Tor components to authenticate to each other using a shared-secret. 2. Motivation Proposal 196 introduced new ways for pluggable transport proxies to communicate with Tor. The communication happens using TCP in the same fashion that controllers speak to the ControlPort. To defend against cross-protocol attacks [0] on the transport ports, we need to define an authentication scheme that will restrict passage to unknown clients. Tor's ControlPort uses an authentication scheme called safe-cookie authentication [1]. Unfortunately, the design of the safe-cookie authentication was influenced by the protocol structure of the ControlPort and the need for backwards compatibility of the cookie-file and can't be easily reused in other use cases. 3. Goals The general goal of Extended ORPort authentication is to authenticate the client based on a shared-secret that only authorized clients should know. Furthermore, its implementation should be flexible and easy to reuse, so that it can be used as the authentication mechanism in front of future Tor helper ports (for example, in proposal 199). Finally, the protocol is able to support multiple authentication schemes and each of them has different goals. 4. Protocol Specification 4.1. Initial handshake When a client connects to the Extended ORPort, the server sends: AuthTypes [variable] EndAuthTypes [1 octet] Where, + AuthTypes are the authentication schemes that the server supports for this session. They are multiple concatenated 1-octet values that take values from 1 to 255. + EndAuthTypes is the special value 0. The client reads the list of supported authentication schemes and replies with the one he prefers to use: AuthType [1 octet] Where, + AuthType is the authentication scheme that the client wants to use for this session. A valid authentication type takes values from 1 to 255. A value of 0 means that the client did not like the authentication types offered by the server. If the client sent an AuthType of value 0, or an AuthType that the server does not support, the server MUST close the connection. 4.2. Authentication types 4.2.1 SAFE_COOKIE handshake Authentication type 1 is called SAFE_COOKIE. 4.2.1.1. Motivation and goals The SAFE_COOKIE scheme is pretty-much identical to the authentication scheme that was introduced for the ControlPort in proposal 193. An additional goal of the SAFE_COOKIE authentication scheme (apart from the goals of section 2), is that it should not leak the contents of the cookie-file to untrusted parties. Specifically, the SAFE_COOKIE protocol will never leak the actual contents of the file. Instead, it uses a challenge-response protocol (similar to the HTTP digest authentication of RFC2617) to ensure that both parties know the cookie without leaking it. 4.2.1.2. Cookie-file format The format of the cookie-file is: StaticHeader [32 octets] Cookie [32 octets] Where, + StaticHeader is the following string: "! Extended ORPort Auth Cookie !\x0a" + Cookie is the shared-secret. During the SAFE_COOKIE protocol, the cookie is called CookieString. Extended ORPort clients MUST make sure that the StaticHeader is present in the cookie file, before proceeding with the authentication protocol. Details on how Tor locates the cookie file can be found in section 5 of proposal 196. Details on how transport proxies locate the cookie file can be found in pt-spec.txt. 4.2.1.3. Protocol specification A client that performs the SAFE_COOKIE handshake begins by sending: ClientNonce [32 octets] Where, + ClientNonce is 32 octets of random data. Then, the server replies with: ServerHash [32 octets] ServerNonce [32 octets] Where, + ServerHash is computed as: HMAC-SHA256(CookieString, "ExtORPort authentication server-to-client hash" | ClientNonce | ServerNonce) + ServerNonce is 32 random octets. Upon receiving that data, the client computes ServerHash herself and validates it against the ServerHash provided by the server. If the server-provided ServerHash is invalid, the client MUST terminate the connection. Otherwise the client replies with: ClientHash [32 octets] Where, + ClientHash is computed as: HMAC-SHA256(CookieString, "ExtORPort authentication client-to-server hash" | ClientNonce | ServerNonce) Upon receiving that data, the server computes ClientHash herself and validates it against the ClientHash provided by the client. Finally, the server replies with: Status [1 octet] Where, + Status is 1 if the authentication was successfull. If the authentication failed, Status is 0. 4.3. Post-authentication After completing the Extended ORPort authentication successfully, the two parties should proceed with the Extended ORPort protocol on the same TCP connection. 5. Acknowledgments Thanks to Robert Ransom for helping with the proposal and designing the original safe-cookie authentication scheme. Thanks to Nick Mathewson for advices and reviews of the proposal. [0]: http://archives.seul.org/or/announce/Sep-2007/msg00000.html [1]: https://gitweb.torproject.org/torspec.git/blob/79f488c32c43562522e5592f2c19952dc7681a65:/control-spec.txt#l1069
Filename: 218-usage-controller-events.txt Title: Controller events to better understand connection/circuit usage Author: Rob Jansen, Karsten Loesing Created: 2013-02-06 Status: Closed Implemented-In: 0.2.5.2-alpha 1. Overview This proposal defines three new controller events that shall help understand connection and circuit usage. These events are designed to be emitted in private Tor networks only. This proposal also defines a tweak to an existing event for the same purpose. 2. Motivation We need to better understand connection and circuit usage in order to better simulate Tor networks. Existing controller events are a fine start, but we need more detailed information about per-connection bandwidth, processed cells by circuit, and token bucket refills. This proposal defines controller events containing the desired information. Most of these usage data are too sensitive to be captured in the public network, unless aggregated sufficiently. That is why we're focusing on private Tor networks first, that is, relays that have TestingTorNetwork set. The new controller events described in this proposal shall all be restricted to private Tor networks. In the next step we might define aggregate statistics to be gathered by public relays, but that will require a new proposal. 3. Design The proposed new event types use Tor's asynchronous event mechanism where a controller registers for events by type and processes events received from the Tor process. Tor controllers can register for any of the new event types, but events will only be emitted if the Tor process is running in TestingTorNetwork mode. 4. Security implications There should be no security implications from the new event types, because they are only emitted in private Tor networks. 5. Specification 5.1. ConnID Token Addition for section 2.4 of the control-spec (General-use tokens). ; Unique identifiers for connections or queues. Only included in ; TestingTorNetwork mode. ConnID = 1*16 IDChar QueueID = 1*16 IDChar 5.2. Adding an ID field to ORCONN events The new syntax for ORCONN events is: "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "ID=" ConnID ] [ SP "REASON=" Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF The remaining specification of that event type stays unchanged. 5.3. Bandwidth used on an OR or DIR or EXIT connection The syntax is: "650" SP "CONN_BW" SP "ID=" ConnID SP "TYPE=" ConnType SP "READ=" BytesRead SP "WRITTEN=" BytesWritten CRLF ConnType = "OR" / "DIR" / "EXIT" BytesRead = 1*DIGIT BytesWritten = 1*DIGIT Controllers MUST tolerate unrecognized connection types. BytesWritten and BytesRead are the number of bytes written and read by Tor since the last CONN_BW event on this connection. These events are generated about once per second per connection; no events are generated for connections that have not read or written. These events are only generated if TestingTorNetwork is set. 5.4. Bandwidth used by all streams attached to a circuit The syntax is: "650" SP "CIRC_BW" SP "ID=" CircuitID SP "READ=" BytesRead SP "WRITTEN=" BytesWritten CRLF BytesRead = 1*DIGIT BytesWritten = 1*DIGIT BytesRead and BytesWritten are the number of bytes read and written by all applications with streams attached to this circuit since the last CIRC_BW event. These events are generated about once per second per circuit; no events are generated for circuits that had no attached stream writing or reading. 5.5. Per-circuit cell stats The syntax is: "650" SP "CELL_STATS" [ SP "ID=" CircuitID ] [ SP "InboundQueue=" QueueID SP "InboundConn=" ConnID ] [ SP "InboundAdded=" CellsByType ] [ SP "InboundRemoved=" CellsByType SP "InboundTime=" MsecByType ] [ SP "OutboundQueue=" QueueID SP "OutboundConn=" ConnID ] [ SP "OutboundAdded=" CellsByType ] [ SP "OutboundRemoved=" CellsByType SP "OutboundTime=" MsecByType ] CRLF CellsByType, MsecByType = CellType ":" 1*DIGIT 0*( "," CellType ":" 1*DIGIT ) CellType = 1*( "a" - "z" / "0" - "9" / "_" ) Examples are: 650 CELL_STATS ID=14 OutboundQueue=19403 OutboundConn=15 OutboundAdded=create_fast:1,relay_early:2 OutboundRemoved=create_fast:1,relay_early:2 OutboundTime=create_fast:0,relay_early:0 650 CELL_STATS InboundQueue=19403 InboundConn=32 InboundAdded=relay:1,created_fast:1 InboundRemoved=relay:1,created_fast:1 InboundTime=relay:0,created_fast:0 OutboundQueue=6710 OutboundConn=18 OutboundAdded=create:1,relay_early:1 OutboundRemoved=create:1,relay_early:1 OutboundTime=create:0,relay_early:0 ID is the locally unique circuit identifier that is only included if the circuit originates at this node. Inbound and outbound refer to the direction of cell flow through the circuit which is either to origin (inbound) or from origin (outbound). InboundQueue and OutboundQueue are identifiers of the inbound and outbound circuit queues of this circuit. These identifiers are only unique per OR connection. OutboundQueue is chosen by this node and matches InboundQueue of the next node in the circuit. InboundConn and OutboundConn are locally unique IDs of inbound and outbound OR connection. OutboundConn does not necessarily match InboundConn of the next node in the circuit. InboundQueue and InboundConn are not present if the circuit originates at this node. OutboundQueue and OutboundConn are not present if the circuit (currently) ends at this node. InboundAdded and OutboundAdded are total number of cells by cell type added to inbound and outbound queues. Only present if at least one cell was added to a queue. InboundRemoved and OutboundRemoved are total number of cells by cell type processed from inbound and outbound queues. InboundTime and OutboundTime are total waiting times in milliseconds of all processed cells by cell type. Only present if at least one cell was removed from a queue. These events are generated about once per second per circuit; no events are generated for circuits that have not added or processed any cell. These events are only generated if TestingTorNetwork is set. 5.6. Token buckets refilled The syntax is: "650" SP "TB_EMPTY" SP BucketName [ SP "ID=" ConnID ] SP "READ=" ReadBucketEmpty SP "WRITTEN=" WriteBucketEmpty SP "LAST=" LastRefill CRLF BucketName = "GLOBAL" / "RELAY" / "ORCONN" ReadBucketEmpty = 1*DIGIT WriteBucketEmpty = 1*DIGIT LastRefill = 1*DIGIT Examples are: 650 TB_EMPTY ORCONN ID=16 READ=0 WRITTEN=0 LAST=100 650 TB_EMPTY GLOBAL READ=93 WRITTEN=93 LAST=100 650 TB_EMPTY RELAY READ=93 WRITTEN=93 LAST=100 This event is generated when refilling a previously empty token bucket. BucketNames "GLOBAL" and "RELAY" keywords are used for the global or relay token buckets, BucketName "ORCONN" is used for the token buckets of an OR connection. Controllers MUST tolerate unrecognized bucket names. ConnID is only included if the BucketName is "ORCONN". If both global and relay buckets and/or the buckets of one or more OR connections run out of tokens at the same time, multiple separate events are generated. ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read (write) bucket was empty since the last refill. LastRefill is the time in millis since the last refill. If a bucket went negative and if refilling tokens didn't make it go positive again, there will be multiple consecutive TB_EMPTY events for each refill interval during which the bucket contained zero tokens or less. In such a case, ReadBucketEmpty or WriteBucketEmpty are capped at LastRefill in order not to report empty times more than once. These events are only generated if TestingTorNetwork is set. 6. Compatibility There should not be any compatibility issues with other Tor versions. 7. Implementation Most of the implementation should be straight-forward. 8. Performance and scalability notes Most of the new code won't be executed in normal Tor mode. Wherever we needed new fields in existing structs, we tried hard to keep them as small as possible. Still, we should make sure that memory requirements won't grow significantly on busy relays.
Filename: 219-expanded-dns.txt Title: Support for full DNS and DNSSEC resolution in Tor Authors: Ondrej Mikle Created: 4 February 2012 Modified: 2 August 2013 Target: 0.2.5.x Status: Needs-Revision 0. Overview Adding support for any DNS query type to Tor. 0.1. Motivation Many applications running over Tor need more than just resolving FQDN to IPv4 and vice versa. Sometimes to prevent DNS leaks the applications have to be hacked around to be supplied necessary data by hand (e.g. SRV records in XMPP). TLS connections will benefit from planned TLSA record that provides certificate pinning to avoid another Diginotar-like fiasco. 0.2. What about DNSSEC? Routine DNSSEC resolution is not practical with this proposal alone, because of round-trip issues: a single name lookup can require dozens of round trips across a circuit, rendering it very slow. (We don't want to add minutes to every webpage load time!) For records like TLSA that need extra signing, this might not be an unacceptable amount of overhead, but routine hostname lookup, it's probably overkill. [Further, thanks to the changes of proposal 205, DNSSEC for routine hostname lookup is less useful in Tor than it might have been back when we cached IPv4 and IPv6 addresses and used them across multiple circuits and exit nodes.] See section 8 below for more discussion of DNSSEC issues. 1. Design 1.1 New cells There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll use DNS_BEGIN and DNS_RESPONSE for short below). 1.1.1. DNS_BEGIN DNS_BEGIN payload: FLAGS [2 octets] DNS packet data (variable length, up to length of relay cell.) The DNS packet must be generated internally by Tor to avoid fingerprinting users by differences in client resolvers' behavior. [XXXX We need to specify the exact behavior here: saying "Just do what Libunbound does!" would make it impossible to implement a Tor-compatible client without reverse-engineering libunbound. - NM] The FLAGS field is reserved, and should be set to 0 by all clients. Because of the maximum length of the RELAY cell, the DNS packet may not be longer than 496 bytes. [XXXX Is this enough? -NM] Some fields in the query must be omitted or set to zero: see section 3 below. 1.1.2. DNS_RESPONSE DNS_RESPONSE payload: STATUS [1 octet] CONTENT [variable, up to length of relay cell] If the low bit of STATUS is set, this is the last DNS_RESPONSE that the server will send in response to the given DNS_BEGIN. Otherwise, there will be more DNS_RESPONSE packets. The other bits are reserved, and should be set to zero for now. The CONTENT fields of the DNS_RESPONSE cells contain a DNS record, split across multiple cells as needed, encoded as: total length (2 octets) data (variable) So for example, if the DNS record R1 is only 300 bytes long, then it is sent in a single DNS_RESPONSE cell with payload [01 01 2C] R1. But if the DNS record R2 is 1024 bytes long, it's sent in 3 DNS_RESPONSE cells, with contents: [00 04 00] R2[0:495], [00] R2[495:992], and [01] R2[992:1024] respectively. [NOTE: I'm using the length field and the is-this-the-last-cell field to allow multi-packet responses in the future. -NM] AXFR and IXRF are not supported in this cell by design (see specialized tool below in section 5). 1.1.3. Matching queries to responses. DNS_BEGIN must use a non-zero, distinct StreamID. The client MUST NOT re-use the same stream ID until it has received a complete response from the server or a RELAY_END cell. The client may cancel a DNS_BEGIN request by sending a RELAY_END cell. The server may refused to answer, or abort answering, a DNS_BEGIN cell by sending a RELAY_END cell. 2. Interfaces to applications DNSPort evdns - existing implementation will be updated to use DNS_BEGIN. [XXXX we should add a dig-like tool that can work over the socksport via some extension, as tor-resolve does now. -NM] 3. Limitations on DNS query Clients must only set query class to IN (INTERNET), since the only other useful class CHAOS is practical for directly querying authoritative servers (OR in this case acts as a recursive resolver). Servers MUST return REFUSED for any for class other than IN. Multiple questions in a single packet are not supported and OR will respond with REFUSED as the DNS error code. All query RR types are allowed. [XXXX I originally thought about some exit policy like "basic RR types" and "all RRs", but managing such list in deployed nodes with extra directory flags outweighs the benefit. Maybe disallow ANY RR type? -OM] Client as well as OR MUST block attempts to resolve local RFC 1918, 4193, or 4291 adresses (PTR). REFUSED will be returned as DNS error code from OR. [XXXX Must they also refuse to report addresses that resolve to these? -NM] [XXX I don't think so. People often use public DNS records that map to private adresses. We can't effectively separate "truly public" records from the ones client's dnsmasq or similar DNS resolver returns. - OM] [XXX Then do you mean "must be returned as the DNS error from the OP"?] Request for special names (.onion, .exit, .noconnect) must never be sent, and will return REFUSED. The DNS transaction ID field MUST be set to zero in all requests and replies; the stream ID field plays the same function in Tor. 4. Implementation notes Client will periodically purge incomplete DNS replies. Any unexpected DNS_RESPONSE will be dropped. AD flag must be zeroed out on client unless validation is performed. [XXXX libunbound lowlevel API, Tor+libunbound libevent loop libunbound doesn't publicly expose all the necessary parts of low-level API. It can return the received DNS packet, but not let you construct a packet and get it in wire-format, for example. Options I see: a) patch libunbound to be able feed wire-format DNS packets and add API to obtain constructed packets instead of sending over network b) replace bufferevents for sockets in unbound with something like libevent's paired bufferevents. This means that data extracted from DNS_RESPONSE/DNS_BEGIN cells would be fed directly to some evbuffers that would be picked up by libunbound. It could possibly result in avoiding background thread of libunbound's ub_resolve_async running separate libevent loop. c) bind to some arbitrary local address like 127.1.2.3:53 and use it as forwarder for libunbound. The code there would pack/unpack the DNS packets from/to libunbound into DNS_BEGIN/DNS_RESPONSE cells. It wouldn't require modification of libunbound code, but it's not pretty either. Also the bind port must be 53 which usually requires superuser privileges. Code of libunbound is fairly complex for me to see outright what would the best approach be. ] 5. Separate tool for AXFR The AXFR tool will have similar interface like tor-resolve, but will return raw DNS data. Parameters are: query domain, server IP of authoritative DNS. The tool will transfer the data through "ordinary" tunnel using RELAY_BEGIN and related cells. This design decision serves two goals: - DNS_BEGIN and DNS_RESPONSE will be simpler to implement (lower chance of bugs) - in practice it's often useful do AXFR queries on secondary authoritative DNS servers IXFR will not be supported (infrequent corner case, can be done by manual tunnel creation over Tor if truly necessary). 6. Security implications As proposal 171 mentions, we need mitigate circuit correlation. One solution would be keeping multiple streams to multiple exit nodes and picking one at random for DNS resolution. Other would be keeping DNS-resolving circuit open only for a short time (e.g. 1-2 minutes). Randomly changing the circuits however means that it would probably incur additional latency since there would likely be a few cache misses on the newly selected exits. [This needs more analysis; We need to consider the possible attacks here. It would be good to have a way to tie requests to SocksPorts, perhaps? -NM] 7. TTL normalization idea A bit complex on implementation, because it requires parsing DNS packets at exit node. TTL in reply DNS packet MUST be normalized at exit node so that client won't learn what other clients queried. The normalization is done in following way: - for a RR, the original TTL value received from authoritative DNS server should be used when sending DNS_RESPONSE, trimming the values to interval [5, 600] - does not pose "ghost-cache-attack", since once RR is flushed from libunbound's cache, it must be fetched anew 8. DNSSEC notes 8.1. Where to do the resolution? DNSSEC is part of the DNS protocol and the most appropriate place for DNSSEC API would be probably in OS libraries (e.g. libc). However that will probably take time until it becomes widespread. On the Tor's side (as opposed to application's side), DNSSEC will provide protection against DNS cache-poisoning attacks (provided that exit is not malicious itself, but still reduces attack surface). 8.2. Round trips and serialization Following are two examples of resolving two A records. The one for addons.mozila.org is an example of a "common" RR without CNAME/DNAME, the other for www.gov.cn an extreme example chained through 5 CNAMEs and 3 TLDs. The examples below are shown for resolving that started with an empty DNS cache. Note that multiple queries are made by libunbound as it tries to adjust for the latency of network. "Standard query response" below that does not list RR type is a negative NOERROR reply with NSEC/NSEC3 (usually reply to DS query). The effect of DNS cache plays a great role - once DS/DNSKEY for root and a TLD is cached, at most 3 records usually need to be fetched for a record that does not utilize CNAME/DNAME (3 roundtrips for DS, DNSKEY and the record itself if there are no zone cuts below). Query for addons.mozilla.org, 6 roundtrips (not counting retries): Standard query A addons.mozilla.org Standard query A addons.mozilla.org Standard query A addons.mozilla.org Standard query A addons.mozilla.org Standard query A addons.mozilla.org Standard query response A 63.245.217.112 RRSIG Standard query response A 63.245.217.112 RRSIG Standard query response A 63.245.217.112 RRSIG Standard query A addons.mozilla.org Standard query response A 63.245.217.112 RRSIG Standard query response A 63.245.217.112 RRSIG Standard query A addons.mozilla.org Standard query response A 63.245.217.112 RRSIG Standard query response A 63.245.217.112 RRSIG Standard query DNSKEY <Root> Standard query DNSKEY <Root> Standard query response DNSKEY DNSKEY RRSIG Standard query response DNSKEY DNSKEY RRSIG Standard query DS org Standard query response DS DS RRSIG Standard query DNSKEY org Standard query response DNSKEY DNSKEY DNSKEY DNSKEY RRSIG RRSIG Standard query DS mozilla.org Standard query response DS RRSIG Standard query DNSKEY mozilla.org Standard query response DNSKEY DNSKEY DNSKEY RRSIG RRSIG Query for www.gov.cn, 16 roundtrips (not counting retries): Standard query A www.gov.cn Standard query A www.gov.cn Standard query A www.gov.cn Standard query A www.gov.cn Standard query A www.gov.cn Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A www.gov.cn Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A www.gov.cn Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query response CNAME www.gov.chinacache.net CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A www.gov.chinacache.net Standard query response CNAME www.gov.cncssr.chinacache.net CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A www.gov.cncssr.chinacache.net Standard query response CNAME www.gov.foreign.ccgslb.com CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A www.gov.foreign.ccgslb.com Standard query response CNAME wac.0b51.edgecastcdn.net CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A wac.0b51.edgecastcdn.net Standard query response CNAME gp1.wac.v2cdn.net A 68.232.35.119 Standard query A gp1.wac.v2cdn.net Standard query response A 68.232.35.119 Standard query DNSKEY <Root> Standard query response DNSKEY DNSKEY RRSIG Standard query DS cn Standard query response Standard query DS net Standard query response DS RRSIG Standard query DNSKEY net Standard query response DNSKEY DNSKEY RRSIG Standard query DS chinacache.net Standard query response Standard query DS com Standard query response DS RRSIG Standard query DNSKEY com Standard query response DNSKEY DNSKEY RRSIG Standard query DS ccgslb.com Standard query response Standard query DS edgecastcdn.net Standard query response Standard query DS v2cdn.net Standard query response An obvious idea to avoid so many roundtrips is to serialize them together. There has been an attempt to standardize such "DNSSEC stapling" [1], however it's incomplete for the general case, mainly due to various intricacies - proofs of non-existence, NSEC3 opt-out zones, TTL handling (see RFC 4035 section 5). References: [1] https://www.ietf.org/mail-archive/web/dane/current/msg02823.html
Filename: 220-ecc-id-keys.txt Title: Migrate server identity keys to Ed25519 Authors: Nick Mathewson Created: 12 August 2013 Implemented-In: 0.3.0.1-alpha Status: Closed [Note: This is a draft proposal; I've probably made some important mistakes, and there are parts that need more thinking. I'm publishing it now so that we can do the thinking together.] (Sections 0-5 are currently implemented, except for section 2.3. Sections 6-8 are a work in progress, and may require revision.) 0. Introduction In current Tor designs, router identity keys are limited to 1024-bit RSA keys. Clearly, that should change, because RSA doesn't represent a good performance-security tradeoff nowadays, and because 1024-bit RSA is just plain too short. We've already got an improved circuit extension handshake protocol that uses curve25519 in place of RSA1024, and we're using (where supported) P256 ECDHE in our TLS handshakes, but there are more uses of RSA1024 to replace, including: * Router identity keys * TLS link keys * Hidden service keys This proposal describes how we'll migrate away from using 1024-bit RSA in the first two, since they're tightly coupled. Hidden service crypto changes will be complex, and will merit their own proposal. In this proposal, we'll also (incidentally) be extirpating a number of SHA1 usages. 1. Overview When this proposal is implemented, every router will have an Ed25519 identity key in addition to its current RSA1024 public key. Ed25519 (specifically, Ed25519-SHA-512 as described and specified at http://ed25519.cr.yp.to/) is a desirable choice here: it's secure, fast, has small keys and small signatures, is bulletproof in several important ways, and supports fast batch verification. (It isn't quite as fast as RSA1024 when it comes to public key operations, since RSA gets to take advantage of small exponents when generating public keys.) (For reference: In Ed25519 public keys are 32 bytes long, private keys are 64 bytes long, and signatures are 64 bytes long.) To mirror the way that authority identity keys work, we'll fully support keeping Ed25519 identity keys offline; they'll be used to sign long-ish term signing keys, which in turn will do all of the heavy lifting. A signing key will get used to sign the things that RSA1024 identity keys currently sign. 1.1. 'Personalized' signatures Each of the keys introduced here is used to sign more than one kind of document. While these documents should be unambiguous, I'm going to forward-proof the signatures by specifying each signature to be generated, not on the document itself, but on the document prefixed with some distinguishing string. 2. Certificates and Router descriptors. 2.1. Certificates When generating a signing key, we also generate a certificate for it. Unlike the certificates for authorities' signing keys, these certificates need to be sent around frequently, in significant numbers. So we'll choose a compact representation. VERSION [1 Byte] CERT_TYPE [1 Byte] EXPIRATION_DATE [4 Bytes] CERT_KEY_TYPE [1 byte] CERTIFIED_KEY [32 Bytes] N_EXTENSIONS [1 byte] EXTENSIONS [N_EXTENSIONS times] SIGNATURE [64 Bytes] The "VERSION" field holds the value [01]. The "CERT_TYPE" field holds a value depending on the type of certificate. (See appendix A.1.) The CERTIFIED_KEY field is an Ed25519 public key if CERT_KEY_TYPE is [01], or a SHA256 hash of some other key type depending on the value of CERT_KEY_TYPE. The EXPIRATION_DATE is a date, given in HOURS since the epoch, after which this certificate isn't valid. (A four-byte field here will work fine until 10136 A.D.) The EXTENSIONS field contains zero or more extensions, each of the format: ExtLength [2 bytes] ExtType [1 byte] ExtFlags [1 byte] ExtData [Length bytes] The meaning of the ExtData field in an extension is type-dependent. The ExtFlags field holds flags; this flag is currently defined: 1 -- AFFECTS_VALIDATION. If this flag is present, then the extension affects whether the certificate is valid; clients must not accept the certificate as valid unless they understand the extension. It is an error for an extension to be truncated; such a certificate is invalid. Before processing any certificate, parties MUST know which identity key it is supposed to be signed by, and then check the signature. The signature is formed by signing the first N-64 bytes of the certificate prefixed with the string "Tor node signing key certificate v1". 2.2. Basic extensions 2.2.1. Signed-with-ed25519-key extension [type 04] In several places, it's desirable to bundle the key signing a certificate along with the certificate. We do so with this extension. ExtLength = 32 ExtData = An ed25519 key [32 bytes] When this extension is present, it MUST match the key used to sign the certificate. 2.3. Revoking keys. We also specify a revocation document for revoking a signing key or an identity key. Its format is: FIXED_PREFIX [8 Bytes] VERSION [1 Byte] KEYTYPE [1 Byte] IDENTITY_KEY [32 Bytes] REVOKED_KEY [32 Bytes] PUBLISHED [8 Bytes] N_EXTENSIONS [1 Byte] N_EXTENSIONS_TIMES: EXTENSIONS [N_EXTENSIONS times] SIGNATURE [64 Bytes] FIXED_PREFIX is "REVOKEID" or "REVOKESK". VERSION is [01]. KEYTYPE is [01] for revoking a signing key, [02] for revoking an identity key, or [03] for revoking an RSA identity key. REVOKED_KEY is the key being revoked or a SHA256 hash of the key if it is an RSA identity key; IDENTITY_KEY is the node's Ed25519 identity key. PUBLISHED is the time that the document was generated, in seconds since the epoch. REV_EXTENSIONS is left for a future version of this document. The SIGNATURE is generated with the same key as in IDENTITY_KEY, and covers the entire revocation, prefixed with "Tor key revocation v1". Using these revocation documents is left for a later specification. 2.4. Managing keys By default, we can keep the easy-to-setup key management properties that Tor has now, so that node operators aren't required to have offline public keys: * When a Tor node starts up with no Ed25519 identity keys, it generates a new identity keypair. * When a Tor node has an Ed25519 identity keypair, and it has no signing key, or its signing key is going to expire within the next 48 hours, it generates a new signing key to last 30 days. But we also support offline identity keys: * When a Tor node starts with an Ed25519 public identity key but no private identity key, it checks whether it has a currently valid certified signing keypair. If it does, it starts. Otherwise, it refuses to start. * If a Tor node's signing key is going to expire soon, it starts warning the user. If it is expired, then the node shuts down. 2.5. Router descriptors We specify the following element that may appear at most once in each router descriptor: "identity-ed25519" NL "-----BEGIN ED25519 CERT-----" NL certificate "-----END ED25519 CERT-----" NL The certificate is base64-encoded with terminating =s removed. When this element is present, it MUST appear as the first or second element in the router descriptor. [XXX The rationale here is to allow extracting the identity key and signing key and checking the signature before fully parsing the rest of the document. -NM] The certificate has CERT_TYPE of [04]. It must include a signed-with-ed25519-key extension (see section 2.2.1), so that we can extract the identity key. When an identity-ed25519 element is present, there must also be a "router-sig-ed25519" element. It MUST be the next-to-last element in the descriptor, appearing immediately before the RSA signature. (In future versions of the descriptor format that do not require an RSA identity key, it MUST be last.) It MUST contain an ed25519 signature of a SHA256 digest of the entire document, from the first character up to and including the first space after the "router-sig-ed25519" string, prefixed with the string "Tor router descriptor signature v1". Its format is: "router-sig-ed25519" SP signature NL Where 'signature' is encoded in base64 with terminating =s removed. The signing key in the certificate MUST be the one used to sign the document. Note that these keys cross-certify as follows: the ed25519 identity key signs the ed25519 signing key in the certificate. The ed25519 signing key signs itself and the ed25519 identity key and the RSA identity key as part of signing the descriptor. And the RSA identity key also signs all three keys as part of signing the descriptor. When an ed25519 signature is present, there MAY be a "master-key-ed25519" element containing the base64 encoded ed25519 master key as a single argument. If it is present, it MUST match the identity key in the certificate. 2.5.1. Checking descriptor signatures. Current versions of Tor will handle these new formats by ignoring the new fields, and not checking any ed25519 information. New versions of Tor will have a flag that tells them whether to check ed25519 information. When it is set, they must check: * All RSA information and signatures that Tor implementations currently check. * If the identity-ed25519 line is present, it must be well-formed, and the certificate must be well-formed and correctly signed, and there must be a valid router-signature-ed25519 signature. * If we require an ed25519 key for this node (see 3.1 below), the ed25519 key must be present. Authorities and directory caches will have this flag always-on. For clients, it will be controlled by a torrc option and consensus option, to be set to "always-on" in the future once enough clients support it. 2.5.2. Extra-info documents Extra-info documents now include "identity-ed25519" and "router-signature-ed25519" fields in the same positions in which they appear in router descriptors. Additionally, we add the base64-encoded, =-stripped SHA256 digest of a node's extra-info document field to the extra-info-digest line in the router descriptor. (All versions of Tor that recognize this line allow an extra field there.) 2.5.3. A note on signature verification Here and elsewhere, we're receiving a certificate and a document signed with the key certified by that certificate in the same step. This is a fine time to use the batch signature checking capability of Ed25519, so that we can check both signatures at once without (much) additional overhead over checking a single signature. 3. Consensus documents and authority operation 3.1. Handling router identity at the authority When receiving router descriptors, authorities must track mappings between RSA and Ed25519 keys. Rule 1: Once an authority has seen an Ed25519 identity key and an RSA identity key together on the same (valid) descriptor, it should no longer accept any descriptor signed by that RSA key with a different Ed25519 key, or that Ed25519 key with a different RSA key. Rule 2: Once an authority has seen an Ed25519 identity key and an RSA identity key on the same descriptor, it should no longer accept any descriptor signed by that RSA key unless it also has that Ed25519 key. These rules together should enforce the property that, even if an attacker manages to steal or factor a node's RSA identity key, the attacker can't impersonate that node to the authorities, even when that node is identified by its RSA key. Enforcement of Rule 1 should be advisory-only for a little while (a release or two) while node operators get experience having Ed25519 keys, in case there are any bugs that cause or force identity key replacement. Enforcement of Rule 2 should be advisory-only for little while, so that node operators can try 0.2.5 but downgrade to 0.2.4 without being de-listed from the consensus. 3.2. Formats Vote and microdescriptor documents now contain an optional "id" field for each routerstatus section. Its format is: "id" SP "ed25519" SP ed25519-identity NL where ed25519-identity is base64-encoded, with trailing = characters omitted. In vote documents, it may be replaced by the format: "id" SP "ed25519" SP "none" NL which indicates that the node does not have an ed25519 identity. (In a microdescriptor, a lack of "id" line means that the node has no ed25519 identity.) A vote or consensus document is ill-formed if it includes the same ed25519 identity key twice. A vote listing ed25519 identities must also include a new entry in its "r" lines, containing a base64-encoded SHA256 digest of the entire descriptor (including signature). This kills off another place where we rely on sha1. The format for 'r' lines is now: "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort SP DirPort [ SP digest-sha256 ] NL 3.3. Generating votes An authority should pick which descriptor to choose for a node as before, and include the ed25519 identity key for the descriptor if it's present. As a transition, before Rule 1 and Rule 2 in 3.1 are fully enforced, authorities need a way to deal with the possibility that there might be two nodes with the same ed25519 key but different RSA keys. In that case, it votes for the one with the most recent publication date. (The existing rules already prevent an authority from voting for two servers with the same RSA identity key.) 3.4. Generating a consensus from votes This proposal requires a new consensus vote method. When we deploy it, we'll pick the next available vote method in sequence to use for this. When the new consensus method is in use, we must choose nodes first by ECC key, then by RSA key. [This procedure is analogous to the current one, except that it is aware of multiple kinds of keys.] 3.4.1. Notation for voting We have a set of votes. Each contains either 'old tuples' or 'new tuples'. Old tuples are: <id-RSA, descriptor-digest, published, nickname, IP, ports> New tuples are: <id-Ed, id-RSA, descriptor-digest, dd256, published, nickname, IP, ports> 3.4.2. Validating votes It is an error for a vote to have the same id-RSA or the same id-Ed listed twice. Throw it away if it does. 3.4.3. Decide which ids to include. For each <id-Ed, id-RSA> that is listed by more than half of the total authorities (not just total votes), include it. (No other <id-Ed, id-RSA'> can have as many votes.) Log any other id-RSA values corresponding to an id-Ed we included, and any other id-Ed values corresponding to an id-RSA we included. For each <id-RSA> that is not yet included, if it is listed by more than half of the total authorities, and we do not already have it listed with some <id-Ed>, include it without an id-Ed. 3.4.4. Decide which descriptors to include. A tuple belongs to an <id-RSA, id-Ed> identity if it is a new tuple that matches both ID parts, or if it is an old tuple that matches the RSA part. A tuple belongs to an <id-RSA> identity if its RSA identity matches. A tuple matches another tuple if all the fields that are present in both tuples are the same. For every included identity, consider the tuples belonging to that identity. Group them into sets of matching tuples. Include the tuple that matches the largest set, breaking ties in favor of the most recently published, and then in favor of the smaller server descriptor digest. 4. The link protocol 4.1. Overview of the status quo This section won't make much sense unless you grok the v3 link protocol as described in tor-spec.txt, first proposed in proposal 195. So let's review. In the v3 link protocol, the client completes a TLS handshake with the server, in which the server uses an arbitrary certificate signed with an RSA key. The client then sends a VERSIONS cell. The server replies with a VERSIONS cell to negotiate version 3 or higher. The server also sends a CERTS cell and an AUTH_CHALLENGE cell and a NETINFO cell. The CERTS cell from the server contains a set of one or more certificates that authenticate the RSA key used in the TLS handshake. (Right now there's one self-signed RSA identity key certificate, and one certificate signing the RSA link key with the identity key. These certificates are X509.) Having received a CERTS cell, the client has enough information to authenticate the server. At this point, the client may send a NETINFO cell to finish the handshake. But if the client wants to authenticate as well, it can send a CERTS cell and an AUTENTICATE cell. The client's CERTS cell also contains certs of the same general kinds as the server's key file: a self-signed identity certificate, and an authentication certificate signed with the identity key. The AUTHENTICATE cell contains a signature of various fields, including the contents of the AUTH_CHALLENGE which the server sent, using the client's authentication key. These cells allow the client to authenticate to the server. 4.2. Link protocol changes for ECC ID keys We add four new CertType values for use in CERTS cells: 4: Ed25519 signing key 5: Link key certificate certified by Ed25519 signing key 6: Ed25519 TLS authentication key certified by Ed25519 signing key 7: RSA cross-certificate for Ed25519 identity key These correspond to types used in the CERT_TYPE field of the certificates. The content of certificate type [04] (Ed25519 signing key) is as in section 2.5 above, containing an identity key and the signing key, both signed by the identity key. Certificate type [05] (Link certificate signed with Ed25519 signing key) contains a SHA256 digest of the X.509 link certificate used on the TLS connection in its key field; it is signed with the signing key. Certificate type [06] (Ed25519 TLS authentication signed with Ed25519 signing key) has the signing key used to sign the AUTHENTICATE cell described later in this section. Certificate type [07] (Cross-certification of Ed25519 identity with RSA key) contains the following data: ED25519_KEY [32 bytes] EXPIRATION_DATE [4 bytes] SIGLEN [1 byte] SIGNATURE [SIGLEN bytes] Here, the Ed25519 identity key is signed with router's RSA identity key, to indicate that authenticating with a key certified by the Ed25519 key counts as certifying with RSA identity key. (The signature is computed on the SHA256 hash of the non-signature parts of the certificate, prefixed with the string "Tor TLS RSA/Ed25519 cross-certificate".) (There's no reason to have a corresponding Ed25519-signed-RSA-key certificate here, since we do not treat authenticating with an RSA key as proving ownership of the Ed25519 identity.) Relays with Ed25519 keys should always send these certificate types in addition to their other certificate types. Non-bridge relays with Ed25519 keys should generate TLS link keys of appropriate strength, so that the certificate chain from the Ed25519 key to the link key is strong enough. We add a new authentication type for AUTHENTICATE cells: "Ed25519-TLSSecret", with AuthType value 2. Its format is the same as "RSA-SHA256-TLSSecret", except that the CID and SID fields support more key types; some strings are different, and the signature is performed with Ed25519 using the authentication key from a type-6 cert. Clients can send this AUTHENTICATE type if the server lists it in its AUTH_CHALLENGE cell. Modified values and new fields below are marked with asterisks. TYPE: The characters "AUTH0002"* [8 octets] CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets] SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets] *CID_ED: The initiator's Ed25519 identity key [32 octets] *SID_ED: The responder's Ed25519 identity key, or all-zero. [32 octets] SLOG: A SHA256 hash of all bytes sent from the responder to the initiator as part of the negotiation up to and including the AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell, the AUTH_CHALLENGE cell, and any padding cells. [32 octets] CLOG: A SHA256 hash of all bytes sent from the initiator to the responder as part of the negotiation so far; that is, the VERSIONS cell and the CERTS cell and any padding cells. [32 octets] SCERT: A SHA256 hash of the responder's TLS link certificate. [32 octets] TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the secret key, of the following: - client_random, as sent in the TLS Client Hello - server_random, as sent in the TLS Server Hello - the NUL terminated ASCII string: "Tor V3 handshake TLS cross-certification with Ed25519"* [32 octets] RAND: A 24 byte value, randomly chosen by the initiator. [24 octets] *SIG: A signature of all previous fields using the initiator's Ed25519 authentication flags. [variable length] If you've got a consensus that lists an ECC key for a node, but the node doesn't give you an ECC key, then refuse this connection. 5. The extend protocol We add a new NSPEC node specifier for use in EXTEND2 cells, with LSTYPE value [03]. Its length must be 32 bytes; its content is the Ed25519 identity key of the target node. Clients should use this type only when: * They know an Ed25519 identity key for the destination node. * The source node supports EXTEND2 cells * A torrc option is set, _or_ a consensus value is set. We'll leave the consensus value off for a while until more clients support this, and then turn it on. When picking a channel for a circuit, if this NSPEC value is provided, then the RSA identity *and* the Ed25519 identity must match. If we have a channel with a given Ed25519 ID and RSA identity, and we have a request for that Ed25519 ID and a different RSA identity, we do not attempt to make another connection: we just fail and DESTROY the circuit. If we receive an EXTEND or EXTEND2 request for a node listed in the consensus, but that EXTEND/EXTEND2 request does not include an Ed25519 identity key, the node SHOULD treat the connection as failed if the Ed25519 identity key it receives does not match the one in the consensus. For testing, clients may have the ability to configure whether to include Ed25519 identities in EXTEND2 cells. By default, this should be governed by the boolean "ExtendByEd25519ID" consensus parameter, with default value '0'. 6. Naming nodes in the interface Anywhere in the interface that takes an $identity should be able to take an ECC identity too. ECC identities are case-sensitive base64 encodings of Ed25519 identity keys. You can use $ to indicate them as well; we distinguish RSA identity digests by length. When we need to indicate an Ed25519 identity key in a hostname format (as in a .exit address), we use the lowercased version of the name, and perform a case-insensitive match. (This loses us a little less than one bit per byte of name, leaving plenty of bits to make sure we choose the right node.) Nodes must not list Ed25519 identities in their family lines; clients and authorities must not honor them there. (Doing so would make different clients change paths differently in a possibly manipulatable way.) Clients shouldn't accept .exit addresses with Ed25519 names on SOCKS or DNS ports by default, even when AllowDotExit is set. We can add another option for them later if there's a good reason to have this. We need an identity-to-node map for ECC identity and for RSA identity. The controller interface will need to accept and report Ed25519 identity keys as well as (or instead of) RSA identity keys. That's a separate proposal, though. 7. Hidden service changes out of scope Hidden services need to be able to identify nodes by ECC keys, just as they will need to include ntor keys as well as TAP keys. Not just yet though. This needs to be part of a bigger hidden service revamping strategy. 8. Proposed migration steps Once a few versions have shipped with Ed25519 key support, turn on "Rule 1" on the authorities. (Don't allow an Ed25519<->RSA pairing to change.) Once the release with these changes is in beta or rc, turn on the consensus option for everyone who receives descriptors with Ed25519 identity keys to check them. Once the release with these changes is in beta or rc, turn on the consensus option for clients to generate EXTEND2 requests with Ed25519 identity keys. Once the release with these changes has been stable for a month or two, turn on "Rule 2" on authorities. (Don't allow nodes that have advertised an Ed25519 key to stop.) 9. Future proposals * Ed25519 identity support on the controller interface * Supporting nodes without RSA keys * Remove support for nodes without Ed25519 keys * Ed25519 support for hidden services * Bridge identity support. * Ed25519-aware family support A.1. List of certificate types The values marked with asterisks are not types corresponding to the certificate format of section 2.1. Instead, they are reserved for RSA-signed certificates to avoid conflicts between the certificate type enumeration of the CERTS cell and the certificate type enumeration of in our Ed25519 certificates. **[00],[01],[02],[03] - Reserved to avoid conflict with types used in CERTS cells. [04] - signing a signing key with an identity key (Section 2.5) [05] - TLS link certificate signed with ed25519 signing key (Section 4.2) [06] - Ed25519 authentication key signed with ed25519 signing key (Section 4.2) **[07] - reserved for RSA identity cross-certification (Section 4.2) A.2. List of extension types [01] - signed-with-ed25519-key (section 2.2.1) A.3. List of signature prefixes We describe various documents as being signed with a prefix. Here are those prefixes: "Tor router descriptor signature v1" (section 2.5) "Tor node signing key certificate v1" (section 2.1) A.4. List of certified key types [01] ed25519 key [02] SHA256 hash of an RSA key [03] SHA256 hash of an X.509 certificate A.5. Reserved numbers We need a new consensus algorithm number to encompass checking ed25519 keys and putting them in microdescriptors. We need new CertType values for use in CERTS cells. We reserved in section 4.2. 4: Ed25519 signing key 5: Link key certificate certified by Ed25519 signing key 6: TLS authentication key certified by Ed25519 signing key 7: RSA cross-certificate for Ed25519 identity key A.6. Related changes As we merge this, proposal, we should also extend link key size to 2048 bits, and use SHA256 as the x509 cert algorithm for our link keys. This will improve link security, and deliver better fingerprinting resistence. See proposal 179 for an older discussion of this issue.
Filename: 221-stop-using-create-fast.txt Title: Stop using CREATE_FAST Authors: Nick Mathewson Created: 12 August 2013 Target: 0.2.5.x Status: Closed 0. Summary I propose that in 0.2.5.x, Tor clients stop sending CREATE_FAST cells, and use CREATE or CREATE2 cells instead as appropriate. 1. Introduction The CREATE_FAST cell was created to avoid the performance hit of using the TAP handshake on a TLS session that already provided what TAP provided: authentication with RSA1024 and forward secrecy with DH1024. But thanks to the introduction of the ntor onionskin handshake in Tor 0.2.4.x, for nodes with older versions of OpenSSL, the TLS handshake strength lags behind with the strength of the onion handshake, and the arguments against CREATE no longer apply. Similarly, it's good to have an argument for circuit security that survives possible breakdowns in TLS. But when CREATE_FAST is in use, this is impossible: we can only argue forward-secrecy at the first hop of each circuit by assuming that TLS has succeeded. So let's simply stop sending CREATE_FAST cells. 2. Proposed design Currently, only clients will send CREATE_FAST, and only when they have FastFirstHopPK set to its default value, 1. I propose that we change "FastFirstHopPK" from a boolean to also allow a new default "auto" value that tells Tor to take a value from the consensus. I propose a new consensus parameter, "usecreatefast", default value taken to be 1. Once enough versions of Tor support this proposal, the authorities should set the value for "usecreatefast" to be 0. In the series after that (0.2.6.x?), the default value for "FastFirstHopPK" should be 0. (Note that CREATE_FAST must still be used in the case where a client has connected to a guard node or bridge without knowing any onion keys for it, and wants to fetch directory information from it.) 3. Alternative designs We might make some choices to preserve CREATE_FAST under some circumstances. For example, we could say that CREATE_FAST is okay if we have a TLS connection with a cipher, public key, and ephemeral key algorithm of a given strength. We might try to trust the TLS handshake for authentication but not forward secrecy, and come up with a first-hop handshake that did a simple curve25519 diffie-hellman. We might use CREATE_FAST only whenever ntor is not available. I'm rejecting all of the above for complexity reasons. We might just change the default for FastFirstHopPK to 0 in 0.2.5.x-alpha. It would make early users of that alpha easy for their guards to distinguish. 4. Performance considerations This will increase the CPU requirements on guard nodes; their cpuworkers would be more heavily loaded as 0.2.5.x is more adopted. I believe that, if guards upgrade to 0.2.4.x as 0.2.5.x is under development, the commensurate benefits of ntor will outweigh the problems here. This holds even more if we wind up with a better ntor implementation or replacement. 5. Considerations on client detection Right now, in a few places, Tor nodes assume that any connection on which they have received a CREATE_FAST cell is probably from a non-relay node, since relays never do that. Implementing this proposal would make that signal unreliable. We should do this proposal anyway. CREATE_FAST has never been a reliable signal, since "FastFirstHopPK 0" is easy enough to type, and the source code is easy enough to edit. Proposal 163 and its successors have better ideas here anyway.
Filename: 222-remove-client-timestamps.txt Title: Stop sending client timestamps Authors: Nick Mathewson Created: 22 August 2013 Status: Closed Implemented-In: 0.2.4.18 0. Summary There are a few places in Tor where clients and servers send timestamps. I list them and discuss how to eliminate them. 1. Introduction Despite this late date, many hosts aren't running NTP and don't have very well synchronized clocks. Even more hosts aren't running a secure NTP; it's probably easy to desynchronize target hosts. Given all of this, it's probably a fingerprinting opportunity whenever clients send their view of the current time. Let's try to avoid that. I'm also going to list the places where servers send their view of the current time, and propose that we eliminate some of those. Scope: This proposal is about eliminating passive timestamp exposure, not about tricky active detection mechanisms where you do something like offering a client a large number of about-to-expire/just-expired certificates to see which ones they accept. 2. The Tor link protocol 2.1. NETINFO (client and server) NETINFO cells specify that both parties include a 4-byte timestamp. Instead, let's say that clients should set this timestamp to 0. Nothing currently looks at a client's setting for this field, so this change should be safe. 2.2. AUTHENTICATE (server) The AUTHENTICATE cell is not ordinarily sent by clients. It contains an 8-byte timestamp and a 16-byte random value. Instead, let's just send 24 bytes or random value. (An earlier version of this proposal suggested that we replace them both with a 24-byte (truncated) HMAC of the current time, using a random key, in an attempt to retain the allegedly desirable property of avoiding nonce duplication in the event of a bad RNG. But really, a Tor process with a bad RNG is not going to get security in any case, so let's KISS.) 2.3. TLS 2.3.1. ClientRandom in the TLS handshake See TLS proposal in appendix A. This presents a TLS fingerprinting/censorship opportunity. I propose that we investigate whether "random " or "zero" is more common on the wire, choose that, and lobby for changes to TLS implementations. 2.3.2. Certificate validity intervals Servers use the current time in setting certificate validity for their initial certificates. They randomize this value somewhat. I propose that we don't change this, since it's a server-only issue, and already somewhat mitigated. 3. Directory protocol 3.1. Published This field in descriptors is generated by servers only; I propose no change. 3.2. The Date header This HTTP header is sent by directory servers only; I propose no change. 4. The hidden service protocol 4.1. Descriptor publication time Hidden service descriptors include a publication time. I propose that we round this time down to the nearest N minutes, where N=60. 4.2. INTRODUCE2 cell timestamp INTRODUCE2 cells once limited the duration of their replay caches by including a timestamp in the INTRODUCE2 cells. Since 0.2.3.9-alpha, this timestamp is ignored, and key lifetime is used instead. When we determine that no hidden services are running on 0.2.2.x (and really, no hidden services should be running on 0.2.2.x!), we can simply send 0 instead. (See ticket #7803). We can control this behavior with a consensus parameter (Support022HiddenServices) and a tristate (0/1/auto) torrc option of the same name. When the timestamp is not completely disabled, it should be rounded to the closest 10 minutes. I claim this would be suitable for backport to 0.2.4. 5. The application layer The application layer is mostly out of scope for this proposal, except: TorBrowser already (I hear) drops the timestamp from the ClientRandom field in TLS. We should encourage other TLS applications to do so. (See Appendix A.) ================================================================= APPENDIX A: "Let's replace gmt_unix_time in TLS" PROBLEM: The gmt_unix_time field in the Random field in the TLS handshake provides a way for an observer to fingerprint clients. Despite the late date, much of the world is still not synchronized to the second via an ntp-like service. This means that different clients have different views of the current time, which provides a fingerprint that helps to track and distinguish them. This fingerprint is useful for tracking clients as they move around. It can also distinguish clients using a single VPN, NAT, or privacy network. (Tor's modified firefox avoids this by not sending the time.) Worse, some implementations don't send the current time, but the process time, or the computer's uptime, both of which are far more distinguishing than the current time() value. The information fingerprint here is strong enough to uniquely identify some TLS users (the ones whose clocks are hours off). Even for the ones whose clocks are mostly right (within a second or two), the field leaks a bit of information, and it only takes so many bits to make a user unique. WHY gmt_unix_time IN THE FIRST PLACE? According to third-hand reports -- (and correct me if I'm wrong!) it was introduced in SSL 3.0 to prevent complete failure in cases where the PRNG was completely broken, by making a part of the Random field that would definitely vary between TLS handshakes. I doubt that this goal is really achieved: on modern desktop environments, it's not really so strange to start two TLS connections within the same second. WHY ELSE IS gmt_unix_time USED? The consensus among implementors seems to be that it's unwise to depend on any particular value or interpretation for the field. The TLS 1.2 standard, RFC 5246, says that "Clocks are not required to be set correctly by the basic TLS protocol; higher-level or application protocols may define additional requirements." Some implementations set the entire field randomly; this appears not to have broken TLS on the internet. At least one tool (tlsdate) uses the server-side value of the field as an authenticated view of the current time. PROPOSAL 1: Declare that implementations MAY replace gmt_unix_time either with four more random bytes, or four bytes of zeroes. Make your implementation just do that. (Rationale: some implementations (like TorBrowser) are already doing this in practice. It's sensible and simple. You're unlikely to mess it up, or cause trouble.) PROPOSAL 2: Okay, if you really want to preserve the security allegedly provided by gmt_unix_time, allow the following approach instead: Set the Random field, not to 32 bytes from your PRNG, but to the HMAC-SHA256 of any high resolution timer that you have, using 32 bytes from your PRNG as a key. In other words, replace this: Random.gmt_unix_time = time(); Random.random_bytes = get_random_bytes(28) with this: now = hires_time(); // clock_gettime(), or concatenate time() // with a CPU timer, or process // uptime, or whatever. key = get_random_bytes(32); Random = hmac_sha256(key, now); This approach is better than the status quo on the following counts: * It doesn't leak your view of the current time, assuming that your PRNG isn't busted. * It actually fixes the problem that gmt_unix_time purported to fix, by using a high-resolution time that's much less likely to be used twice. Even if the PRNG is broken, the value is still nonrepeating. It is not worse than the status quo: * It is unpredictable from an attacker's POV, assuming that the PRNG works. (Because an HMAC, even of known data, with an unknown random key is supposed to look random). CONSIDERATIONS: I'd personally suggest proposal 1 (just set the field at random) for most users. Yes, it makes things a little worse if your PRNG can generate repeat values... but nearly everything in cryptography fails if your PRNG is broken. You might want to apply this fix on clients only. With a few exceptions (like hidden services) the server's view of the current time is not sensitive. Implementors might want to make this feature optional and on-by-default, just in case some higher-level application protocol really does depend on it. ==================================================================
Filename: 223-ace-handshake.txt Title: Ace: Improved circuit-creation key exchange Author: Esfandiar Mohammadi, Aniket Kate, Michael Backes Created: 22-July-2013 Status: Reserve History: 22-July-2013 -- Submitted 20-Nov-2013 -- Reformatted slightly, wrapped lines, added references, adjusted the KDF [nickm] 20-Nov-2013 -- Clarified that there's only one group here [nickm] Summary: This is an attempt to translate the proposed circuit handshake from "Ace: An Efficient Key-Exchange Protocol for Onion Routing" by Backes, Kate, and Mohammadi into a Tor proposal format. The specification assumes an implementation of scalar multiplication and addition of two curve elements, as in Robert Ransom's celator library. Notation: Let a|b be the concatenation of a with b. Let H(x,t) be a tweakable hash function of output width H_LENGTH bytes. Let t_mac, t_key, and t_verify be a set of arbitrarily-chosen tweaks for the hash function. Let EXP(a,b) be a^b in some appropriate group G where the appropriate DH parameters hold. Let's say elements of this group, when represented as byte strings, are all G_LENGTH bytes long. Let's say we are using a generator g for this group. Let MUTLIEXPONEN (a,b,c,d) be (a^b)*(c^d) in the same group G as above. Let PROTOID be a string designating this variant of the protocol. Let KEYID be a collision-resistant (but not necessarily preimage-resistant) hash function on members of G, of output length H_LENGTH bytes. Instantiation: Let's call this PROTOID "ace-curve25519-ed-uncompressed-sha256-1" Set H(x,t) == HMAC_SHA256 with message x and key t. So H_LENGTH == 32. Set t_mac == PROTOID | ":mac" t_key == PROTOID | ":key" t_verify == PROTOID | ":verify" Set EXP(a,b) == scalar_mult_curve25519(a,b), MUTLIEXPONEN(a,b) == dblscalarmult_curve25519(a,b,c,d), and g == 9 . Set KEYID(B) == B. (We don't need to use a hash function here, since our keys are already very short. It is trivially collision-resistant, since KEYID(A)==KEYID(B) iff A==B.) Protocol: Take a router with identity key digest ID. As setup, the router generates a secret key b, and a public onion key B = EXP(g,b). The router publishes B in its server descriptor. To send a create cell, the client generates two keypairs of x_1, X_1=EXP(g,x_1) and x_2, X_2=EXP(g,x_2) and sends a CREATE cell with contents: NODEID: ID -- H_LENGTH bytes KEYID: KEYID(B) -- H_LENGTH bytes CLIENT_PK: X_1, X_2 -- 2 x G_LENGTH bytes The server checks X_1, X_2, generates a keypair of y, Y=EXP(g,y) and computes point = MUTLIEXPONEN(X_1,y,X_2,b) secret_input = point | ID | B | X_1 | X_2 | Y | PROTOID KEY_SEED = H(secret_input | "Key Seed", t_key) KEY_VERIFY = H(secret_input | "HMac Seed", t_verify) auth_input = ID | B | Y | X_1 | X_2 | PROTOID | "Server" The server sends a CREATED cell containing: SERVER_PK: Y -- G_LENGTH bytes AUTH: H(auth_input, KEY_VERIFY) -- H_LENGTH bytes The client then checks Y, and computes point = MUTLIEXPONEN(Y,x_1,B,x_2) secret_input = point | ID | B | X_1 | X_2 | Y | PROTOID KEY_SEED = H(secret_input | "Key Seed", t_key) KEY_VERIFY = H(secret_input | "HMac Seed", t_verify) auth_input = ID | B | Y | X_1 | X_2 | PROTOID | "Server" The client verifies that AUTH == H(auth_input, KEY_VERIFY). Both parties now have a shared value for KEY_SEED. They expand this into the keys needed for the Tor relay protocol. Key expansion: When using this handshake, clients and servers should expand keys using HKDF as with the ntor handshake today. See also: http://www.infsec.cs.uni-saarland.de/~mohammadi/ace/ace.html for implementations, academic paper, and benchmarking code.
Filename: 224-rend-spec-ng.txt Title: Next-Generation Hidden Services in Tor Author: David Goulet, George Kadianakis, Nick Mathewson Created: 2013-11-29 Status: Closed Implemented-In: 0.3.2.1-alpha Table of contents: 0. Hidden services: overview and preliminaries. 0.1. Improvements over previous versions. 0.2. Notation and vocabulary 0.3. Cryptographic building blocks 0.4. Protocol building blocks [BUILDING-BLOCKS] 0.5. Assigned relay cell types 0.6. Acknowledgments 1. Protocol overview 1.1. View from 10,000 feet 1.2. In more detail: naming hidden services [NAMING] 1.3. In more detail: Access control [IMD:AC] 1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] 1.5. In more detail: Scaling to multiple hosts 1.6. In more detail: Backward compatibility with older hidden service 1.7. In more detail: Keeping crypto keys offline 1.8. In more detail: Encryption Keys And Replay Resistance 1.9. In more detail: A menagerie of keys 1.9.1. In even more detail: Client authorization [CLIENT-AUTH] 2. Generating and publishing hidden service descriptors [HSDIR] 2.1. Deriving blinded keys and subcredentials [SUBCRED] 2.2. Locating, uploading, and downloading hidden service descriptors 2.2.1. Dividing time into periods [TIME-PERIODS] 2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] 2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] 2.2.4. Using time periods and SRVs to fetch/upload HS descriptors 2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] 2.2.6. URLs for anonymous uploading and downloading 2.3. Publishing shared random values [PUB-SHAREDRANDOM] 2.3.1. Client behavior in the absense of shared random values 2.3.2. Hidden services and changing shared random values 2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] 2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] 2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] 2.5.1.1. First layer encryption logic 2.5.1.2. First layer plaintext format 2.5.1.3. Client behavior 2.5.1.4. Obfuscating the number of authorized clients 2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] 2.5.2.1. Second layer encryption keys 2.5.2.2. Second layer plaintext format 2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] 3. The introduction protocol [INTRO-PROTOCOL] 3.1. Registering an introduction point [REG_INTRO_POINT] 3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] 3.1.2. Registering an introduction point on a legacy Tor node [LEGACY_EST_INTRO] 3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] 3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] 3.2.1. INTRODUCE1 cell format [FMT_INTRO1] 3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] 3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] 3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] 3.3.2. Example encryption handshake: ntor with extra data [NTOR-WITH-EXTRA-DATA] 3.4. Authentication during the introduction phase. [INTRO-AUTH] 3.4.1. Ed25519-based authentication. 4. The rendezvous protocol 4.1. Establishing a rendezvous point [EST_REND_POINT] 4.2. Joining to a rendezvous point [JOIN_REND] 4.2.1. Key expansion 4.3. Using legacy hosts as rendezvous points 5. Encrypting data between client and host 6. Encoding onion addresses [ONIONADDRESS] 7. Open Questions: -1. Draft notes This document describes a proposed design and specification for hidden services in Tor version 0.2.5.x or later. It's a replacement for the current rend-spec.txt, rewritten for clarity and for improved design. Look for the string "TODO" below: it describes gaps or uncertainties in the design. Change history: 2013-11-29: Proposal first numbered. Some TODO and XXX items remain. 2014-01-04: Clarify some unclear sections. 2014-01-21: Fix a typo. 2014-02-20: Move more things to the revised certificate format in the new updated proposal 220. 2015-05-26: Fix two typos. 0. Hidden services: overview and preliminaries. Hidden services aim to provide responder anonymity for bidirectional stream-based communication on the Tor network. Unlike regular Tor connections, where the connection initiator receives anonymity but the responder does not, hidden services attempt to provide bidirectional anonymity. Participants: Operator -- A person running a hidden service Host, "Server" -- The Tor software run by the operator to provide a hidden service. User -- A person contacting a hidden service. Client -- The Tor software running on the User's computer Hidden Service Directory (HSDir) -- A Tor node that hosts signed statements from hidden service hosts so that users can make contact with them. Introduction Point -- A Tor node that accepts connection requests for hidden services and anonymously relays those requests to the hidden service. Rendezvous Point -- A Tor node to which clients and servers connect and which relays traffic between them. 0.1. Improvements over previous versions. Here is a list of improvements of this proposal over the legacy hidden services: a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519) b) Improved directory protocol leaking less to directory servers. c) Improved directory protocol with smaller surface for targeted attacks. d) Better onion address security against impersonation. e) More extensible introduction/rendezvous protocol. f) Offline keys for onion services g) Advanced client authorization 0.2. Notation and vocabulary Unless specified otherwise, all multi-octet integers are big-endian. We write sequences of bytes in two ways: 1. A sequence of two-digit hexadecimal values in square brackets, as in [AB AD 1D EA]. 2. A string of characters enclosed in quotes, as in "Hello". The characters in these strings are encoded in their ascii representations; strings are NOT nul-terminated unless explicitly described as NUL terminated. We use the words "byte" and "octet" interchangeably. We use the vertical bar | to denote concatenation. We use INT_N(val) to denote the network (big-endian) encoding of the unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00 05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example, INT_4(42) is 42 % 4294967296 (32 bit). 0.3. Cryptographic building blocks This specification uses the following cryptographic building blocks: * A pseudorandom number generator backed by a strong entropy source. The output of the PRNG should always be hashed before being posted on the network to avoid leaking raw PRNG bytes to the network (see [PRNG-REFS]). * A stream cipher STREAM(iv, k) where iv is a nonce of length S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes. * A public key signature system SIGN_KEYGEN()->seckey, pubkey; SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) -> { "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and signatures are of length SIGN_SIG_LEN bytes. This signature system must also support key blinding operations as discussed in appendix [KEYBLIND] and in section [SUBCRED]: SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 . * A public key agreement system "PK", providing PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"}; and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are of length PK_SECKEY_LEN bytes, public keys are of length PK_PUBKEY_LEN bytes, and the handshake produces outputs of length PK_OUTPUT_LEN bytes. * A cryptographic hash function H(d), which should be preimage and collision resistant. It produces hashes of length HASH_LEN bytes. * A cryptographic message authentication code MAC(key,msg) that produces outputs of length MAC_LEN bytes. * A key derivation function KDF(message, n) that outputs n bytes. As a first pass, I suggest: * Instantiate STREAM with AES256-CTR. * Instantiate SIGN with Ed25519 and the blinding protocol in [KEYBLIND]. * Instantiate PK with Curve25519. * Instantiate H with SHA3-256. * Instantiate KDF with SHAKE-256. * Instantiate MAC(key=k, message=m) with H(k_len | k | m), where k_len is htonll(len(k)). For legacy purposes, we specify compatibility with older versions of the Tor introduction point and rendezvous point protocols. These used RSA1024, DH1024, AES128, and SHA1, as discussed in rend-spec.txt. As in [proposal 220], all signatures are generated not over strings themselves, but over those strings prefixed with a distinguishing value. 0.4. Protocol building blocks [BUILDING-BLOCKS] In sections below, we need to transmit the locations and identities of Tor nodes. We do so in the link identification format used by EXTEND2 cells in the Tor protocol. NSPEC (Number of link specifiers) [1 byte] NSPEC times: LSTYPE (Link specifier type) [1 byte] LSLEN (Link specifier length) [1 byte] LSPEC (Link specifier) [LSLEN bytes] Link specifier types are as described in tor-spec.txt. Every set of link specifiers MUST include at minimum specifiers of type [00] (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519 identity key). We also incorporate Tor's circuit extension handshakes, as used in the CREATE2 and CREATED2 cells described in tor-spec.txt. In these handshakes, a client who knows a public key for a server sends a message and receives a message from that server. Once the exchange is done, the two parties have a shared set of forward-secure key material, and the client knows that nobody else shares that key material unless they control the secret key corresponding to the server's public key. 0.5. Assigned relay cell types These relay cell types are reserved for use in the hidden service protocol. 32 -- RELAY_COMMAND_ESTABLISH_INTRO Sent from hidden service host to introduction point; establishes introduction point. Discussed in [REG_INTRO_POINT]. 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS Sent from client to rendezvous point; creates rendezvous point. Discussed in [EST_REND_POINT]. 34 -- RELAY_COMMAND_INTRODUCE1 Sent from client to introduction point; requests introduction. Discussed in [SEND_INTRO1] 35 -- RELAY_COMMAND_INTRODUCE2 Sent from introduction point to hidden service host; requests introduction. Same format as INTRODUCE1. Discussed in [FMT_INTRO1] and [PROCESS_INTRO2] 36 -- RELAY_COMMAND_RENDEZVOUS1 Sent from hidden service host to rendezvous point; attempts to join host's circuit to client's circuit. Discussed in [JOIN_REND] 37 -- RELAY_COMMAND_RENDEZVOUS2 Sent from rendezvous point to client; reports join of host's circuit to client's circuit. Discussed in [JOIN_REND] 38 -- RELAY_COMMAND_INTRO_ESTABLISHED Sent from introduction point to hidden service host; reports status of attempt to establish introduction point. Discussed in [INTRO_ESTABLISHED] 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED Sent from rendezvous point to client; acknowledges receipt of ESTABLISH_RENDEZVOUS cell. Discussed in [EST_REND_POINT] 40 -- RELAY_COMMAND_INTRODUCE_ACK Sent from introduction point to client; acknowledges receipt of INTRODUCE1 cell and reports success/failure. Discussed in [INTRO_ACK] 0.6. Acknowledgments This design includes ideas from many people, including Christopher Baines, Daniel J. Bernstein, Matthew Finkel, Ian Goldberg, George Kadianakis, Aniket Kate, Tanja Lange, Robert Ransom, Roger Dingledine, Aaron Johnson, Tim Wilson-Brown ("teor"), special (John Brooks), s7r It's based on Tor's original hidden service design by Roger Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to that design over the years by people including Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Alessandro Preite Martinez, Robert Ransom, Ferdinand Rieger, Christoph Weingarten, Christian Wilms, We wouldn't be able to do any of this work without good attack designs from researchers including Alex Biryukov, Lasse Øverlier, Ivan Pustogarov, Paul Syverson Ralf-Philipp Weinmann, See [ATTACK-REFS] for their papers. Several of these ideas have come from conversations with Christian Grothoff, Brian Warner, Zooko Wilcox-O'Hearn, And if this document makes any sense at all, it's thanks to editing help from Matthew Finkel George Kadianakis, Peter Palfrader, Tim Wilson-Brown ("teor"), [XXX Acknowledge the huge bunch of people working on 8106.] [XXX Acknowledge the huge bunch of people working on 8244.] Please forgive me if I've missed you; please forgive me if I've misunderstood your best ideas here too. 1. Protocol overview In this section, we outline the hidden service protocol. This section omits some details in the name of simplicity; those are given more fully below, when we specify the protocol in more detail. 1.1. View from 10,000 feet A hidden service host prepares to offer a hidden service by choosing several Tor nodes to serve as its introduction points. It builds circuits to those nodes, and tells them to forward introduction requests to it using those circuits. Once introduction points have been picked, the host builds a set of documents called "hidden service descriptors" (or just "descriptors" for short) and uploads them to a set of HSDir nodes. These documents list the hidden service's current introduction points and describe how to make contact with the hidden service. When a client wants to connect to a hidden service, it first chooses a Tor node at random to be its "rendezvous point" and builds a circuit to that rendezvous point. If the client does not have an up-to-date descriptor for the service, it contacts an appropriate HSDir and requests such a descriptor. The client then builds an anonymous circuit to one of the hidden service's introduction points listed in its descriptor, and gives the introduction point an introduction request to pass to the hidden service. This introduction request includes the target rendezvous point and the first part of a cryptographic handshake. Upon receiving the introduction request, the hidden service host makes an anonymous circuit to the rendezvous point and completes the cryptographic handshake. The rendezvous point connects the two circuits, and the cryptographic handshake gives the two parties a shared key and proves to the client that it is indeed talking to the hidden service. Once the two circuits are joined, the client can send Tor RELAY cells to the server. RELAY_BEGIN cells open streams to an external process or processes configured by the server; RELAY_DATA cells are used to communicate data on those streams, and so forth. 1.2. In more detail: naming hidden services [NAMING] A hidden service's name is its long term master identity key. This is encoded as a hostname by encoding the entire key in Base 32, including a version byte and a checksum, and then appending the string ".onion" at the end. The result is a 56-character domain name. (This is a change from older versions of the hidden service protocol, where we used an 80-bit truncated SHA1 hash of a 1024 bit RSA key.) The names in this format are distinct from earlier names because of their length. An older name might look like: unlikelynamefora.onion yyhws9optuwiwsns.onion And a new name following this specification might look like: l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion Please see section [ONIONADDRESS] for the encoding specification. 1.3. In more detail: Access control [IMD:AC] Access control for a hidden service is imposed at multiple points through the process above. Furthermore, there is also the option to impose additional client authorization access control using pre-shared secrets exchanged out-of-band between the hidden service and its clients. The first stage of access control happens when downloading HS descriptors. Specifically, in order to download a descriptor, clients must know which blinded signing key was used to sign it. (See the next section for more info on key blinding.) To learn the introduction points, clients must decrypt the body of the hidden service descriptor. To do so, clients must know the _unblinded_ public key of the service, which makes the descriptor unuseable by entities without that knowledge (e.g. HSDirs that don't know the onion address). Also, if optional client authorization is enabled, hidden service descriptors are superencrypted using each authorized user's identity x25519 key, to further ensure that unauthorized entities cannot decrypt it. In order to make the introduction point send a rendezvous request to the service, the client needs to use the per-introduction-point authentication key found in the hidden service descriptor. The final level of access control happens at the server itself, which may decide to respond or not respond to the client's request depending on the contents of the request. The protocol is extensible at this point: at a minimum, the server requires that the client demonstrate knowledge of the contents of the encrypted portion of the hidden service descriptor. If optional client authorization is enabled, the service may additionally require the client to prove knowledge of a pre-shared private key. 1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] Periodically, hidden service descriptors become stored at different locations to prevent a single directory or small set of directories from becoming a good DoS target for removing a hidden service. For each period, the Tor directory authorities agree upon a collaboratively generated random value. (See section 2.3 for a description of how to incorporate this value into the voting practice; generating the value is described in other proposals, including [SHAREDRANDOM-REFS].) That value, combined with hidden service directories' public identity keys, determines each HSDir's position in the hash ring for descriptors made in that period. Each hidden service's descriptors are placed into the ring in positions based on the key that was used to sign them. Note that hidden service descriptors are not signed with the services' public keys directly. Instead, we use a key-blinding system [KEYBLIND] to create a new key-of-the-day for each hidden service. Any client that knows the hidden service's credential can derive these blinded signing keys for a given period. It should be impossible to derive the blinded signing key lacking that credential. The body of each descriptor is also encrypted with a key derived from the credential. To avoid a "thundering herd" problem where every service generates and uploads a new descriptor at the start of each period, each descriptor comes online at a time during the period that depends on its blinded signing key. The keys for the last period remain valid until the new keys come online. 1.5. In more detail: Scaling to multiple hosts This design is compatible with our current approaches for scaling hidden services. Specifically, hidden service operators can use onionbalance to achieve high availability between multiple nodes on the HSDir layer. Furthermore, operators can use proposal 255 to load balance their hidden services on the introduction layer. See [SCALING-REFS] for further discussions on this topic and alternative designs. 1.6. In more detail: Backward compatibility with older hidden service protocols This design is incompatible with the clients, server, and hsdir node protocols from older versions of the hidden service protocol as described in rend-spec.txt. On the other hand, it is designed to enable the use of older Tor nodes as rendezvous points and introduction points. 1.7. In more detail: Keeping crypto keys offline In this design, a hidden service's secret identity key may be stored offline. It's used only to generate blinded signing keys, which are used to sign descriptor signing keys. In order to operate a hidden service, the operator can generate in advance a number of blinded signing keys and descriptor signing keys (and their credentials; see [DESC-OUTER] and [HS-DESC-ENC] below), and their corresponding descriptor encryption keys, and export those to the hidden service hosts. As a result, in the scenario where the Hidden Service gets compromised, the adversary can only impersonate it for a limited period of time (depending on how many signing keys were generated in advance). It's important to not send the private part of the blinded signing key to the Hidden Service since an attacker can derive from it the secret master identity key. The secret blinded signing key should only be used to create credentials for the descriptor signing keys. 1.8. In more detail: Encryption Keys And Replay Resistance To avoid replays of an introduction request by an introduction point, a hidden service host must never accept the same request twice. Earlier versions of the hidden service design used an authenticated timestamp here, but including a view of the current time can create a problematic fingerprint. (See proposal 222 for more discussion.) 1.9. In more detail: A menagerie of keys [In the text below, an "encryption keypair" is roughly "a keypair you can do Diffie-Hellman with" and a "signing keypair" is roughly "a keypair you can do ECDSA with."] Public/private keypairs defined in this document: Master (hidden service) identity key -- A master signing keypair used as the identity for a hidden service. This key is long term and not used on its own to sign anything; it is only used to generate blinded signing keys as described in [KEYBLIND] and [SUBCRED]. The public key is encoded in the ".onion" address according to [NAMING]. Blinded signing key -- A keypair derived from the identity key, used to sign descriptor signing keys. It changes periodically for each service. Clients who know a 'credential' consisting of the service's public identity key and an optional secret can derive the public blinded identity key for a service. This key is used as an index in the DHT-like structure of the directory system (see [SUBCRED]). Descriptor signing key -- A key used to sign hidden service descriptors. This is signed by blinded signing keys. Unlike blinded signing keys and master identity keys, the secret part of this key must be stored online by hidden service hosts. The public part of this key is included in the unencrypted section of HS descriptors (see [DESC-OUTER]). Introduction point authentication key -- A short-term signing keypair used to identify a hidden service to a given introduction point. A fresh keypair is made for each introduction point; these are used to sign the request that a hidden service host makes when establishing an introduction point, so that clients who know the public component of this key can get their introduction requests sent to the right service. No keypair is ever used with more than one introduction point. (previously called a "service key" in rend-spec.txt) Introduction point encryption key -- A short-term encryption keypair used when establishing connections via an introduction point. Plays a role analogous to Tor nodes' onion keys. A fresh keypair is made for each introduction point. Symmetric keys defined in this document: Descriptor encryption keys -- A symmetric encryption key used to encrypt the body of hidden service descriptors. Derived from the current period and the hidden service credential. Public/private keypairs defined elsewhere: Onion key -- Short-term encryption keypair (Node) identity key Symmetric key-like things defined elsewhere: KH from circuit handshake -- An unpredictable value derived as part of the Tor circuit extension handshake, used to tie a request to a particular circuit. 1.9.1. In even more detail: Client authorization keys [CLIENT-AUTH] When client authorization is enabled, each authorized client of a hidden service has two more assymetric keypairs which are shared with the hidden service. An entity without those keys is not able to use the hidden service. Throughout this document, we assume that these pre-shared keys are exchanged between the hidden service and its clients in a secure out-of-band fashion. Specifically, each authorized client possesses: - An x25519 keypair used to compute decryption keys that allow the client to decrypt the hidden service descriptor. See [HS-DESC-ENC]. - An ed25519 keypair which allows the client to compute signatures which prove to the hidden service that the client is authorized. These signatures are inserted into the INTRODUCE1 cell, and without them the introduction to the hidden service cannot be completed. See [INTRO-AUTH]. The right way to exchange these keys is to have the client generate keys and send the corresponding public keys to the hidden service out-of-band. An easier but less secure way of doing this exchange would be to have the hidden service generate the keypairs and pass the corresponding private keys to its clients. See section [CLIENT-AUTH-MGMT] for more details on how these keys should be managed. [TODO: Also specify stealth client authorization.] 2. Generating and publishing hidden service descriptors [HSDIR] Hidden service descriptors follow the same metaformat as other Tor directory objects. They are published anonymously to Tor servers with the HSDir flag, HSDir=2 protocol version and tor version >= 0.3.0.8 (because a bug was fixed in this version). 2.1. Deriving blinded keys and subcredentials [SUBCRED] In each time period (see [TIME-PERIODS] for a definition of time periods), a hidden service host uses a different blinded private key to sign its directory information, and clients use a different blinded public key as the index for fetching that information. For a candidate for a key derivation method, see Appendix [KEYBLIND]. Additionally, clients and hosts derive a subcredential for each period. Knowledge of the subcredential is needed to decrypt hidden service descriptors for each period and to authenticate with the hidden service host in the introduction process. Unlike the credential, it changes each period. Knowing the subcredential, even in combination with the blinded private key, does not enable the hidden service host to derive the main credential--therefore, it is safe to put the subcredential on the hidden service host while leaving the hidden service's private key offline. The subcredential for a period is derived as: subcredential = H("subcredential" | credential | blinded-public-key). In the above formula, credential corresponds to: credential = H("credential" | public-identity-key) where public-identity-key is the public identity master key of the hidden service. 2.2. Locating, uploading, and downloading hidden service descriptors [HASHRING] To avoid attacks where a hidden service's descriptor is easily targeted for censorship, we store them at different directories over time, and use shared random values to prevent those directories from being predictable far in advance. Which Tor servers hosts a hidden service depends on: * the current time period, * the daily subcredential, * the hidden service directories' public keys, * a shared random value that changes in each time period, * a set of network-wide networkstatus consensus parameters. (Consensus parameters are integer values voted on by authorities and published in the consensus documents, described in dir-spec.txt, section 3.3.) Below we explain in more detail. 2.2.1. Dividing time into periods [TIME-PERIODS] To prevent a single set of hidden service directory from becoming a target by adversaries looking to permanently censor a hidden service, hidden service descriptors are uploaded to different locations that change over time. The length of a "time period" is controlled by the consensus parameter 'hsdir-interval', and is a number of minutes between 30 and 14400 (10 days). The default time period length is 1440 (one day). Time periods start at the Unix epoch (Jan 1, 1970), and are computed by taking the number of minutes since the epoch and dividing by the time period. However, we want our time periods to start at 12:00UTC every day, so we subtract a "rotation time offset" of 12*60 minutes from the number of minutes since the epoch, before dividing by the time period (effectively making "our" epoch start at Jan 1, 1970 12:00UTC). Example: If the current time is 2016-04-13 11:15:01 UTC, making the seconds since the epoch 1460546101, and the number of minutes since the epoch 24342435. We then subtract the "rotation time offset" of 12*60 minutes from the minutes since the epoch, to get 24341715. If the current time period length is 1440 minutes, by doing the division we see that we are currently in time period number 16903. Specifically, time period #16903 began 16903*1440*60 + (12*60*60) seconds after the epoch, at 2016-04-12 12:00 UTC, and ended at 16904*1440*60 + (12*60*60) seconds after the epoch, at 2016-04-13 12:00 UTC. 2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] Hidden services periodically publish their descriptor to the responsible HSDirs. The set of responsible HSDirs is determined as specified in [WHERE-HSDESC]. Specifically, everytime a hidden service publishes its descriptor, it also sets up a timer for a random time between 60 minutes and 120 minutes in the future. When the timer triggers, the hidden service needs to publish its descriptor again to the responsible HSDirs for that time period. [TODO: Control republish period using a consensus parameter?] 2.2.2.1. Overlapping descriptors Hidden services need to upload multiple descriptors so that they can be reachable to clients with older or newer consensuses than them. Services need to upload their descriptors to the HSDirs _before_ the beginning of each upcoming time period, so that they are readily available for clients to fetch them. Furthermore, services should keep uploading their old descriptor even after the end of a time period, so that they can be reachable by clients that still have consensuses from the previous time period. Hence, services maintain two active descriptors at every point. Clients on the other hand, don't have a notion of overlapping descriptors, and instead always download the descriptor for the current time period and shared random value. It's the job of the service to ensure that descriptors will be available for all clients. See section [FETCHUPLOADDESC] for how this is achieved. [TODO: What to do when we run multiple hidden services in a single host?] 2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] This section specifies how the HSDir hash ring is formed at any given time. Whenever a time value is needed (e.g. to get the current time period number), we assume that clients and services use the valid-after time from their latest live consensus. The following consensus parameters control where a hidden service descriptor is stored; hsdir_n_replicas = an integer in range [1,16] with default value 2. hsdir_spread_fetch = an integer in range [1,128] with default value 3. hsdir_spread_store = an integer in range [1,128] with default value 3. To determine where a given hidden service descriptor will be stored in a given period, after the blinded public key for that period is derived, the uploading or downloading party calculates: for replicanum in 1...hsdir_n_replicas: hs_index(replicanum) = H("store-at-idx" | blinded_public_key | INT_8(replicanum) | INT_8(period_length) | INT_8(period_num) ) where blinded_public_key is specified in section [KEYBLIND], period_length is the length of the time period in minutes, and period_num is calculated using the current consensus "valid-after" as specified in section [TIME-PERIODS]. Then, for each node listed in the current consensus with the HSDirV3 flag, we compute a directory index for that node as: hsdir_index(node) = H("node-idx" | node_identity | shared_random_value | INT_8(period_num) | INT_8(period_length) ) where shared_random_value is the shared value generated by the authorities in section [PUB-SHAREDRANDOM], and node_identity is the ed25519 identity key of the node. Finally, for replicanum in 1...hsdir_n_replicas, the hidden service host uploads descriptors to the first hsdir_spread_store nodes whose indices immediately follow hs_index(replicanum). If any of those nodes have already been selected for a lower-numbered replica of the service, any nodes already chosen are disregarded (i.e. skipped over) when choosing a replica's hsdir_spread_store nodes. When choosing an HSDir to download from, clients choose randomly from among the first hsdir_spread_fetch nodes after the indices. (Note that, in order to make the system better tolerate disappearing HSDirs, hsdir_spread_fetch may be less than hsdir_spread_store.) Again, nodes from lower-numbered replicas are disregarded when choosing the spread for a replica. 2.2.4. Using time periods and SRVs to fetch/upload HS descriptors [FETCHUPLOADDESC] Hidden services and clients need to make correct use of time periods (TP) and shared random values (SRVs) to successfuly fetch and upload descriptors. Furthermore, to avoid problems with skewed clocks, both clients and services use the 'valid-after' time of a live consensus as a way to take decisions with regards to uploading and fetching descriptors. By using the consensus times as the ground truth here, we minimize the desynchronization of clients and services due to system clock. Whenever time-based decisions are taken in this section, assume that they are consensus times and not system times. As [PUB-SHAREDRANDOM] specifies, consensuses contain two shared random values (the current one and the previous one). Hidden services and clients are asked to match these shared random values with descriptor time periods and use the right SRV when fetching/uploading descriptors. This section attempts to precisely specify how this works. Let's start with an illustration of the system: +------------------------------------------------------------------+ | | | 00:00 12:00 00:00 12:00 00:00 12:00 | | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | | | | $==========|-----------$===========|-----------$===========| | | | | | +------------------------------------------------------------------+ Legend: [TP#1 = Time Period #1] [SRV#1 = Shared Random Value #1] ["$" = descriptor rotation moment] 2.2.4.1. Client behavior for fetching descriptors [CLIENTFETCH] And here is how clients use TPs and SRVs to fetch descriptors: Clients always aim to synchronize their TP with SRV, so they always want to use TP#N with SRV#N: To achieve this wrt time periods, clients always use the current time period when fetching descriptors. Now wrt SRVs, if a client is in the time segment between a new time period and a new SRV (i.e. the segments drawn with "-") it uses the current SRV, else if the client is in a time segment between a new SRV and a new time period (i.e. the segments drawn with "="), it uses the previous SRV. Example: +------------------------------------------------------------------+ | | | 00:00 12:00 00:00 12:00 00:00 12:00 | | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | | | | $==========|-----------$===========|-----------$===========| | | ^ ^ | | C1 C2 | +------------------------------------------------------------------+ If a client (C1) is at 13:00 right after TP#1, then it will use TP#1 and SRV#1 for fetching descriptors. Also, if a client (C2) is at 01:00 right after SRV#2, it will still use TP#1 and SRV#1. 2.2.4.2. Service behavior for uploading descriptors [SERVICEUPLOAD] As discussed above, services maintain two active descriptors at any time. We call these the "first" and "second" service descriptors. Services rotate their descriptor everytime they receive a consensus with a valid_after time past the next SRV calculation time. They rotate their descriptors by discarding their first descriptor, pushing the second descriptor to the first, and rebuilding their second descriptor with the latest data. Services like clients also employ a different logic for picking SRV and TP values based on their position in the graph above. Here is the logic: 2.2.4.2.1. First descriptor upload logic [FIRSTDESCUPLOAD] Here is the service logic for uploading its first descriptor: When a service is in the time segment between a new time period a new SRV (i.e. the segments drawn with "-"), it uses the previous time period and previous SRV for uploading its first descriptor: that's meant to cover for clients that have a consensus that is still in the previous time period. Example: Consider in the above illustration that the service is at 13:00 right after TP#1. It will upload its first descriptor using TP#0 and SRV#0. So if a client still has a 11:00 consensus it will be able to access it based on the client logic above. Now if a service is in the time segment between a new SRV and a new time period (i.e. the segments drawn with "=") it uses the current time period and the previous SRV for its first descriptor: that's meant to cover clients with an up-to-date consensus in the same time period as the service. Example: +------------------------------------------------------------------+ | | | 00:00 12:00 00:00 12:00 00:00 12:00 | | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | | | | $==========|-----------$===========|-----------$===========| | | ^ | | S | +------------------------------------------------------------------+ Consider that the service is at 01:00 right after SRV#2: it will upload its first descriptor using TP#1 and SRV#1. 2.2.4.2.2. Second descriptor upload logic [SECONDDESCUPLOAD] Here is the service logic for uploading its second descriptor: When a service is in the time segment between a new time period a new SRV (i.e. the segments drawn with "-"), it uses the current time period and current SRV for uploading its second descriptor: that's meant to cover for clients that have an up-to-date consensus on the same TP as the service. Example: Consider in the above illustration that the service is at 13:00 right after TP#1: it will upload its second descriptor using TP#1 and SRV#1. Now if a service is in the time segment between a new SRV and a new time period (i.e. the segments drawn with "=") it uses the next time period and the current SRV for its second descriptor: that's meant to cover clients with a newer consensus than the service (in the next time period). Example: +------------------------------------------------------------------+ | | | 00:00 12:00 00:00 12:00 00:00 12:00 | | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | | | | $==========|-----------$===========|-----------$===========| | | ^ | | S | +------------------------------------------------------------------+ Consider that the service is at 01:00 right after SRV#2: it will upload its second descriptor using TP#2 and SRV#2. 2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] Hidden services set their descriptor's "descriptor-lifetime" field to 180 minutes (3 hours). Hidden services ensure that their descriptor will remain valid in the HSDir caches, by republishing their descriptors periodically as specified in [WHEN-HSDESC]. Hidden services MUST also keep their introduction circuits alive for as long as descriptors including those intro points are valid (even if that's after the time period has changed). 2.2.6. URLs for anonymous uploading and downloading Hidden service descriptors conforming to this specification are uploaded with an HTTP POST request to the URL /tor/hs/<version>/publish relative to the hidden service directory's root, and downloaded with an HTTP GET request for the URL /tor/hs/<version>/<z> where <z> is a base64 encoding of the hidden service's blinded public key and <version> is the protocol version which is "3" in this case. These requests must be made anonymously, on circuits not used for anything else. 2.2.7. Client-side validation of onion addresses When a Tor client receives a prop224 onion address from the user, it MUST first validate the onion address before attempting to connect or fetch its descriptor. If the validation fails, the client MUST refuse to connect. As part of the address validation, Tor clients should check that the underlying ed25519 key does not have a torsion component. If Tor accepted ed25519 keys with torsion components, attackers could create multiple equivalent onion addresses for a single ed25519 key, which would map to the same service. We want to avoid that because it could lead to phishing attacks and surprising behaviors (e.g. imagine a browser plugin that blocks onion addresses, but could be bypassed using an equivalent onion address with a torsion component). The right way for clients to detect such fraudulent addresses (which should only occur malevolently and never natutally) is to extract the ed25519 public key from the onion address and multiply it by the ed25519 group order and ensure that the result is the ed25519 identity element. For more details, please see [TORSION-REFS]. 2.3. Publishing shared random values [PUB-SHAREDRANDOM] Our design for limiting the predictability of HSDir upload locations relies on a shared random value (SRV) that isn't predictable in advance or too influenceable by an attacker. The authorities must run a protocol to generate such a value at least once per hsdir period. Here we describe how they publish these values; the procedure they use to generate them can change independently of the rest of this specification. For more information see [SHAREDRANDOM-REFS]. According to proposal 250, we add two new lines in consensuses: "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL 2.3.1. Client behavior in the absense of shared random values If the previous or current shared random value cannot be found in a consensus, then Tor clients and services need to generate their own random value for use when choosing HSDirs. To do so, Tor clients and services use: SRV = H("shared-random-disaster" | INT_8(period_length) | INT_8(period_num)) where period_length is the length of a time period in minutes, period_num is calculated as specified in [TIME-PERIODS] for the wanted shared random value that could not be found originally. 2.3.2. Hidden services and changing shared random values It's theoretically possible that the consensus shared random values will change or disappear in the middle of a time period because of directory authorities dropping offline or misbehaving. To avoid client reachability issues in this rare event, hidden services should use the new shared random values to find the new responsible HSDirs and upload their descriptors there. XXX How long should they upload descriptors there for? 2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] The format for a hidden service descriptor is as follows, using the meta-format from dir-spec.txt. "hs-descriptor" SP version-number NL [At start, exactly once.] The version-number is a 32 bit unsigned integer indicating the version of the descriptor. Current version is "3". "descriptor-lifetime" SP LifetimeMinutes NL [Exactly once] The lifetime of a descriptor in minutes. An HSDir SHOULD expire the hidden service descriptor at least LifetimeMinutes after it was uploaded. The LifetimeMinutes field can take values between 30 and 3000 (50 hours). "descriptor-signing-key-cert" NL certificate NL [Exactly once.] The 'certificate' field contains a certificate in the format from proposal 220, wrapped with "-----BEGIN ED25519 CERT-----". The certificate cross-certifies the short-term descriptor signing key with the blinded public key. The certificate type must be [08], and the blinded public key must be present as the signing-key extension. "revision-counter" SP Integer NL [Exactly once.] The revision number of the descriptor. If an HSDir receives a second descriptor for a key that it already has a descriptor for, it should retain and serve the descriptor with the higher revision-counter. (Checking for monotonically increasing revision-counter values prevents an attacker from replacing a newer descriptor signed by a given key with a copy of an older version.) "superencrypted" NL encrypted-string [Exactly once.] An encrypted blob, whose format is discussed in [HS-DESC-ENC] below. The blob is base64 encoded and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers. "signature" SP signature NL [exactly once, at end.] A signature of all previous fields, using the signing key in the descriptor-signing-key-cert line, prefixed by the string "Tor onion service descriptor sig v3". We use a separate key for signing, so that the hidden service host does not need to have its private blinded key online. HSDirs accept hidden service descriptors of up to 50k bytes (a consensus parameter should also be introduced to control this value). 2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] Hidden service descriptors are protected by two layers of encryption. Clients need to decrypt both layers to connect to the hidden service. The first layer of encryption provides confidentiality against entities who don't know the public key of the hidden service (e.g. HSDirs), while the second layer of encryption is only useful when client authorization is enabled and protects against entities that do not possess valid client credentials. 2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] The first layer of HS descriptor encryption is designed to protect descriptor confidentiality against entities who don't know the blinded public key of the hidden service. 2.5.1.1. First layer encryption logic The encryption keys and format for the first layer of encryption are generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization parameters: SECRET_DATA = blinded-public-key STRING_CONSTANT = "hsdir-superencrypted-data" The ciphertext is placed on the "superencrypted" field of the descriptor. Before encryption the plaintext is padded with NUL bytes to the nearest multiple of 10k bytes. 2.5.1.2. First layer plaintext format After clients decrypt the first layer of encryption, they need to parse the plaintext to get to the second layer ciphertext which is contained in the "encrypted" field. If client auth is enabled, the hidden service generates a fresh descriptor_cookie key (32 random bytes) and encrypts it using each authorized client's identity x25519 key. Authorized clients can use the descriptor cookie to decrypt the second layer of encryption. Our encryption scheme requires the hidden service to also generate an ephemeral x25519 keypair for each new descriptor. If client auth is disabled, fake data is placed in each of the fields below to obfuscate whether client authorization is enabled. Here are all the supported fields: "desc-auth-type" SP type NL [Exactly once] This field contains the type of authorization used to protect the descriptor. The only recognized type is "x25519" and specifies the encryption scheme described in this section. If client authorization is disabled, the value here should be "x25519". "desc-auth-ephemeral-key" SP key NL [Exactly once] This field contains an ephemeral x25519 public key generated by the hidden service and encoded in base64. The key is used by the encryption scheme below. If client authorization is disabled, the value here should be a fresh x25519 pubkey that will remain unused. "auth-client" SP client-id SP iv SP encrypted-cookie [Any number] When client authorization is enabled, the hidden service inserts an "auth-client" line for each of its authorized clients. If client authorization is disabled, the fields here can be populated with random data of the right size (that's 8 bytes for 'client-id', 16 bytes for 'iv' and 16 bytes for 'encrypted-cookie' all encoded with base64). When client authorization is enabled, each "auth-client" line contains the descriptor cookie encrypted to each individual client. We assume that each authorized client possesses a pre-shared x25519 keypair which is used to decrypt the descriptor cookie. We now describe the descriptor cookie encryption scheme. Here are the relevant keys: client_x = private x25519 key of authorized client client_X = public x25519 key of authorized client hs_y = private key of ephemeral x25519 keypair of hidden service hs_Y = public key of ephemeral x25519 keypair of hidden service descriptor_cookie = descriptor cookie used to encrypt the descriptor And here is what the hidden service computes: SECRET_SEED = x25519(hs_y, client_X) KEYS = KDF(SECRET_SEED, 40) CLIENT-ID = fist 8 bytes of KEYS COOKIE-KEY = last 32 bytes of KEYS Here is a description of the fields in the "auth-client" line: - The "client-id" field is CLIENT-ID from above encoded in base64. - The "iv" field is 16 random bytes encoded in base64. - The "encrypted-cookie" field contains the descriptor cookie ciphertext as follows and is encoded in base64: encrypted-cookie = STREAM(iv, COOKIE-KEY) XOR descriptor_cookie See section [FIRST-LAYER-CLIENT-BEHAVIOR] for the client-side logic of how to decrypt the descriptor cookie. "encrypted" NL encrypted-string [Exactly once] An encrypted blob containing the second layer ciphertext, whose format is discussed in [HS-DESC-SECOND-LAYER] below. The blob is base64 encoded and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers. 2.5.1.3. Client behavior [FIRST-LAYER-CLIENT-BEHAVIOR] The goal of clients at this stage is to decrypt the "encrypted" field as described in [HS-DESC-SECOND-LAYER]. If client authorization is enabled, authorized clients need to extract the descriptor cookie to proceed with decryption of the second layer as follows: An authorized client parsing the first layer of an encrypted descriptor, extracts the ephemeral key from "desc-auth-ephemeral-key" and calculates CLIENT-ID and COOKIE-KEY as described in the section above using their x25519 private key. The client then uses CLIENT-ID to find the right "auth-client" field which contains the ciphertext of the descriptor cookie. The client then uses COOKIE-KEY and the iv to decrypt the descriptor_cookie, which is used to decrypt the second layer of descriptor encryption as described in [HS-DESC-SECOND-LAYER]. 2.5.1.4. Hiding client authorization data Hidden services should avoid leaking whether client authorization is enabled or how many authorized clients there are. Hence even when client authorization is disabled, the hidden service adds fake "desc-auth-type", "desc-auth-ephemeral-key" and "auth-client" lines to the descriptor, as described in [HS-DESC-FIRST-LAYER]. The hidden service also avoids leaking the number of authorized clients by adding fake "auth-client" entries to its descriptor. Specifically, descriptors always contain a number of authorized clients that is a multiple of 16 by adding fake "auth-client" entries if needed. [XXX consider randomization of the value 16] Clients MUST accept descriptors with any number of "auth-client" lines as long as the total descriptor size is within the max limit of 50k (also controlled with a consensus parameter). 2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] The second layer of descriptor encryption is designed to protect descriptor confidentiality against unauthorized clients. If client authorization is enabled, it's encrypted using the descriptor_cookie, and contains needed information for connecting to the hidden service, like the list of its introduction points. If client authorization is disabled, then the second layer of HS encryption does not offer any additional security, but is still used. 2.5.2.1. Second layer encryption keys The encryption keys and format for the second layer of encryption are generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization parameters as follows: SECRET_DATA = blinded-public-key | descriptor_cookie STRING_CONSTANT = "hsdir-encrypted-data" If client authorization is disabled the 'descriptor_cookie' field is left blank. The ciphertext is placed on the "encrypted" field of the descriptor. 2.5.2.2. Second layer plaintext format After decrypting the second layer ciphertext, clients can finally learn the list of intro points etc. The plaintext has the following format: "create2-formats" SP formats NL [Exactly once] A space-separated list of integers denoting CREATE2 cell format numbers that the server recognizes. Must include at least ntor as described in tor-spec.txt. See tor-spec section 5.1 for a list of recognized handshake types. "intro-auth-required" SP types NL [At most once] A space-separated list of introduction-layer authentication types; see section [INTRO-AUTH] for more info. A client that does not support at least one of these authentication types will not be able to contact the host. Recognized types are: 'password' and 'ed25519'. "single-onion-service" [None or at most once] If present, this line indicates that the service is a Single Onion Service (see prop260 for more details about that type of service). This field has been introduced in 0.3.0 meaning 0.2.9 service don't include this. Followed by zero or more introduction points as follows (see section [NUM_INTRO_POINT] below for accepted values): "introduction-point" SP link-specifiers NL [Exactly once per introduction point at start of introduction point section] The link-specifiers is a base64 encoding of a link specifier block in the format described in BUILDING-BLOCKS. "onion-key" SP "ntor" SP key NL [Exactly once per introduction point] The key is a base64 encoded curve25519 public key which is the onion key of the introduction point Tor node used for the ntor handshake when a client extends to it. "auth-key" NL certificate NL [Exactly once per introduction point] The certificate is a proposal 220 certificate wrapped in "-----BEGIN ED25519 CERT-----", cross-certifying the descriptor signing key with the introduction point authentication key, which is included in the mandatory signing-key extension. The certificate type must be [09]. "enc-key" SP "ntor" SP key NL [Exactly once per introduction point] The key is a base64 encoded curve25519 public key used to encrypt the introduction request to service. "enc-key-cert" NL certificate NL [Exactly once per introduction point] Cross-certification of the descriptor signing key by the encryption key. For "ntor" keys, certificate is a proposal 220 certificate wrapped in "-----BEGIN ED25519 CERT-----" armor, cross-certifying the descriptor signing key with the ed25519 equivalent of a curve25519 public encryption key derived using the process in proposal 228 appendix A. The certificate type must be [0B], and the signing-key extension is mandatory. "legacy-key" NL key NL [None or at most once per introduction point] The key is an ASN.1 encoded RSA public key in PEM format used for a legacy introduction point as described in [LEGACY_EST_INTRO]. This field is only present if the introduction point only supports legacy protocol (v2) that is <= 0.2.9 or the protocol version value "HSIntro 3". "legacy-key-cert NL certificate NL [None or at most once per introduction point] MUST be present if "legacy-key" is present. The certificate is a proposal 220 RSA->Ed cross-certificate wrapped in "-----BEGIN CROSSCERT-----" armor, cross-certifying the descriptor signing key with the RSA public key found in "legacy-key". To remain compatible with future revisions to the descriptor format, clients should ignore unrecognized lines in the descriptor. Other encryption and authentication key formats are allowed; clients should ignore ones they do not recognize. Clients who manage to extract the introduction points of the hidden service can prroceed with the introduction protocol as specified in [INTRO-PROTOCOL]. 2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] In this section we present the generic encryption format for hidden service descriptors. We use the same encryption format in both encryption layers, hence we introduce two customization parameters SECRET_DATA and STRING_CONSTANT which vary between the layers. The SECRET_DATA parameter specifies the secret data that are used during encryption key generation, while STRING_CONSTANT is merely a string constant that is used as part of the KDF. Here is the key generation logic: SALT = 16 bytes from H(random), changes each time we rebuld the descriptor even if the content of the descriptor hasn't changed. (So that we don't leak whether the intro point list etc. changed) secret_input = SECRET_DATA | subcredential | INT_8(revision_counter) keys = KDF(secret_input | salt | STRING_CONSTANT, S_KEY_LEN + S_IV_LEN + MAC_KEY_LEN) SECRET_KEY = first S_KEY_LEN bytes of keys SECRET_IV = next S_IV_LEN bytes of keys MAC_KEY = last MAC_KEY_LEN bytes of keys The encrypted data has the format: SALT hashed random bytes from above [16 bytes] ENCRYPTED The ciphertext [variable] MAC MAC of both above fields [32 bytes] The final encryption format is ENCRYPTED = STREAM(SECRET_IV,SECRET_KEY) XOR Plaintext 2.5.4. Number of introduction points [NUM_INTRO_POINT] This section defines how many introduction points an hidden service descriptor can have at minimum, by default and the maximum: Minimum: 0 - Default: 3 - Maximum: 20 A value of 0 would means that the service is still alive but doesn't want to be reached by any client at the moment. Note that the descriptor size increases considerably as more introduction points are added. The reason for a maximum value of 20 is to give enough scalability to tools like OnionBalance to be able to load balance up to 120 servers (20 x 6 HSDirs) but also in order for the descriptor size to not overwhelmed hidden service directories with user defined values that could be gigantic. 3. The introduction protocol [INTRO-PROTOCOL] The introduction protocol proceeds in three steps. First, a hidden service host builds an anonymous circuit to a Tor node and registers that circuit as an introduction point. [After 'First' and before 'Second', the hidden service publishes its introduction points and associated keys, and the client fetches them as described in section [HSDIR] above.] Second, a client builds an anonymous circuit to the introduction point, and sends an introduction request. Third, the introduction point relays the introduction request along the introduction circuit to the hidden service host, and acknowledges the introduction request to the client. 3.1. Registering an introduction point [REG_INTRO_POINT] 3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] When a hidden service is establishing a new introduction point, it sends an ESTABLISH_INTRO cell with the following contents: AUTH_KEY_TYPE [1 byte] AUTH_KEY_LEN [2 bytes] AUTH_KEY [AUTH_KEY_LEN bytes] N_EXTENSIONS [1 byte] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] HANDSHAKE_AUTH [MAC_LEN bytes] SIG_LEN [2 bytes] SIG [SIG_LEN bytes] The AUTH_KEY_TYPE field indicates the type of the introduction point authentication key and the type of the MAC to use in HANDSHAKE_AUTH. Recognized types are: [00, 01] -- Reserved for legacy introduction cells; see [LEGACY_EST_INTRO below] [02] -- Ed25519; SHA3-256. The AUTH_KEY_LEN field determines the length of the AUTH_KEY field. The AUTH_KEY field contains the public introduction point authentication key. The EXT_FIELD_TYPE, EXT_FIELD_LEN, EXT_FIELD entries are reserved for future extensions to the introduction protocol. Extensions with unrecognized EXT_FIELD_TYPE values must be ignored. The HANDSHAKE_AUTH field contains the MAC of all earlier fields in the cell using as its key the shared per-circuit material ("KH") generated during the circuit extension protocol; see tor-spec.txt section 5.2, "Setting circuit keys". It prevents replays of ESTABLISH_INTRO cells. SIG_LEN is the length of the signature. SIG is a signature, using AUTH_KEY, of all contents of the cell, up to but not including SIG. These contents are prefixed with the string "Tor establish-intro cell v1". Upon receiving an ESTABLISH_INTRO cell, a Tor node first decodes the key and the signature, and checks the signature. The node must reject the ESTABLISH_INTRO cell and destroy the circuit in these cases: * If the key type is unrecognized * If the key is ill-formatted * If the signature is incorrect * If the HANDSHAKE_AUTH value is incorrect * If the circuit is already a rendezvous circuit. * If the circuit is already an introduction circuit. [TODO: some scalability designs fail there.] * If the key is already in use by another circuit. Otherwise, the node must associate the key with the circuit, for use later in INTRODUCE1 cells. 3.1.2. Registering an introduction point on a legacy Tor node [LEGACY_EST_INTRO] Tor nodes should also support an older version of the ESTABLISH_INTRO cell, first documented in rend-spec.txt. New hidden service hosts must use this format when establishing introduction points at older Tor nodes that do not support the format above in [EST_INTRO]. In this older protocol, an ESTABLISH_INTRO cell contains: KEY_LEN [2 bytes] KEY [KEY_LEN bytes] HANDSHAKE_AUTH [20 bytes] SIG [variable, up to end of relay payload] The KEY_LEN variable determines the length of the KEY field. The KEY field is the ASN1-encoded legacy RSA public key that was also included in the hidden service descriptor. The HANDSHAKE_AUTH field contains the SHA1 digest of (KH | "INTRODUCE"). The SIG field contains an RSA signature, using PKCS1 padding, of all earlier fields. Older versions of Tor always use a 1024-bit RSA key for these introduction authentication keys. 3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] After setting up an introduction circuit, the introduction point reports its status back to the hidden service host with an INTRO_ESTABLISHED cell. The INTRO_ESTABLISHED cell has the following contents: N_EXTENSIONS [1 byte] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] Older versions of Tor send back an empty INTRO_ESTABLISHED cell instead. Services must accept an empty INTRO_ESTABLISHED cell from a legacy relay. 3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] In order to participate in the introduction protocol, a client must know the following: * An introduction point for a service. * The introduction authentication key for that introduction point. * The introduction encryption key for that introduction point. The client sends an INTRODUCE1 cell to the introduction point, containing an identifier for the service, an identifier for the encryption key that the client intends to use, and an opaque blob to be relayed to the hidden service host. In reply, the introduction point sends an INTRODUCE_ACK cell back to the client, either informing it that its request has been delivered, or that its request will not succeed. [TODO: specify what tor should do when receiving a malformed cell. Drop it? Kill circuit? This goes for all possible cells.] 3.2.1. INTRODUCE1 cell format [FMT_INTRO1] When a client is connecting to an introduction point, INTRODUCE1 cells should be of the form: LEGACY_KEY_ID [20 bytes] AUTH_KEY_TYPE [1 byte] AUTH_KEY_LEN [2 bytes] AUTH_KEY [AUTH_KEY_LEN bytes] N_EXTENSIONS [1 byte] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] ENCRYPTED [Up to end of relay payload] AUTH_KEY_TYPE is defined as in [EST_INTRO]. Currently, the only value of AUTH_KEY_TYPE for this cell is an Ed25519 public key [02]. The LEGACY_KEY_ID field is used to distinguish between legacy and new style INTRODUCE1 cells. In new style INTRODUCE1 cells, LEGACY_KEY_ID is 20 zero bytes. Upon receiving an INTRODUCE1 cell, the introduction point checks the LEGACY_KEY_ID field. If LEGACY_KEY_ID is non-zero, the INTRODUCE1 cell should be handled as a legacy INTRODUCE1 cell by the intro point. Upon receiving a INTRODUCE1 cell, the introduction point checks whether AUTH_KEY matches the introduction point authentication key for an active introduction circuit. If so, the introduction point sends an INTRODUCE2 cell with exactly the same contents to the service, and sends an INTRODUCE_ACK response to the client. 3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] An INTRODUCE_ACK cell has the following fields: STATUS [2 bytes] N_EXTENSIONS [1 bytes] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] Recognized status values are: [00 00] -- Success: cell relayed to hidden service host. [00 01] -- Failure: service ID not recognized [00 02] -- Bad message format [00 03] -- Can't relay cell to service 3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] Upon receiving an INTRODUCE2 cell, the hidden service host checks whether the AUTH_KEY or LEGACY_KEY_ID field matches the keys for this introduction circuit. The service host then checks whether it has received a cell with these contents or rendezvous cookie before. If it has, it silently drops it as a replay. (It must maintain a replay cache for as long as it accepts cells with the same encryption key. Note that the encryption format below should be non-malleable.) If the cell is not a replay, it decrypts the ENCRYPTED field, establishes a shared key with the client, and authenticates the whole contents of the cell as having been unmodified since they left the client. There may be multiple ways of decrypting the ENCRYPTED field, depending on the chosen type of the encryption key. Requirements for an introduction handshake protocol are described in [INTRO-HANDSHAKE-REQS]. We specify one below in section [NTOR-WITH-EXTRA-DATA]. The decrypted plaintext must have the form: RENDEZVOUS_COOKIE [20 bytes] N_EXTENSIONS [1 byte] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] ONION_KEY_TYPE [1 bytes] ONION_KEY_LEN [2 bytes] ONION_KEY [ONION_KEY_LEN bytes] NSPEC (Number of link specifiers) [1 byte] NSPEC times: LSTYPE (Link specifier type) [1 byte] LSLEN (Link specifier length) [1 byte] LSPEC (Link specifier) [LSLEN bytes] PAD (optional padding) [up to end of plaintext] Upon processing this plaintext, the hidden service makes sure that any required authentication is present in the extension fields, and then extends a rendezvous circuit to the node described in the LSPEC fields, using the ONION_KEY to complete the extension. As mentioned in [BUILDING-BLOCKS], the "TLS-over-TCP, IPv4" and "Legacy node identity" specifiers must be present. The hidden service SHOULD NOT reject any LSTYPE fields which it doesn't recognize; instead, it should use them verbatim in its EXTEND request to the rendezvous point. The ONION_KEY_TYPE field is: [01] NTOR: ONION_KEY is 32 bytes long. The ONION_KEY field describes the onion key that must be used when extending to the rendezvous point. It must be of a type listed as supported in the hidden service descriptor. When using a legacy introduction point, the INTRODUCE cells must be padded to a certain length using the PAD field in the encrypted portion. Upon receiving a well-formed INTRODUCE2 cell, the hidden service host will have: * The information needed to connect to the client's chosen rendezvous point. * The second half of a handshake to authenticate and establish a shared key with the hidden service client. * A set of shared keys to use for end-to-end encryption. 3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] When decoding the encrypted information in an INTRODUCE2 cell, a hidden service host must be able to: * Decrypt additional information included in the INTRODUCE2 cell, to include the rendezvous token and the information needed to extend to the rendezvous point. * Establish a set of shared keys for use with the client. * Authenticate that the cell has not been modified since the client generated it. Note that the old TAP-derived protocol of the previous hidden service design achieved the first two requirements, but not the third. 3.3.2. Example encryption handshake: ntor with extra data [NTOR-WITH-EXTRA-DATA] [TODO: relocate this] This is a variant of the ntor handshake (see tor-spec.txt, section 5.1.4; see proposal 216; and see "Anonymity and one-way authentication in key-exchange protocols" by Goldberg, Stebila, and Ustaoglu). It behaves the same as the ntor handshake, except that, in addition to negotiating forward secure keys, it also provides a means for encrypting non-forward-secure data to the server (in this case, to the hidden service host) as part of the handshake. Notation here is as in section 5.1.4 of tor-spec.txt, which defines the ntor handshake. The PROTOID for this variant is "tor-hs-ntor-curve25519-sha3-256-1". We also use the following tweak values: t_hsenc = PROTOID | ":hs_key_extract" t_hsverify = PROTOID | ":hs_verify" t_hsmac = PROTOID | ":hs_mac" m_hsexpand = PROTOID | ":hs_key_expand" To make an INTRODUCE1 cell, the client must know a public encryption key B for the hidden service on this introduction circuit. The client generates a single-use keypair: x,X = KEYGEN() and computes: intro_secret_hs_input = EXP(B,x) | AUTH_KEY | X | B | PROTOID info = m_hsexpand | subcredential hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) ENC_KEY = hs_keys[0:S_KEY_LEN] MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] and sends, as the ENCRYPTED part of the INTRODUCE1 cell: CLIENT_PK [PK_PUBKEY_LEN bytes] ENCRYPTED_DATA [Padded to length of plaintext] MAC [MAC_LEN bytes] Substituting those fields into the INTRODUCE1 cell body format described in [FMT_INTRO1] above, we have LEGACY_KEY_ID [20 bytes] AUTH_KEY_TYPE [1 byte] AUTH_KEY_LEN [2 bytes] AUTH_KEY [AUTH_KEY_LEN bytes] N_EXTENSIONS [1 bytes] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] ENCRYPTED: CLIENT_PK [PK_PUBKEY_LEN bytes] ENCRYPTED_DATA [Padded to length of plaintext] MAC [MAC_LEN bytes] (This format is as documented in [FMT_INTRO1] above, except that here we describe how to build the ENCRYPTED portion.) Here, the encryption key plays the role of B in the regular ntor handshake, and the AUTH_KEY field plays the role of the node ID. The CLIENT_PK field is the public key X. The ENCRYPTED_DATA field is the message plaintext, encrypted with the symmetric key ENC_KEY. The MAC field is a MAC of all of the cell from the AUTH_KEY through the end of ENCRYPTED_DATA, using the MAC_KEY value as its key. To process this format, the hidden service checks PK_VALID(CLIENT_PK) as necessary, and then computes ENC_KEY and MAC_KEY as the client did above, except using EXP(CLIENT_PK,b) in the calculation of intro_secret_hs_input. The service host then checks whether the MAC is correct. If it is invalid, it drops the cell. Otherwise, it computes the plaintext by decrypting ENCRYPTED_DATA. The hidden service host now completes the service side of the extended ntor handshake, as described in tor-spec.txt section 5.1.4, with the modified PROTOID as given above. To be explicit, the hidden service host generates a keypair of y,Y = KEYGEN(), and uses its introduction point encryption key 'b' to computes: intro_secret_hs_input = EXP(X,b) | AUTH_KEY | X | B | PROTOID info = m_hsexpand | subcredential hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) HS_DEC_KEY = hs_keys[0:S_KEY_LEN] HS_MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] (The above are used to check the MAC and then decrypt the encrypted data.) rend_secret_hs_input = EXP(X,y) | EXP(X,b) | AUTH_KEY | B | X | Y | PROTOID NTOR_KEY_SEED = MAC(rend_secret_hs_input, t_hsenc) verify = MAC(rend_secret_hs_input, t_hsverify) auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) (The above are used to finish the ntor handshake.) The server's handshake reply is: SERVER_PK Y [PK_PUBKEY_LEN bytes] AUTH AUTH_INPUT_MAC [MAC_LEN bytes] These fields will be sent to the client in a RENDEZVOUS1 cell using the HANDSHAKE_INFO element (see [JOIN_REND]). The hidden service host now also knows the keys generated by the handshake, which it will use to encrypt and authenticate data end-to-end between the client and the server. These keys are as computed in tor-spec.txt section 5.1.4. 3.4. Authentication during the introduction phase. [INTRO-AUTH] Hidden services may restrict access only to authorized users. One mechanism to do so is the credential mechanism, where only users who know the credential for a hidden service may connect at all. 3.4.1. Ed25519-based authentication. To authenticate with an Ed25519 private key, the user must include an extension field in the encrypted part of the INTRODUCE1 cell with an EXT_FIELD_TYPE type of [02] and the contents: Nonce [16 bytes] Pubkey [32 bytes] Signature [64 bytes] Nonce is a random value. Pubkey is the public key that will be used to authenticate. [TODO: should this be an identifier for the public key instead?] Signature is the signature, using Ed25519, of: "hidserv-userauth-ed25519" Nonce (same as above) Pubkey (same as above) AUTH_KEY (As in the INTRODUCE1 cell) The hidden service host checks this by seeing whether it recognizes and would accept a signature from the provided public key. If it would, then it checks whether the signature is correct. If it is, then the correct user has authenticated. Replay prevention on the whole cell is sufficient to prevent replays on the authentication. Users SHOULD NOT use the same public key with multiple hidden services. 4. The rendezvous protocol Before connecting to a hidden service, the client first builds a circuit to an arbitrarily chosen Tor node (known as the rendezvous point), and sends an ESTABLISH_RENDEZVOUS cell. The hidden service later connects to the same node and sends a RENDEZVOUS cell. Once this has occurred, the relay forwards the contents of the RENDEZVOUS cell to the client, and joins the two circuits together. 4.1. Establishing a rendezvous point [EST_REND_POINT] The client sends the rendezvous point a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell containing a 20-byte value. RENDEZVOUS_COOKIE [20 bytes] Rendezvous points MUST ignore any extra bytes in an ESTABLISH_RENDEZVOUS cell. (Older versions of Tor did not.) The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by the client. The client SHOULD choose a new rendezvous cookie for each new connection attempt. If the rendezvous cookie is already in use on an existing circuit, the rendezvous point should reject it and destroy the circuit. Upon receiving an ESTABLISH_RENDEZVOUS cell, the rendezvous point associates the cookie with the circuit on which it was sent. It replies to the client with an empty RENDEZVOUS_ESTABLISHED cell to indicate success. Clients MUST ignore any extra bytes in a RENDEZVOUS_ESTABLISHED cell. The client MUST NOT use the circuit which sent the cell for any purpose other than rendezvous with the given location-hidden service. The client should establish a rendezvous point BEFORE trying to connect to a hidden service. 4.2. Joining to a rendezvous point [JOIN_REND] To complete a rendezvous, the hidden service host builds a circuit to the rendezvous point and sends a RENDEZVOUS1 cell containing: RENDEZVOUS_COOKIE [20 bytes] HANDSHAKE_INFO [variable; depends on handshake type used.] where RENDEZVOUS_COOKIE is the cookie suggested by the client during the introduction (see [PROCESS_INTRO2]) and HANDSHAKE_INFO is defined in [NTOR-WITH-EXTRA-DATA]. If the cookie matches the rendezvous cookie set on any not-yet-connected circuit on the rendezvous point, the rendezvous point connects the two circuits, and sends a RENDEZVOUS2 cell to the client containing the HANDSHAKE_INFO field of the RENDEZVOUS1 cell. Upon receiving the RENDEZVOUS2 cell, the client verifies that HANDSHAKE_INFO correctly completes a handshake. To do so, the client parses SERVER_PK from HANDSHAKE_INFO and reverses the final operations of section [NTOR-WITH-EXTRA-DATA] as shown here: rend_secret_hs_input = EXP(Y,x) | EXP(B,x) | AUTH_KEY | B | X | Y | PROTOID NTOR_KEY_SEED = MAC(ntor_secret_input, t_hsenc) verify = MAC(ntor_secret_input, t_hsverify) auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) Finally the client verifies that the received AUTH field of HANDSHAKE_INFO is equal to the computed AUTH_INPUT_MAC. Now both parties use the handshake output to derive shared keys for use on the circuit as specified in the section below: 4.2.1. Key expansion The hidden service and its client need to derive crypto keys from the NTOR_KEY_SEED part of the handshake output. To do so, they use the KDF construction as follows: K = KDF(NTOR_KEY_SEED | m_hsexpand, HASH_LEN * 2 + S_KEY_LEN * 2) The first HASH_LEN bytes of K form the forward digest Df; the next HASH_LEN bytes form the backward digest Db; the next S_KEY_LEN bytes form Kf, and the final S_KEY_LEN bytes form Kb. Excess bytes from K are discarded. Subsequently, the rendezvous point passes relay cells, unchanged, from each of the two circuits to the other. When Alice's OP sends RELAY cells along the circuit, it authenticates with Df, and encrypts them with the Kf, then with all of the keys for the ORs in Alice's side of the circuit; and when Alice's OP receives RELAY cells from the circuit, it decrypts them with the keys for the ORs in Alice's side of the circuit, then decrypts them with Kb, and checks integrity with Db. Bob's OP does the same, with Kf and Kb interchanged. [TODO: Should we encrypt HANDSHAKE_INFO as we did INTRODUCE2 contents? It's not necessary, but it could be wise. Similarly, we should make it extensible.] 4.3. Using legacy hosts as rendezvous points The behavior of ESTABLISH_RENDEZVOUS is unchanged from older versions of this protocol, except that relays should now ignore unexpected bytes at the end. Old versions of Tor required that RENDEZVOUS cell payloads be exactly 168 bytes long. All shorter rendezvous payloads should be padded to this length with random bytes, to make them difficult to distinguish from older protocols at the rendezvous point. Relays older than 0.2.9.1 should not be used for rendezvous points by next generation onion services because they enforce too-strict length checks to rendezvous cells. Hence the "HSRend" protocol from proposal#264 should be used to select relays for rendezvous points. 5. Encrypting data between client and host A successfully completed handshake, as embedded in the INTRODUCE/RENDEZVOUS cells, gives the client and hidden service host a shared set of keys Kf, Kb, Df, Db, which they use for sending end-to-end traffic encryption and authentication as in the regular Tor relay encryption protocol, applying encryption with these keys before other encryption, and decrypting with these keys before other decryption. The client encrypts with Kf and decrypts with Kb; the service host does the opposite. 6. Encoding onion addresses [ONIONADDRESS] The onion address of a hidden service includes its identity public key, a version field and a basic checksum. All this information is then base32 encoded as shown below: onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion" CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2] where: - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service. - VERSION is an one byte version field (default value '\x03') - ".onion checksum" is a constant string - CHECKSUM is truncated to two bytes before inserting it in onion_address Here are a few example addresses: pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd.onion sp3k262uwy4r2k3ycr5awluarykdpag6a7y33jxop4cs2lu5uz5sseqd.onion xa4r2iadxm55fbnqgwwi5mymqdcofiu3w6rpbtqn7b2dyn7mgwj64jyd.onion For more information about this encoding, please see our discussion thread at [ONIONADDRESS-REFS]. 7. Open Questions: Scaling hidden services is hard. There are on-going discussions that you might be able to help with. See [SCALING-REFS]. How can we improve the HSDir unpredictability design proposed in [SHAREDRANDOM]? See [SHAREDRANDOM-REFS] for discussion. How can hidden service addresses become memorable while retaining their self-authenticating and decentralized nature? See [HUMANE-HSADDRESSES-REFS] for some proposals; many more are possible. Hidden Services are pretty slow. Both because of the lengthy setup procedure and because the final circuit has 6 hops. How can we make the Hidden Service protocol faster? See [PERFORMANCE-REFS] for some suggestions. References: [KEYBLIND-REFS]: https://trac.torproject.org/projects/tor/ticket/8106 https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html [KEYBLIND-PROOF]: https://lists.torproject.org/pipermail/tor-dev/2013-December/005943.html [SHAREDRANDOM-REFS]: https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-consensus.txt https://trac.torproject.org/projects/tor/ticket/8244 [SCALING-REFS]: https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html [HUMANE-HSADDRESSES-REFS]: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-onion-nyms.txt http://archives.seul.org/or/dev/Dec-2011/msg00034.html [PERFORMANCE-REFS]: "Improving Efficiency and Simplicity of Tor circuit establishment and hidden services" by Overlier, L., and P. Syverson [TODO: Need more here! Do we have any? :( ] [ATTACK-REFS]: "Trawling for Tor Hidden Services: Detection, Measurement, Deanonymization" by Alex Biryukov, Ivan Pustogarov, Ralf-Philipp Weinmann "Locating Hidden Servers" by Lasse Øverlier and Paul Syverson [ED25519-REFS]: "High-speed high-security signatures" by Daniel J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang. http://cr.yp.to/papers.html#ed25519 [ED25519-B-REF]: https://tools.ietf.org/html/draft-josefsson-eddsa-ed25519-03#section-5: [PRNG-REFS]: http://projectbullrun.org/dual-ec/ext-rand.html https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html [SRV-TP-REFS]: https://lists.torproject.org/pipermail/tor-dev/2016-April/010759.html [VANITY-REFS]: https://github.com/Yawning/horse25519 [ONIONADDRESS-REFS]: https://lists.torproject.org/pipermail/tor-dev/2017-January/011816.html [TORSION-REFS]: https://lists.torproject.org/pipermail/tor-dev/2017-April/012164.html https://getmonero.org/2017/05/17/disclosure-of-a-major-bug-in-cryptonote-based-currencies.html Appendix A. Signature scheme with key blinding [KEYBLIND] A.1. Key derivation overview As described in [IMD:DIST] and [SUBCRED] above, we require a "key blinding" system that works (roughly) as follows: There is a master keypair (sk, pk). Given the keypair and a nonce n, there is a derivation function that gives a new blinded keypair (sk_n, pk_n). This keypair can be used for signing. Given only the public key and the nonce, there is a function that gives pk_n. Without knowing pk, it is not possible to derive pk_n; without knowing sk, it is not possible to derive sk_n. It's possible to check that a signature was made with sk_n while knowing only pk_n. Someone who sees a large number of blinded public keys and signatures made using those public keys can't tell which signatures and which blinded keys were derived from the same master keypair. You can't forge signatures. [TODO: Insert a more rigorous definition and better references.] A.2. Tor's key derivation scheme We propose the following scheme for key blinding, based on Ed25519. (This is an ECC group, so remember that scalar multiplication is the trapdoor function, and it's defined in terms of iterated point addition. See the Ed25519 paper [Reference ED25519-REFS] for a fairly clear writeup.) Let B be the ed25519 basepoint as found in section 5 of [ED25519-B-REF]: B = (15112221349535400772501151409588531511454012693041857206046113283949847762202, 46316835694926478169428394003475163141307993866256225615783033603165251855960) Assume B has prime order l, so lB=0. Let a master keypair be written as (a,A), where a is the private key and A is the public key (A=aB). To derive the key for a nonce N and an optional secret s, compute the blinding factor like this: h = H(BLIND_STRING | A | s | B | N) BLIND_STRING = "Derive temporary signing key" N = "key-blind" | INT_8(period-number) | INT_8(period_length) then clamp the blinding factor 'h' according to the ed25519 spec: h[0] &= 248; h[31] &= 127; h[31] |= 64; and do the key derivation as follows: private key for the period: a' = h a public key for the period: A' = h A = (ha)B Generating a signature of M: given a deterministic random-looking r (see EdDSA paper), take R=rB, S=r+hash(R,A',M)ah mod l. Send signature (R,S) and public key A'. Verifying the signature: Check whether SB = R+hash(R,A',M)A'. (If the signature is valid, SB = (r + hash(R,A',M)ah)B = rB + (hash(R,A',M)ah)B = R + hash(R,A',M)A' ) See [KEYBLIND-REFS] for an extensive discussion on this scheme and possible alternatives. Also, see [KEYBLIND-PROOF] for a security proof of this scheme. Appendix B. Selecting nodes [PICKNODES] Picking introduction points Picking rendezvous points Building paths Reusing circuits (TODO: This needs a writeup) Appendix C. Recommendations for searching for vanity .onions [VANITY] EDITORIAL NOTE: The author thinks that it's silly to brute-force the keyspace for a key that, when base-32 encoded, spells out the name of your website. It also feels a bit dangerous to me. If you train your users to connect to llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion I worry that you're making it easier for somebody to trick them into connecting to llamanymityb4sqi0ta0tsw6uovyhwlezkcrmczeuzdvfauuemle.onion Nevertheless, people are probably going to try to do this, so here's a decent algorithm to use. To search for a public key with some criterion X: Generate a random (sk,pk) pair. While pk does not satisfy X: Add the number 8 to sk Add the scalar 8*B to pk Return sk, pk. We add 8 and 8*B, rather than 1 and B, so that sk is always a valid Curve25519 private key, with the lowest 3 bits equal to 0. This algorithm is safe [source: djb, personal communication] [TODO: Make sure I understood correctly!] so long as only the final (sk,pk) pair is used, and all previous values are discarded. To parallelize this algorithm, start with an independent (sk,pk) pair generated for each independent thread, and let each search proceed independently. See [VANITY-REFS] for a reference implementation of this vanity .onion search scheme. Appendix D. Numeric values reserved in this document [TODO: collect all the lists of commands and values mentioned above] Appendix E. Reserved numbers We reserve these certificate type values for Ed25519 certificates: [08] short-term descriptor signing key, signed with blinded public key. (Section 2.4) [09] intro point authentication key, cross-certifying the descriptor signing key. (Section 2.5) [0B] ed25519 key derived from the curve25519 intro point encryption key, cross-certifying the descriptor signing key. (Section 2.5) Note: The value "0A" is skipped because it's reserved for the onion key cross-certifying ntor identity key from proposal 228. Appendix F. Hidden service directory format [HIDSERVDIR-FORMAT] This appendix section specifies the contents of the HiddenServiceDir directory: - "hostname" [FILE] This file contains the onion address of the onion service. - "private_key_ed25519" [FILE] This file contains the private master ed25519 key of the onion service. [TODO: Offline keys] - "client_authorized_pubkeys" [FILE] If client authorization is _enabled_, this is a newline-separated file of "<client name> <pubkeys> entries for authorized clients. You can think of it as the ~/.ssh/authorized_keys of onion services. See [CLIENT-AUTH-MGMT] for more details. - "./client_authorized_privkeys/" [DIRECTORY] "./client_authorized_privkeys/alice.privkey" [FILE] "./client_authorized_privkeys/bob.privkey" [FILE] "./client_authorized_privkeys/charlie.privkey" [FILE] If client authorization is _enabled_ _AND_ if the hidden service is responsible for generating and distributing private keys for its clients, then this directory contains files with client's private keys. See [CLIENT-AUTH-MGMT] for more details. Appendix E. Managing authorized client data [CLIENT-AUTH-MGMT] Hidden services and clients can configure their authorized client data either using the torrc, or using the control port. This section presents a suggested scheme for configuring client authorization. Please see appendix [HIDSERVDIR-FORMAT] for more information about relevant hidden service files. E.1. Configuring client authorization using torrc E.1.1. Hidden Service side A hidden service that wants to perform client authorization, adds a new option HiddenServiceAuthorizeClient to its torrc file: HiddenServiceAuthorizeClient auth-type client-name,client-name,... The only recognized auth-type value is "basic" which describes the scheme in section [CLIENT-AUTH]. The rest of the line is a comma-separated list of human-readable authorized client names. Let's consider that one of the listed client names is "alice". In this case, Tor checks the "client_authorized_pubkeys" file for any entries with client_name being "alice". If an "alice" entry is found, we use the relevant pubkeys to authenticate Alice. If no "alice" entry is found in the "client_authorized_pubkeys" file, Tor is tasked with generating public/private keys for Alice. To do so, Tor generates x25519 and ed25519 keypairs for Alice, then makes a "client_authorized_privkeys/alice.privkey" file and writes the private keys inside; it also adds an entry for alice to the "client_authorized_pubkeys" file. In this last case, the hidden service operator has the responsibility to pass the .key file to Alice in a secure out-of-band way. After the file is passed to Alice, it can be shredded from the filesystem, as only the public keys are required for the hidden service to function. E.1.2. Client side A client who wants to register client authorization data for a hidden service needs to add the following line to their torrc: HidServAuth onion-address x25519-private-key ed25519-private-key The keys above are either generated by Alice using a key generation utility, or they are extracted from a .key file provided by the hidden service. In the former case, the client is also tasked with transfering the public keys to the hidden service in a secure out-of-band way. E.2. Configuring client authorization using the control port E.2.1. Service side A hidden service also has the option to configure authorized clients using the control port. The idea is that hidden service operators can use controller utilities that manage their access control instead of using the filesystem to register client keys. Specifically, we require a new control port command ADD_ONION_CLIENT_AUTH which is able to register x25519/ed25519 public keys tied to a specific authorized client. [XXX figure out control port command format] Hidden services who use the control port interface for client auth need to perform their own key management. E.2.2. Client side There should also be a control port interface for clients to register authorization data for hidden services without having to use the torrc. It should allow both generation of client authorization private keys, and also to import client authorization data provided by a hidden service This way, Tor Browser can present "Generate client auth keys" and "Import client auth keys" dialogs to users when they try to visit a hidden service that is protected by client authorization. Specifically, we require two new control port commands: IMPORT_ONION_CLIENT_AUTH_DATA GENERATE_ONION_CLIENT_AUTH_DATA which import and generate client authorization data respectively. [XXX how does key management work here?] [XXX what happens when people use both the control port interface and the filesystem interface?]
Filename: 225-strawman-shared-rand.txt Title: Strawman proposal: commit-and-reveal shared rng Author: Nick Mathewson Created: 2013-11-29 Status: Superseded Superseded-by: 250 1. Introduction This is a strawman proposal: I don't think we should actually build it. It's just a simple writeup of the more trivial commit-then-reveal protocol for generating a shared random value. It's insecure to the extent that an adversary who controls b of the authorities gets to choose among 2^b outcomes for the result of the protocol. See proposal 224, section HASHRING for some motivation of why we want one of these in Tor. Let's do better! [TODO: Are we really stuck with Tor's nasty metaformat here?] 2. The protocol Here's a protocol for producing a shared random value. It should run less frequently than the directory consensus algorithm. It runs in these phases. 1. COMMITMENT 2. REVEAL 3. COMPUTE SHARED RANDOM It should be implemented by software other than Tor, which should be okay for authorities. Note: This is not a great protocol. It has a number of failure modes. Better protocols seem hard to implement, though, and it ought to be possible to drop in a replacement here, if we do it right. At the start of phase 1, each participating authority publishes a statement of the form: shared-random 1 shared-random-type commit signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C signature (made with commitment key; see proposal 220) The signing key is the one used for consensus votes, signed by the directory authority identity key. The commitment key is used for this protocol only. The signature is made with the commitment key. The period-start value is the start of the period for which the shared random value should be in use. The attempt value starts at 1, and increments by 1 for each time that the protocol fails. The other fields should be self-explanatory. The commitment value C is a base64-encoded SHA-512 hash of a 256-bit random value R. During the rest of phase 1, every authority collects the commitments from other authorities, and publishes them to other authorities, as they do today with directory votes. At the start of phase 2, each participating authority publishes: shared-random 1 shared-random-type reveal signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig received-commitment ID sig published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C reveal R signature (made with commitment key; see proposal 220) The R value is the one used to generate C. The received-commitment lines are the signatures on the documents from other authorities in phase 1. All other fields are as in the commitments. During the rest of phase 2, every authority collects the reveals from other authorities, as above with commitments. At the start of phase 3, each participating authority either has a reveal from every authority that it received a commitment from, or it does not. Each participating authority then says shared-random 1 shared-random-type finish signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig R received-commitment ID sig R ... published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT consensus C signature (made with commitment key; see proposal 220) Where C = SHA256(ID | R | ID | R | ID | R | ...) where the ID values appear in ascending order and the R values appear after their corresponding ID values. See [SHAREDRANDOM-REFS] for more discussion here. (TODO: should this be its own spec? If so, does it have to use our regular metaformat or can it use something less sucky?)
Filename: 226-bridgedb-database-improvements.txt Title: "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" Author: Isis Agora Lovecruft Created: 12 Oct 2013 Related Proposals: XXX-social-bridge-distribution.txt Status: Reserve * I. Overview BridgeDB is Tor's Bridge Distribution system, which currently has two major Bridge Distribution mechanisms: the HTTPS Distributor and an Email Distributor. [0] BridgeDB is written largely in Twisted Python, and uses Python2's builtin sqlite3 as its database backend. Unfortunately, this backend system is already showing strain through increased times for queries, and sqlite's memory usage is not up-to-par with modern, more efficient, NoSQL databases. In order to better facilitate the implementation of newer, more complex Bridge Distribution mechanisms, several improvements should be made to the underlying database system of BridgeDB. Additionally, I propose that a clear distinction in terms, as well as a modularisation of the codebase, be drawn between the mechanisms for Bridge Distribution versus the backend Bridge Database (BridgeDB) storage system. This proposal covers the design and implementation of a scalable NoSQL ― Document-Based and Key-Value Relational ― database backend for storing data on Tor Bridge relays, in an efficient manner that is ammenable to interfacing with the Twisted Python asynchronous networking code of current and future Bridge Distribution mechanisms. * II. Terminology BridgeDistributor := A program which decides when and how to hand out information on a Tor Bridge relay, and to whom. BridgeDB := The backend system of databases and object-relational mapping servers, which interfaces with the BridgeDistributor in order to hand out bridges to clients, and to obtain and process new, incoming ``@type bridge-server-descriptors``, ``@type bridge-networkstatus`` documents, and ``@type bridge-extrainfo`` descriptors. [3] BridgeFinder := A client-side program for an Onion Proxy (OP) which handles interfacing with a BridgeDistributor in order to obtain new Bridge relays for a client. A BridgeFinder also interfaces with a local Tor Controller (such as TorButton or ARM) to handle automatic, transparent Bridge configuration (no more copy+pasting into a torrc) without being given any additional privileges over the Tor process, [1] and relies on the Tor Controller to interface with the user for control input and displaying up-to-date information regarding available Bridges, Pluggable Transport methods, and potentially Invite Tickets and Credits (a cryptographic currency without fiat value which is generated automatically by clients whose Bridges remain largely uncensored, and is used to purchase new Bridges), should a Social Bridge Distributor be implemented. [2] * III. Databases ** III.A. Scalability Requirements Databases SHOULD be implemented in a manner which is ammenable to using a distributed storage system; this is necessary because many potential datatypes required by future BridgeDistributors MUST be stored permanently. For example, in the designs for the Social Bridge Distributor, the list of hash digests of spent Credits, and the list of hash digests of redeemed Invite Tickets MUST be stored forever to prevent either from being replayed ― or double-spent ― by a malicious user who wishes to block bridges faster. Designing the BridgeDB backend system such that additional nodes may be added in the future will allow the system to freely scale in relation to the storage requirements of future BridgeDistributors. Additionally, requiring that the implementation allow for distributed database backends promotes modularisation the components of BridgeDB, such that BridgeDistributors can be separated from the backend storage system, BridgeDB, as all queries will be issued through a simplified, common API, regardless of the number of nodes system, or the design of future BridgeDistributors. *** 1. Distributed Database System A distributed database system SHOULD be used for BridgeDB, in order to scale resources as the number of Tor bridge users grows. This database system, hereafter referred to as DDBS. The DDBS MUST be capable of working within Twisted's asynchronous framework. If possible, a Object-Relational Mapper (ORM) SHOULD be used to abstract the database backend's structure and query syntax from the Twisted Python classes which interact with it, so that the type of database may be swapped out for another with less code refactoring. The DDBM SHALL be used for persistent storage of complex data structures such as the bridges, which MAY include additional information from both the `@type bridge-server-descriptor`s and the `@type bridge-extra-info` descriptors. [3] **** 1.a. Choice of DDBS CouchDB is chosen for its simple HTTP API, ease of use, speed, and official support for Twisted Python applications. [4] Additionally, its document-based data model is very similar to the current archetecture of tor's Directory Server/Mirror system, in that an HTTP API is used to retrieve data stored within virtual directories. Internally, it uses JSON to store data and JavaScript as its query language, both of which are likely friendlier to various other components of the Tor Metrics infrastructure which sanitise and analyse portions of the Bridge descriptors. At the very least, friendlier than hardcoding raw SQL queries as Python strings. **** 1.b. Data Structures which should be stored in a DDBS: - RedactedDB - The Database of Blocked Bridges The RedactedDB will hold entries of bridges which have been discovered to be unreachable from BridgeDB network vantage point, or have been reported unreachable by clients. - BridgeDB - The Database of Bridges BridgeDB holds information on available Bridges, obtained via bridge descriptors and networkstatus documents from the BridgeAuthority. Because a Bridge may have multiple `ORPort`s and multiple `ServerTransportListenAddress`es, attaching additional data to each of these addresses which MAY include the following information on a blocking event: - Geolocational country code of the reported blocking event - Timestamp for when the blocking event was first reported - The method used for discovery of the block - an the believed mechanism which is causing the block would quickly become unwieldy, the RedactedDB and BridgeDB SHOULD be kept separate. - User Credentials For the Social BridgeDistributor, these are rather complex, increasingly-growing, concatenations (or tuples) of several datatypes, including Non-Interactive Proofs-of-Knowledge (NIPK) of Commitments to k-TAA Blind Signatures, and NIPK of Commitments to a User's current number of Credits and timestamps of requests for Invite Tickets. *** 2. Key-Value Relational Database Mapping Server For simpler data structures which must be persistently stored, such as the list of hashes of previously seen Invite Tickets, or the list of previously spent Tokens, a Relational Database Mapping Server (RDBMS) SHALL be used for optimisation of queries. Redis and Memcached are two examples of RDBMS which are well tested and are known to work well with Twisted. The major difference between the two is that Memcaches are stored only within volatile memory, while Redis additionally supports commands for transferring objects into persistent, on-disk storage. There are several support modules for interfacing with both Memcached and Redis from Twisted Python, see Twisted's MemCacheProtocol class [5] [6] or txyam [7] for Memcached, and txredis [8] or txredisapi [9] for Redis. Additionally, numerous big name projects both use Redis as part of their backend systems, and also provide helpful documentation on their own experience of the process of switching over to the new systems. [17] For non-Twisted Python Redis APIs, there is redis-py, which provides a connection pool that could likely be interfaced with from Twisted Python without too much difficultly. [10] [11] **** 2.a. Data Structures which should be stored in a RDBMS Simple, mostly-flat datatypes, and data which must be frequently indexed should be stored in a RDBMS, such as large lists of hashes, or arbitrary strings with assigned point-values (i.e. the "Uniform Mapping" for the current HTTPS BridgeDistributor). For the Social BridgeDistributor, hash digests of the following datatypes SHOULD be stored in the RDBMS, in order to prevent double-spending and replay attacks: - Invite Tickets These are anonymous, unlinkable, unforgeable, and verifiable tokens which are occasionally handed out to well-behaved Users by the Social BridgeDistributor to permit new Users to be invited into the system. When they are redeemed, the Social BridgeDistributor MUST store a hash digest of their contents to prevent replayed Invite Tickets. - Spent Credits These are Credits which have already been redeemed for new Bridges. The Social BridgeDistributor MUST also store a hash digest of Spent Credits to prevent double-spending. *** 3. Bloom Filters and Other Database Optimisations In order to further decrease the need for lookups in the backend databases, Bloom Filters can used to eliminate extraneous queries. However, this optimization would only be beneficial for string lookups, i.e. querying for a User's Credential, and SHOULD NOT be used for queries within any of the hash lists, i.e. the list of hashes of previously seen Invite Tickets. [14] **** 3.a. Bloom Filters within Redis It might be possible to use Redis' GETBIT and SETBIT commands to store a Bloom Filter within a Redis cache system; [15] doing so would offload the severe memory requirements of loading the Bloom Filter into memory in Python when inserting new entries, reducing the time complexity from some polynomial time complexity that is proportional to the integral of the number of bridge users over the rate of change of bridge users over time, to a time complexity of order O(1). **** 3.b. Expiration of Stale Data Some types of data SHOULD be safe to expire, such as User Credentials which have not been updated within a certain timeframe. This idea should be further explored to assess the safety and potential drawbacks to removing old data. If there is data which SHOULD expire, the PEXPIREAT command provided by Redis for the key datatype would allow the RDBMS itself to handle cleanup of stale data automatically. [16] **** 4. Other potential uses of the improved Bridge database system Redis provides mechanisms for evaluations to be made on data by calling the sha1 for a serverside Lua script. [15] While not required in the slightest, it is a rather neat feature, as it would allow Tor's Metrics infrastructure to offload some of the computational overhead of gathering data on Bridge usage to BridgeDB (as well as diminish the security implications of storing Bridge descriptors). Also, if Twisted's IProducer and IConsumer interfaces do not provide needed interface functionality, or it is desired that other components of the Tor software ecosystem be capable of scheduling jobs for BridgeDB, there are well-tested mechanisms for using Redis as a message queue/scheduling system. [16] * References [0]: https://bridges.torproject.org mailto:bridges@bridges.torproject.org [1]: See proposals 199-bridgefinder-integration.txt at https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/199-bridgefinder-integration.txt [2]: See XXX-social-bridge-distribution.txt at https://gitweb.torproject.org/user/isis/bridgedb.git/blob/refs/heads/feature/7520-social-dist-design:/doc/proposals/XXX-bridgedb-social-distribution.txt [3]: https://metrics.torproject.org/formats.html#descriptortypes [4]: https://github.com/couchbase/couchbase-python-client#twisted-api [5]: https://twistedmatrix.com/documents/current/api/twisted.protocols.memcache.MemCacheProtocol.html [6]: http://stackoverflow.com/a/5162203 [7]: http://findingscience.com/twisted/python/memcache/2012/06/09/txyam:-yet-another-memcached-twisted-client.html [8]: https://pypi.python.org/pypi/txredis [9]: https://github.com/fiorix/txredisapi [10]: https://github.com/andymccurdy/redis-py/ [11]: http://degizmo.com/2010/03/22/getting-started-redis-and-python/ [12]: http://www.dr-josiah.com/2012/03/why-we-didnt-use-bloom-filter.html [13]: http://redis.io/topics/data-types §"Strings" [14]: http://redis.io/commands/pexpireat [15]: http://redis.io/commands/evalsha [16]: http://www.restmq.com/ [17]: https://www.mediawiki.org/wiki/Redis
Filename: 227-vote-on-package-fingerprints.txt Title: Include package fingerprints in consensus documents Author: Nick Mathewson, Mike Perry Created: 2014-02-14 Status: Closed Implemented-In: 0.2.6.3-alpha 0. Abstract We propose extending the Tor consensus document to include digests of the latest versions of one or more package files, to allow software using Tor to determine its up-to-dateness, and help users verify that they are getting the correct software. 1. Introduction To improve the integrity and security of updates, it would be useful to provide a way to authenticate the latest versions of core Tor software through the consensus. By listing a location with this information for each version of each package, we can augment the update process of Tor software to authenticate the packages it downloads through the Tor consensus. 2. Proposal We introduce a new line for inclusion in votes and consensuses. Its format is: "package" SP PACKAGENAME SP VERSION SP URL SP DIGESTS NL PACKAGENAME = NONSPACE VERSION = NONSPACE URL = NONSPACE DIGESTS = DIGEST | DIGESTS SP DIGEST DIGEST = DIGESTTYPE "=" DIGESTVAL NONSPACE = one or more non-space printing characters DIGESTVAL = DIGESTTYPE = one or more non-=, non-" " characters. SP = " " NL = a newline Votes and consensuses may include any number of "package" lines, but no vote or consensus may include more than one "package" line with the same PACKAGENAME and VERSION values. All "package" lines must be sorted by "PACKAGENAME VERSION", in lexical (strcmp) order. (If a vote contains multiple entries with the same PACKAGENAME and VERSION, then only the last one is considered.) If the consensus-method is at least 19, then when computing the consensus, package lines for a given PACKAGENAME/VERSION pair should be included if at least three authorities list such a package in their votes. (Call these lines the "input" lines for PACKAGENAME.) That consensus should contain every "package" line that is listed verbatim by more than half of the authorities listing a line for the PACKAGENAME/VERSION pair, and no others. These lines appear immediately following the client-versions and server-versions lines. 3. Recommended usage Programs that want to use this facility should pick their PACKAGENAME values, and arrange to have their versions listed in the consensus by at least three friendly authority operators. Programs may want to have multiple PACKAGENAME values in order to keep separate lists. These lists could correspond to how the software is used (as tor has client-versions and server-versions); or to a release series (as in tbb-alpha, tbb-beta, and tbb-stable); or to how bad it is to use versions not listed (as in foo-noknownexploits, foo-recommended). Programs MUST NOT use "package" lines from consensuses that have not been verified and accepted as valid according to the rules in dir-spec.txt, and SHOULD NOT fetch their own consensuses if there is a tor process also running that can fetch the consensus itself. For safety, programs MAY want to disable functionality until confirming that their versions are acceptable. To avoid synchronization problems, programs that use the DIGEST field to store a digest of the contents of the URL SHOULD NOT use any URLs whose contents are expected to change while any valid consensus lists them. 3.1. Intended usage by the Tor Browser Bundle Tor Browser Bundle packages will be listed with package names 'tbb-stable, 'tbb-beta', and 'tbb-alpha'. We will list a line for the latest version of each release series. When the updater downloads a new update, it always downloads the latest version of the Tor Browser Bundle. Because of this, and because we will only use these lines to authenticate updates, we should not need to list more than one version per series in the consensus. After completing a package download and verifying the download signatures (which are handled independently from the Tor Consensus), it will consult the appropriate current consensus document through the control port. If the current consensus timestamp is not yet more recent than the proposed update timestamp, the updater will delay installing the package until a consensus timestamp that is more recent than the update timestamp has been obtained by the Tor client. If the consensus document has a package line for the current release series with a matching version, it will then download the file at the specified URL, and then compute its hash to make sure it matches the value in the consensus. If the hash matches, the Tor Browser will download the file and parse its contents, which will be a JSON file which lists information needed to verify the hashes of the downloaded update file. If the hash does not match, the Tor Browser Bundle should display an error to the user and not install the package. If there are no package lines in the consensus for the expected version, the updater will delay installing the update (but the bundle should still inform the user they are out of date and may update manually). If there are no package lines in the consensus for the current release series at all, the updater should install the package using only normal signature verification. 4. Limitations and open questions This proposal won't tell users how to upgrade, or even exactly what version to upgrade to. If software is so broken that it won't start at all, or shouldn't be started at all, this proposal can't help with that. This proposal is not a substitute for a proper software update tool.
Filename: 228-cross-certification-onionkeys.txt Title: Cross-certifying identity keys with onion keys Author: Nick Mathewson Created: 25 February 2014 Status: Closed 0. Abstract In current Tor router descriptor designs, routers prove ownership of an identity key (by signing the router descriptors), but not of their onion keys. This document describes a method for them to do so. 1. Introduction. Signing router descriptors with identity keys prevents attackers from impersonating a server and advertising their own onion keys and IP addresses. That's good. But there's nothing in Tor right now that effectively stops you (an attacker) from listing somebody else's public onion key in your descriptor. If you do, you can't actually recover any keys negotiated using that key, and you can't MITM circuits made with that key (since you don't have the private key). (You _could_ do something weird in the TAP protocol where you receive an onionskin that you can't process, relay it to the party who can process it, and receive a valid reply that you could send back to the user. But this makes you a less effective man-in-the-middle than you would be if you had just generated your own onion key. The ntor protocol shuts down this possibility by including the router identity in the material to be hashed, so that you can't complete an ntor handshake unless the client agrees with you about what identity goes with your ntor onion key.) Nonetheless, it's probably undesirable that this is possible at all. Just because it isn't obvious today how to exploit this doesn't mean it will never be possible. 2. Cross-certifying identities with onion keys 2.1. What to certify Once proposal 220 is implemented, we'll sign our Ed25519 identity key as described in proposal 220. Since the Ed25519 identity key certifies the RSA key, there's no strict need to certify both separately. On the other hand, this proposal may be implemented before proposal 220. If it is, we'll need a way for it to certify the RSA1024 key too. 2.2. TAP onion keys We add to each router descriptor a new element, "onion-key-crosscert", containing a RSA signature of: A SHA1 hash of the identity key [20 bytes] The Ed25519 identity key, if any [32 bytes] If there is no ed25519 identity key, or if in some future version there is no RSA identity key, the corresponding field must be zero-filled. Parties verifying this signature MUST allow additional data beyond the 52 bytes listed above. 2.3. ntor onion keys Here, we need to convert the ntor key to an ed25519 key for signing. See the appendix A for how to do that. We'll also need to transmit a sign bit. We can add an element "ntor-onion-key-crosscert", containing an Ed25519 certificate in the format from proposal 220 section 2.1, with a sign indicator to indicate which ed25519 public key to use to check the key: "ntor-onion-key-crosscert" SP SIGNBIT SP CERT NL SIGNBIT = "0" / "1" Note that this cert format has 32 bytes of of redundant data, since it includes the identity key an extra time. That seems okay to me. The signed key here is the master identity key. The TYPE field in this certificate should be set to [0A] - ntor onion key cross-certifying ntor identity key 3. Authority behavior Authorities should reject any router descriptor with an invalid onion-key-crosscert element or ntor-onion-key-crosscert element. Both elements should be required on any cert containing an ed25519 identity key. See section 3.1 of proposal 220 for rules requiring routers to eventually have ed25519 keys. 4. Performance impact Routers do not generate new descriptors frequently enough for the extra signing operations required here to have an appreciable affect on their performance. Checking an extra ed25519 signature when parsing a descriptor is very cheap, since we can use batch signature checking. The point decompression algorithm will require us to calculate 1/(u+1), which costs as much as an exponentiation in GF(2^255-19). Checking an RSA1024 signature is also cheap, since we use small public exponents. Adding an extra RSA signature and an extra ed25519 signature to each descriptor will make each descriptor, after compression, about 128+100 bytes longer. (Compressed base64-encoded random bytes are about as long as the original random bytes.) Most clients don't download raw descriptors, though, so it shouldn't matter too much. A. Converting a curve25519 public key to an ed25519 public key Given a curve25519 x-coordinate (u), we can get the y coordinate of the ed25519 key using y = (u-1)/(u+1) and then we can apply the usual ed25519 point decompression algorithm to find the x coordinate of the ed25519 point to check signatures with. Note that we need the sign of the X coordinate to do this operation; otherwise, we'll have two possible X coordinates that might have correspond to the key. Therefore, we need the 'sign' of the X coordinate, as used by the ed25519 key expansion algorithm. To get the sign, the easiest way is to take the same private key, feed it to the ed25519 public key generation algorithm, and see what the sign is. B. Security notes It would be very bad for security if we provided a diffie-hellman oracle for our curve25519 ntor keys. Fortunately, we don't, since nobody else can influence the certificate contents. C. Implementation notes As implemented in Tor, I've decided to make this proposal cross-dependent on proposal 220. A router descriptor must have ALL or NONE of the following: * An Ed25529 identity key * A TAP cross-certification * An ntor cross-certification Further, if it has the above, it must also have: * An ntor onion key.
Filename: 229-further-socks5-extensions.txt Title: Further SOCKS5 extensions Author: Yawning Angel Created: 25-Feb-2014 Status: Rejected Note: These are good ideas, but it's better not to hack SOCKS any further now that we support HTTP CONNECT tunnels. 0. Abstract We propose extending the SOCKS5 protocol to allow passing more per-session metadata, and to allow returning more meaningful response failure codes back to the client. 1. Introduction The SOCKS5 protocol is used by Tor both as the primary interface for applications to transfer data, and as the interface by which Tor communicates with pluggable transport implementations. While the current specifications allow for passing a limited amount of per-session metadata via hijacking the Username/Password authentication method fields, this solution is limited in that the amount of payload that can be conveyed is restricted to 510 bytes, does not allow the SOCKS server to return a response, and precludes using authentication on the SOCKS port. The first part of this proposal defines a new authentication method to overcome both of these limitations. The second part of this proposal defines a range of SOCKS5 response codes that can be used to signal Tor specific error conditions when processing SOCKS requests. 2. Proposal 2.1. Tor Extended SOCKS5 Authentication We introduce a new authentication method to the SOCKS5 protocol. The METHOD number to be returned to indicate support for or select this method is X'97', which belongs to the "RESERVED FOR PRIVATE METHODS" range in RFC 1928. After the authentication method has been negotiated following the standard SOCKS5 protocol, the actual authentication phase begins. If any requirement labeled with a "MUST" below in this protocol is violated, the party receiving the violation MUST close the connection. All multibyte numeric values in this protocol MUST be transmitted in network (big-endian) byte order. The initiator will send an Extended Authentication request: +----+----------+-------+-------------+-------+-------------+--- |VER | NR PAIRS | KLEN1 | KEY1 | VLEN1 | VALUE1 | ... +----+----------+-------+-------------+-------+-------------+--- | 1 | 2 | 2 | KLEN1 bytes | 2 | VLEN1 bytes | ... +----+----------+-------+-------------+-------+-------------+--- VER: 8 bits (unsigned integer) This field specifies the version of the authentication method. It MUST be set to X'01'. NR PAIRS: 16 bits (unsigned integer) This field specifies the number of key/value pairs to follow. KLEN: 16 bits (unsigned integer) This field specifies the length of the key in bytes. It MUST be greater than 0. KEY: variable length This field contains the key associated with the subsequent VALUE field as an ASCII string, without a NUL terminator. VLEN: 16 bits (unsigned integer) This field specifies the length of the value in bytes. It MAY be X'0000', in which case the corresponding VALUE field is omitted. VALUE: variable length, optional The value corresponding to the KEY. The responder will verify the contents of the Extended Authentication request and send the following response: +----+--------+----------+-------+-------------+-------+-------------+--- |VER | STATUS | NR PAIRS | KLEN1 | KEY1 | VLEN1 | VALUE1 | ... +----+--------+----------+-------+-------------+-------+-------------+--- | 1 | 1 | 2 | 2 | KLEN1 bytes | 2 | VLEN1 bytes | ... +----+--------+----------+-------+-------------+-------+-------------+--- VER: 8 bits (unsigned integer) This field specifies the version of the authentication method. It MUST be set to X'01'. STATUS: 8 bits (unsigned integer) The status of the Extended Authentication request where: * X'00' SUCCESS * X'01' AUTHENTICATION FAILED * X'02' INVALID ARGUMENTS If a server sends a response indicating failure (STATUS value other than X'00') it MUST close the connection. [XXXX What should a client if it gets a value here it does not recognize?] NR PAIRS, KLEN, KEY, VLEN, VALUE: These fields have the same format as they do in Extended Authentication requests. The currently defined KEYs are: * "USERNAME" The username for authentication. * "PASSWD" The password for authentication. [XXXX What do these do? What is their behavior? Are they client-only? Right now, Tor uses SOCKS5 usernames and passwords in two ways: 1) as a way to control isolation, when receiving them from a SOCKS client. 2) as a way to encode arbitrary data, when sending data to a PT. Neither of these seem necessary any more. We can turn 1 into a new KEY, and we can turn 2 into a new set of keys. -NM] [XXX - Add some more here, Stream isolation? -YA] [XXXX What should a client if it gets a value here it does not recognize? -NM] [XXXX Should we recommend any namespace conventions for these? -NM] 2.2. Tor Extended SOCKS5 Reply Codes We introduce the following additional SOCKS5 reply codes to be sent in the REP field of a SOCKS5 message. Implementations MUST NOT send any of the extended codes unless the initiator has indicated that it understands the "Tor Extended SOCKS5 Authentication" as part of the version identifier/method selection SOCKS5 message. [Actually, should this perhaps be controlled by additional KEY? (I'm not sure.) -NM] Where: * X'E0' Hidden Service Not Found The requested Tor Hidden Service was not reachable. * X'E1' Hidden Service Not Reachable The requested Tor Hidden Service was not found. * X'F0' Temporary Pluggable Transport failure, retry immediately Pluggable transports SHOULD return this status code if the connection attempt failed, but the pluggable transport believes that subsequent connections with the same parameters are likely to succeed. Example: The ScrambleSuit Session Ticket handshake failed, but reconnecting is likely to succeed as it will use the UniformDH handshake. * X'F1' Pluggable transport protocol failure, invalid bridge Pluggable transports MUST return this status code if the connection attempt failed in a manner that indicates that the remote peer is not likely to accept connections at a later time. Example: The obfs3 handshake failed. * X'F2' Pluggable transport internal error Pluggable transports SHOULD return this status code if the connection attempt failed due to an internal error in the pluggable transport implementation. Tor might wish to restart the pluggable transport executable, or retry after a delay. 3. Compatibility SOCKS5 negotiates authentication methods so backward and forward compatibility is obtained for free, assuming a non-broken SOCKS5 implementation on the responder side that ignores unrecognised authentication methods in the negotiation phase. 4. Security Considerations Identical security considerations to RFC 1929 Username/Password authentication applies when doing Username/Password authentication using the keys reserved for such. As SOCKS5 is sent in cleartext, this extension (like the rest of the SOCKS5 protocol) MUST NOT be used in scenarios where sniffing is possible. The authors of this proposal note that binding any of the Tor (and associated) SOCKS5 servers to non-loopback interfaces is strongly discouraged currently, so in the current model this is believed to be acceptable. 5. References Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D., Jones L., "SOCKS Protocol Version 5", RFC 1928, March 1996. Tor Project, "Tor's extensions to the SOCKS protocol" Leech, M. "Username/Password Authentication for SOCKS V5", RFC 1929, March 1996. Appelbaum, J., Mathewson, N., "Pluggable Transport Specification", June 2012. [XXX - Changelog (Remove when accepted) -YA] 2014-02-28 (Thanks to nickm/arma) * Generalize to also support tor * Add status codes for bug #6031 * Switch from always having a username/password field to making them just predefined keys. * Change the auth method number to 0x97 2014-02-28 (nickm's fault) * check it into git * clean text a little, fix redundancy * ask some questions
Filename: 230-rsa1024-relay-id-migration.txt Title: How to change RSA1024 relay identity keys Authors: Nick Mathewson Created: 7 April 2014 Target: 0.2.? Status: Obsolete Note: Obsoleted by Ed25519 ID keys; superseded by 240 and 256. 1. Intro and motivation Some times, a relay would like to migrate from one RSA1024 identity key to another without losing its previous status. This is especially important because proposal 220 ("Migrate server identity keys to Ed25519") is not yet implemented, and so server identity keys are not kept offline. So when an OpenSSL bug like CVE-2014-0160 makes memory-reading attacks a threat to identity keys, we need a way for routers to migrate ASAP. This proposal does not cover migrating RSA1024 OR identity keys for authorities. 2. Design I propose that when a relay changes its identity key, it should include a "old-identity" field in its server descriptor for 60 days after the migration. This old-identity would include the old RSA1024 identity, a signature of the new identity key with the old one, and the date when the migration occurred. This field would appear as an "old-id" field in microdescriptors, containing a SHA1 fingerprint of the old identity key, if the signature turned out to be value. Authorities would store old-identity => new-identity mappings, and: * Treat history information (wfu, mtbf, [and what else?]) from old identities as applying to new identities instead. * No longer accept any routers descriptors signed by the old identity. Clients would migrate any guard entries for the old identity to the new identity. (This will break clients connections for clients who try to connect to the old identity key before learning about the new one, but the window there won't be large for any single router.) 3. Descriptor format details Router descriptors may contain these new elements: "old-rsa1024-id-key" NL RSA_KEY NL Contains an old RSA1024 identity key. If this appears, old-rsa1024-id-migration must also appear. [At most once] "old-rsa1024-id-migration" SP ISO-TIME NL SIGNATURE NL Contains a signature of: The bytes "RSA1024 ID MIGRATION" [20 bytes] The ISO-TIME field above as an 8 byte field [8 bytes] A SHA256 hash of the new identity [32 bytes] If this appears, "old-rsa1024-id-key" must also appear. [At most once]. 4. Interface To use this feature, a router should rename its secret_id_key file to secret_id_key_OLD. The first time that Tor starts and finds a secret_id_key_OLD file, it generates a new ID key if one is not present, and generates the text of the old-rsa-1024-id-key and old-rsa1024-id-migration fields above. It stores them in a new "old_id_key_migration" file, and deletes the secret_id_key_OLD file. It includes them in its desecriptors. Sixty days after the stored timestamp, the router deletes the "old_id_key_migration" file and stops including its contents in the descriptor.
Filename: 231-migrate-authority-rsa1024-ids.txt Title: Migrating authority RSA1024 identity keys Authors: Nick Mathewson Created: 8 April 2014 Target: 0.2.? Status: Obsolete Note: Obsoleted by Ed25519 ID keys; superseded by 240 and 256. 1. Intro and motivation We'd like for RSA1024 identity keys to die out entirely. But we may need to migrate authority identity keys before that happens. This is especially important because proposal 220 ("Migrate server identity keys to Ed25519") is not yet implemented, and so server identity keys are not kept offline. So when an OpenSSL bug like CVE-2014-0160 makes memory-reading attacks a threat to identity keys, we need a way for authorities to migrate ASAP. Migrating authority ID keys is a trickier problem than migrating router ID keys, since the authority RSA1024 keys are hardwired in the source. We use them to authenticate encrypted OR connections to authorities that we use to publish and retrieve directory information. This proposal does not cover migrating RSA1024 OR identity keys for other nodes; for that, see proposal 230. 2. Design When an authority is using a new RSA1024 key, it retains the old one in a "legacy_link_id_key" file. It uses this key to perform link protocol handshakes at its old address:port, and it uses the new key to perform link protocol handshakes at a new address:port. This should be sufficient for all clients that expect the old address:port:fingerprint to work, while allowing new clients to use the correct address:port:fingerprint. Authorities will sign their own router descriptors with their new identity key, and won't advertise the old port or fingerprint at all in their descriptors. This shouldn't break anything, so far as I know. 3. Implementation We'll have a new flag on an ORPort: "LegacyIDKey". It implies NoAdvertise. If it is present, we use our LegacyIDKey for that ORPort and that ORPort, for all of: * The TLS certificate chains used in the v1 and v2 link protocol handshake. * The certificate chains and declared identity in the v3 link handshake. * Accepting ntor cells. 4. Open questions On ticket #11448, Robert Ransom suggests that authorities may need to publish extra server descriptors for themselves, signed with the old identity key too. We should investigate whether clients will misbehave if they can't find such descriptors. If that's the case, authorities should generate these descriptors, but not include them in votes or the consensus; or if they are included, don't assign them flags that will get them used.
Filename: 232-pluggable-transports-through-proxy.txt Title: Pluggable Transport through SOCKS proxy Author: Arturo Filastò Created: 28 February 2012 Status: Closed Implemented-In: 0.2.6 Overview Tor introduced Pluggable Transports in proposal "180 Pluggable Transports for circumvention". The problem is that Tor currently cannot use a pluggable transport proxy and a normal (SOCKS/HTTP) proxy at the same time. This has been noticed by users in #5195, where Tor would be failing saying "Unacceptable option value: You have configured more than one proxy type". Trivia This comes from a discussion that came up with Nick and I promised to write a proposal for it if I wanted to hear what he had to say. Nick spoke and I am writing this proposal. Acknowledgments Most of the credit goes to Nick Mathewson for the main idea and the rest of it goes to George Kadianakis for helping me out in writing it. Motivation After looking at some options we decided to go for this solution since it guarantees backwards compatibility and is not particularly costly to implement. Design overview When Tor is configured to use both a pluggable transport proxy and a normal proxy it should delegate the proxying to the pluggable transport proxy. This can be achieved by specifying the address and port of the normal proxy to the pluggable transport proxy using environment variables: When both a normal proxy and the ClientTransportPlugin directives are set in the torrc, Tor should put the address of the normal proxy in an environment variable and start the pluggable transport proxy. When the pluggable transport proxy starts, it should read the address of the normal proxy and route all its traffic through it. After connecting to the normal proxy, the pluggable transport proxy notifies Tor whether it managed to connect or not. The environment variables also contain the authentication credentials for accessing the proxy. Specifications: Tor Pluggable Transport communication When Tor detects a normal proxy directive and a pluggable transport proxy directive, it sets the environment variable: "TOR_PT_PROXY" -- This is the address of the proxy to be used by the pluggable transport proxy. It is in the format: <proxy_type>://[<user_name>][:<password>][@]<ip>:<port> ex. socks5://tor:test1234@198.51.100.1:8000 socks4a://198.51.100.2:8001 Acceptable values for <proxy_type> are: 'socks5', 'socks4a' and 'http'. If no <password> can be specified (e.g. in 'socks4a'), it is left out. If the pluggable transport proxy detects that the TOR_PT_PROXY environment variable is set, it attempts connecting to it. On success it writes to stdout: "PROXY DONE". On failure it writes: "PROXY-ERROR <errormessage>". If Tor does not read a PROXY line or it reads a PROXY-ERROR line from its stdout and it is configured to use both a normal proxy and a pluggable transport it should kill the transport proxy.
Filename: 233-quicken-tor2web-mode.txt Title: Making Tor2Web mode faster Author: Virgil Griffith, Fabio Pietrosanti, Giovanni Pellerano Created: 2014-03-27 Status: Rejected 1. Introduction While chatting with the Tor archons at the winter meeting, two speed optimizations for tor2web mode [1] were put forward. This proposal specifies concretizes these two optimizations. As with the current tor2web mode, this quickened tor2web mode negates any client anonymity. 2. Tor2web optimizations 2.1. Self-rendezvous In the current tor2web mode, the client establishes a 1-hop circuit (direct connection) to a chosen rendezvous point. We propose that, alternatively, the client set *itself* as the rendezvous point. This coincides with ticket #9685[2]. 2.2. direct-introduction Identical to the non-tor2web mode, in the current tor2web mode, the client establishes a 3-hop circuit to the introduction point. We propose that, alternatively, the client builds a 1-hop circuit to the introduction point. 4. References [1] Tor2web mode: https://trac.torproject.org/projects/tor/ticket/2553 [2] Self-rendezvous: https://trac.torproject.org/projects/tor/ticket/9685
Filename: 234-remittance-addresses.txt Title: Adding remittance field to directory specification Author: Virgil Griffith, Leif Ryge, Rob Jansen Created: 2014-03-27 Status: Rejected Note: Rejected. People are doing this with ContactInfo lines. 1. Motivation We wish to add the ability for individual users to donate to the owners of relay operators using a cryptocurrency. We propose adding an optional line to the torrc file which will be published in the directory consensus and listed on https://compass.torproject.org. 2. Proposal Allow an optional "RemittanceAddresses" line to the torrc file containing comma-delimited cryptocurrency URIs. The format is: RemittanceAddressses <currency1>:<address>1,<currency2>:<address2> For an example using an actual bitcoin and namecoin address, this is: RemittanceAddressses bitcoin:19mP9FKrXqL46Si58pHdhGKow88SUPy1V8,namecoin:NAMEuWT2icj3ef8HWJwetZyZbXaZUJ5hFT The contents of a relay's RemittanceAddresses line will be mirrored in the relay's router descriptor (which is then published in the directory consensus). This line will be treated akin to the ContactInfo field. A cryptocurrency address may not contain a colon, comma, whitespace, or other nonprintable ASCII. Like the ContactInfo line, there is no explicit length limit for RemittanceAddressses---the only limit is the length of the entire descriptor. If the relay lists multiple addresses of the same currency type (e.g., two bitcoin addresses), only the first (left-most) one of each currency is published in the directory consensus.
Filename: 235-kill-named-flag.txt Title: Stop assigning (and eventually supporting) the Named flag Authors: Sebastian Hahn Created: 10 April 2014 Implemented-In: 0.2.6, 0.2.7 Status: Closed 1. Intro and motivation Currently, Tor supports the concept of linking a Tor relay's nickname to its identity key. This happens automatically as a new relay joins the network with a unique nickname, and keeps it for a while. To indicate that a nickname is linked to the presented identity, the directory authorities vote on a Named flag for all relays where they have such a link. Not all directory authorities are currently doing this - in fact, there are only two, gabelmoo and tor26. For a long time, we've been telling everyone to not rely on relay nicknames, even if the Named flag is assigned. This has two reasons: First off, it adds another trust requirement on the directory authorities, and secondly naming may change over time as relays go offline for substantial amounts of time. Now that a significant portion of the network is required to rotate their identity keys, few relays will keep their Named flag. We should use this chance to stop assigning Named flags. 2. Design None so far, but we should review older-but-still-supported Tor versions (down to 0.2.2.x) for potential issues. In theory, Tor clients already support consensuses without Named flags, and testing in private Tor networks has never revealed any issues in this regard, but we're unsure if there might be some functionality that isn't typically tested with private networks and could get broken now. 3. Implementation The gabelmoo and tor26 directory authorities can simply remove the NamingAuthoritativeDirectory configuration option to stop giving out Named flags. This will mean the consensus won't include Named and Unnamed flags any longer. The code collecting naming statistics is independent of Tor, so it can run a while longer to ensure Naming can be switched on if unforeseen issues arise. Once this has been shown to not cause any issues, support for the Named flag can be removed from the Tor client implementation, and support for the NamingAuthoritativeDirectory can be removed from the Tor directory authority implementation. 4. Open questions None.
Filename: 236-single-guard-node.txt Title: The move to a single guard node Author: George Kadianakis, Nicholas Hopper Created: 2014-03-22 Status: Closed -1. Implementation-status Partially implemented, and partially superseded by proposal 271. 0. Introduction It has been suggested that reducing the number of guard nodes of each user and increasing the guard node rotation period will make Tor more resistant against certain attacks [0]. For example, an attacker who sets up guard nodes and hopes for a client to eventually choose them as their guard will have much less probability of succeeding in the long term. Currently, every client picks 3 guard nodes and keeps them for 2 to 3 months (since 0.2.4.12-alpha) before rotating them. In this document, we propose the move to a single guard per client and an increase of the rotation period to 9 to 10 months. 1. Proposed changes 1.1. Switch to one guard per client When this proposal becomes effective, clients will switch to using a single guard node. That is, in its first startup, Tor picks one guard and stores its identity persistently to disk. Tor uses that guard node as the first hop of its circuits from thereafter. If that Guard node ever becomes unusable, rather than replacing it, Tor picks a new guard and adds it to the end of the list. When choosing the first hop of a circuit, Tor tries all guard nodes from the top of the list sequentially till it finds a usable guard node. A Guard node is considered unusable according to section "5. Guard nodes" in path-spec.txt. The rest of the rules from that section apply here too. XXX which rules specifically? -asn XXX Probably the rules about how to add a new guard (only after contact), when to re-try a guard for reachability, and when to discard a guard? -nickhopper XXX Do we need to specify how already existing clients migrate? 1.1.1. Alternative behavior to section 1.1 Here is an alternative behavior than the one specified in the previous section. It's unclear which one is better. Instead of picking a new guard when the old guard becomes unusable, we pick a number of guards in the beginning but only use the top usable guard each time. When our guard becomes unusable, we move to the guard below it in the list. This behavior _might_ make some attacks harder; for example, an attacker who shoots down your guard in the hope that you will pick his guard next, is now forced to have evil guards in the network at the time you first picked your guards. However, this behavior might also influence performance, since a guard that was fast enough 7 months ago, might not be this fast today. Should we reevaluate our opinion based on the last consensus, when we have to pick a new guard? Also, a guard that was up 7 months ago might be down today, so we might end up sampling from the current network anyway. 1.2. Increase guard rotation period When this proposal becomes effective, Tor clients will set the lifetime of each guard to a random time between 9 to 10 months. If Tor tries to use a guard whose age is over its lifetime value, the guard gets discarded (also from persistent storage) and a new one is picked in its place. XXX We didn't do any analysis on extending the rotation period. For example, we don't even know the average age of guards, and whether all guards stay around for less than 9 months anyway. Maybe we should do some analysis before proceeding? XXX The guard lifetime should be controlled using the (undocumented?) GuardLifetime consensus option, right? 1.2.1. Alternative behavior to section 1.2 Here is an alternative behavior than the one specified in the previous section. It's unclear which one is better. Similar to section 1.2, but instead of rotating to completely new guard nodes after 9 months, we pick a few extra guard nodes in the beginning, and after 9 months we delete the already used guard nodes and use the one after them. This has approximately the same tradeoffs as section 1.1.1. Also, should we check the age of all of our guards periodically, or only check them when we try to use them? 1.3. Age of guard as a factor on guard probabilities By increasing the guard rotation period we also increase the lack of utilization for young guards since clients will rotate guards even more infrequently now (see 'Phase three' of [1]). We can mitigate this phenomenon by treating these recent guards as "fractional" guards: To do so, everytime an authority needs to vote for a guard, it reads a set of consensus documents spanning the past NNN months, where NNN is the number of months in the guard rotation period (10 months if this proposal is adopted in full) and calculates in how many consensuses it has had the guard flag for. Then, in their votes, the authorities include the Guard Fraction of each guard by appending '[SP "GuardFraction=" INT]' in the guard's "w" line. Its value is an integer between 0 and 100, with 0 meaning that it's a brand new guard, and 100 that it has been present in all the inspected consensuses. A guard N that has been visible for V out of NNN*30*24 consensuses has had the opportunity to be chosen as a guard by approximately F = V/NNN*30*24 of the clients in the network, and the remaining 1-F fraction of the clients have not noticed this change. So when being chosen for middle or exit positions on a circuit, clients should treat N as if F fraction of its bandwidth is a guard (respectively, dual) node and (1-F) is a middle (resp, exit) node. Let Wpf denote the weight from the 'bandwidth-weights' line a client would apply to N for position p if it had the guard flag, Wpn the weight if it did not have the guard flag, and B the measured bandwidth of N in the consensus. Then instead of choosing N for position p proportionally to Wpf*B or Wpn*B, clients should choose N proportionally to F*Wpf*B + (1-F)*Wpn*B. Similarly, when calculating the bandwidth-weights line as in section 3.8.3 of dir-spec.txt, directory authorities should treat N as if fraction F of its bandwidth has the guard flag and (1-F) does not. So when computing the totals G,M,E,D, each relay N with guard visibility fraction F and bandwidth B should be added as follows: G' = G + F*B, if N does not have the exit flag M' = M + (1-F)*B, if N does not have the exit flag D' = D + F*B, if N has the exit flag E' = E + (1-F)*B, if N has the exit flag 1.3.1. Guard Fraction voting To pass that information to clients, we introduce consensus method 19, where if 3 or more authorities provided GuardFraction values in their votes, the authorities produce a consensus containing a GuardFraction keyword equal to the low-median of the GuardFraction votes. The GuardFraction keyword is appended in the 'w' line of each router in the consensus, after the optional 'Unmeasured' keyword. Example: w Bandwidth=20 Unmeasured=1 GuardFraction=66 or w Bandwidth=53600 GuardFraction=99 1.4. Raise the bandwidth threshold for being a guard From dir-spec.txt: "Guard" -- A router is a possible 'Guard' if its Weighted Fractional Uptime is at least the median for "familiar" active routers, and if its bandwidth is at least median or at least 250KB/s. When this proposal becomes effective, authorities should change the bandwidth threshold for being a guard node to 2000KB/s instead of 250KB/s. Implications of raising the bandwidth threshold are discussed in section 2.3. XXX Is this insane? It's an 8-fold increase. 2. Discussion 2.1. Guard node set fingerprinting With the old behavior of three guard nodes per user, it was extremely unlikely for two users to have the same guard node set. Hence the set of guard nodes acted as a fingerprint to each user. When this proposal becomes effective, each user will have one guard node. We believe that this slightly reduces the effectiveness of this fingerprint since users who pick a popular guard node will now blend in with thousands of other users. However, clients who pick a slow guard will still have a small anonymity set [2]. All in all, this proposal slightly improves the situation of guard node fingerprinting, but does not solve it. See the next section for a suggested scheme that would further fix the guard node set fingerprinting problem 2.1.1. Potential fingerprinting solution: Guard buckets One of the suggested alternatives that moves us closer to solving the guard node fingerprinting problem, would be to split the list of N guard nodes into buckets of K guards, and have each client pick a bucket [3]. This reduces the fingerprint from N-choose-k to N/k guard set choices; it also allows users to have multiple guard nodes which provides reliability and performance. Unfortunately, the implementation of this idea is not as easy and its anonymity effects are not well understood so we had to reject this alternative for now. 2.2. What about 'multipath' schemes like Conflux? By switching to one guard, we rule out the deployment of 'multipath' systems like Conflux [4] which build multiple circuits through the Tor network and attempt to detect and use the most efficient circuits. On the other hand, the 'Guard buckets' idea outlined in section 2.1.1 works well with Conflux-type schemes so it's still worth considering. 2.3. Implications of raising the bandwidth threshold for guards By raising the bandwidth threshold for being a guard we directly affect the performance and anonymity of Tor clients. We performed a brief analysis of the implications of switching to one guard and the results imply that the changes are not tragic [2]. Specifically, it seems that the performance of about half of the clients will degrade slightly, but the performance of the other half will remain the same or even improve. Also, it seems that the powerful guard nodes of the Tor network have enough total bandwidth capacity to handle client traffic even if some slow guard nodes get discarded. On the anonymity side, by increasing the bandwidth threshold to 2MB/s we half our guard nodes; we discard 1000 out of 2000 guards. Even if this seems like a substantial diversity loss, it seems that the 1000 discarded guard nodes had a very small chance of being selected in the first place (7% chance of any of the being selected). However, it's worth noting that the performed analysis was quite brief and the implications of this proposal are complex, so we should be prepared for surprises. 2.4. Should we stop building circuits after a number of guard failures? Inspired by academic papers like the Sniper attack [5], a powerful attacker can choose to shut down guard nodes till a client is forced to pick an attacker controlled guard node. Similarly, a local network attacker can kill all connections towards all guards except the ones she controls. This is a very powerful attack that is hard to defend against. A naive way of defending against it would be for Tor to refuse to build any more circuits after a number of guard node failures have been experienced. Unfortunately, we believe that this is not a sufficiently strong countermeasure since puzzled users will not comprehend the confusing warning message about guard node failures and they will instead just uninstall and reinstall TBB to fix the issue. 2.5. What this proposal does not propose Finally, this proposal does not aim to solve all the problems with guard nodes. This proposal only tries to solve some of the problems whose solution is analyzed sufficiently and seems harmless enough to us. For example, this proposal does not try to solve: - Guard enumeration attacks. We need guard layers or virtual circuits for this [6]. - The guard node set fingerprinting problem [7] - The fact that each isolation profile or virtual identity should have its own guards. XXX It would also be nice to have some way to easily revert back to 3 guards if we later decide that a single guard was a very stupid idea. References: [0]: https://blog.torproject.org/blog/improving-tors-anonymity-changing-guard-parameters http://freehaven.net/anonbib/#wpes12-cogs [1]: https://blog.torproject.org/blog/lifecycle-of-a-new-relay [2]: https://lists.torproject.org/pipermail/tor-dev/2014-March/006458.html [3]: https://trac.torproject.org/projects/tor/ticket/9273#comment:4 [4]: http://freehaven.net/anonbib/#pets13-splitting [5]: https://blog.torproject.org/blog/new-tor-denial-service-attacks-and-defenses [6]: https://trac.torproject.org/projects/tor/ticket/9001 [7]: https://trac.torproject.org/projects/tor/ticket/10969
Filename: 237-directory-servers-for-all.txt Title: All relays are directory servers Author: Matthew Finkel Created: 29-Jul-2014 Status: Closed Target: 0.2.7.x Implemented-in: 0.2.8.1-alpha Supersedes: 185 Overview: This proposal aims at simplying how users interact directly with the Tor network by turning all relays into directory servers (also known as directory caches), too. Currently an operator has the options of running a relay, a directory server, or both. With the acceptance (and implementation) of this proposal the options will be simplified by having (nearly) all relays cache and serve directory documents, without additional configuration. Motivation: Fetching directory documents and descriptors is not always a simple operation for a client. This is especially true and potentially dangerous when the client would prefer querying its guard but its guard is not a directory server. When this is the case, the client must choose and query a distinct directory server. At best this should not be necessary and at worst, it seems, this adds another position within the network for profiling and partitioning users. With the orthogonally proposed move to clients using a single guard, the resulting benefits could be reduced by clients using distinct directory servers. In addition, in the case where the client does not use guards, it is important to have the largest possible amount of diversity in the set of directory servers. In a network where (almost) every relay is a directory server, the profiling and partitioning attack vector is reduced to the guard (for clients who use them), which is already in a privileged position for this. In addition, with the increased set size, relay descriptors and documents are more readily available and it diversifies the providers. Design: The changes needed to achieve this should be simple. Currently all relays download and cache the majority of relay documents in any case, so the slight increased memory usage from downloading all of them should have minimal consequences. There will be necessary logical changes in the client, router, and directory code. Currently directory servers are defined as such if they advertise having an open directory port. We can no longer assume this is true. To this end, we will introduce a new server descriptor line. "tunnelled-dir-server" NL [At most once] [No extra arguments] The presence of this line indicates that the relay accepts tunnelled directory requests. For a relay that implements this proposal, this line MUST be added to its descriptor if it does not advertise a directory port, and the line MAY be added if it also advertises an open directory port. In addition to this, relays will now download and cache all descriptors and documents listed in the consensus, regardless of whether they are deemed useful or usable, exactly like the current directory server behavior. All relays will also accept directory requests when they are tunnelled over a connection established with a BEGIN_DIR cell, the same way these connections are already accepted by bridges and directory servers with an open DirPort. Directory Authorities will now assign the V2Dir flag to a server if it supports a version of the directory protocol which is useful to clients and it has at least an open directory port or it has an open and reachable OR port and advertises "tunnelled-dir-server" in its server descriptor. Clients choose a directory by using the current criteria with the additional criterion that a server only needs the V2Dir status flag instead of requiring an open DirPort. Security Considerations and Implications: Currently all directory servers are explicitly configured. This is necessary because they must have a configured and reachable external port. However, within Tor, this requires additional configuration and results in a reduced number of directory servers in the network. As a consequence, this could allow an adversary to control a non-negligable fraction of the servers. By increasing the number of directory servers in the network the likelihood of selecting one that is malicious is reduced. Also, with this proposal, it will be more likely that a client's entry guard is also a directory server (as alluded to in Proposal 207). However, the reduced anonymity set created when the guard does not have, or is unwilling to distribute, a specific document still exists. With the increased diversity in the available servers, the impact of this should be reduced. Another question that may need further consideration is whether we trust bad directories to be good guards and exits. Specification: The version 3 directory protocol specification does not currently document the use of directory guards. This spec should be updated to mention the preferred use of directory guards during directory requests. In addition, the new criteria for assigning the V2Dir flag should be documented. Impact on local resources: Should relays attempt to download documents from another mirror before asking an authority? All relays, with minor exceptions, will now contact the authorities for documents, but this will not scale well and will partition users from relays. If all relays become directory servers, they will choose to download all documents, regardless of whether they are useful, in case another client does want them. This will have very little impact on the most relays, however on memory constrained relays (BeagleBone, Raspberry Pi, and similar), every megabyte allocated to directory documents is not available for new circuits. For this reason, a new configuration option will be introduced within Tor for these systems, named DirCache, which the operator may choose to set as 0, thus disabling caching of directory documents and denying client directory requests. Future Considerations: Should the DirPort be deprecated at some point in the future? Write a proposal requiring that a relay must have the V2Dir flag as a criterion for being a guard. Is V2Dir a good name for this? It's the name we currently use, but that's a silly reason to continue using it.
Filename: 238-hs-relay-stats.txt Title: Better hidden service stats from Tor relays Author: George Kadianakis, David Goulet, Karsten Loesing, Aaron Johnson Created: 2014-11-17 Status: Closed 0. Motivation Hidden Services is one of the least understood parts of the Tor network. We don't really know how many hidden services there are and how much they are used. This proposal suggests that Tor relays include some hidden service related stats to their extra info descriptors. No stats are collected from Tor hidden services or clients. While uncertainty might be a good thing in a hidden network, learning more information about the usage of hidden services can be helpful. For example, learning how many cells are sent for hidden service purposes tells us whether hidden service traffic is 2% of the Tor network traffic or 90% of the Tor network traffic. This info can also help us during load balancing, for example if we change the path building of hidden services to mitigate guard discovery attacks [GUARD-DISCOVERY]. Also, learning the number of hidden services, can give us an understanding of how widespread hidden services are. It will also help us understand approximately how much load is put in the network by hidden service logistics, like introduction point circuits etc. 1. Design Tor relays shall add some fields related to hidden service statistics in their extra-info descriptors. Tor relays collect these statistics by keeping track of their hidden service directory or rendezvous point activities, slightly obfuscating the numbers and posting them to the directory authorities. Extra-info descriptors are posted to directory authorities every 24 hours. 2. Implementation 2.1. Hidden service statistics interval We want relays to report hidden-service statistics over a long-enough time period to not put users at risk. Similar to other statistics, we suggest a 24-hour statistics interval. All related statistics are collected at the end of that interval and included in the next extra-info descriptors published by the relay. Tor relays will add the following line to their extra-info descriptor: "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL [At most once.] YYYY-MM-DD HH:MM:SS defines the end of the included measurement interval of length NSEC seconds (86400 seconds by default). A "hidserv-stats-end" line, as well as any other "hidserv-*" line, is first added after the relay has been running for at least 24 hours. 2.2. Hidden service traffic statistics We want to learn how much of the total Tor network traffic is caused by hidden service usage. More precisely, we measure hidden service traffic by counting RELAY cells seen on a rendezvous point after receiving a RENDEZVOUS1 cell. These RELAY cells include commands to open or close application streams, and they include application data. Tor relays will add the following line to their extra-info descriptor: "hidserv-rend-relayed-cells" SP num SP key=val SP key=val ... NL [At most once.] Where 'num' is the number of RELAY cells seen in either direction on a circuit after receiving and successfully processing a RENDEZVOUS1 cell. The actual number is obfuscated as detailed in [STAT-OBFUSCATION]. The parameters of the obfuscation are included in the key=val part of the line. The obfuscatory parameters for this statistic are: * delta_f = 2048 * epsilon = 0.3 * bin_size = 1024 (Also see [CELL-LAPLACE-GRAPH] for a graph of the Laplace distribution.) So, an example line could be: hidserv-rend-relayed-cells 19456 delta_f=2048 epsilon=0.30 binsize=1024 2.3. HSDir hidden service counting We also want to learn how many hidden services exist in the network. The best place to learn this is at hidden service directories where hidden services publish their descriptors. Tor relays will add the following line to their extra-info descriptor: "hidserv-dir-onions-seen" SP num SP key=val SP key=val ... NL [At most once.] Approximate number of unique hidden-service identities seen in descriptors published to and accepted by this hidden-service directory. The actual number number is obfuscated as detailed in [STAT-OBFUSCATION]. The parameters of the obfuscation are included in the key=val part of the line. The obfuscatory parameters for this statistic are: * delta_f = 8 * epsilon = 0.3 * bin_size = 8 (Also see [ONIONS-LAPLACE-GRAPH] for a graph of the Laplace distribution.) So, an example line could be: hidserv-dir-onions-seen 112 delta_f=1 epsilon=0.30 binsize=8 2.4. Statistics obfuscation [STAT-OBFUSCATION] We believe that publishing the actual measurement values in such a system might have unpredictable effects, so we obfuscate these statistics before publishing: +-----------+ +--------------+ actual value -> | binning | -> |additive noise| -> public statistic +-----------+ +--------------+ We are using two obfuscation methods to better hide the actual numbers even if they remain the same over multiple measurement periods. Specifically, given the actual measurement value, we first apply data binning to it (basically we round it up to the nearest multiple of an integer, see [DATA-BINNING]). And then we apply additive noise to the binned value in a fashion similar to differential privacy. More information about the obfuscation methods follows: 2.4.1. Data binning The first thing we do to the original measurement value, is to round it up to the nearest multiple of 'bin_size'. 'bin_size' is an integer security parameter and can be found on the respective statistics sections. This is similar to how Tor keeps bridge user statistics. As an example, if the measurement value is 9 and bin_size is 8, then the final value will be rounded up to 16. This also works for negative values, so for example, if the measurement value is -9 and bin_size is 8, the value will be rounded up to -8. 2.4.2. Additive noise Then, before publishing the statistics, we apply additive noise to the binned value by adding to it a random value sampled from a Laplace distribution . Following the differential privacy methodology [DIFF-PRIVACY], our obfuscatory Laplace distribution has mu = 0 and b = (delta_f / epsilon). The precise values of delta_f and epsilon are different for each statistic and are defined on the respective statistics sections. 3. Security The main security considerations that need discussion are what an adversary could do with reported statistics that they couldn't do without them. In the following, we're going through things the adversary could learn, how plausible that is, and how much we care. (All these things refer to hidden-service traffic, not to hidden-service counting. We should think about the latter, too.) 3.1. Identify rendezvous point of high-volume and long-lived connection The adversary could identify the rendezvous point of a very large and very long-lived HS connection by observing a relay with unexpectedly large relay cell count. 3.2. Identify number of users of a hidden service The adversary may be able to identify the number of users of an HS if he knows the amount of traffic on a connection to that HS (which he potentially can determine himself) and knows when that service goes up or down. He can look at the change in the total reported RP traffic to determine about how many fewer HS users there are when that HS is down. 4. Discussion 4.1. Why count only RP cells? Why not count IP cells too? There are three phases in the rendezvous protocol where traffic is generated: (1) when hidden services make themselves available in the network, (2) when clients open connections to hidden services, and (3) when clients exchange application data with hidden services. We expect (3), that is the RP cells, to consume most bytes here, so we're focusing on this only. Furthermore, introduction points correspond to specific HSes, so publishing IP cell stats could reveal the popularity of specific HSes. 4.2. How to use these stats? 4.2.1. How to use rendezvous cell statistics We plan to extrapolate reported values to network totals by dividing values by the probability of clients picking relays as rendezvous point. This approach should become more precise on faster relays and the more relays report these statistics. We also plan to compare reported values with "cell-*" statistics to learn what fraction of traffic can be attributed to hidden services. Ideally, we'd be able to compare values to "write-history" and "read-history" lines to compute similar fractions of traffic used for hidden services. The goal would be to avoid enabling "cell-*" statistics by default. In order for this to work we'll have to multiply reported cell numbers with the default cell size of 512 bytes (we cannot infer the actual number of bytes, because cells are end-to-end encrypted between client and service). 4.2.2. How to use HSDir HS statistics We plan to extrapolate this value to network totals by calculating what fraction of hidden-service identities this relay was supposed to see. This extrapolation will be very rough, because each hidden-service directory is only responsible for a tiny share of hidden-service descriptors, and there is no way to increase that share significantly. Here are some numbers: there are about 3000 directories, and each descriptor is stored on three directories. So, each directory is responsible for roughly 1/1000 of descriptor identifiers. There are two replicas for each descriptor (that is, each descriptor is stored under two descriptor identifiers), and descriptor identifiers change once per day (which means that, during a 24-hour period, there are two opportunities for each directory to see a descriptor). Hence, each descriptor is stored to four places in identifier space throughout a 24-hour period. The probability of any given directory to see a given hidden-service identity is 1-(1-1/1000)^4 = 0.00399 = 1/250. This approximation constitutes an upper threshold, because it assumes that services are running all day. An extrapolation based on this formula will lead to undercounting the total number of hidden services. A possible inaccuracy in the estimation algorithm comes from the fact that a relay may not be acting as hidden-service directory during the full statistics interval. We'll have to look at consensuses to determine when the relay first received the "HSDir" flag, and only consider the part of the statistics interval following the valid-after time of that consensus. 4.3. Why does the obfuscation work? By applying data binning, we smudge the original value making it harder for attackers to guess it. Specifically, an attacker who knows the bin, can only guess the underlying value with probability 1/bin_size. By applying additive noise, we make it harder for the adversary to find out the current bin, which makes it even harder to get the original value. If additive noise was not applied, an adversary could try to detect changes in the original value by checking when we switch bins. 5. Acknowledgements Thanks go to 'pfm' for the helpful Laplace graphs. 6. References [GUARD-DISCOVERY]: https://lists.torproject.org/pipermail/tor-dev/2014-September/007474.html [DIFF-PRIVACY]: http://research.microsoft.com/en-us/projects/databaseprivacy/dwork.pdf [DATA-BINNING]: https://en.wikipedia.org/wiki/Data_binning [CELL-LAPLACE-GRAPH]: https://raw.githubusercontent.com/corcra/pioton/master/vis/laplacePDF_mu0.0_b6826.67.png https://raw.githubusercontent.com/corcra/pioton/master/vis/laplaceCDF_mu0.0_b6826.67.png [ONIONS-LAPLACE-GRAPH]: https://raw.githubusercontent.com/corcra/pioton/master/vis/laplacePDF_mu0.0_b26.67.png https://raw.githubusercontent.com/corcra/pioton/master/vis/laplaceCDF_mu0.0_b26.67.png
Filename: 239-consensus-hash-chaining.txt Title: Consensus Hash Chaining Author: Nick Mathewson, Andrea Shepard Created: 06-Jan-2015 Status: Open 1. Introduction and overview To avoid some categories of attacks against directory authorities and their keys, it would be handy to have an explicit hash chain in consensuses. 2. Directory authority operation We add the following field to votes and consensuses: previous-consensus ISOTIME [SP HashName "=" Base16]* NL where HashName is any keyword. This field may occur any number of times. The date in a previous-consensus line in a vote is the valid-after date of the consensus the line refers to. The hash should be computed over the signed portion of the consensus document. A directory authority should include a previous-consensus line for a consensus using all hashes it supports for all consensuses it knows which are still valid, together with the two most recently expired ones. When this proposal is implemented, a new consensus method should be allocated for adding previous-consensus lines to the consensus. A previous-consensus line is included in the consensus if and only if a line with that date was listed by more than half of the authorities whose votes are under consideration. A hash is included in that line if the hash was listed by more than half of the authorities whose votes are under consideration. Hashes are sorted lexically with a line by hashname; dates are sorted in temporal order. If, when computing a consensus, the authorities find that any previous-consensus line is *incompatible* with another, they must issue a loud warning. Two lines are incompatible if they have the same ISOTIME, but different values for the the same HashName. The hash "sha256" is mandatory. 3. Client and cache operation All parties receiving consensus documents should validate previous-consensus lines, and complain loudly if a hash fails to match. When a party receives a consensus document, it SHOULD check all previous-consensus lines against any previous consensuses it has retained, and if a hash fails to match it SHOULD warn loudly in the log mentioning the specific hashes and valid-after times in question, and store both the new consensus containing the mismatching hashes and the old consensus being checked for later analysis. An option SHOULD be provided to disable operation as a client or as a hidden service if this occurs. All relying parties SHOULD by default retain all valid consensuses they download plus two; but see "Security considerations" below. If a hash is not mismatched, the relying party may nonetheless be unable to validate the chain: either because there is a gap in the chain itself, or because the relying party does not have any of the consensuses that the latest consensus mentions. If this happens, the relying party should log a warning stating the specific cause, the hashes and valid-after time of both the consensus containing the unverifiable previous-consensus line and the hashes and valid-after time of the line for each such line, and retain a copy of the consensus document in question. A relying party MAY provide an option to disable operation as a client or hidden service in this event, but due to the risk that breaks in the chain may occur accidentally, such an option SHOULD be disabled by default if provided. If a relying party starts up and finds only very old consensuses such that no previous-consensus lines can be verified, it should log a notice of the gap along the lines of "consensus (date, hash) is quite new. Can't chain back to old consensus (date, hash)". If it has no old consensuses at all, it should log an info-level message of the form "we got consensus (date, hash). We haven't got any older consensuses, so we won't do any hash chain verification" 4. Security Considerations: * Retaining consensus documents on clients might leak information about when the client was active if a disk is later stolen or the client compromised. This should be documented somewhere and an option to disable (but thereby also disable verifying previous-consensus hashes) should be provided. * Clients MAY offer the option to retain previous consensuses in memory only to allow for validation without the potential disk leak.
Filename: 240-auth-cert-revocation.txt Title: Early signing key revocation for directory authorities Author: Nick Mathewson Created: 09-Jan-2015 Status: Open 1. Overview This proposal describes a simple way for directory authorities to perform signing key revocation. 2. Specification We add the following lines to the authority signing certificate format: revoked-signing-key SP algname SP FINGERPRINT NL This line may appear zero or more times. It indicates that a particular not-yet-expired signing key should not be used. 3. Client and cache operation No client or cache should retain, use, or serve any certificate whose signing key is described in a revoked-signing-key line in a certificate with the same authority identity key. (If the signing key fingerprint appears in a cert with a different identity key, it has no effect: you aren't allowed to revoke other people's keys.) No Tor instance should download a certificate whose signing key,identity key combination is known to be revoked. 4. Authority operator interface. The 'tor-gencert' command will take a number of older certificates to revoke as optional command-line arguments. It will include their keys in revoked-signing-key lines only if they are still valid, or have been expired for no more than a month. 5. Circular revocation My first attempt at writing a proposal here included a lengthy section about how to handle cases where certificate A revokes the key of certificate B, and certificate B revokes the key of certificate A. Instead, I am inclined to say that this is a MUST NOT.
Filename: 241-suspicious-guard-turnover.txt Title: Resisting guard-turnover attacks Author: Aaron Johnson, Nick Mathewson Created: 2015-01-27 Status: Rejected This proposal was made obsolete by the introduction of Proposal #259. Some of the ideas here have be incorporated into Proposal #259. 1. Introduction Tor uses entry guards to prevent an attacker who controls some fraction of the network from observing a fraction of every user's traffic. If users chose their entries and exits uniformly at random from the list of servers every time they build a circuit, then an adversary who had (k/N) of the network would deanonymize F=(k/N)^2 of all circuits... and after a given user had built C circuits, the attacker would see them at least once with probability 1-(1-F)^C. With large C, the attacker would get a sample of every user's traffic with probability 1. To prevent this from happening, Tor clients choose a small number of guard nodes (currently 1: see proposal 236). These guard nodes are the only nodes that the client will connect to directly. If they are not compromised, the user's paths are not compromised. But attacks remain. Consider an attacker who can run a firewall between a target user and the Tor network, and make many of the guards they don't control appear to be unreachable. Or consider an attacker who can identify a user's guards, and mount denial-of-service attacks on them until the user picks a guard that the attacker controls. In the presence of these attacks, we can't continue to connect to the Tor network unconditionally. Doing so would eventually result in the user choosing a hostile node as their guard, and losing anonymity. 2. Proposed behavior Keep a record of all the guards we've tried to connect to, connected to, or extended circuits through in the last PERIOD days. (We have connected to a guard if we authenticate its identity. We have extended a circuit through a guard if we built a multi-hop circuit with it.) If the number of guards we have *tried* to connect to in the last PERIOD days is greater than CANDIDATE_THRESHOLD, do not attempt to connect to any other guards; only attempt the ones we have previously *tried* to connect to. If the number of guards we *have* connected to in the last PERIOD days is greater than CONNECTED_THRESHOLD, do not attempt to connect to any other guards; only attempt ones we have already *successfully* connected to. If we fail to connect to NET_THRESHOLD guards in a row, conclude that the network is likely down. Stop/notify the user; retry later; add no new guards for consideration. [[ optional If we notice that USE_THRESHOLD guards that we *used for circuits* in the last FAST_REACT_PERIOD days are not working, but some other guards are, assume that an attack is in progress, and stop/notify the user. ]] 2.1. Suggested parameter thresholds. PERIOD -- 60 days FAST_REACT_PERIOD -- 10 days CONNECTED_THRESHOLD -- 8 CANDIDATE_THRESHOLD -- 20 NET_THRESHOLD -- 10 (< CANDIDATE_THRESHOLD) [[ optional USE_THRESHOLD -- 3 (< CONNECTED_THRESHOLD) ]] (Each of the above should have a corresponding consensus parameter.) 2.2. What do we mean by "Stop/warn"? By default, we should probably give warnings in most of the above cases for the first version that deploys them. We can have an on/off/auto setting for whether we will build circuits at all if we're in a "stopped" mode. Default should be auto, meaning off for now. The warning needs to be carefully chosen, and suggest a workaround better than "get a better network" or "clear your state file". 2.3. What's with making USE_THRESHOLD optional? Aaron thinks that getting rid of it might help in the fascistfirewall case. I'm a little unclear whether that makes any of the attacks easier. 3. State storage requirements Right now, we save for each guard that we have made contact with: ID Added is dircache? down-since last-attempted bad-since chosen-on-date, chosen-by-version path bias info (circ_attempts, successes, close_success) To implement the above proposal, we'll need to add, for each guard *or guard candidate*: when did we first decide to try connecting to it? when did we last do one of: decide to try connecting to it? connect to it? build a multihop circuit through it? which one was it? Probably round these to the nearest day or so. 4. Future work We need to make this play nicely with mobility. When a user has three guards on port 9001 and they move to a firewall that only allows 80/443, we'd prefer that they not simply grind to a halt. If nodes are configured to stop when too many of their guards have gone away, this will confuse them. If people need to turn FascistFirewall on and off, great. But if they just clear their state file as a workaround, that's not so good. If we could tie guard choice to location, that would help a great deal, but we'd need to answer the question, "Where am I on the network", which is not so easy to do passively if you're behind a NAT. Appendix A. Scenario analysis A.1. Example attacks * Filter Alice's connection so they can only talk to your guards. * Whenever Alice is using a guard you don't control, DOS it. A.2. Example non-attacks * Alice's guard goes down. * Alice is on a laptop that is sometimes behind a firewall that blocks a guard, and sometimes is not. * Alice is on a laptop that's behind a firewall that blocks a lot of the tor network, (like, everything not on 80/443). * Alice has a network connection that sometimes turns off and turns on again. * Alice reboots her computer periodically, and tor starts a little while before the network is live. Appendix B. Acknowledgements Thanks to Rob Jansen and David Goulet for comments on earlier versions of this draft. Appendix C. Desirable revisions Incorporate ideas from proposal 156.
Filename: 242-better-families.txt Title: Better performance and usability for the MyFamily option Author: Nick Mathewson Created: 2015-02-27 Status: Superseded Superseded-by: 321-happy-families.md 1. Problem statement. The current family interface allows well-behaved relays to identify that they all belong to the same 'family', and should not be used in the same circuits. Right now, this interface works by having every family member list every other family member in its server descriptor. This winds up using O(n^2) space in microdescriptors, server descriptors, and RAM. Adding or removing a server from the family requires all the other servers to change their torrc settings. One proposal is to eliminate the use of the Family option entirely; see ticket #6676. But if we don't, let's come up with a way to make it better. (I'm writing this down mainly to get it out of my head.) 2. Design overview. In this design, every family has a master ed25519 key. A node is in the family iff its server descriptor includes a certificate of its ed25519 identity key with the master ed25519 key. The certificate format is as in proposal 220 section 2.1. Note that because server descriptors are signed with the node's ed25519 signing key, this creates a bidirectional relationship where nodes can't be put in families without their consent. 3. Changes to server descriptors We add a new entry to server descriptors: "family-cert" This line contains a base64-encoded certificate as described above. It may appear any number of times. 4. Changes to microdescriptors We add a new entry to microdescriptors: "family-keys" This line contains one or more space-separated strings describing families to which the node belongs. These strings MUST be between 1 and 64 characters long, and sorted in lexical order. Clients MUST NOT depend on any particular property of these strings. 5. Changes to voting algorithm We allocate a new consensus method number for voting on these keys. When generating microdescriptors using a suitable consensus method, the authorities include a "family-keys" line if the underlying server descriptor contains any family-cert lines. For each family-cert in the server descriptor, they add a base-64-encoded string of that family-cert's signing key. 6. Client behavior Clients should treat node A and node B as belonging to the same family if ANY of these is true: * The client has server descriptors or microdescriptors for A and B, and A's descriptor lists B in its family line, and B's descriptor lists A in its family line. * The client has a server descriptor for A and one for B, and they both contain valid family-cert lines whose certs are signed by the family key. * The client has microdescriptors for A and B, and they both contain some string in common on their family-cert line. 7. Deprecating the old family lines. Once all clients that support the old family line format are deprecated, servers can stop including family lines in their descriptors, and authorities can stop including them in their microdescriptors. 8. Open questions The rules in section 6 above leave open the possibility of old clients and new clients reaching different decisions about who is in a family. We should evaluate this for anonymity implications. It's possible that families are a bad idea entirely; see ticket #6676.
Filename: 243-hsdir-flag-need-stable.txt Title: Give out HSDir flag only to relays with Stable flag Author: George Kadianakis Created: 2015-03-23 Status: Closed Implemented-in: 0.2.7 1. Introduction The descriptors of hidden services are stored by hidden service directories. Those are chosen by directory authorities who assign the "HSDir" flag to those relays according to their uptime. It's important for new relays to not be able to get the HSDir flag too easily, because a few correctly placed HSDirs can launch a denial of service attack on a hidden service. We should make sure that a naive Sybil attacker that injects thousands of new Tor relays to the network cannot position herself like this. 2. Motivation Currently, directory authorities give out the HSDir flag to relays that volunteer to be hidden service directories by sending a "hidden-service-dir" line in their relay descriptor, which is the default relay behavior. Furthermore, the HSDir flag is only given to relays that have been up for more than MinUptimeHidServDirectoryV2 hours. MinUptimeHidServDirectoryV2 is a parameter locally set at the directory authorities and it's somewhere between 25 to 96 hours. We propose changing that last requirement, and instead giving the HSDir flag only to relays that have the Stable flag. We believe that this will result in a few benefits: - We stop using the ad-hoc uptime calculation that we are currently doing (see dirserv_thinks_router_is_hs_dir()). Instead, we use the MTBF uptime calculation that is performed for the Stable flag which is more robust. - We increase the time required to get the HSDir flag, making it harder for naive adversaries that flood the network with relays to actually get the HSDir flag. - After implementing non-deterministic HSDir picks (#8244) we also make it harder for sophisticated adversaries to DoS a hidden service, since at that point their main attack strategy is to flood the network with relays. - By increasing the stability of HSDirs, we reduce the misses during descriptor fetching that get caused by natural churn of relays on the list of HSDirs. 3. Specification We are suggesting changing the criteria that directory authorities use to vote for HSDirs to the following: - The relay has included the "hidden-service-dir\n" line in its descriptor. - The relay is eligible for having the "Stable" flag. 4. Security considerations As it currently is, a router is 'Stable' if it is active, and either its Weighted MTBF is at least the median for known active routers or its Weighted MTBF corresponds to at least 7 days. This is stricter criteria than what's required for HSDir, which means that the number of HSDirs will decrease after the suggested changes. Currently there are about 2400 HSDirs in the consensus, and about 2300 of them are Stable, which means that we will lose about 100 HSDirs. We believe that this is an acceptable temporary loss. In the short-term future, the number of HSDirs will greatly improve as more directory authorities upgrade to #14202 and more relays upgrade to #12538. 5. Future Should we give out the HSDir flag only to relays that are Fast? Is being an HSDir a demanding job bandwidth-wise? With the upcoming keyblinding scheme (#8106) and non-deterministic HSDir selection (#8244), are there any other criteria that we should use when assigning HSDir flags?
Filename: 244-use-rfc5705-for-tls-binding.txt Title: Use RFC5705 Key Exporting in our AUTHENTICATE calls Author: Nick Mathewson Created: 2015-05-14 Status: Closed Implemented-In: 0.3.0.1-alpha 0. IMPLEMENTATION-NOTES We decided to implement this proposal for the Ed25519 handshake only. 1. Proposal We use AUTHENTICATE cells to bind the connection-initiator's Tor identity to a TLS session. Our current type of authentication ("RSA-SHA256-TLSSecret", see tor-spec.txt section 4.4) does this by signing a document that includes an HMAC of client_random and server_random, using the TLS master secret as a secret key. There is a more standard way to get at this information, by using the facility defined in RFC5705. Further, it is likely to continue to work with more TLS libraries, including TLS libraries like OpenSSL 1.1 that make master secrets and session data opaque. I propose that we introduce a new authentication type, with AuthType and TYPE field to be determined, that works the same as our current "RSA-SHA256-TLSSecret" authentication, except for these fields: TYPE is a different constant string, "AUTH0002". TLSSECRETS is replaced by the output of the Exporter function in RFC5705, using as its inputs: * The label string "EXPORTER FOR TOR TLS CLIENT BINDING " + TYPE * The context value equal to the client's identity key digest. * The length 32. I propose that proposal 220's section on authenticating with ed25519 keys be amended accordingly: TYPE is a different constant string, "AUTH0003". TLSSECRETS is replaced by the output of the Exporter function in RFC5705, using as its inputs: * The label string "EXPORTER FOR TOR TLS CLIENT BINDING " + TYPE * The context value equal to the client's Ed25519 identity key * The length 32.
Filename: 245-tap-out.txt Title: Deprecating and removing the TAP circuit extension protocol Author: Nick Mathewson Created: 2015-06-02 Status: Needs-Revision 0. Introduction This proposal describes a series of steps necessary for deprecating TAP without breaking functionality. TAP is the original protocol for one-way authenticated key negotiation used by Tor. Before Tor version 0.2.4, it was the only supported protocol. Its key length is unpleasantly short, however, and it had some design warts. Moreover, it had no name, until Ian Goldberg wrote a paper about the design warts. Why deprecate and remove it? Because ntor is better in basically every way. It's actually got a proper security proof, the key strength seems to be 20th-century secure, and so on. Meanwhile, TAP is lingering as a zombie, taking up space in descriptors and microdescriptors. 1. TAP is still in (limited) use today for hidden service hops. The original hidden service protocol only describes a way to tell clients and servers about an introduction point's or a rendezvous point's TAP onion key. We can do a bit better (see section 4), but we can't break TAP completely until current clients and hidden services are obsolete. 2. The step-by-step process. Step 1. Adjust the parsing algorithm for descriptors and microdescriptors on servers so that it accepts MDs without a TAP key. See section 3 below. Target: 0.2.7. Step 1b. Optionally, when connecting to a known IP/RP, extend by ntor. (See section 4 below.) Step 2. Wait until proposal 224 is implemented. (Clients and hidden services implementing 224 won't need TAP for anything.) Step 3. Begin throttling TAP answers even more aggressively at relays. Target: prop224 is stable. Step 4. Wait until all versions of Tor without prop224 support are obsolete/deprecated. Step 5. Stop generating TAP keys; stop answering TAP requests; stop advertising TAP keys in descriptors; stop including them in microdescriptors. Target: prop224 has been stable for 12-18 months, and 0.2.7 has been stable for 2-3 years. 3. Accepting descriptors without TAP keys. (Step 1) Our microdescriptor parsing code uses the string "onion-key" at the start of the line to identify the boundary between microdescriptors, so we can't remove it entirely. Instead, we will make the body optional. We will make the following changes to dir-spec: - In router descriptors, make the onion-key field "at most once" instead of "exactly once." - In microdescriptors, make the body of "onion-key" optional. Until Step 4, authorities MUST still reject any descriptor without a TAP key. If we do step 1 before proposal 224 is implemented, we'll need to make sure that we never choose a relay without a TAP key as an introduction point or a rendezvous point. 4. Avoiding TAP earlier for HS usage (Step 1b) We could begin to move more circuits off TAP now by adjusting our behavior for extending circuits to Introduction Points and Rendezvous Points. The new rule would be: If you've been told to extend to an IP/RP, and you know a directory entry for that relay (matching by identity), you extend using the node_t you have instead. This would improve cryptographic security a bit, at the expense of making it possible to probe for whether a given hidden service has an up-to-date consensus or not, and learn whether each client has an up-to-date consensus or not. We need to figure out whether that enables an attack. (For reference, the functions to patch would be rend_client_get_random_intro_impl and find_rp_for_intro.)
Filename: 246-merge-hsdir-and-intro.txt Title: Merging Hidden Service Directories and Introduction Points Author: John Brooks, George Kadianakis Created: 2015-07-12 Status: Rejected Change history: 18-Jan-2016 Changed status to "Needs-Research" after discussion in email thread [1]. 1. Overview and Motivation This document describes a modification to proposal 224 ("Next-Generation Hidden Services in Tor"), which simplifies and improves the architecture by combining hidden service directories and introduction points at the same relays. A reader will want to be familiar with the existing hidden service design, and with the changes in proposal 224. If accepted, this proposal should be combined with proposal 224 to make a superseding specification. 1.1. Overview In the existing hidden service design and proposal 224, there are three distinct steps building a connection: fetching the descriptor from a directory, contacting an introduction point listed in the descriptor, and rendezvous as specified during the introduction. The hidden service directories are selected algorithmically, and introduction points are selected at random by the service. We propose to combine the responsibilities of the introduction point and hidden service directory. The list of introduction points responsible for a service will be selected using the algorithm specified for HSDirs [proposal 224, section 2.2.3]. The service builds a long-term introduction circuit to each of these, identified by its blinded public key. Clients can calculate the same set of relays, build an introduction circuit, retrieve the ephemeral keys, and proceed with sending an introduction to the service in the same ways as before. 1.2. Benefits over proposal 224 With this change, client connections are made more efficient by needing only two circuits (for introduction and rendezvous), instead of the three needed previously, and need to contact fewer relays. Clients also no longer cache descriptors, which substantially simplifies code and removes a common source of bugs and reliability issues. Hidden services are able to stay online by simply maintaining their introduction circuits; there is no longer a need to periodically update descriptors. This reduces network load and traffic fingerprinting opportunities for a hidden service. The number and churn of relays a hidden service depends on is also reduced. In particular, prior hidden service designs may frequently choose new introduction points, and each of these has an opportunity to observe the popularity or connection behavior of clients. 1.3. Other effects on proposal 224 An adversarial introduction point is not significantly more capable than a hidden service directory under proposal 224. The differences are: 1. The introduction point maintains a long-lived circuit with the service 2. The introduction point can break that circuit and cause the service to rebuild it See section 4 ("Discussion") for other impacts and open discussion questions. 2. Specification 2.1. Picking introduction points for a service Instead of picking HSDirs, hidden services pick their introduction points using the same algorithm as defined in proposal 224 section 2.2 [HASHRING]. To be used as an introduction point, a relay must have the Stable flag in the consensus and an uptime of at least twice the shared random period defined in proposal 224 section 2.3. This also specifies the lifetime of introduction points, since they will be rotated with the change of time period and shared randomness. 2.2. Hidden service sets up introduction points After a hidden service has picked its intro points, it needs to establish long-term introduction circuits to them and also send them an encrypted descriptor that should be forwarded to potential clients. The descriptor contains a service key that should be used by clients to encrypt the INTRODUCE1 cell that will be sent to the hidden service. The encrypted parts of the descriptor are encrypted with the symmetric keys specified in prop224 section [ENCRYPTED-DATA]. 2.2.1. Hidden service uploads a descriptor Services post a descriptor by opening a directory stream with BEGIN_DIR, and sending a HTTP POST request as described in proposal 224, section 2.2.4. The relay must verify the signatures of the descriptor, and check whether it is responsible for that blinded public key in the hash ring. Relays should connect the descriptor to the circuit used to upload it, which will be repurposed as the service introduction circuit. The descriptor does not need to be cached by the introduction point after that introduction circuit has closed. It is unexpected and invalid to send more than one descriptor on the same introduction circuit. 2.2.2. Descriptor format The format for the hidden service descriptor is as described in proposal 224 sections 2.4 and 2.5, with the following modifications: * The "revision-counter" field is removed * The introduction-point section is removed * The "auth-key" field is removed * The "enc-key legacy" field is removed * The "enc-key ntor" field must be specified exactly once per descriptor Unlike previous versions, the descriptor does not encode the entire list of introduction points. The descriptor only contains a key for the particular introduction point it was sent to. 2.2.3. ESTABLISH_INTRO cell When a hidden service is establishing a new introduction point, it sends the ESTABLISH_INTRO cell, which is formatted as described by proposal 224 section 3.1.1, except for the following: The AUTH_KEY_TYPE value 02 is changed to: [02] -- Signing key certificate cross-certified with the blinded key, in the same format as in the hidden service descriptor. In this case, SIG is a signature of the cell with the signing key specified in AUTH_KEY. The relay must verify this signature, as well as the certification with the blinded key. The relay should also verify that it has received a valid descriptor with this blinded key. [XXX: Other options include putting only the blinded key, or only the signing key in this cell. In either of these cases, we must look up the descriptor to fully validate the contents, but we require the descriptor to be present anyway. -special] [XXX: What happens with the MAINT_INTRO process defined in proposal 224 section 3.1.3? -special] 2.3. Client connection to a service A client that wants to connect to a hidden service should first calculate the responsible introduction points for the onion address as described in section 2.1 above. The client chooses one introduction point at random, builds a circuit, and fetches the descriptor. Once it has received, verified, and decrypted the descriptor, the client can use the same circuit to send the INTRODUCE1 cell. 2.3.1. Client requests a descriptor Clients can request a descriptor by opening a directory stream with BEGIN_DIR, and sending a HTTP GET request as described in proposal 224, section 2.2.4. The client must verify the signatures of the descriptor, and decrypt the encrypted portion to access the "enc-key". This key is used to encrypt the contents of the INTRODUCE1 cell to the service. Because the descriptor is specific to each introduction point, client-side descriptor caching changes significantly. There is little point in caching these descriptors, because they are inexpensive to request and will always be available when a service-side introduction circuit is available. A client that does caching must be prepared to handle INTRODUCE1 failures due to rotated keys. 2.3.2. Client sends INTRODUCE1 After requesting the descriptor, the client can use the same circuit to send an INTRODUCE1 cell, which is forwarded to the service and begins the rendezvous process. The INTRODUCE1 cell is the same as proposal 224 section 3.2.1, except that the AUTH_KEYID is the blinded public key, instead of the now-removed introduction point authentication key. The relay must permit this circuit to change purpose from the directory request to a client or server introduction. 3. Other changes to proposal 224 3.1. Removing proposal 224 legacy relay support Proposal 224 defines a process for using legacy relays as introduction points; see section 3.1.2 [LEGACY_EST_INTRO], and 3.2.3 [LEGACY-INTRODUCE1]. With the changes to the introduction point in this proposals, it's no longer possible to maintain support for legacy introduction points. These sections of proposal 224 are removed, along with other references to legacy introduction points and RSA introduction point keys. We will need to handle the migration process to ensure that sufficient relays are available as introduction points. See the discussion in section 4.1 for more details. 3.2. Removing the "introduction point authentication key" The "introduction point authentication key" defined in proposal 224 is removed. The "descriptor signing key" is used to sign descriptors and the ESTABLISH_INTRO2 cell. Descriptors are unique for each introduction point, and there is no point in generating a new key used only to sign the ESTABLISH_INTRO2 cell. 4. Discussion 4.1. No backwards compatibility with legacy relays By changing the introduction procedure in such a way, we are unable to maintain backwards compatibility. That is, hidden services will be unable to use old relays as their introduction points, and similarly clients will be unable to introduce through old relays. To maintain an adequate anonymity set of intro points, clients and hidden services should perform this introduction method only after most relays have upgraded. For this reason we introduce the consensus parameter HSMergedIntroduction which controls whether hidden services should perform this merged introduction or fall back to the old one. [XXX: Do we? This sounds like we have to implement both in the client, which I thought we wanted to avoid. An alternative is to make sure that the intro point side is done early enough, and that clients know not to rely on the security of 224 services until enough relays are upgraded and the implementation is done. -special] 4.2. Restriction on the number of intro points and impact on load balancing One drawback of this proposal is that the number of introduction points of a hidden service is now a constant global parameter. Hence, a hidden service can no longer adjust how many introduction points it uses, or select the nodes that will serve as its introduction points. While bad, we don't consider this a major drawback since we don't believe that introduction points are a significant bottleneck on hidden services performance. However, our system significantly impacts the way some load balancing schemes for hidden services work. For example, onionbalance is a third-party application that manages the introduction points of a hidden service in a way that allows traffic load-balancing. This is achieved by compiling a master descriptor that mixes and matches the introduction points of underlying hidden service instances. With our system there are no descriptors that onionbalance can use to mix and match introduction points. A variant of the onionbalance idea that could work with our system would involve onionbalance starting a hidden service, not establishing any intro points, and then ordering the underlying hidden service load-balancing instances to establish intro points to all the right introduction points. 4.3. Behavior when introduction points go offline or misbehave In this new system, it's the Tor network that decides which relays should be used as the intro points of a hidden service for every time period. This means, that a hidden service is forced to use those relays as intro points if it wants clients to connect to it. This brings up the topic of what should happen when the designated relays go offline or refuse connections. Our behavior here should block guard discovery attacks (as in #8239) while allowing maximum reachability for clients. We should also make sure that an adversary cannot manipulate the hash ring in such a way that forces us to rotate introduction points quickly. This is enforced by the uptime check that is necessary for acquiring the HSDir flag (#8243). For this reason we propose the following rules: - After every consensus and when the blinded public key changes as a result of the time period, hidden services need to recalculate their introduction points and adjust themselves by establishing intro points to the new relays. - When an introduction point goes offline or drops connections, we attempt to re-establish to it INTRO_RETRIES times per consensus. If the intro point failed more than INTRO_RETRIES times for a consensus period, we abandon it and stay with one less intro point. If a new consensus is released and that relay is still listed as online, then we reset our retry counter and start trying again. [XXX: Is this crazy? -asn] [XXX: INTRO_RETRIES = 3? -asn] 4.4. Defining constants; how many introduction points for a service? We keep the same intro point configuration as in proposal 224. That is, each hidden service uses 6 relays and keeps them for a whole time period. [XXX: Are these good constants? We don't have a chance to change them in the future!! -asn] [XXX: 224 makes them consensus parameters, which we can keep, but they can still only be changed on a network-wide basis. -special] References: [1] : https://lists.torproject.org/pipermail/tor-dev/2016-January/010203.html
Filename: 247-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Authors: George Kadianakis and Mike Perry Created: 2015-07-10 Status: Superseded Superseded-by: 292-mesh-vanguards.txt [This proposal is superseded by proposal 292-mesh-vanguards.txt based on our analysis and experiences while implementing and simulating the vanguard design.] 0. Motivation A guard discovery attack allow attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001). Following the guard discovery attack with a compromise and/or coercion of the guard node can lead to the deanonymization of a hidden service. 1. Overview This document tries to make the above guard discovery + compromise attack harder to launch. It introduces a configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration. With this new path selection, we force the adversary to perform a Sybil attack and two compromise attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single compromise attack is required. With this new path selection, an attacker is forced to compromise one or more nodes before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since compromise attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of node compromise attacks that he might not be able to complete. 1.1. Visuals Here is how a hidden service rendezvous circuit currently looks like: -> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> Rendezvous Point -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n this proposal pins the two middle nodes to a much more restricted set, as follows: -> guard_3A_A -> guard_2_A -> guard_3A_B -> guard_3A_C -> Rendezvous Point HS -> guard_1 -> guard_3B_D -> guard_2_B -> guard_3B_E -> guard_3B_F -> Rendezvous Point Note that the third level guards are partitioned into buckets such that they are only used with one specific second-level guard. In this way, we ensure that even if an adversary is able to execute a Sybil attack against the third layer, they only get to learn one of the second-layer Guards, and not all of them. This prevents the adversary from gaining the ability to take their pick of the weakest of the second-level guards for further attack. 2. Design This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled. When a hidden service picks its guard nodes, it also picks an additional NUM_SECOND_GUARDS-sized set of middle nodes for its `second_guard_set`. For each of those middle layer guards, it picks NUM_THIRD_GUARDS that will be used only with a specific middle node. These sets are unique to each hidden service created by a single Tor client, and must be kept separate and distinct. When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from that second hop's corresponding `third_guard_set` as third hops of the circuit. A hidden service rotates nodes from the 'second_guard_set' at a random time between MIN_SECOND_GUARD_LIFETIME hours and MAX_SECOND_GUARD_LIFETIME hours. A hidden service rotates nodes from the 'third_guard_set' at a random time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME hours. These extra guard nodes should be picked with the same path selection procedure that is used for regular middle nodes (though see Section 4.3 and Section 5.1 for reasons to restrict this slightly beyond the current path selection rules). Each node's rotation time is tracked independently, to avoid disclosing the rotation times of the primary and second-level guards. XXX: IP and RP actually need to be separate 4th hops. On the server side, IP should be separate to better unlink IP from the 3rd layer guards, and on the client side, the RP needs to come from the full network to avoid cross-visit linkability. So it's seven proxies all teh time... XXX: What about hsdir fetch? to avoid targeting and visit linkability, it needs an emphemeral hop too.. Unless we believe that linkability is low? It is lower than IP linkability, since the hsdescs can be cached for a bit. But if we are worried about visit linkability, then client should also add an extra ephemeral hop during IP visits, making that circuit 8 hops long... XXX: Emphemeral hops for service side before RP? XXX: Really crazy idea: We can provide multiple path security levels. We could have full 4 hops, or combine Layer2+Layer3, or combine Layer1+Layer2 and Layer3+Layer4 for lower-security HS circs.. XXX: update the load balancing proposal with the outcome of this :/ XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here? 2.1. Security parameters We set NUM_SECOND_GUARDS to 4 nodes and NUM_THIRD_GUARDS to 4 nodes (ie four sets of four). However, see Section 5.2 for some performance versus security tradeoffs and discussion. We set MIN_SECOND_GUARD_LIFETIME to 1 day, and MAX_SECOND_GUARD_LIFETIME to 32 days inclusive, for an average rotation rate of ~11 days, using the min(X,X) distribution specified in Section 3.2.3. We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and MAX_THIRD_GUARD_LIFETIME to 18 hours inclusive, for an average rotation rate of ~12 hours, using the max(X,X) distribution specified in Section 3.2.3. The above parameters should be configurable in the Tor consensus and torrc. See Section 3 for more analysis on these constants. 3. Rationale and Security Parameter Selection 3.1. Threat model, Assumptions, and Goals Consider an adversary with the following powers: - Can launch a Sybil guard discovery attack against any node of a rendezvous circuit. The slower the rotation period of the node, the longer the attack takes. Similarly, the higher the percentage of the network is compromised, the faster the attack runs. - Can compromise any node on the network, but this compromise takes time and potentially even coercive action, and also carries risk of discovery. We also make the following assumptions about the types of attacks: 1. A Sybil attack is observable by both people monitoring the network for large numbers of new nodes, as well as vigilant hidden service operators. It will require either large amounts of traffic sent towards the hidden service, multiple test circuits, or both. 2. A Sybil attack against the second or first layer Guards will be more noisy than a Sybil attack against the third layer guard, since the second and first layer Sybil attack requires a timing side channel in order to determine success, whereas the Sybil success is almost immediately obvious to third layer guard, since it will be instructed to connect to a cooperating malicious rend point by the adversary. 3. As soon as the adversary is confident they have won the Sybil attack, an even more aggressive circuit building attack will allow them to determine the next node very fast (an hour or less). 4. The adversary is strongly disincentivized from compromising nodes that may prove useless, as node compromise is even more risky for the adversary than a Sybil attack in terms of being noticed. Given this threat model, our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a node compromise attack. Ideally, we want the node compromise attacks to carry a non-negligible probability of being useless to the adversary by the time they complete. On the other hand, the outermost layer of guards should rotate fast enough to _require_ a Sybil attack. 3.2. Parameter Tuning 3.2.1. Sybil rotation counts for a given number of Guards The probability of Sybil success for Guard discovery can be modeled as the probability of choosing 1 or more malicious middle nodes for a sensitive circuit over some period of time. P(At least 1 bad middle) = 1 - P(All Good Middles) = 1 - P(One Good middle)^(num_middles) = 1 - (1 - c/n)^(num_middles) c/n is the adversary compromise percentage In the case of Vanguards, num_middles is the number of Guards you rotate through in a given time period. This is a function of the number of vanguards in that position (v), as well as the number of rotations (r). P(At least one bad middle) = 1 - (1 - c/n)^(v*r) Here's detailed tables in terms of the number of rotations required for a given Sybil success rate for certain number of guards. 1.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 11 6 4 3 3 2 2 2 2 1 1 15% 17 9 6 5 4 3 3 2 2 2 2 25% 29 15 10 8 6 5 4 4 3 3 2 50% 69 35 23 18 14 12 9 8 7 6 5 60% 92 46 31 23 19 16 12 11 10 8 6 75% 138 69 46 35 28 23 18 16 14 12 9 85% 189 95 63 48 38 32 24 21 19 16 12 90% 230 115 77 58 46 39 29 26 23 20 15 95% 299 150 100 75 60 50 38 34 30 25 19 99% 459 230 153 115 92 77 58 51 46 39 29 5.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 3 2 1 1 1 1 1 1 1 1 1 15% 4 2 2 1 1 1 1 1 1 1 1 25% 6 3 2 2 2 1 1 1 1 1 1 50% 14 7 5 4 3 3 2 2 2 2 1 60% 18 9 6 5 4 3 3 2 2 2 2 75% 28 14 10 7 6 5 4 4 3 3 2 85% 37 19 13 10 8 7 5 5 4 4 3 90% 45 23 15 12 9 8 6 5 5 4 3 95% 59 30 20 15 12 10 8 7 6 5 4 99% 90 45 30 23 18 15 12 10 9 8 6 10.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 2 1 1 1 1 1 1 1 1 1 1 15% 2 1 1 1 1 1 1 1 1 1 1 25% 3 2 1 1 1 1 1 1 1 1 1 50% 7 4 3 2 2 2 1 1 1 1 1 60% 9 5 3 3 2 2 2 1 1 1 1 75% 14 7 5 4 3 3 2 2 2 2 1 85% 19 10 7 5 4 4 3 3 2 2 2 90% 22 11 8 6 5 4 3 3 3 2 2 95% 29 15 10 8 6 5 4 4 3 3 2 99% 44 22 15 11 9 8 6 5 5 4 3 The rotation counts in these tables were generated with: def num_rotations(c, v, success): r = 0 while 1-math.pow((1-c), v*r) < success: r += 1 return r 3.2.2. Rotation Period As specified in Section 3.1, the primary driving force for the third layer selection was to ensure that these nodes rotate fast enough that it is not worth trying to compromise them, because it is unlikely for compromise to succeed and yield useful information before the nodes stop being used. For this reason we chose 1 to 18 hours, with a weighted distribution (Section 3.2.3) causing the expected average to be 12 hours. From the table in Section 3.2.1, with NUM_SECOND_GUARDS=4 and NUM_THIRD_GUARDS=4, it can be seen that this means that the Sybil attack will complete with near-certainty (99%) in 29*12 hours (14.5 days) for the 1% adversary, 3 days for the 5% adversary, and 1.5 days for the 10% adversary. Since rotation of each node happens independently, the distribution of when the adversary expects to win this Sybil attack in order to discover the next node up is uniform. This means that on average, the adversary should expect that half of the rotation period of the next node is already over by the time that they win the Sybil. With this fact, we choose our range and distribution for the second layer rotation to be short enough to cause the adversary to risk compromising nodes that are useless, yet long enough to require a Sybil attack to be noticeable in terms of client activity. For this reason, we choose a minimum second-layer guard lifetime of 1 day, since this gives the adversary a minimum expected value of 12 hours for during which they can compromise a guard before it might be rotated. If the total expected rotation rate is 11 days, then the adversary can expect overall to have 5.5 days remaining after completing their Sybil attack before a second-layer guard rotates away. 3.2.3. Rotation distributions In order to skew the distribution of the third layer guard towards higher values, we use max(X,X) for the distribution, where X is a random variable that takes on values from the uniform distribution. In order to skew the distribution of the second layer guard towards low values (to increase the risk of compromising useless nodes) we skew the distribution towards lower values, using min(X,X). Here's a table of expectation (arithmetic means) for relevant ranges of X (sampled from 0..N-1). The table was generated with the following python functions: def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N) def ProbMaxXX(N, i): return (2.0*i+1)/(N*N) def ExpFn(N, ProbFunc): exp = 0.0 for i in xrange(N): exp += i*ProbFunc(N, i) return exp The current choice for second-layer guards is noted with **, and the current choice for third-layer guards is noted with ***. Range Exp[Min(X,X)] Exp[Max(X,X)] 10 2.85 6.15 11 3.18 6.82 12 3.51 7.49 13 3.85 8.15 14 4.18 8.82 15 4.51 9.49 16 4.84 10.16 17 5.18 10.82*** 18 5.51 11.49 19 5.84 12.16 20 6.18 12.82 21 6.51 13.49 22 6.84 14.16 23 7.17 14.83 24 7.51 15.49 25 7.84 16.16 26 8.17 16.83 27 8.51 17.49 28 8.84 18.16 29 9.17 18.83 30 9.51 19.49 31 9.84 20.16 32 10.17** 20.83 33 10.51 21.49 34 10.84 22.16 35 11.17 22.83 36 11.50 23.50 37 11.84 24.16 38 12.17 24.83 39 12.50 25.50 The Cumulative Density Function (CDF) tells us the probability that a guard will no longer be in use after a given number of time units have passed. Because the Sybil attack on the third node is expected to complete at any point in the second node's rotation period with uniform probability, if we want to know the probability that a second-level Guard node will still be in use after t days, we first need to compute the probability distribution of the rotation duration of the second-level guard at a uniformly random point in time. Let's call this P(R=r). For P(R=r), the probability of the rotation duration depends on the selection probability of a rotation duration, and the fraction of total time that rotation is likely to be in use. This can be written as: P(R=r) = ProbMinXX(X=r)*r / \sum_{i=1}^N ProbMinXX(X=i)*i or in Python: def ProbR(N, r, ProbFunc=ProbMinXX): return ProbFunc(N, r)*r/ExpFn(N, ProbFunc) For the full CDF, we simply sum up the fractional probability density for all rotation durations. For rotation durations less than t days, we add the entire probability mass for that period to the density function. For durations d greater than t days, we take the fraction of that rotation period's selection probability and multiply it by t/d and add it to the density. In other words: def FullCDF(N, t, ProbFunc=ProbR): density = 0.0 for d in xrange(N): if t >= d: density += ProbFunc(N, d) # The +1's below compensate for 0-indexed arrays: else: density += ProbFunc(N, d)*(float(t+1))/(d+1) return density Computing this yields the following distribution for our current parameters: t P(SECOND_ROTATION <= t) 1 0.07701 2 0.15403 3 0.22829 4 0.29900 5 0.36584 6 0.42869 7 0.48754 8 0.54241 9 0.59338 10 0.64055 11 0.68402 12 0.72392 13 0.76036 14 0.79350 15 0.82348 16 0.85043 17 0.87452 18 0.89589 19 0.91471 20 0.93112 21 0.94529 22 0.95738 23 0.96754 24 0.97596 25 0.98278 26 0.98817 27 0.99231 28 0.99535 29 0.99746 30 0.99881 31 0.99958 32 0.99992 33 1.00000 This CDF tells us that for the second-level Guard rotation, the adversary can expect that 7.7% of the time, their third-level Sybil attack will provide them with a second-level guard node that has only 1 day remaining before it rotates. 15.4% of the time, there will be only 2 day or less remaining, and 22.8% of the time, 3 days or less. Note that this distribution is still a day-resolution approximation. The actual numbers are likely even more biased towards lower values. In this way, we achieve our goal of ensuring that the adversary must do the prep work to compromise multiple second-level nodes before likely being successful, or be extremely fast in compromising a second-level guard after winning the Sybil attack. 4. Security concerns and mitigations 4.1. Mitigating fingerprinting of new HS circuits By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit with varying degrees of certainty. The Guard node is able to recognize a Vanguard client with a high degree of certainty because it will observe a client IP creating the overwhelming majority of its circuits to just a few middle nodes in any given 10-18 day time period. The middle nodes will be able to tell with a variable certainty that depends on both its traffic volume and upon the popularity of the service, because they will see a large number of circuits that tend to pick the same Guard and Exit. The final nodes will be able to tell with a similar level of certainty that depends on their capacity and the service popularity, because they will see a lot of rend handshakes that all tend to have the same second hop. The final nodes can also actively confirm that they have been selected for the third hop by creating multiple Rend circuits to a target hidden service, and seeing if they are chosen for the Rend point. The most serious of these is the Guard fingerprinting issue. When proposal 254-padding-negotiation is implemented, services that enable this feature should use those padding primitives to create fake circuits to random middle nodes that are not their guards, in an attempt to look more like a client. Additionally, if Tor Browser implements "virtual circuits" based on SOCKS username+password isolation in order to enforce the re-use of paths when SOCKS username+passwords are re-used, then the number of middle nodes in use during a typical user's browsing session will be proportional to the number of sites they are viewing at any one time. This is likely to be much lower than one new middle node every ten minutes, and for some users, may be close to the number of Vanguards we're considering. This same reasoning is also an argument for increasing the number of second-level guards beyond just two, as it will spread the hidden service's traffic over a wider set of middle nodes, making it both easier to cover, and behave closer to a client using SOCKS virtual circuit isolation. 4.2. Hidden service linkability Multiple hidden services on the same Tor instance should use separate second and third level guard sets; otherwise an adversary is trivially able to determine that the two hidden services are co-located by inspecting their current chosen rend point nodes. Unfortunately, if the adversary is still able to determine that two or more hidden services are run on the same Tor instance through some other means, then they are able to take advantage of this fact to execute a Sybil attack more effectively, since there will now be an extra set of guard nodes for each hidden service in use. For this reason, if Vanguards are enabled, and more than one hidden service is configured, the user should be advised to ensure that they do not accidentally leak that the two hidden services are from the same Tor instance. For cases where the user or application wants to deliberately link multiple different hidden services together (for example, to support concurrent file transfer and chat for the same identity), this behavior should be configurable. A torrc option DisjointHSVanguards should be provided that defaults to keeping the Vanguards separate for each hidden service. 4.3. Long term information leaks Due to Tor's path selection constraints, the client will never choose its primary guard node as later positions in the circuit. Over time, the absence of these nodes will give away information to the adversary. Unfortunately, the current solution (from bug #14917) of simply creating a temporary second guard connection to allow the primary guard to appear in some paths will make the hidden service fingerprinting problem worse, since only hidden services will exhibit this behavior on the local network. The simplest mitigation is to require that no Guard-flagged nodes be used for the second and third-level nodes at all, and to allow the primary guard to be chosen as a rend point. XXX: Dgoulet suggested using arbitrary subsets here rather than the no Guard-flag restriction, esp since Layer2 inference is still a possibility. XXX: If a Guard-flagged node is chosen for the alls IP or RP, raise protocolerror. Refuse connection. Or allow our guard/other nodes in IP/RP.. Additionally, in order to further limit the exposure of secondary guards to sybil attacks, the bin position of the third-level guards should be stable over long periods of time. When choosing third-level guards, these guards should be given a fixed bin number so that if they are selected at a later point in the future, they are placed after the same second-level guard, and not a different one. A potential stateless way of accomplishing this is to assign third-level guards to a bin number such that H(bin_number | HS addr) is closest to the key for the third-level relay. 4.4. Denial of service Since it will be fairly trivial for the adversary to enumerate the current set of third-layer guards for a hidden service, denial of service becomes a serious risk for Vanguard users. For this reason, it is important to support a large number of third-level guards, to increase the amount of resources required to bring a hidden service offline by DoSing just a few Tor nodes. Even with multiple third-level guards, an adversary is still able to degrade either performance or user experience significantly, simply by taking out a fraction of them. The solution to this is to make use of the circuit build timeout code (Section 5.2) to have the hidden service retry the rend connection multiple times. Unfortunately, it is unwise to simply replace unresponsive third-level guards that fail to complete circuits, as this will accelerate the Sybil attack. 4.5. Path Bias XXX: Re-use Prop#259 here. 5. Performance considerations The switch to a restricted set of nodes will very likely cause significant performance issues, especially for high-traffic hidden services. If any of the nodes they select happen to be temporarily overloaded, performance will suffer dramatically until the next rotation period. 5.1. Load Balancing Since the second and third level "guards" are chosen from the set of all nodes eligible for use in the "middle" hop (as per hidden services today), this proposal should not significantly affect the long-term load on various classes of the Tor network, and should not require any changes to either the node weight equations, or the bandwidth authorities. Unfortunately, transient load is another matter, as mentioned previously. It is very likely that this scheme will increase instances of transient overload at nodes selected by high-traffic hidden services. One option to reduce the impact of this transient overload is to restrict the set of middle nodes that we choose from to some percentage of the fastest middle-capable relays in the network. This may have some impact on load balancing, but since the total volume of hidden service traffic is low, it may be unlikely to matter. 5.2. Circuit build timeout and topology The adaptive circuit build timeout mechanism in Tor is what corrects for instances of transient node overload right now. The timeout will naturally tend to select the current fastest and least-loaded paths even through this set of restricted routes, but it may fail to behave correctly if there are a very small set of nodes in each guard set, as it is based upon assumptions about the current path selection algorithm, and it may need to be tuned specifically for Vanguards, especially if the set of possible routes is small. It turns out that a fully-connected/mesh (aka non-binned) second guard to third guard mapping topology is a better option for CBT for performance, because it will create a larger total set of paths for CBT to choose from while using fewer nodes. This comes at the expense of exposing all second-layer guards to a single sybil attack, but for small numbers of guard sets, it may be worth the tradeoff. However, it also turns out that this need not block implementation, as worst-case the data structures and storage needed to support a fully connected mesh topology can do so by simply replicating the same set of third-layer guards for each second-layer guard bin. Since we only expect this tradeoff to be worth it when the sets are small, this replication should not be expensive in practice. 5.3. OnionBalance At first glance, it seems that this scheme makes multi-homed hidden services such as OnionBalance[1] even more important for high-traffic hidden services. Unfortunately, if it is equally damaging to the user for any of their multi-homed hidden service locations to be discovered, then OnionBalance is strictly equivalent to simply increasing the number of second-level guard nodes in use, because an active adversary can perform simultaneous Sybil attacks against all of the rend points offered by the multi-homed OnionBalance introduction points. XXX: This actually matters for high-perf censorship resistant publishing. It is better for those users to use onionbalance than to up their guards, since redundancy is useful for them. 5.4. Default vs optional behavior We suggest this torrc option to be optional because it changes path selection in a way that may seriously impact hidden service performance, especially for high traffic services that happen to pick slow guard nodes. However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should in fact enable this feature globally, but only after we verify its viability for high-traffic hidden services, and ensure that it is free of second-order load balancing effects. Even after that point, until Single Onion Services are implemented, there will likely still be classes of very high traffic hidden services for whom some degree of location anonymity is desired, but for which performance is much more important than the benefit of Vanguards, so there should always remain a way to turn this option off. 6. Future directions Here are some more ideas for improvements that should be done sooner or later: - Do we want to consider using Tor's GeoIP country database (if present) to ensure that the second-layer guards are chosen from a different country as the first-layer guards, or does this leak too much information to the adversary? - What does the security vs performance tradeoff actually look like for different amounts of bins? Or for mesh vs bins? We may need to simulate or run CBT tests to learn this. - With this tradeoff information, do we want to provide the user (or application) with a choice of 3 different Vanguard sets? One could imagine "small", "medium", and "large", for example. 7. Acknowledgments Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else who helped with this idea. This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. Appendix A: Full Python program for generating tables in this proposal #!/usr/bin/python import math ############ Section 3.2.1 ################# def num_rotations(c, v, success): i = 0 while 1-math.pow((1-c), v*i) < success: i += 1 return i def rotation_line(c, pct): print " %2d%% %6d%6d%6d%6d%6d%6d%6d%6d%6d%6d%8d" % \ (pct, num_rotations(c, 1, pct/100.0), num_rotations(c, 2, pct/100.0), \ num_rotations(c, 3, pct/100.0), num_rotations(c, 4, pct/100.0), num_rotations(c, 5, pct/100.0), num_rotations(c, 6, pct/100.0), num_rotations(c, 8, pct/100.0), num_rotations(c, 9, pct/100.0), num_rotations(c, 10, pct/100.0), num_rotations(c, 12, pct/100.0), num_rotations(c, 16, pct/100.0)) def rotation_table_321(): for c in [1,5,10]: print "\n %2.1f%% Network Compromise: " % c print " Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen" for success in [10,15,25,50,60,75,85,90,95,99]: rotation_line(c/100.0, success) ############ Section 3.2.3 ################# def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N) def ProbMaxXX(N, i): return (2.0*i+1)/(N*N) def ExpFn(N, ProbFunc): exp = 0.0 for i in xrange(N): exp += i*ProbFunc(N, i) return exp def ProbR(N, r, ProbFunc=ProbMinXX): return ProbFunc(N, r)*r/ExpFn(N, ProbFunc) def FullCDF(N, t, ProbFunc=ProbR): density = 0.0 for d in xrange(N): if t >= d: density += ProbFunc(N, d) # The +1's below compensate for 0-indexed arrays: else: density += ProbFunc(N, d)*float(t+1)/(d+1) return density def expectation_table_323(): print "\n Range Min(X,X) Max(X,X)" for i in xrange(10,40): print " %2d %2.2f %2.2f" % (i, ExpFn(i,ProbMinXX), ExpFn(i, ProbMaxXX)) def CDF_table_323(): print "\n t P(SECOND_ROTATION <= t)" for i in xrange(1,34): print " %2d %2.5f" % (i, FullCDF(33, i-1)) ########### Output ############ # Section 3.2.1 rotation_table_321() # Section 3.2.3 expectation_table_323() CDF_table_323() ---------------------- 1. https://onionbalance.readthedocs.org/en/latest/design.html#overview
Filename: 248-removing-rsa-identities.txt Title: Remove all RSA identity keys Authors: Nick Mathewson Created: 15 July 2015 Status: Needs-Revision 1. Summary With 0.2.7.2-alpha, all relays will have Ed25519 identity keys. Old identity keys are 1024-bit RSA, which should not really be considered adequate. In proposal 220, we describe a migration path to start using Ed25519 keys. This proposal describes an additional migration path, for finally removing our old RSA identity keys. See also proposal 245, which describes a migration path away from the old TAP RSA1024-based circuit extension protocol. 1.1. Steps of migration Phase 1. Prepare for routers that do not advertise their RSA identities, by teaching clients and relays and other dependent software how to handle them. Reject such routers at the authority level. Phase 2. Once all supported routers and clients are updated to phase 1, we can accept routers at the authority level which lack RSA keys. Phase 3. Once all authorities accept routers without RSA keys, we can finally remove RSA keys from relays. 2. Accepting descriptors without RSA identities We make the following changes to the descriptor format: If an ed25519 key and signature are present, then these elements may be omitted: "fignerprint", "signing-key", "router-signature". They must either be all present or all absent. If they are all absent, then the router has no RSA identity key. Authorities MUST NOT accept routers descriptors of this form in phase 1. 3. Accepting handshakes without RSA identities When performing a new version of our link handshake, only the Ed25519 key and certificates and authentication need to be performed. If the link handshake is performed this way, it is accepted as authenticating the route with an ed25519 key but no RSA key. A circuit extension EXTEND2 cell may contain an Ed25519 identity but not an RSA identity. In this case, the relay should connect the circuit to any connection with the correct ed25519 identity, regardless of RSA identity. If an EXTEND2 cell contains an RSA identity fingerprint, however, its the relay receiving it should not connect to any relay that has a different RSA identity or that has no identity, even if the Ed25519 identity does match. 4. UI updates In phase 1 we can update our UIs to refer to all relays that have Ed25519 keys by their Ed25519 keys. We can update our configuration and control port interfaces so that they accept Ed keys as well as RSA keys. During phase 1, we should warn about identifying any dual-identity relays by their Ed identity alone. For backward compatibility, we should consider a default that refers to Ed25519 relays by the first 160 bits of their key. This would allow many controller-based tools to work transparently with the new key types. 5. Changes to external tools This is the big one. We need a relatively comprehensive list of tools we can break with the above changes. Anything that refers to relays by SHA1(RSA1024_id) will need to be able to receive, store, and use an Ed25519 key instead. 5. Testing Before going forward with phase 2 and phase 3, we need to verify that we did phase 1 correctly. To do so, we should create a small temporary testing network, and verify that it works correctly as we make the phase 2 and phase 3 changes.
Filename: 249-large-create-cells.txt Title: Allow CREATE cells with >505 bytes of handshake data Authors: Nick Mathewson, Isis Lovecruft Created: 23 July 15 Updated: 13 December 2017 Status: Superseded Superseded-By: 319-wide-everything.md 1. Summary There have been multiple proposals over the last year or so for adding post-quantum cryptography to Tor's circuit extension handshakes. (See for example https://eprint.iacr.org/2015/008 or https://eprint.iacr.org/2015/287 .) These proposals share the property that the request and reply for a handshake message do not fit in a single RELAY cell. In this proposal I describe a new CREATE2V cell for handshakes that don't fit in a 505-byte CREATE2 cell's HDATA section, and a means for fragmenting these CREATE2V cells across multiple EXTEND2 cells. I also discuss replies, migration, and DoS-mitigation strategies. 2. CREATE2V and CREATED2V First, we add two variable-width cell types, CREATE2V and CREATED2V. These cell formats are nearly the same as CREATE2 and CREATED2. (Here specified using Trunnel.) struct create2v_cell_body { /* Handshake type */ u16 htype; /* Length of handshake data */ u16 hlen; /* Handshake data */ u8 hdata[hlen]; /* Padding data to be ignored */ u8 ignored[]; }; struct created2v_cell_body { /* Handshake reply length */ u16 hlen; /* Handshake reply data */ u8 hdata[hlen]; /* Padding data to be ignored */ u8 ignored[]; }; The 'ignored' fields, which extend to the end of the variable-length cells, are reserved. Initiators MAY set them to any length, and MUST fill them with either zero-valued bytes or pseudo-random bytes. Responders MUST ignore them, regardless of what they contain. When a CREATE2V cell is generated in response to a set of EXTEND2 cells, these fields are set by the relay that receives the EXTEND2 cells. (The purpose of the 'ignored' fields here is future-proofing and padding.) Protocols MAY wish to pad to a certain multiple of bytes, or wish to pad the initiator/receiver payloads to be of equal length. This is encouraged but NOT REQUIRED. 3. Fragmented EXTEND2 cells Without changing the current EXTEND2 cell format, we change its semantics: If the 'HLEN' field in an EXTEND2 cell describes a handshake data section that would be too long to fit in the EXTEND2 cell's payload, the handshake data of the EXTEND2 cell is to be continued in one or more subsequent EXTEND2 cells. These subsequent cells MUST have zero link specifiers, handshake type 0xFFFF, and handshake data length field set to zero. Similarly, if the 'HLEN' field in an EXTENDED2 cell would be too long to fit into the EXTENDED2 cell's payload, the handshake reply data of the EXTENDED2 cell is to be continued in one or more subsequent EXTENDED2 cells. These subsequent cells must have the handshake data length field set to zero. These cells must be sent on the circuit with no intervening cells. If any intervening cells are received, the receiver SHOULD destroy the circuit. Protocols which make use of CREATE(D)2V cells SHOULD send an equal number of cells in either direction, to avoid trivially disclosing information about the direction of the circuit: for example a relay might use the fact that it saw five EXTEND2 cells in one direction and three in the other to easily determine whether it is the middle relay on the onion service-side or the middle relay on the client-side of a rendezvous circuit. 4. Interacting with RELAY_EARLY cells The first EXTEND2 cell in a batch must arrive in a RELAY_EARLY cell. The others MAY arrive in RELAY_EARLY cells. For many handshakes, for the possible lengths of many types of circuits, sending all EXTEND2 cells inside RELAY_EARLY cells will not be possible. For example, for a fragmented EXTEND2 cell with parts A B C D E, A is the only fragment that MUST be sent within a RELAY_EARLY. For parts B C D E, these are merely sent as EXTEND2{CREATE2V} cells. Note that this change leaks the size of the handshake being used to intermediate relays. We should analyze this and see whether it matters. Clients and relays MAY send RELAY_DROP cells during circuit construction in order to hide the true size of their handshakes (but they can't send these drop cells inside a train of EXTEND2 or EXTENDED2 cells for a given handshake). 5. Example So for example, if we are a client, and we need to send a 2000-byte handshake to extend a circuit from relay X to relay Y, we might send cells as follows: EXTEND2 { nspec = 2; lstype = [0x01 || 0x02]; (IPv4 or IPv6 node address) lslen = [0x04 || 0x16]; lspec = { node address for Y, taking 8 bytes or 16 bytes}; lstype = 0x03; (An ed25519 node identity) lslen = 32; lspen = { ed25519 node ID for Y, taking 32 bytes } htype = {whatever the handshake type is.} hlen = 2000 hdata = { the first 462 bytes of the handshake } } EXTEND2 { nspec = 0; htype = 0xffff; hlen = 0; hdata = { the next 492 bytes of the handshake } } EXTEND2 { nspec = 0; htype = 0xffff; hlen = 0; hdata = { the next 492 bytes of the handshake } } EXTEND2 { nspec = 0; htype = 0xffff; hlen = 0; hdata = { the next 492 bytes of the handshake } } EXTEND2 { nspec = 0; htype = 0xffff; hlen = 0; hdata = { the final 62 bytes of the handshake } } Upon receiving this last cell, the relay X would send a create2v cell to Y, containing the entire handshake. 6. Migration We can and should implement the EXTEND2 fragmentation feature before we implement anything that uses it. If we can get it widely deployed before it's needed, we can use the new handshake types whenever both of the involved relays support this proposal. Clients MUST NOT send fragmented EXTEND2 cells to relays that don't support them, since this would cause them to close the circuit. Relays MAY send CREATE2V and CREATED2V cells to relays that don't support them, since unrecognized cell types are ignored. 6.1. New Subprotocols and Subprotocol Versions This proposal introduces, following prop#264, the following new subprotocol numbers and their uses. 6.1.1. Relay Subprotocol "Relay 3" -- The OP supports all of "Relay 2", plus support for CREATE2V and CREATED2V cells and their above specification for link-layer authentication specifiers. 6.1.2. Link Subprotocol "Link 5": The OP supports all of "Link 1-4", plus support for the new EXTEND2 semantics. Namely, it understands that an EXTEND2 cell whose "hlen" field is greater than 505 will be followed by further "hdata" in fragmented EXTEND2 cells which MUST follow. It also understands that the following combination of EXTEND2 payload specifiers indicates that the cell is a continuation of the earlier payload portions: nspec = 0; htype = 0xffff; hlen = 0; 6.1.3. Handshake Subprotocol Additionally, we introduce a new subprotocol, "Handshake" and the following number assignments for previously occuring instances: "Handshake 1" -- The OP supports the TAP handshake. "Handshake 2" -- The OP supports the ntor handshake. We also reserve the following assignments for future use: "Handshake 3" -- The OP supports the "hybrid+null" ntor-like handshake from prop#269. "Handshake 4" -- The OP supports a(n as yet unspecified) post-quantum secure hybrid handshake, that is, the "hybrid+null" handshake from "Handshake 3", except with "null" part replaced with another (as yet unspecified) protocol to be composed with the ntor-like ECDH-based handshake. Further handshakes MUST be specified with "Handshake" subprotocol numbers, and MUST NOT be specified with "Relay" subprotocol numbers. The "Relay" subprotocol SHALL be used in the future to denote changes to handshake protocol handling of CREATE* and EXTEND* cells, i.e. CREATE, CREATED, CREATE_FAST, CREATED_FAST, CREATE2, CREATED2, CREATE2V, CREATED2V, EXTEND, EXTENDED, EXTEND2, and EXTENDED2. Thus, "Handshake 1" is taken to be synonymous with "Relay 1", and likewise "Handshake 2" is with "Relay 2". 6.2. Subprotocol Recommendations After the subprotocol additions above, we change to recommending the following in the consensus: recommended-client-protocols […] Link=5 Relay=3 Handshake=2 recommended-relay-protocols […] Link=5 Relay=3 Handshake=2 required-client-protocols […] Link=4-5 Relay=2-3 Handshake=1-2 required-relay-protocols […] Link=3-5 Relay=1-3 Handshake=1-2 6.2. New Consensus Parameters We introduce the following new consensus parameters: Create2VMaximumData SP int The maximum amount of "hlen" data, in bytes, which may carried in either direction within a set of CREATE(D)2V cells. (default: 10240) 7. Resource management issues This feature requires relays and clients to buffer EXTEND2 cell bodies for incoming cells until the entire CREATE2V/CREATED2V body has arrived. To avoid memory-related denial-of-service attacks, the buffers allocated for this data need to be counted against the total data usage of the circuit. Further, circuits which receive and buffer CREATE(D)2V cells MUST store the time the first buffer chunk was allocated, and use it to inform the OOM manager w.r.t. the amount of data used and its staleness. Appendix A. A rejected idea for migration In section 5 above, I gave up on the idea of allowing relay A to extend to relay B with a large CREATE cell when relay A does not support this proposal. There are other ways to do this, but they are impressively kludgey. For example, we could have a fake CREATE cell for new handshake types that always elicits a "yes, keep going!" CREATED cell. Then the client could send the rest of the handshake and receive the rest of the CREATED cell as RELAY cells inside the circuit. This design would add an extra round-trip to circuit extension whenever it was used, however, and would violate a number of Tor's assumptions about circuits (e.g., by having half-created circuits, where authentication hasn't actually been performed). So I'm guessing we shouldn't do that. Appendix B. Acknowledgements This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
Filename: 250-commit-reveal-consensus.txt Title: Random Number Generation During Tor Voting Authors: David Goulet, George Kadianakis Created: 2015-08-03 Status: Closed Supersedes: 225 UPDATE 2017/01/26: This proposal now has its own specification file as srv-spec.txt . Table Of Contents: 1. Introduction 1.1. Motivation 1.2. Previous work 2. Overview 2.1. Introduction to our commit-and-reveal protocol 2.2. Ten thousand feet view of the protocol 2.3. How we use the consensus [CONS] 2.3.1. Inserting Shared Random Values in the consensus 2.4. Persistent State of the Protocol [STATE] 2.5. Protocol Illustration 3. Protocol 3.1 Commitment Phase [COMMITMENTPHASE] 3.1.1. Voting During Commitment Phase 3.1.2. Persistent State During Commitment Phase [STATECOMMIT] 3.2 Reveal Phase 3.2.1. Voting During Reveal Phase 3.2.2. Persistent State During Reveal Phase [STATEREVEAL] 3.3. Shared Random Value Calculation At 00:00UTC 3.3.1. Shared Randomness Calculation [SRCALC] 3.4. Bootstrapping Procedure 3.5. Rebooting Directory Authorities [REBOOT] 4. Specification [SPEC] 4.1. Voting 4.1.1. Computing commitments and reveals [COMMITREVEAL] 4.1.2. Validating commitments and reveals [VALIDATEVALUES] 4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] 4.1.5. Shared Random Value [SRVOTE] 4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] 4.3. Persistent state format [STATEFORMAT] 5. Security Analysis 5.1. Security of commit-and-reveal and future directions 5.2. Predicting the shared random value during reveal phase 5.3. Partition attacks 5.3.1. Partition attacks during commit phase 5.3.2. Partition attacks during reveal phase 6. Discussion 6.1. Why the added complexity from proposal 225? 6.2. Why do you do a commit-and-reveal protocol in 24 rounds? 6.3. Why can't we recover if the 00:00UTC consensus fails? 7. Acknowledgements 1. Introduction 1.1. Motivation For the next generation hidden services project, we need the Tor network to produce a fresh random value every day in such a way that it cannot be predicted in advance or influenced by an attacker. Currently we need this random value to make the HSDir hash ring unpredictable (#8244), which should resolve a wide class of hidden service DoS attacks and should make it harder for people to gauge the popularity and activity of target hidden services. Furthermore this random value can be used by other systems in need of fresh global randomness like Tor-related protocols (e.g. OnioNS) or even non-Tor-related (e.g. warrant canaries). 1.2. Previous work Proposal 225 specifies a commit-and-reveal protocol that can be run as an external script and have the results be fed to the directory authorities. However, directory authority operators feel unsafe running a third-party script that opens TCP ports and accepts connections from the Internet. Hence, this proposal aims to embed the commit-and-reveal idea in the Tor voting process which should make it smoother to deploy and maintain. Another idea proposed specifically for Tor is Nick Hopper's "A threshold signature-based proposal for a shared RNG" which was never turned into an actual Tor proposal. 2. Overview This proposal alters the Tor consensus protocol such that a random number is generated every midnight by the directory authorities during the regular voting process. The distributed random generator scheme is based on the commit-and-reveal technique. The proposal also specifies how the final shared random value is embedded in consensus documents so that clients who need it can get it. 2.1. Introduction to our commit-and-reveal protocol Every day, before voting for the consensus at 00:00UTC each authority generates a new random value and keeps it for the whole day. The authority cryptographically hashes the random value and calls the output its "commitment" value. The original random value is called the "reveal" value. The idea is that given a reveal value you can cryptographically confirm that it corresponds to a given commitment value (by hashing it). However given a commitment value you should not be able to derive the underlying reveal value. The construction of these values is specified in section [COMMITREVEAL]. 2.1. Ten thousand feet view of the protocol Our commit-and-reveal protocol aims to produce a fresh shared random value everyday at 00:00UTC. The final fresh random value is embedded in the consensus document at that time. Our protocol has two phases and uses the hourly voting procedure of Tor. Each phase lasts 12 hours, which means that 12 voting rounds happen in between. In short, the protocol works as follows: Commit phase: Starting at 00:00UTC and for a period of 12 hours, authorities every hour include their commitment in their votes. They also include any received commitments from other authorities, if available. Reveal phase: At 12:00UTC, the reveal phase starts and lasts till the end of the protocol at 00:00UTC. In this stage, authorities must reveal the value they committed to in the previous phase. The commitment and revealed values from other authorities, when available, are also added to the vote. Shared Randomness Calculation: At 00:00UTC, the shared random value is computed from the agreed revealed values and added to the consensus. This concludes the commit-and-reveal protocol at 00:00UTC everyday. 2.3. How we use the consensus [CONS] The produced shared random values needs to be readily available to clients. For this reason we include them in the consensus documents. Every hour the consensus documents need to include the shared random value of the day, as well as the shared random value of the previous day. That's because either of these values might be needed at a given time for a Tor client to access a hidden service according to section [TIME-OVERLAP] of proposal 224. This means that both of these two values need to be included in votes as well. Hence, consensuses need to include: (a) The shared random value of the current time period. (b) The shared random value of the previous time period. For this, a new SR consensus method will be needed to indicate which authorities support this new protocol. 2.3.1. Inserting Shared Random Values in the consensus After voting happens, we need to be careful on how we pick which shared random values (SRV) to put in the consensus, to avoid breaking the consensus because of authorities having different views of the commit-and-reveal protocol (because maybe they missed some rounds of the protocol). For this reason, authorities look at the received votes before creating a consensus and employ the following logic: - First of all, they make sure that the agreed upon consensus method is above the SR consensus method. - Authorities include an SRV in the consensus if and only if the SRV has been voted by at least the majority of authorities. - For the consensus at 00:00UTC, authorities include an SRV in the consensus if and only if the SRV has been voted by at least AuthDirNumAgreements authorities (where AuthDirNumAgreements is a newly introduced consensus parameter). Authorities include in the consensus the most popular SRV that also satisfies the above constraints. Otherwise, no SRV should be included. The above logic is used to make it harder to break the consensus by natural partioning causes. We use the AuthDirNumAgreements consensus parameter to enforce that a _supermajority_ of dirauths supports the SR protocol during SRV creation, so that even if a few of those dirauths drop offline in the middle of the run the SR protocol does not get disturbed. We go to extra lengths to ensure this because changing SRVs in the middle of the day has terrible reachability consequences for hidden service clients. 2.4. Persistent State of the Protocol [STATE] A directory authority needs to keep a persistent state on disk of the on going protocol run. This allows an authority to join the protocol seamlessly in the case of a reboot. During the commitment phase, it is populated with the commitments of all authorities. Then during the reveal phase, the reveal values are also stored in the state. As discussed previously, the shared random values from the current and previous time period must also be present in the state at all times if they are available. 2.5. Protocol Illustration An illustration for better understanding the protocol can be found here: https://people.torproject.org/~asn/hs_notes/shared_rand.jpg It reads left-to-right. The illustration displays what the authorities (A_1, A_2, A_3) put in their votes. A chain 'A_1 -> c_1 -> r_1' denotes that authority A_1 committed to the value c_1 which corresponds to the reveal value r_1. The illustration depicts only a few rounds of the whole protocol. It starts with the first three rounds of the commit phase, then it jumps to the last round of the commit phase. It continues with the first two rounds of the reveal phase and then it jumps to the final round of the protocol run. It finally shows the first round of the commit phase of the next protocol run (00:00UTC) where the final Shared Random Value is computed. In our fictional example, the SRV was computed with 3 authority contributions and its value is "a56fg39h". We advice you to revisit this after you have read the whole document. 3. Protocol In this section we give a detailed specification of the protocol. We describe the protocol participants' logic and the messages they send. The encoding of the messages is specified in the next section ([SPEC]). Now we go through the phases of the protocol: 3.1 Commitment Phase [COMMITMENTPHASE] The commit phase lasts from 00:00UTC to 12:00UTC. During this phase, an authority commits a value in its vote and saves it to the permanent state as well. Authorities also save any received authoritative commits by other authorities in their permanent state. We call a commit by Alice "authoritative" if it was included in Alice's vote. 3.1.1. Voting During Commitment Phase During the commit phase, each authority includes in its votes: - The commitment value for this protocol run. - Any authoritative commitments received from other authorities. - The two previous shared random values produced by the protocol (if any). The commit phase lasts for 12 hours, so authorities have multiple chances to commit their values. An authority MUST NOT commit a second value during a subsequent round of the commit phase. If an authority publishes a second commitment value in the same commit phase, only the first commitment should be taken in account by other authorities. Any subsequent commitments MUST be ignored. 3.1.2. Persistent State During Commitment Phase [STATECOMMIT] During the commitment phase, authorities save in their persistent state the authoritative commits they have received from each authority. Only one commit per authority must be considered trusted and active at a given time. 3.2 Reveal Phase The reveal phase lasts from 12:00UTC to 00:00UTC. Now that the commitments have been agreed on, it's time for authorities to reveal their random values. 3.2.1. Voting During Reveal Phase During the reveal phase, each authority includes in its votes: - Its reveal value that was previously committed in the commit phase. - All the commitments and reveals received from other authorities. - The two previous shared random values produced by the protocol (if any). The set of commitments have been decided during the commitment phase and must remain the same. If an authority tries to change its commitment during the reveal phase or introduce a new commitment, the new commitment MUST be ignored. 3.2.2. Persistent State During Reveal Phase [STATEREVEAL] During the reveal phase, authorities keep the authoritative commits from the commit phase in their persistent state. They also save any received reveals that correspond to authoritative commits and are valid (as specified in [VALIDATEVALUES]). An authority that just received a reveal value from another authority's vote, MUST wait till the next voting round before including that reveal value in its votes. 3.3. Shared Random Value Calculation At 00:00UTC Finally, at 00:00UTC every day, authorities compute a fresh shared random value and this value must be added to the consensus so clients can use it. Authorities calculate the shared random value using the reveal values in their state as specified in subsection [SRCALC]. Authorities at 00:00UTC start including this new shared random value in their votes, replacing the one from two protocol runs ago. Authorities also start including this new shared random value in the consensus as well. Apart from that, authorities at 00:00UTC proceed voting normally as they would in the first round of the commitment phase (section [COMMITMENTPHASE]). 3.3.1. Shared Randomness Calculation [SRCALC] An authority that wants to derive the shared random value SRV, should use the appropriate reveal values for that time period and calculate SRV as follows. HASHED_REVEALS = H(ID_a | R_a | ID_b | R_b | ..) SRV = SHA3-256("shared-random" | INT_8(REVEAL_NUM) | INT_4(VERSION) | HASHED_REVEALS | PREVIOUS_SRV) where the ID_a value is the identity key fingerprint of authority 'a' and R_a is the corresponding reveal value of that authority for the current period. Also, REVEAL_NUM is the number of revealed values in this construction, VERSION is the protocol version number and PREVIOUS_SRV is the previous shared random value. If no previous shared random value is known, then PREVIOUS_SRV is set to 32 NUL (\x00) bytes. To maintain consistent ordering in HASHED_REVEALS, all the ID_a | R_a pairs are ordered based on the R_a value in ascending order. 3.4. Bootstrapping Procedure As described in [CONS], two shared random values are required for the HSDir overlay periods to work properly as specified in proposal 224. Hence clients MUST NOT use the randomness of this system till it has bootstrapped completely; that is, until two shared random values are included in a consensus. This should happen after three 00:00UTC consensuses have been produced, which takes 48 hours. 3.5. Rebooting Directory Authorities [REBOOT] The shared randomness protocol must be able to support directory authorities who leave or join in the middle of the protocol execution. An authority that commits in the Commitment Phase and then leaves MUST have stored its reveal value on disk so that it continues participating in the protocol if it returns before or during the Reveal Phase. The reveal value MUST be stored timestamped to avoid sending it on wrong protocol runs. An authority that misses the Commitment Phase cannot commit anymore, so it's unable to participate in the protocol for that run. Same goes for an authority that misses the Reveal phase. Authorities who do not participate in the protocol SHOULD still carry commits and reveals of others in their vote. Finally, authorities MUST implement their persistent state in such a way that they will never commit two different values in the same protocol run, even if they have to reboot in the middle (assuming that their persistent state file is kept). A suggested way to structure the persistent state is found at [STATEFORMAT]. 4. Specification [SPEC] 4.1. Voting This section describes how commitments, reveals and SR values are encoded in votes. We describe how to encode both the authority's own commitments/reveals and also the commitments/reveals received from the other authorities. Commitments and reveals share the same line, but reveals are optional. Participating authorities need to include the line: "shared-rand-participate" in their votes to announce that they take part in the protocol. 4.1.1. Computing commitments and reveals [COMMITREVEAL] A directory authority that wants to participate in this protocol needs to create a new pair of commitment/reveal values for every protocol run. Authorities SHOULD generate a fresh pair of such values right before the first commitment phase of the day (at 00:00UTC). The value REVEAL is computed as follows: REVEAL = base64-encode( TIMESTAMP || H(RN) ) where RN is the SHA3 hashed value of a 256-bit random value. We hash the random value to avoid exposing raw bytes from our PRNG to the network (see [RANDOM-REFS]). TIMESTAMP is an 8-bytes network-endian time_t value. Authorities SHOULD set TIMESTAMP to the valid-after time of the vote document they first plan to publish their commit into (so usually at 00:00UTC, except if they start up in a later commit round). The value COMMIT is computed as follows: COMMIT = base64-encode( TIMESTAMP || H(REVEAL) ) 4.1.2. Validating commitments and reveals [VALIDATEVALUES] Given a COMMIT message and a REVEAL message it should be possible to verify that they indeed correspond. To do so, the client extracts the random value H(RN) from the REVEAL message, hashes it, and compares it with the H(H(RN)) from the COMMIT message. We say that the COMMIT and REVEAL messages correspond, if the comparison was successful. Pariticipants MUST also check that corresponding COMMIT and REVEAL values have the same timestamp value. Authorities should ignore reveal values during the Reveal Phase that don't correspond to commit values published during the Commitment Phase. 4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] An authority puts in its vote the commitments and reveals it has produced and seen from the other authorities. To do so, it includes the following in its votes: "shared-rand-commit" SP VERSION SP ALGNAME SP IDENTITY SP COMMIT [SP REVEAL] NL where VERSION is the version of the protocol the commit was created with. IDENTITY is the authority's SHA1 identity fingerprint and COMMIT is the encoded commit [COMMITREVEAL]. Authorities during the reveal phase can also optionally include an encoded reveal value REVEAL. There MUST be only one line per authority else the vote is considered invalid. Finally, the ALGNAME is the hash algorithm that should be used to compute COMMIT and REVEAL which is "sha3-256" for version 1. 4.1.5. Shared Random Value [SRVOTE] Authorities include a shared random value (SRV) in their votes using the following encoding for the previous and current value respectively: "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL where VALUE is the actual shared random value encoded in hex (computed as specified in section [SRCALC]. NUM_REVEALS is the number of reveal values used to generate this SRV. To maintain consistent ordering, the shared random values of the previous period should be listed before the values of the current period. 4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] Authorities insert the two active shared random values in the consensus following the same encoding format as in [SRVOTE]. 4.3. Persistent state format [STATEFORMAT] As a way to keep ground truth state in this protocol, an authority MUST keep a persistent state of the protocol. The next sub-section suggest a format for this state which is the same as the current state file format. It contains a preamble, a commitment and reveal section and a list of shared random values. The preamble (or header) contains the following items. They MUST occur in the order given here: "Version" SP version NL [At start, exactly once.] A document format version. For this specification, version is "1". "ValidUntil" SP YYYY-MM-DD SP HH:MM:SS NL [Exactly once] After this time, this state is expired and shouldn't be used nor trusted. The validity time period is till the end of the current protocol run (the upcoming noon). The following details the commitment and reveal section. They are encoded the same as in the vote. This makes it easier for implementation purposes. "Commit" SP version SP algname SP identity SP commit [SP reveal] NL [Exactly once per authority] The values are the same as detailed in section [COMMITVOTE]. This line is also used by an authority to store its own value. Finally is the shared random value section. "SharedRandPreviousValue" SP num_reveals SP value NL [At most once] This is the previous shared random value agreed on at the previous period. The fields are the same as in section [SRVOTE]. "SharedRandCurrentValue" SP num_reveals SP value NL [At most once] This is the latest shared random value. The fields are the same as in section [SRVOTE]. 5. Security Analysis 5.1. Security of commit-and-reveal and future directions The security of commit-and-reveal protocols is well understood, and has certain flaws. Basically, the protocol is insecure to the extent that an adversary who controls b of the authorities gets to choose among 2^b outcomes for the result of the protocol. However, an attacker who is not a dirauth should not be able to influence the outcome at all. We believe that this system offers sufficient security especially compared to the current situation. More secure solutions require much more advanced crypto and more complex protocols so this seems like an acceptable solution for now. For alternative approaches on collaborative random number generation also see the discussion at [RNGMESSAGING]. 5.2. Predicting the shared random value during reveal phase The reveal phase lasts 12 hours, and most authorities will send their reveal value on the first round of the reveal phase. This means that an attacker can predict the final shared random value about 12 hours before it's generated. This does not pose a problem for the HSDir hash ring, since we impose an higher uptime restriction on HSDir nodes, so 12 hours predictability is not an issue. Any other protocols using the shared random value from this system should be aware of this property. 5.3. Partition attacks This design is not immune to certain partition attacks. We believe they don't offer much gain to an attacker as they are very easy to detect and difficult to pull off since an attacker would need to compromise a directory authority at the very least. Also, because of the byzantine general problem, it's very hard (even impossible in some cases) to protect against all such attacks. Nevertheless, this section describes all possible partition attack and how to detect them. 5.3.1. Partition attacks during commit phase A malicious directory authority could send only its commit to one single authority which results in that authority having an extra commit value for the shared random calculation that the others don't have. Since the consensus needs majority, this won't affect the final SRV value. However, the attacker, using this attack, could remove a single directory authority from the consensus decision at 24:00 when the SRV is computed. An attacker could also partition the authorities by sending two different commitment values to different authorities during the commit phase. All of the above is fairly easy to detect. Commitment values in the vote coming from an authority should NEVER be different between authorities. If so, this means an attack is ongoing or very bad bug (highly unlikely). 5.3.2. Partition attacks during reveal phase Let's consider Alice, a malicious directory authority. Alice could wait until the last reveal round, and reveal its value to half of the authorities. That would partition the authorities into two sets: the ones who think that the shared random value should contain this new reveal, and the rest who don't know about it. This would result in a tie and two different shared random value. A similar attack is possible. For example, two rounds before the end of the reveal phase, Alice could advertise her reveal value to only half of the dirauths. This way, in the last reveal phase round, half of the dirauths will include that reveal value in their votes and the others will not. In the end of the reveal phase, half of the dirauths will calculate a different shared randomness value than the others. We claim that this attack is not particularly fruitful: Alice ends up having two shared random values to chose from which is a fundamental problem of commit-and-reveal protocols as well (since the last person can always abort or reveal). The attacker can also sabotage the consensus, but there are other ways this can be done with the current voting system. Furthermore, we claim that such an attack is very noisy and detectable. First of all, it requires the authority to sabotage two consensuses which will cause quite some noise. Furthermore, the authority needs to send different votes to different auths which is detectable. Like the commit phase attack, the detection here is to make sure that the commiment values in a vote coming from an authority are always the same for each authority. 6. Discussion 6.1. Why the added complexity from proposal 225? The complexity difference between this proposal and prop225 is in part because prop225 doesn't specify how the shared random value gets to the clients. This proposal spends lots of effort specifying how the two shared random values can always be readily accessible to clients. 6.2. Why do you do a commit-and-reveal protocol in 24 rounds? The reader might be wondering why we span the protocol over the course of a whole day (24 hours), when only 3 rounds would be sufficient to generate a shared random value. We decided to do it this way, because we piggyback on the Tor voting protocol which also happens every hour. We could instead only do the shared randomness protocol from 21:00 to 00:00 every day. Or to do it multiple times a day. However, we decided that since the shared random value needs to be in every consensus anyway, carrying the commitments/reveals as well will not be a big problem. Also, this way we give more chances for a failing dirauth to recover and rejoin the protocol. 6.3. Why can't we recover if the 00:00UTC consensus fails? If the 00:00UTC consensus fails, there will be no shared random value for the whole day. In theory, we could recover by calculating the shared randomness of the day at 01:00UTC instead. However, the engineering issues with adding such recovery logic are too great. For example, it's not easy for an authority who just booted to learn whether a specific consensus failed to be created. 7. Acknowledgements Thanks to everyone who has contributed to this design with feedback and discussion. Thanks go to arma, ioerror, kernelcorn, nickm, s7r, Sebastian, teor, weasel and everyone else! References: [RANDOM-REFS]: http://projectbullrun.org/dual-ec/ext-rand.html https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html [RNGMESSAGING]: https://moderncrypto.org/mail-archive/messaging/2015/002032.html
Filename: 251-netflow-padding.txt Title: Padding for netflow record resolution reduction Authors: Mike Perry Created: 20 August 2015 Status: Closed Implemented-In: 0.3.1.1-alpha NOTE: Please look at section 2 of padding-spec.txt now, not this document. 0. Motivation It is common practice by many ISPs to record data about the activity of endpoints that use their uplink, if nothing else for billing purposes, but sometimes also for monitoring for attacks and general failure. Unfortunately, Tor node operators typically have no control over the data recorded and retained by their ISP. They are often not even informed about their ISP's retention policy, or the associated data sharing policy of those records (which tends to be "give them to whoever asks" in practice[1]). It is also likely that defenses for this problem will prove useful against proposed data retention plans in the EU and elsewhere, since these schemes will likely rely on the same technology. 0.1. Background At the ISP level, this data typically takes the form of Netflow, jFlow, Netstream, or IPFIX flow records. These records are emitted by gateway routers in a raw form and then exported (often over plaintext) to a "collector" that either records them verbatim, or reduces their granularity further[2]. Netflow records and the associated data collection and retention tools are very configurable, and have many modes of operation, especially when configured to handle high throughput. However, at ISP scale, per-flow records are very likely to be employed, since they are the default, and also provide very high resolution in terms of endpoint activity, second only to full packet and/or header capture. Per-flow records record the endpoint connection 5-tuple, as well as the total number of bytes sent and received by that 5-tuple during a particular time period. They can store additional fields as well, but it is primarily timing and bytecount information that concern us. When configured to provide per-flow data, routers emit these raw flow records periodically for all active connections passing through them based on two parameters: the "active flow timeout" and the "inactive flow timeout". The "active flow timeout" causes the router to emit a new record periodically for every active TCP session that continuously sends data. The default active flow timeout for most routers is 30 minutes, meaning that a new record is created for every TCP session at least every 30 minutes, no matter what. This value can be configured to be from 1 minute to 60 minutes on major routers. The "inactive flow timeout" is used by routers to create a new record if a TCP session is inactive for some number of seconds. It allows routers to avoid the need to track a large number of idle connections in memory, and instead emit a separate record only when there is activity. This value ranges from 10 seconds to 600 seconds on common routers. It appears as though no routers support a value lower than 10 seconds. 0.2. Default timeout values of major routers For reference, here are default values and ranges (in parenthesis when known) for common routers, along with citations to their manuals. Some routers speak other collection protocols than Netflow, and in the case of Juniper, use different timeouts for these protocols. Where this is known to happen, it has been noted. Inactive Timeout Active Timeout Cisco IOS[3] 15s (10-600s) 30min (1-60min) Cisco Catalyst[4] 5min 32min Juniper (jFlow)[5] 15s (10-600s) 30min (1-60min) Juniper (Netflow)[6,7] 60s (10-600s) 30min (1-30min) H3C (Netstream)[8] 60s (60-600s) 30min (1-60min) Fortinet[9] 15s 30min MicroTik[10] 15s 30min nProbe[14] 30s 120s Alcatel-Lucent[15] 15s (10-600s) 30min (1-600min) 1. Proposal Overview The combination of the active and inactive netflow record timeouts allow us to devise a low-cost padding defense that causes what would otherwise be split records to "collapse" at the router even before they are exported to the collector for storage. So long as a connection transmits data before the "inactive flow timeout" expires, then the router will continue to count the total bytes on that flow before finally emitting a record at the "active flow timeout". This means that for a minimal amount of padding that prevents the "inactive flow timeout" from expiring, it is possible to reduce the resolution of raw per-flow netflow data to the total amount of bytes send and received in a 30 minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP, SSH, and other intermittent interactive traffic, especially when all user traffic in that time period is multiplexed over a single connection (as it is with Tor). 2. Implementation Tor clients currently maintain one TLS connection to their Guard node to carry actual application traffic, and make up to 3 additional connections to other nodes to retrieve directory information. We propose to pad only the client's connection to the Guard node, and not any other connection. We propose to treat Bridge node connections to the Tor network as client connections, and pad them, but otherwise not pad between normal relays. Both clients and Guards will maintain a timer for all application (ie: non-directory) TLS connections. Every time a non-padding packet is sent or received by either end, that endpoint will sample a timeout value from between 1.5 seconds and 9.5 seconds. If the connection becomes active for any reason before this timer expires, the timer is reset to a new random value between 1.5 and 9.5 seconds. If the connection remains inactive until the timer expires, a single CELL_PADDING cell will be sent on that connection. In this way, the connection will only be padded in the event that it is idle, and will always transmit a packet before the minimum 10 second inactive timeout. 2.1. Tunable parameters We propose that the defense be controlled by the following consensus parameters: * nf_ito_low - The low end of the range to send padding when inactive, in ms. - Default: 1500 * nf_ito_high - The high end of the range to send padding, in ms. - Default: 9500 * nf_pad_relays - If set to 1, we also pad inactive relay-to-relay connections - Default: 0 * conn_timeout_low - The low end of the range to decide when we should close an idle connection (not counting padding). - Default: 900 seconds after last circuit closes * conn_timeout_high - The high end of the range to decide when we should close an idle connection. - Default: 1800 seconds after last circuit close If nf_ito_low == nf_ito_high == 0, padding will be disabled. 2.2. Maximum overhead bounds With the default parameters, we expect a padded connection to send one padding cell every 5.5 seconds (see Appendix A for the statistical analysis of expected padding packet rate on an idle link). This averages to 103 bytes per second full duplex (~52 bytes/sec in each direction), assuming a 512 byte cell and 55 bytes of TLS+TCP+IP headers. For a connection that remains idle for a full 30 minutes of inactivity, this is about 92KB of overhead in each direction. With 2.5M completely idle clients connected simultaneously, 52 bytes per second still amounts to only 130MB/second in each direction network-wide, which is roughly the current amount of Tor directory traffic[11]. Of course, our 2.5M daily users will neither be connected simultaneously, nor entirely idle, so we expect the actual overhead to be much lower than this. 2.3. Measuring actual overhead To measure the actual padding overhead in practice, we propose to export the following statistics in extra-info descriptors for the previous (fixed, non-rolling) 24 hour period: * Total cells read (padding and non-padding) * Total cells written (padding and non-padding) * Total CELL_PADDING cells read * Total CELL_PADDING cells written * Total RELAY_COMMAND_DROP cells read * Total RELAY_COMMAND_DROP cells written These values will be rounded to 100 cells each, and no values are reported if the relay has read or written less than 10000 cells in the previous period. RELAY_COMMAND_DROP cells are circuit-level padding not used by this defense, but we may as well start recording statistics about them now, too, to aid in the development of future defenses. 2.4. Load balancing considerations Eventually, we will likely want to update the consensus weights to properly load balance the selection of Guard nodes that must carry this overhead. We propose that we use the extra-info documents to get a more accurate value for the total average Guard and Guard+Exit node overhead of this defense in practice, and then use that value to fractionally reduce the consensus selection weights for Guard nodes and Guard+Exit nodes, to reflect their reduced capacity relative to middle nodes. 3. Threat model and adversarial considerations This defense does not assume fully adversarial behavior on the part of the upstream network administrator, as that administrator typically has no specific interest in trying to deanonymize Tor, but only in monitoring their own network for signs of overusage, attack, or failure. Therefore, in a manner closer to the "honest but curious" threat model, we assume that the netflow collector will be using standard equipment not specifically tuned to capturing Tor traffic. We want to reduce the resolution of logs that are collected incidentally, so that if they happen to fall into the wrong hands, we can be more certain will not be useful. We feel that this assumption is a fair one because correlation attacks (and statistical attacks in general) will tend to accumulate false positives very quickly if the adversary loses resolution at any observation points. It is especially unlikely for the the attacker to benefit from only a few high-resolution collection points while the remainder of the Tor network is only subject to connection-level/per-flow netflow data retention, or even less data retention than that. Nonetheless, it is still worthwhile to consider what the adversary is capable of, especially in light of looming data retention regulation. Because no major router appears to have the ability to set the inactive flow timeout below 10 seconds, it would seem as though the adversary is left with three main options: reduce the active record timeout to the minimum (1 minute), begin logging full packet and/or header data, or develop a custom solution. It is an open question to what degree these approaches would help the adversary, especially if only some of its observation points implemented these changes. 3.1 What about sampled data? At scale, it is known that some Internet backbone routers at AS boundaries and exchanges perform sampled packet header collection and/or produce netflow records based on a subset of the packets that pass through their infrastructure. The effects of this against Tor were studied before against the (much smaller) Tor network as it was in 2007[12]. At sampling rate of 1 out of every 2000 packets, the attack did not achieve high accuracy until over 100MB of data were transmitted, even when correlating only 500 flows in a closed-world lab setting. We suspect that this type of attack is unlikely to be effective at scale on the Tor network today, but we make no claims that this defense will make any impact upon sampled correlation, primarily because the amount of padding that this defense introduces is comparatively low relative to the amount of transmitted traffic that sampled correlation attacks require to attain any accuracy. 3.2. What about long-term statistical disclosure? This defense similarly does not claim to defeat long-term correlation attacks involving many observations over large amounts of time. However, we do believe it will significantly increase the amount of traffic and the number of independent observations required to attain the same accuracy if the adversary uses default per-flow netflow records. 3.3. What about prior information/confirmation? In truth, the most dangerous aspect of these netflow logs is not actually correlation at all, but confirmation. If the adversary has prior information about the location of a target, and/or when and how that target is expected to be using Tor, then the effectiveness of this defense will be very situation-dependent (on factors such as the number of other tor users in the area at that time, etc). In any case, the odds that there is other concurrent activity (to create a false positive) within a single 30 minute record are much higher than the odds that there is concurrent activity that aligns with a subset of a series of smaller, more frequent inactive timeout records. 4. Synergistic effects with future padding and other changes Because this defense only sends padding when the OR connection is completely idle, it should still operate optimally when combined with other forms of padding (such as padding for website traffic fingerprinting and hidden service circuit fingerprinting). If those future defenses choose to send padding for any reason at any layer of Tor, then this defense automatically will not. In addition to interoperating optimally with any future padding defenses, simple changes to the Tor network usage can serve to further reduce the usefulness of any data retention, as well as reduce the overhead from this defense. For example, if all directory traffic were also tunneled through the main Guard node instead of independent directory guards, then the adversary would lose additional resolution in terms of the ability to differentiate directory traffic from normal usage, especially when it is occurs within the same netflow record. As written and specified, the defense will pad such tunneled directory traffic optimally. Similarly, if bridge guards[13] are implemented such that bridges use their own guard node to route all of their connecting client traffic through, then users who run bridges will also benefit from blending their own client traffic with the concurrent traffic of their connected clients, the sum total of which will also be optimally padded such that it only transmits padding when the connection to the bridge's guard is completely idle. Appendix A: Padding Cell Timeout Distribution Statistics It turns out that because the padding is bidirectional, and because both endpoints are maintaining timers, this creates the situation where the time before sending a padding packet in either direction is actually min(client_timeout, server_timeout). If client_timeout and server_timeout are uniformly sampled, then the distribution of min(client_timeout,server_timeout) is no longer uniform, and the resulting average timeout (Exp[min(X,X)]) is much lower than the midpoint of the timeout range. To compensate for this, instead of sampling each endpoint timeout uniformly, we instead sample it from max(X,X), where X is uniformly distributed. If X is a random variable uniform from 0..R-1 (where R=high-low), then the random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R). Then, when both sides apply timeouts sampled from Y, the resulting bidirectional padding packet rate is now a third random variable: Z = min(Y,Y). The distribution of Z is slightly bell-shaped, but mostly flat around the mean. It also turns out that Exp[Z] ~= Exp[X]. Here's a table of average values for each random variable: R Exp[X] Exp[Z] Exp[min(X,X)] Exp[Y=max(X,X)] 2000 999.5 1066 666.2 1332.8 3000 1499.5 1599.5 999.5 1999.5 5000 2499.5 2666 1666.2 3332.8 6000 2999.5 3199.5 1999.5 3999.5 7000 3499.5 3732.8 2332.8 4666.2 8000 3999.5 4266.2 2666.2 5332.8 10000 4999.5 5328 3332.8 6666.2 15000 7499.5 7995 4999.5 9999.5 20000 9900.5 10661 6666.2 13332.8 In this way, we maintain the property that the midpoint of the timeout range is the expected mean time before a padding packet is sent in either direction. 1. https://lists.torproject.org/pipermail/tor-relays/2015-August/007575.html 2. https://en.wikipedia.org/wiki/NetFlow 3. http://www.cisco.com/en/US/docs/ios/12_3t/netflow/command/reference/nfl_a1gt_ps5207_TSD_Products_Command_Reference_Chapter.html#wp1185203 4. http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/70974-netflow-catalyst6500.html#opconf 5. https://www.juniper.net/techpubs/software/erx/junose60/swconfig-routing-vol1/html/ip-jflow-stats-config4.html#560916 6. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html 7. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html 8. http://www.h3c.com/portal/Technical_Support___Documents/Technical_Documents/Switches/H3C_S9500_Series_Switches/Command/Command/H3C_S9500_CM-Release1648%5Bv1.24%5D-System_Volume/200901/624854_1285_0.htm#_Toc217704193 9. http://docs-legacy.fortinet.com/fgt/handbook/cli52_html/FortiOS%205.2%20CLI/config_system.23.046.html 10. http://wiki.mikrotik.com/wiki/Manual:IP/Traffic_Flow 11. https://metrics.torproject.org/dirbytes.html 12. http://freehaven.net/anonbib/cache/murdoch-pet2007.pdf 13. https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt 14. http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf 15. http://infodoc.alcatel-lucent.com/html/0_add-h-f/93-0073-10-01/7750_SR_OS_Router_Configuration_Guide/Cflowd-CLI.html
Filename: 252-single-onion.txt Title: Single Onion Services Author: John Brooks, Paul Syverson, Roger Dingledine Created: 2015-07-13 Status: Superseded Superseded-by: 260 1. Overview Single onion services are a modified form of onion services, which trade service-side location privacy for improved performance, reliability, and scalability. Single onion services have a .onion address identical to any other onion service. The descriptor contains information sufficient to do a relay extend of a circuit to the onion service and to open a stream for the onion address. The introduction point and rendezvous protocols are bypassed for these services. We also specify behavior for a tor instance to publish a single onion service, which requires a reachable OR port, without necessarily acting as a public relay in the network. 2. Motivation Single onion services have a few benefits over double onion services: * Connection latency is much lower by skipping rendezvous * Stream latency is reduced on a 4-hop circuit * Removing rendezvous circuits improves service scalability * A single onion service can use multiple relays for load balancing Single onion services are not location hidden on the service side, but clients retain all of the benefits and privacy of onion services. More details, relation to double onion services, and the rationale for the 'single' and 'double' nomenclature are further described in section 7.4. We believe these improvements, along with the other benefits of onion services, will be a significant incentive for website and other internet service operators to provide these portals to preserve the privacy of their users. 3. Onion descriptors The onion descriptor format is extended to add: "service-extend-locations" NL encrypted-string [At most once] A list of relay extend info, which is used instead of introduction points and rendezvous for single onion services. This field is encoded and optionally encrypted in the same way as the "introduction-points" field. The encoded contents of this field contains no more than 10 entries, each containing the following data: "service-extend-location" SP link-specifiers NL [At start, exactly once] link-specifiers is a base64 encoded link specifier block, in the format described by proposal 224 [BUILDING-BLOCKS] and the EXTEND2 cell. "onion-key" SP key-type NL onion-key [Exactly once] Describes the onion key that must be used when extending to the single onion service relay. The key-type field is one of: "tap" onion-key is a PEM-encoded RSA relay onion key "ntor" onion-key is a base64-encoded NTOR relay onion key [XXX: Should there be some kind of cookie to prove that we have the desc? See also section 7.1. -special] A descriptor may contain either or both of "introduction-points" and "service-extend-locations"; see section 5.2. [XXX: What kind of backwards compatibility issues exist here? Will existing relays accept one of those descriptors? -special] 4. Reaching a single onion service as a client Single onion services use normal onion hostnames, so the client will first request the service's descriptor. If the descriptor contains a "service-extend-locations" field, the client should ignore the introduction points and rendezvous process in favor of the process defined here. The descriptor's "service-extend-locations" information is sufficient for a client to extend a circuit to the onion service, regardless of whether it is also listed as a relay in the network consensus. This extend info must not be used for any other purpose. If multiple extend locations are specified, the client should randomly select one. The client uses a 3-hop circuit to extend to the service location from the descriptor. Once this circuit is built, the client sends a BEGIN cell to the relay, with the onion address as hostname and the desired TCP port. If the circuit or stream fails, the client should retry using another extend location from the descriptor. If all extend locations fail, and the descriptor contains an "introduction-points" field, the client may fall back to a full rendezvous operation. 5. Publishing a single onion service To act as a single onion service, a tor instance (or cooperating group of tor instances) must: * Have a publicly accessible OR port * Publish onion descriptors in the same manner as any onion service * Include a "service-extend-locations" section in the onion descriptor * Accept RELAY_BEGIN cells for the service as defined in section 5.3 5.1. Configuration options The tor server operating a single onion service must accept connections as a tor relay, but is not required to be published in the consensus or to allow extending circuits. To enable this, we propose the following configuration option: RelayAllowExtend 0|1 If set, allow clients to extend circuits from this relay. Otherwise, refuse all extend cells. PublishServerDescriptor must also be disabled if this option is disabled. If ExitRelay is also disabled, this relay will not pass through any traffic. 5.2. Publishing descriptors A single onion service must publish descriptors in the same manner as any onion service, as defined by rend-spec and section 3 of this proposal. Optionally, a set of introduction points may be included in the descriptor to provide backwards compatibility with clients that don't support single onion services, or to provide a fallback when the extend locations fail. 5.3. RELAY_BEGIN When a RELAY_BEGIN cell is received with a configured single onion hostname as the destination, the stream should be connected to the configured backend server in the same manner as a service-side rendezvous stream. All relays must reject any RELAY_BEGIN cell with an address ending in ".onion" that does not match a locally configured single onion service. 6. Other considerations 6.1. Load balancing High capacity services can distribute load by including multiple entries in the "service-extend-locations" section of the descriptor, or by publishing several descriptors to different onion service directories, or by a combination of these methods. 6.2. Benefits of also running a Tor relay If a single onion service also acts as a published tor relay, it will keep connections to many other tor relays. This can significantly reduce the latency of connections to the single onion service, and also helps the tor network. 6.3. Proposal 224 ("Next-Generation Hidden Services") This proposal is compatible with proposal 224, with small changes to the service descriptor format. In particular: The "service-extend-location" sections are included in the encrypted portion of the descriptor, adjacent to any "introduction-point" sections. The "service-extend-locations" field is no longer present. An onion service is also single onion service if any "service-extend-location" field is present. 6.4. Proposal 246 ("Merging Hidden Service Directories and Intro Points") This proposal is compatible with proposal 246. The onion service will publish its descriptor to the introduction points in the same manner as any other onion service. The client may choose to build a circuit to the specified relays, or to continue with the rendezvous protocol. The client should not extend from the introduction point to the single onion service's relay, to avoid overloading the introduction point. The client may truncate the circuit and extend through a new relay. 7. Discussion 7.1. Authorization Client authorization for a single onion service is possible through encryption of the service-extend-locations section in the descriptor, or "stealth" publication under a new onion address, as with traditional onion services. One problem with this is that if you suspect a relay is also serving a single onion service, you can connect to it and send RELAY_BEGIN without any further authorization. To prevent this, we would need to include a cookie from the descriptor in the RELAY_BEGIN information. 7.2. Preventing relays from being unintentionally published Many single onion servers will not want to relay other traffic, and will set 'PublishServerDescriptor 0' to prevent it. Even when they do, they will still generate a relay descriptor, which could be downloaded and published to a directory authority without the relay's consent. To prevent this, we should insert a field in the relay descriptor when PublishServerDescriptor is disabled that instructs relays to never include it as part of a consensus. [XXX: Also see task #16564] 7.3. Ephemeral single onion services (ADD_ONION) The ADD_ONION control port command could be extended to support ephemerally configured single onion services. We encourage this, but specifying its behavior is out of the scope of this proposal. 7.4. Onion service taxonomy and nomenclature Onion services in general provide several benefits. First, by requiring a connection via Tor they provide the client the protections of Tor and make it much more difficult to inadvertently bypass those protections than when connecting to a non .onion site. Second, because .onion addresses are self-authenticating, onion services have look-up, routing, and authentication protections not provided by sites with standard domain addresses. These benefits apply to all onion services. Onion services as originally introduced also provide network location hiding of the service itself: because the client only ever connects through the end of a Tor circuit created by the onion service, the IP address of the onion service also remains protected. Applications and services already exist that use existing onion service protocols for the above described general benefits without the need for network location hiding. This Proposal is accordingly motivated by a desire to provide the general benefits, without the complexity and overhead of also protecting the location of the service. Further, as with what had originally been called 'location hidden services', there may be useful and valid applications of this design that are not reflected in our current intent. Just as 'location hidden service' is a misleading name for many current onion service applications, we prefer a name that is descriptive of the system but flexible with respect to applications of it. We also prefer a nomenclature that consistently works for the different types of onion services. It is also important to have short, simple names lest usage efficiencies evolve easier names for us. For example, 'hidden service' has replaced the original 'location hidden service' in Tor Proposals and other writings. For these reasons, we have chosen 'onion services' to refer to both those as set out in this Proposal and those with the client-side and server-side protections of the original---also for referring indiscriminately to any and all onion services. We use 'double-onion service' to refer to services that join two Tor circuits, one from the server and one from the client. We use 'single-onion' when referring to services that use only a client-side Tor circuit. In speech we sometimes use the even briefer, 'two-nion' and 'one-ion' respectively.
Filename: 253-oob-hmac.txt Title: Out of Band Circuit HMACs Authors: Mike Perry Created: 01 September 2015 Status: Dead 0. Motivation It is currently possible for Guard nodes (and MITM adversaries that steal their identity keys) to perform a "tagging" attack to influence circuit construction and resulting relay usage[1]. Because Tor uses AES as a stream cipher, malicious or intercepted Guard nodes can simply XOR a unique identifier into the circuit cipherstream during circuit setup and usage. If this identifier is not removed by a colluding exit (either by performing another XOR, or making use of known plaintext regions of a cell to directly extract a complete side-channel value), then the circuit will fail. In this way, malicious or intercepted Guard nodes can ensure that all client traffic is directed only to colluding exit nodes, who can observe the destinations and deanonymize users. Most code paths in the Tor relay source code will emit loud warnings for the most obvious instances circuit failure caused by this attack. However, it is very difficult to ensure that all such error conditions are properly covered such that warnings will be emitted. This proposal aims to provide a mechanism to ensure that tagging and related malleability attacks are cryptographically detectable when they happen. 1. Overview Since Tor Relays are already storing a running hash of all data transmitted on their circuits (via the or_circuit_t::n_digest and or_circuit_t::p_digest properties), it is possible to compute an out-of-band HMAC on circuit data, and verify that it is as expected. This proposal first defines an OOB_HMAC primitive that can be included standalone in a new relay cell command type, and additionally in other cell types. Use of the standalone relay cell command serves to ensure that circuits that are successfully built and used were not manipulated at a previous point. By altering the RELAY_COMMAND_TRUNCATED and CELL_DESTROY cells to also include the OOB_HMAC information, it is similarly possible to detect alteration of circuit contents that cause failures before the point of usage. 2. The OOB_HMAC primitive The OOB_HMAC primitive uses the existing rolling hashes present in or_circuit_t to provide a Tor OP (aka client) with the hash history of the traffic that a given relay has seen it so far. Note that to avoid storing an additional 64 bytes of SHA256 digest for every circuit at every relay, we use SHA1 for the hash logs, since the circuits are already storing SHA1 hashes. It's not immediately clear how to upgrade the existing SHA1 digests to SHA256 with the current circuit protocol, either, since matching hash algorithms are essential to the 'recognized' relay cell forwarding behavior. The version field exists primarily for this reason, should the rolling circuit hashes ever upgrade to SHA256. The OOB_HMAC primitive is specified in Trunnel as follows: struct oob_hmac_body { /* Version of this section. Must be 1 */ u8 version; /* SHA1 hash of all client-originating data on this circuit (obtained from or_circuit_t::n_digest). */ u8 client_hash_log[20]; /* Number of cells processed in this hash, mod 2^32. Used to spot-check hash position */ u32 client_cell_count; /* SHA1 hash of all server-originating data on this circuit (obtained from or_circuit_t::p_digest). */ u8 server_hash_log[20]; /* Number of cells processed in this hash, mod 2^32. Used to spot-check hash position. XXX: Technically the server-side is not needed. */ u32 server_cell_count; /* HMAC-SHA-256 of the entire cell contents up to this point, using or_circuit_t::p_crypto as the hmac key. XXX: Should we use a KDF here instead of p_crypto directly? */ u8 cell_hmac_256[32]; }; 3. Usage of OOB_HMAC The OOB_HMAC body will be included in three places: 1. In a new relay cell command RELAY_COMMAND_HMAC_SEND, which is sent in response to a client-originating RELAY_COMMAND_HMAC_GET on stream 0. 2. In CELL_DESTROY, immediately after the error code 3. In RELAY_COMMAND_TRUNCATED, immediately after the CELL_DESTROY contents 3.1. RELAY_COMMAND_HMAC_GET/SEND relay commands Clients should use leaky-pipe topology to send RELAY_COMMAND_HMAC_GET to the second-to-last node (typically the middle node) in the circuit at three points during circuit construction and usage: 1. Immediately after the last RELAY_EARLY cell is sent 2. Upon any stream detachment, timeout, or failure. 3. Upon any OP-initiated circuit teardown (including timed-out partially built circuits). We use RELAY_EARLY as the point at which to send these cells to avoid leaking the path length to the middle hop. 3.2. Alteration of CELL_DESTROY and RELAY_COMMAND_TRUNCATED In order to provide an HMAC even when a circuit is torn down before use due to failure, the behavior for generating and handling CELL_DESTROY and RELAY_COMMAND_TRUNCATED should be modified as follows: Whenever an OR sends a CELL_DESTROY for a circuit towards the OP, if that circuit was already properly established, the OR should include the contents of oob_hmac_body immediately after the reason field. The HMAC must cover the error code from CELL_DESTROY. Upon receipt of a CELL_DESTROY, and in any other case where an OR would generate a RELAY_COMMAND_TRUNCATED due to error, a conformant relay would include the CELL_DESTROY oob_hmac_body, as well as its own locally created oob_hmac_body. The locally created oob_hmac_body must cover the entire payload contents of RELAY_COMMAND_TRUNCATED, including the error code and the CELL_DESTROY oob_hmac_body. Here is a new Trunnel specification for RELAY_COMMAND_TRUNCATED: struct relay_command_truncated { /* Error code */ u8 error_code; /* Number of oob_hmacs. Must be 0, 1, or 2 */ u8 num_hmac; /* If there are 2 hmacs, the first one is from the CELL_DESTROY, and the second one is from the truncating relay. If num_hmac is 0, then this came from a relay without support for oob_hmac. */ struct oob_hmac_body[num_hmac]; }; The usage of a strong HMAC to cover the entire CELL_DESTROY contents also allows an OP to properly authenticate the reason a remote node needed to close a circuit, without relying on the previous hop to be honest about it. 4. Ensuring proper ordering with respect to hashes 4.1. RELAY_COMMAND_HMAC_GET/SEND The in-order delivery guarantee of circuits will mean that the incoming hashes will match upon receipt of the RELAY_COMMAND_HMAC_SEND cell, but any outgoing traffic the OP sent since RELAY_COMMAND_HMAC_GET will not have been seen by the responding OR. Therefore, immediately upon sending a RELAY_COMMAND_HMAC_GET, the OP must record and store its current outgoing hash state for that circuit, until the RELAY_COMMAND_HMAC_SEND arrives, and use that stored hash value for comparison against the oob_hmac_body's client_hash_log field. The server_hash_log should be checked against the corresponding crypt_path_t entry in origin_circuit_t for the relay that the command was sent to. 4.2. RELAY_COMMAND_TRUNCATED Since RELAY_COMMAND_TRUNCATED may be sent in response to any error condition generated by a cell in either direction, the OP must check that its local cell counts match those present in the oob_hmac_body for that hop. If the counts do not match, the OP may generate a RELAY_COMMAND_HMAC_GET to the hop that sent RELAY_COMMAND_TRUNCATED, prior to tearing down the circuit. 4.3. CELL_DESTROY If the cell counts of the destroy cell's oob_hmac_body do not match what the client sent for that hop, unfortunately that hash must be discarded. Otherwise, it may be checked against values held from before processing the RELAY_COMMAND_TRUNCATED envelope. 5. Security concerns and mitigations 5.1. Silent circuit failure attacks The primary way to game this oob-hmac is to omit or block cells containing HMACs from reaching the OP, or otherwise tear down circuits before responses arrive with proof of tampering. If a large fraction of circuits somehow fail without any RELAY_COMMAND_TRUNCATED oob_hmac_body payloads present, and without any responses to RELAY_COMMAND_HMAC_GET requests, the user should be alerted of this fact as well. This rate of silent circuit failure should be kept as an additional, separate per-Guard Path Bias statistic, and the user should be warned if this failure rate exceeds some (low) threshold for circuits containing relays that should have supported this proposal. 5.2. Malicious/colluding middle nodes If the adversary is prevented from causing silent circuit failure without the client being able to notice and react, their next available vector is to ensure that circuits are only built to middle nodes that are malicious and colluding with them (or that do not support this proposal), so that they may lie about the proper hash values that they see (or omit them). Right now, the current path bias code also does not count circuit failures to the middle hop as circuit attempts. This was done to reduce the effect of ambient circuit failure on the path bias accounting (since an average ambient circuit failure of X per-hop causes the total circuit failure middle+exit circuits to be 2X). Unfortunately, not counting middle hop failure allows the adversary to only allow circuits to colluding middle hops to complete, so that they may lie about their hash logs. All failed circuits to non-colluding middle nodes could be torn down before RELAY_COMMAND_TRUNCATED is sent. For this reason, the per-Guard Path Bias counts should be augmented to additionally track middle-node-only failure as a separate statistic as well, and the user should be warned if middle-node failure drops below a similar threshold as the current end-to-end failure. 5.3. Side channel issues, mitigations, and limitations Unfortunately, leaking information about circuit usage to the middle node prevents us from sending RELAY_COMMAND_HMAC_GET cells at more optimal points in circuit usage (such as immediately upon open, immediately after stream usage, etc). As such, we are limited to waiting until RELAY_EARLY cells stop being sent. It is debatable if we should send hashes periodically (perhaps with windowing information updates?) instead. 6. Alternatives A handful of alternatives to this proposal have already been discussed, but have been dismissed for various reasons. Per-hop cell HMACs were ruled out because they will leak the total path length, as well as the current hop's position in the circuit. Wide-block ciphers have been discussed, which would provide the property that attempts to alter a cell at a previous hop would render it completely corrupted upon its final destination, thus preventing untagging and recovery, even by a colluding malicious peer. Unfortunately, performance analysis of modern provably secure versions of wide-block ciphers has shown them to be at least 10X slower than AES-NI[2]. 1. https://lists.torproject.org/pipermail/tor-dev/2012-March/003347.html 2. http://archives.seul.org/tor/dev/Mar-2015/msg00137.html
Filename: 254-padding-negotiation.txt Title: Padding Negotiation Authors: Mike Perry Created: 01 September 2015 Status: Closed [See padding-spec.txt for the implemented version of this proposal.] 0. Overview This proposal aims to describe mechanisms for requesting various types of padding from relays. These padding primitives are general enough to use to defend against both website traffic fingerprinting as well as hidden service circuit setup fingerprinting. 1. Motivation Tor already supports both link-level padding via (CELL_PADDING cell types), as well as circuit-level padding (via RELAY_COMMAND_DROP relay cells). Unfortunately, there is no way for clients to request padding from relays, or request that relays not send them padding to conserve bandwidth. This proposal aims to create a mechanism for clients to do both of these. It also establishes consensus parameters to limit the amount of padding that relays will send, to prevent custom wingnut clients from requesting too much. 2. Link-level padding Padding is most useful if it can defend against a malicious or compromised guard node. However, link-level padding is still useful to defend against an adversary that can merely observe a Guard node externally, such as for low-resolution netflow-based attacks (see Proposal 251[1]). In that scenario, the primary negotiation mechanism we need is a way for mobile clients to tell their Guards to stop padding, or to pad less often. The following Trunnel payload should cover the needed parameters: const CHANNELPADDING_COMMAND_STOP = 1; const CHANNELPADDING_COMMAND_START = 2; /* The start command tells the relay to alter its min and max netflow timeout range values, and send padding at that rate (resuming if stopped). The stop command tells the relay to stop sending link-level padding. */ struct channelpadding_negotiate { u8 version IN [0]; u8 command IN [CHANNELPADDING_COMMAND_START, CHANNELPADDING_COMMAND_STOP]; /* Min must not be lower than the current consensus parameter nf_ito_low. Ignored if command is stop. */ u16 ito_low_ms; /* Max must not be lower than ito_low_ms. Ignored if command is stop. */ u16 ito_high_ms; }; After the above cell is received, the guard should use the 'ito_low_ms' and 'ito_high_ms' values as the minimum and maximum values (respectively) for inactivity before it decides to pad the channel. The actual timeout value is randomly chosen between those two values through an appropriate probability distribution (see proposal251 for the netflow padding protocol). More complicated forms of link-level padding can still be specified using the primitives in Section 3, by using "leaky pipe" topology to send the RELAY commands to the Guard node instead of to later nodes in the circuit. Because the above link-level padding only sends padding cells if the link is idle, it can be used in combination with the more complicated circuit-level padding below, without compounding overhead effects. 3. End-to-end circuit padding For circuit-level padding, we need the ability to schedule a statistical distribution of arbitrary padding to overlay on top of non-padding traffic (aka "Adaptive Padding"). The statistical mechanisms that define padding are known as padding machines. Padding machines can be hardcoded in Tor, specified in the consensus, and custom research machines can be listed in Torrc. 3.1. Padding Machines Circuits can have either one or two state machines at both the origin and at a specified middle hop. Each state machine can contain up to three states ("Start", "Burst" and "Gap") governing their behavior, as well as an "END" state. Not all states need to be used. Each state of a padding machine specifies either: * A histogram describing inter-arrival cell delays; OR * A parameterized delay probability distribution for inter-arrival cell delays In either case, the lower bound of the delay probability distribution can be specified as the start_usec parameter, and/or it can be learned by measuring the RTT of the circuit at the middle node. For client-side machines, RTT measurement is always set to 0. RTT measurement at the middle node is calculated by measuring the difference between the time of arrival of an received cell (ie: away from origin) and the time of arrival of a sent cell (ie: towards origin). The RTT is continually updated so long as two cells do not arrive back-to-back in either direction. If the most recent measured RTT value is larger than our measured value so far, this larger value is used. If the most recent measured RTT value is lower than our measured value so far, it is averaged with our current measured value. (We favor longer RTTs slightly in this way, because circuits are growing away from the middle node and becoming longer). If the histogram is used, it has an additional special "infinity" bin that means "infinite delay". The state can also provide an optional parameterized distribution that specifies how many total cells (or how many padding cells) can be sent on the circuit while the machine is in this state, before it transitions to a new state. Each state of a padding machine can react to the following cell events: * Non-padding cell received * Padding cell received * Non-padding cell sent * Padding cell sent Additionally, padding machines emit the following internal events to themselves: * Infinity bin was selected * The histogram bins are empty * The length count for this state was exceeded Each state of the padding machine specifies a set of these events that cause it to cancel any pending padding, and a set of events that cause it to transition to another state, or transition back itself. When an event causes a transition to a state (or back to the same state), a delay is sampled from the histogram or delay distribution, and padding cell is scheduled to be sent after that delay. If a non-padding cell is sent before the timer, the timer is canceled and a new padding delay is chosen. 3.1.1. Histogram Specification If a histogram is used by a state (as opposed to a fixed parameterized distribution), then each of the histograms' fields represent a probability distribution that is encoded into bins of exponentially increasing width. The first bin of the histogram (bin 0) has 0 width, with a delay value of start_usec+rtt_estimate (from the machine definition, and rtt estimate above). The remaining bins are exponentially spaced, starting at this offset and covering the range of the histogram, which is range_usec. The intermediate bins thus divide the timespan range_usec with offset start_usec+rtt_estimate, so that smaller bin indexes represent narrower time ranges, doubling up until the last bin. The last bin before the "infinity bin" thus covers [start_usec+rtt_estimate+range_usec/2, start_usec+rtt_estimate+range_usec). This exponentially increasing bin width allows the histograms to most accurately represent small interpacket delay (where accuracy is needed), and devote less accuracy to larger timescales (where accuracy is not as important). To sample the delay time to send a padding packet, perform the following: * Select a bin weighted by the number of tokens in its index compared to the total. * If the infinity bin is selected, do not schedule padding. * If bin 0 is selected, schedule padding at exactly its time value. * For other bins, uniformly sample a time value between this bin and the next bin, and schedule padding then. 3.1.1.1. Histogram Token Removal Tokens can be optionally removed from histogram bins whenever a padding or non-padding packet is sent. With this token removal, the histogram functions as an overall target delay distribution for the machine while it is in that state. If token removal is enabled, when a padding packet is sent, a token is removed from the bin corresponding to the target delay. When a non-padding packet is sent, the actual delay from the previous packet is calculated, and the histogram bin corresponding to that delay is inspected. If that bin has tokens remaining, it is decremented. If the bin has no tokens left, the state removes a token from a different bin, as specified in its token removal rule. The following token removal options are defined: * None -- Never remove any tokens * Exact -- Only remove from the target bin, if it is empty, ignore it. * Higher -- Remove from the next higher non-empty bin * Lower -- Remove from the next higher non-empty bin * Closest -- Remove from the closest non-empty bin by index * Closest_time -- Remove from the closest non-empty bin by index, by time When all bins exept the infinity bin are empty in a histogram, the padding machine emits the internal "bins empty" event to itself. Bin 0 and the bin before the infinity bin both have special rules for purposes of token removal. While removing tokens, all values less than bin 0 are treated as part of bin 0, and all values greater than start_usec+rtt_estimate+range_sec are treated as part of the bin before the infinity bin. Tokens are not removed from the infinity bin when non-padding is sent. (They are only removed when an "infinite" delay is chosen). 3.1.2. Delay Probability Distribution Alternatively, a delay probability distribution can be used instead of a histogram, to sample padding delays. In this case, the designer also needs to specify the appropriate distribution parameters, and when a padding cell needs to be scheduled, the padding subsystem will sample a positive delay value (in microseconds) from that distribution (where the minimum delay is range_usec+start_usec as is the case for histograms). We currently support the following probability distributions: Uniform, Logistic, Log-logistic, Geometric, Weibull, Pareto 3.2. State Machine Selection Clients will select which of the defined available padding machines to use based on the conditions that these machines specify. These conditions include: * How many hops the circuit must be in order for the machine to apply * If the machine requires vanguards to be enabled to apply * The state the circuit must be in for machines to apply (building, relay early cells remaining, opened, streams currently attached). * If the circuit purpose matches a set of purposes for the machine. * If the target hop of the machine supports circuit padding. Clients will only select machines whose conditions fully match given circuits. A machine is represented by a positive number that can be thought of as a "menu option" through the list of padding machines. The currently supported padding state machines are: [1]: CIRCPAD_MACHINE_CIRC_SETUP A padding machine that obscures the initial circuit setup in an attempt to hide onion services. 3.3. Machine Negotiation When a machine is selected, the client uses leaky-pipe delivery to send a RELAY_COMMAND_PADDING_NEGOTIATE to the target hop of the machine, using the following trunnel relay cell payload format: /** * This command tells the relay to alter its min and max netflow * timeout range values, and send padding at that rate (resuming * if stopped). */ struct circpad_negotiate { u8 version IN [0]; u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; /** Machine type is left unbounded because we can specify * new machines in the consensus */ u8 machine_type; }; Upon receipt of a RELAY_COMMAND_PADDING_NEGOTIATE cell, the middle node sends a RELAY_COMMAND_PADDING_NEGOTIATED with the following format: /** * This command tells the relay to alter its min and max netflow * timeout range values, and send padding at that rate (resuming * if stopped). */ struct circpad_negotiated { u8 version IN [0]; u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; u8 response IN [CIRCPAD_RESPONSE_OK, CIRCPAD_RESPONSE_ERR]; /** Machine type is left unbounded because we can specify * new machines in the consensus */ u8 machine_type; }; The 'machine_type' field should be the same as the one from the PADDING_NEGOTIATE cell. This is because, as an optimization, new machines can be installed at the client side immediately after tearing down an old machine. If the response machine type does not match the current machine type, the response was for a previous machine, and can be ignored. If the response field is CIRCPAD_RESPONSE_OK, padding was successfully negotiated. If it is CIRCPAD_RESPONSE_ERR, the machine is torn down and we do not pad. 4. Examples of Padding Machines In the original WTF-PAD design[2], the state machines are used as follows: The "Burst" histogram specifies the delay probabilities for sending a padding packet after the arrival of a non-padding data packet. The "Gap" histogram specifies the delay probabilities for sending another padding packet after a padding packet was just sent from this node. This self-triggering property of the "Gap" histogram allows the construction of multi-packet padding trains using a simple statistical distribution. Both "Gap" and "Burst" histograms each have a special "Infinity" bin, which means "We have decided not to send a packet". Intuitively, the burst state is used to detect when the line is idle (and should therefore have few or no tokens in low histogram bins). The lack of tokens in the low histogram bins causes the system to remain in the burst state until the actual application traffic either slows, stalls, or has a gap. The gap state is used to fill in otherwise idle periods with artificial payloads from the server (and should have many tokens in low bins, and possibly some also at higher bins). In this way, the gap state either generates entirely fake streams of cells, or extends real streams with additional cells. The Adaptive Padding Early implementation[3] uses parameterized distributions instead of histograms, but otherwise uses the states in the same way. It should be noted that due to our generalization of these states and their transition possibilities, more complicated interactions are also possible. For example, it is possible to simulate circuit extension, so that all circuits appear to continue to extend up until the RELAY_EARLY cell count is depleted. It is also possible to create machines that simulate traffic on unused circuits, or mimic onion service activity on clients that aren't otherwise using onion services. 5. Security considerations and mitigations The risks from this proposal are primarily DoS/resource exhaustion, and side channels. 5.1. Rate limiting Current research[2,3] indicates that padding should be be effective against website traffic fingerprinting at overhead rates of 50-60%. Circuit setup behavior can be concealed with far less overhead. We recommend that three consensus parameters be used in the event that the network is being overloaded from padding to such a degree that padding requests should be ignored: * circpad_global_max_padding_pct - The maximum percent of sent padding traffic out of total traffic to allow in a tor process before ceasing to pad. Ex: 75 means 75 padding packets for every 100 non-padding+padding packets. This definition is consistent with the overhead values in Proposal #265, though it does not take node position into account. * circpad_global_allowed_cells - The number of padding cells that must be transmitted before the global ratio limit is applied. Additionally, each machine can specify its own per-machine limits for the allowed cell counters and padding overhead percentages. When either global or machine limits are reached, padding is no longer scheduled. The machine simply becomes idle until the overhead drops below the threshold. Finally, the consensus can also be used to specify that clients should use only machines that are flagged as reduced padding, or disable circuit padding entirely, with the following two parameters: * circpad_padding_reduced=1 - Tells clients to only use padding machines with the 'reduced_padding_ok' machine condition flag set. * circpad_padding_disabled=1 - Tells clients to stop circuit padding immediately, and not negotiate any further padding machines. 5.2. Overhead accounting In order to monitor the quantity of padding to decide if we should alter these limits in the consensus, every node will publish the following values in a padding-counts line in its extra-info descriptor: * read_drop_cell_count - The number of RELAY_DROP cells read by this relay. * write_drop_cell_count - The number of RELAY_DROP cells sent by this relay. Each of these counters will be rounded to the nearest 10,000 cells. This rounding parameter will also be listed in the extra-info descriptor line, in case we change it in a later release. In the future, we may decide to introduce Laplace Noise in a similar manner to the hidden service statistics, to further obscure padding quantities. 5.3. Side channels In order to prevent relays from introducing side channels by requesting padding from clients, all of the padding negotiation commands are only valid in the outgoing (from the client/OP) direction. Similarly, to prevent relays from sending malicious padding from arbitrary circuit positions, if RELAY_DROP cells arrive from a hop other than that with which padding was negotiated, this cell is counted as invalid for purposes of CIRC_BW control port fields, allowing the vanguards addon to close the circuit upon detecting this activity. ------------------- 1. https://gitweb.torproject.org/torspec.git/tree/proposals/251-netflow-padding.txt 2. https://www.cs.kau.se/pulls/hot/thebasketcase-wtfpad/ 3. https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Filename: 255-hs-load-balancing.txt Title: Controller features to allow for load-balancing hidden services Author: Tom van der Woerdt Created: 2015-10-12 Status: Reserve 1. Overview and motivation To address scaling concerns with the onion web, we want to be able to spread the load of hidden services across multiple machines. OnionBalance is a great stab at this, and it can currently give us 60x the capacity by publishing 6 separate descriptors, each with 10 introduction points, but more is better. This proposal aims to address hidden service scaling up to a point where we can handle millions of concurrent connections. The basic idea involves splitting the 'introduce' from the 'rendezvous', in the tor implementation, and adding new events and commands to the control specification to allow intercepting introductions and transmitting them to different nodes, which will then take care of the actual rendezvous. External controller code could relay the data to another node or a pool of nodes, all which are run by the hidden service operator, effectively distributing the load of hidden services over multiple processes. By cleverly utilizing the current descriptor methods through OnionBalance, we could publish up to sixty unique introduction points, which could translate to many thousands of parallel tor workers after implementing this proposal. This should allow hidden services to go multi-threaded with a few small changes, and continue scaling for a long time. 2. Specification We propose two additions to the control specification, of which one is an event and the other is a new command. We also introduce two new configuration options. 2.1. HiddenServiceAutomaticRendezvous configuration option The syntax is: "HiddenServiceAutomaticRendezvous" SP [1|0] CRLF This configuration option is defined to be a boolean toggle which, if zero, stops the tor implementation from automatically doing a rendezvous when an INTRODUCE2 cell is received. Instead, an event will be sent to the controllers. If no controllers are present, the introduction cell should be dropped, as acting on it instead of dropping it could open a window for a DoS. This configuration option can be specified on a per-hidden service level, and can be set through the controller for ephemeral hidden services as well. 2.2. HiddenServiceTag configuration option The syntax is: "HiddenServiceTag" SP [a-zA-Z0-9] CRLF To identify groups of hidden services more easily across nodes, a name/tag can be given to a hidden service. Defaults to the storage path of the hidden service (HiddenServiceDir). 2.3. The "INTRODUCE" event The syntax is: "650" SP "INTRODUCE" SP HSTag SP RendezvousData CRLF HSTag = the tag of the hidden service RendezvousData = implementation-specific, but must not contain whitespace, must only contain human-readable characters, and should be no longer than 2048 bytes The INTRODUCE event should contain sufficient data to allow continuing the rendezvous from another Tor instance. The exact format is left unspecified and left up to the implementation. From this follows that only matching versions can be used safely to coordinate the rendezvous of hidden service connections. 2.4. "PERFORM-RENDEZVOUS" command The syntax is: "PERFORM-RENDEZVOUS" SP HSTag SP RendezvousData CRLF This command allows a controller to perform a rendezvous using data received through an INTRODUCE event. The format of RendezvousData is not specified other than that it must not contain whitespace, and should be no longer than 2048 bytes. 2.5. The RendezvousData blob The "RendezvousData" blob is opaque to the controller, however the tor implementation should of course know how to deal with it. Its contents is the minimal amount of data required to process the INTRODUCE2 cell on another machine. Before proposal 224 is implemented, this could consist of the INTRODUCE2 cell payload, the key to decrypt the cell if the cell is not already decrypted (which may be preferable, for performance reasons), and data necessary for other machines to recognize what to do with the cell. After proposal 224 is implemented, the blob would contain any additional keys needed to perform the rendezvous handshake. Implementations do not need to handle blobs generated by other versions of the software. Because of this, it is recommended to include a version number which can be used to verify that the blob is from a compatible implementation. 3. Compatibility and security The implementation of these methods should, ideally, not change anything in the network, and all control changes are opt-in, so this proposal is fully backwards compatible. Controllers handling this data must be careful to not leak rendezvous data to untrusted parties, as it could be used to intercept and manipulate hidden services traffic. 4. Example Let's take an example where a client (Alice) tries to contact Bob's hidden service. To do this, Bob follows the normal hidden service specification, except he sets up ten servers to do this. One of these publishes the descriptor, the others have this disabled. When the INTRODUCE2 cell arrives at the node which published the descriptor, it does not immediately try to perform the rendezvous, but instead outputs this to the controller. Through an out-of-band process this message is relayed to a controller of another node of Bob's, and this transmits the "PERFORM-RENDEZVOUS" command to that node. This node performs the rendezvous, and will continue to serve data to Alice, whose client will now not have to talk to the introduction point anymore. 5. Other considerations We have left the actual format of the rendezvous data in the control protocol unspecified, so that controllers do not need to worry about the various types of hidden service connections, most notably proposal 224. The decision to not implement the actual cell relaying in the tor implementation itself was taken to allow more advanced configurations, and to leave the actual load-balancing algorithm to the implementor of the controller. The developer of the tor implementation should not have to choose between a round-robin algorithm and something that could pull CPU load averages from a centralized monitoring system.
Filename: 256-key-revocation.txt Title: Key revocation for relays and authorities Authors: Nick Mathewson Created: 27 October 2015 Status: Reserve 1. Introduction This document examines the different kinds of long-lived public keys in Tor, and discusses a way to revoke each. The kind of keys at issue are: * Authority identity keys * Authority signing keys * OR identity keys (ed25519) * OR signing keys (ed25519) * OR identity keys (RSA) Additionally, we need to make sure that all other key types, if they are compromised, can be replaced or rendered unusable. 2. When to revoke keys Revoking keys should happen when the operator of an authority or relay believes that the key has been compromised, or has a significant risk of having been compromised. In this proposal we deliberately leave this decision up to the authority/relay operators. (If a third party believes that a key has been compromised, they should attempt to get the key-issuing party to revoke their key. If that can't be done, the uncompromised authorities should block the relay or de-list the authority in question.) Our key-generation code (for authorities and relays) should generate preemptive revocation documents at the same time it generates the original keys, so that operators can retain those documents in the event that access to the original keys is lost. The operators should keep these revocation documents private and available enough so that they can issue the revocation if necessary, but nobody else can. Additionally, the key generation code should be able to generate retrospective revocation documents for existing keys and certificates. (This approach can be more useful when a subkey is revoked, but the operator still has ready access to the issuing key.) 3. Authority keys Authority identity keys are the most important keys in Tor. They are kept offline and encrypted. They are used to sign authority signing keys, and for no other purpose. Authority signing keys are kept online. They are authenticated by authority identity keys, and used to sign votes and consensus documents. (For the rest of section 2, all keys mentioned will be authority keys.) 3.1. Revocation certificates for authorities We add the following extensions to authority key certificates (see dir-spec.txt section 3.1), for use in key revocation. "dir-key-revocation-type" SP "master" | "signing" NL Specifies which kind of revocation document this is. If dir-key-revocation is absent, this is not a revocation. [At most once] "dir-key-revocation-notes" SP (any non-NL text) NL An explanation of why the key was revoked. Must be absent unless dir-key-revocation-type is set. [Any number of times] "dir-key-revocation-signing-key-unusable" NL Present if the signing key in this document will not actually be used to sign votes and consensuses. [At most once] "dir-key-revoked-signing-key" SP DIGEST NL Fingerprints of signing keys being explicitly revoked by this certificate. (All keys published before this one are _implicitly_ revoked.) [Any number of times] "dir-key-revocation-published" SP YYYY-MM-DD SP HH:MM:SS NL The actual time when the revocation was generated. (Used since the 'published' field in the certificate will lie; see below.) [at most once.] 3.2. Generating revocations Revocations for signing keys should be generated with: * A 'published' time immediately following the published date on the key that they are revoking. * An 'expires' time at least 48 hours after the expires date on the key that they are revoking, and at least one week in the future. (Note that this ensures as-correct-as-possible behavior for existing Tor clients and servers. For Tor versions before 0.2.6, having a more recent published date than the older key will cause the revoked key certificate to be removed by trusted_dirs_remove_old_certs() if it is published at least 7 days in the past. For Tor versions 0.2.6 or later, the interval is reduced to 2 days.) If generating a signing key revocation ahead of time, the revocation document should include a dummy signing key, to be thrown away immediately after it is generated and used to make the revocation document. The "dir-key-revocation-signing-key-unusable" element should be present. If generating a signing key revocation in response to an event, the revocation document should include the new signing key to be used. The "dir-key-revocation-signing-key-unusable" element must be be absent. All replacement certificates generated for the lifetime of the original revoked certificate should be generated as revocations. Revocations for master keys should be generated with: * A 'published' time immediately following the published date on the most recently generated certificate, if possible. * An 'expires' time equal to 18 Jan 2038. (The next-to-last day encodeable in time_t, to avoid Y2038 problems.) * A dummy signing key, as above. 3.3. Submitting revocations In the event of a key revocation, authority operators should upload the revocation document to every other authority. If there is a replacement signing key, it should be included in the authority's votes (as any new key certificate would be). 3.4. Handling revocations We add these additional rules for caching and storing revocations on Tor servers and clients. * Master key revocations should be stored indefinitely. * If we have a master key revocation, no other certificates for that key should be fetched, stored, or served. * If we have a master key revocation, we should replace any DirAuthority entry for that master key with a 'null' entry -- an authority with no address and no keys, from which nothing can be downloaded and nothing can be trusted, but which still counts against the total number of authorities. * Signing key revocations should be retained until their 'expires' date. * If we have a signing key revocation document, we should not trust any signature generated with any key in an older signing key certificates for the same master key. We should not serve such key certificates. * We should not attempt to fetch any certificate document matching an <identity, signing> pair for which a revocation document exists. We add these additional rule for directory authorities: * When generating or serving a consensus document, an authority should include a dir-source entry based on the most recent revocation cert it has from an authority, unless that authority has a more recent valid key cert. (This will require a new consensus method.) * When generating or serving a consensus document, if no valid signature exists from a given authority, and that authority has a currently valid key revocation document with a signing key in it, it should include a bogus signature purporting to be made with that signing key. (All-zeros is suggested.) (Doing this will make old Tor clients download the revocation certificates.) 4. Router identity key revocations 4.1. RSA identity keys If the RSA key is compromised but the ed25519 identity and signing keys are not, simply disable the router. Key pinning should take care of the rest. (This isn't ideal when key pinning isn't deployed yet, but I'm betting that key pinning comes online before this proposal does.) 4.2. Ed25519 master identity keys (We use the data format from proposal 220, section 2.3 here.) To revoke a master identity key, create a revocation for the master key and upload it to every authority. Authorities should accept these documents as meaning that the signing key should never be allowed to appear on the Tor network. This can be enforced with the key pinning mechanism. 4.3. Ed25519 signing keys (We use the data format from proposal 220, section 2.3 here.) To revoke a signing key, include the revocation for every not-yet-expired signing key in your routerinfo document, as in: "revoked-signing-key" SP Base64-Ed25519-Key NL Note that this doesn't need to be authenticated, since the newer signing key certificate creates a trust path from the master identity key to the the revocation. [Up to 32 times.] Upon receiving these entries, authorities store up to 32 such entries per router per year. (If you have more than 32 keys compromised, give up and take your router down. Start it with a new master key.) When voting, authorities include a "rev" line in the microdescriptor for every revoked-signing-key in the routerinfo: "rev" SP "ed25519" SP Base64-Ed25519-Key NL (This will require a new microdescriptor version.) Upon receiving such a line in the microdescriptor, Tor instances MUST NOT trust any signing key certificate with a matching key.
Filename: 257-hiding-authorities.txt Title: Refactoring authorities and making them more isolated from the net Authors: Nick Mathewson, Andrea Shepard Created: 2015-10-27 Status: Meta 0. Meta status This proposal is 'accepted' as an outline for future proposals, though it doesn't actually specify itself in enough detail to be implemented as it stands. 1. Introduction Directory authorities are critical to the Tor network, and represent a DoS target to anybody trying to disable the network. This document describes a strategy for making directory authorities in general less vulnerable to DoS by separating the different parts of their functionality. 2. Design 2.1. The status quo This proposal is about splitting up the roles of directory authorities. But, what are these roles? Currently, directory authorities perform the following functions. Some of these functions require large amounts of bandwidth; I've noted that with a (BW). Some of these functions require a publicly known address; I've marked those with a (PUB). Some of these functions inevitably leak the location from which they are performed. I've marked those with a (LOC). Not everything in this list is something that needs to be done by an authority permanently! This list is, again, just what authorities do now. * Authorities receive uploaded server descriptors and extrainfo descriptors from regular Tor servers and from each other. (BW, PUB?) * Authorities periodically probe the routers they know about in order to determine whether they are running or not. By remembering the past behavior of nodes, they also build a view of each node's fractional uptime and mean time between failures. (LOC, unless we're clever) * Authorities perform the consensus protocol by: * Generating 'vote' documents describing their view of the network, along with a set of microdescriptors for later client use. * Uploading these votes to one another. * Computing a 'consensus' of these votes. * Authorities serve as a location for distributing consensus documents, descriptors, extrainfo documents, and microdescriptors... * To directory mirrors. (BW?, PUB?, LOC?) * To clients that do not yet know a directory mirror. (BW!!, PUB) These functions are tied to directory authorities, but done out-of-process: * Bandwidth measurement (BW) * Sybil detection * 'Guardiness' measurement, possibly. 2.2. Design goals Principally, we attempt to isolate the security-critical, high-resource, and availability-critical pieces of our directory infrastructure from one another. We would like to make the security-critical pieces of our infrastructure easy to relocate, and the communications between them easy to secure. We require that the Tor network remain able to bootstrap itself in the event of catastrophic failure. So, while we can _use_ a running Tor network to communicate, we should not _require_ that a running Tor network exist in order to perform the voting process. 2.3. Division of responsibility We propose dividing directory authority operations into these modules: ---------- ---------- -------------- ---------------- | Upload |======>| Voting |===>| Publishing |===>| Distribution | ---------- ---------- -------------- ---------------- I ^ I ----------- I ====>| Metrics |=== ----------- A number of 'upload' servers are responsible for receiving router descriptors. These are publicly known, and responsible for collecting descriptors. Information from these servers is used by 'metrics' modules (which check Tor servers for reliability and measure their history), and fed into the voting process. The voting process involves only communication (indirectly) from authorities to authorities, to produce a consensus and a set of microdescriptors. When voting is complete, consensuses, descriptors, and microdescriptors must be made available to the rest of the world. This is done by the 'publishing' module. The consensuses, descriptors, and mds are then taken up by the directory caches, and distributed. The rest of this proposal will describe means of communication between these modules. 3. The modules in more detail This section will outline possibilities for communication between the various parts of the system to isolate them. There will be plenty of "may"s and "could"s and "might"s here: these are possibilities, in need of further consideration. 3.1. Sending descriptors to the Upload module We retain the current mechanism: a set of well-known IP addresses with well-known OR keys to which each relay should upload a copy of its descriptors. The worst that a hostile upload server can do is to drop descriptors. (It could also generate large numbers of spurious descriptors in order to increase the load on the metrics system. But an attacker could do that without running an upload server) With respect to dropping, upload servers can use an anytrust model: so long as a single server receives and honestly reports descriptors to the rest of the system, those descriptors will arrive correctly. To avoid DoS attacks, we can require that any relay not previously known to an upload module perform some kind of a proof of work as it first registers itself. (Details TBD) If we're using TLS here, we should also consider a check-then-start TLS design as described in A.1 below. The location of Upload servers can change over time; they can be published in the consensus. (Note also that as an alternative, we could distribute this functionality across the whole network.) 3.2. Transferring descriptors to the metrics server and the voters The simplest option here would be for the authorities and metrics servers to mirror them using Tor. rsync-over-ssh-over-Tor is a possibility, if we don't feel like building more machinery. (We could use hidden services here, but it is probably okay for upload servers and to be known by the the voters and metrics.) A fallback to a non-Tor connection could be done manually, or could require explicit buy-in from the voter/metrics operator. 3.3. Transferring information from metrics server to voters The same approaches as 3.2 should work fine. 3.4. Communication between voters Voters can, we hope, communicate to each other over authenticated hidden services. But we'll need a fallback mechanism here. Another option is to have public ledgers available for voters to talk to anonymously. This is probably a better idea. We could re-use the upload servers for this purpose, perhaps. Giving voters each others' addresses seems like a bad idea. 3.5. Communication from voters to directory nodes We should design a flood-distribution mechanism for microdescriptors, listed descriptors, and consensuses so that authorities can each upload to a few targets anonymously, and have them propagate through the rest of the network. 4. Migration To support old clients and old servers, the current authority IP addresses should remain as Upload and Distribution points. The current authority identity keys keys should remain as the keys for voters. A.1. Check-then-start TLS Current TLS protocols are quite susceptible to denial-of-service attacks, with large asymmetries in resource consumption. (Client sends junk, forcing server to perform private-key operation on junk.) We could hope for a good TLS replacement to someday emerge, or for TLS to improve its properties. But as a replacement, I suggest that we wrap TLS in a preliminary challenge-response protocol to establish that the use is authorized before we allow the TLS handshake to begin. (We shouldn't do this for all the TLS in Tor: only for the cases where we want to restrict the users of a given TLS server.)
Filename: 258-dirauth-dos.txt Title: Denial-of-service resistance for directory authorities Author: Andrea Shepard Created: 2015-10-27 Status: Dead 1. Problem statement The directory authorities are few in number and vital for the functioning of the Tor network; threats of denial of service attacks against them have occurred in the past. They should be more resistant to unreasonably large connection volumes. 2. Design overview There are two possible ways a new connection to a directory authority can be established, directly by a TCP connection to the DirPort, or tunneled inside a Tor circuit and initiated with a begindir cell. The client can originate the former as direct connections or from a Tor exit, and the latter either as fully anonymized circuits or one-hop links to the dirauth's ORPort. The dirauth will try to heuristically classify incoming requests as one of these four indirection types, and then in the two non-anonymized cases further sort them into hash buckets on the basis of source IP. It will use an exponentially-weighted moving average to measure the rate of connection attempts in each bucket, and also separately limit the number of begindir cells permitted on each circuit. It will periodically scan the hash tables and forget counters which have fallen below a threshold to prevent memory exhaustion. 3. Classification of incoming connections Clients can originate connections as one of four indirection types: - DIRIND_ONEHOP: begindir cell on a single-hop Tor circuit - DIRIND_ANONYMOUS: begindir cell on a fully anonymized Tor circuit - DIRIND_DIRECT_CONN: direct TCP connection to dirport - DIRIND_ANON_DIRPORT: TCP connection to dirport from an exit relay The directory authority can always tell a dirport connection from a begindir, but it must use its knowledge of the current consensus and exit policies to disambiguate whether the connection is anonymized. It should treat a begindir as DIRIND_ANONYMOUS when the previous hop in the circuit it appears on is in the current consensus, and as DIRIND_ONEHOP otherwise; it should treat a dirport connection as DIRIND_ANON_DIRPORT if the source address appears in the consensus and allows exits to the dirport in question, or as DIRIND_DIRECT_CONN otherwise. In the case of relays which also act as clients, these heuristics may falsely classify direct/onehop connections as anonymous, but will never falsely classify anonymous connections as direct/onehop. 4. Exponentially-weighted moving average counters and hash table The directory authority implements a set of exponentially-weighted moving averages to measure the rate of incoming connections in each bucket. The two anonymous connection types are each a single bucket, but the two non- anonymous cases get a single bucket per source IP each, stored in a hash table. The directory authority must periodically scan this hash table for counters which have decayed close to zero and free them to avoid permitting memory exhaustion. This introduces five new configuration parameters: - DirDoSFilterEWMATimeConstant: the time for an EWMA counter to decay by a factor of 1/e, in seconds. - DirDoSFilterMaxAnonConnectRate: the threshold to trigger the DoS filter on DIRIND_ANONYMOUS connections. - DirDoSFilterMaxAnonDirportConnectRate: the threshold to trigger the DoS filter on DIRIND_ANON_DIRPORT connections. - DirDoSFilterMaxBegindirRatePerIP: the threshold per source IP to trigger the DoS filter on DIRIND_ONEHOP connections. - DirDoSFilterMaxDirectConnRatePerIP: the threshold per source IP to trigger the DoS filter on DIRIND_DIRECT_CONN connections. When incrementing a counter would put it over the relevant threshold, the filter is said to be triggered. In this case, the directory authority does not update the counter, but instead suppresses the incoming request. In the DIRIND_ONEHOP and DIRIND_ANONYMOUS cases, the directory authority must kill the circuit rather than merely refusing the request, to prevent an unending stream of client retries on the same circuit. 5. Begindir cap Directory authorities limit the number of begindir cells permitted in the lifetime of a particular circuit, separately from the EWMA counters. This can only affect the DIRIND_ANONYMOUS and DIRIND_ONEHOP connetion types. A sixth configuration variable, DirDoSFilterMaxBegindirPerCircuit, controls this feature. 6. Limitations Widely distributed DoS attacks with many source IPs may still be able to avoid raising any single DIRIND_ONEHOP or DIRIND_DIRECT_CONN counter above threshold.
Filename: 259-guard-selection.txt Title: New Guard Selection Behaviour Author: Isis Lovecruft, George Kadianakis Created: 2015-10-28 Status: Obsolete Extends: 241-suspicious-guard-turnover.txt This proposal was made obsolete by proposal #271. §1. Overview In addition to the concerns regarding path bias attacks, namely that the space from which guards are selected by some specific client should not consist of the entirety of nodes with the Guard flag (cf. §1 of proposal #247), several additional concerns with respect to guard selection behaviour remain. This proposal outlines a new entry guard selection algorithm, which additionally addresses the following concerns: - Heuristics and algorithms for determining how and which guard(s) is(/are) chosen should be kept as simple and easy to understand as possible. - Clients in censored regions or who are behind a fascist firewall who connect to the Tor network should not experience any significant disadvantage in terms of reachability or usability. - Tor should make a best attempt at discovering the most appropriate behaviour, with as little user input and configuration as possible. §2. Design Alice, an OP attempting to connect to the Tor network, should undertake the following steps to determine information about the local network and to select (some) appropriate entry guards. In the following scenario, it is assumed that Alice has already obtained a recent, valid, and verifiable consensus document. Before attempting the guard selection procedure, Alice initialises the guard data structures and prepopulates the guardlist structures, including the UTOPIC_GUARDLIST and DYSTOPIC_GUARDLIST (cf. §XXX). Additionally, the structures have been designed to make updates efficient both in terms of memory and time, in order that these and other portions of the code which require an up-to-date guard structure are capable of obtaining such. 0. Determine if the local network is potentially accessible. Alice should attempt to discover if the local network is up or down, based upon information such as the availability of network interfaces and configured routing tables. See #16120. [0] [XXX: This section needs to be fleshed out more. I'm ignoring it for now, but since others have expressed interest in doing this, I've added this preliminary step. —isis] 1. Check that we have not already attempted to add too many guards (cf. proposal #241). 2. Then, if the PRIMARY_GUARDS on our list are marked offline, the algorithm attempts to retry them, to ensure that they were not flagged offline erroneously when the network was down. This retry attempt happens only once every 20 mins to avoid infinite loops. [Should we do an exponential decay on the retry as s7r suggested? —isis] 3. Take the list of all available and fitting entry guards and return the top one in the list. 4. If there were no available entry guards, the algorithm adds a new entry guard and returns it. [XXX detail what "adding" means] 5. Go through the steps 1-4 above algorithm, using the UTOPIC_GUARDLIST. 5.a. When the GUARDLIST_FAILOVER_THRESHOLD of the UTOPIC_GUARDLIST has been tried (without success), Alice should begin trying steps 1-4 with entry guards from the DYSTOPIC_GUARDLIST as well. Further, if no nodes from UTOPIC_GUARDLIST work, and it appears that the DYSTOPIC_GUARDLIST nodes are accessible, Alice should make a note to herself that she is possibly behind a fascist firewall. 5.b. If no nodes from either the UTOPIC_GUARDLIST or the DYSTOPIC_GUARDLIST are working, Alice should make a note to herself that the network has potentially gone down. Alice should then schedule, at exponentially decaying times, to rerun steps 0-5. [XXX Should we do step 0? Or just 1-4? Should we retain any previous assumptions about FascistFirewall? —isis] 6. [XXX Insert potential other fallback mechanisms, e.g. switching to using bridges? —isis] §3. New Data Structures, Consensus Parameters, & Configurable Variables §3.1. Consensus Parameters & Configurable Variables Variables marked with an asterisk (*) SHOULD be consensus parameters. DYSTOPIC_GUARDS ¹ All nodes listed in the most recent consensus which are marked with the Guard flag and which advertise their ORPort(s) on 80, 443, or any other addresses and/or ports controllable via the FirewallPorts and ReachableAddresses configuration options. UTOPIC_GUARDS All nodes listed in the most recent consensus which are marked with the Guard flag and which do NOT advertise their ORPort(s) on 80, 443, or any other addresses and/or ports controllable via the FirewallPorts and ReachableAddresses configuration options. PRIMARY_GUARDS * The number of first, active, PRIMARY_GUARDS on either the UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST as "primary". We will go to extra lengths to ensure that we connect to one of our primary guards, before we fall back to a lower priority guard. By "active" we mean that we only consider guards that are present in the latest consensus as primary. UTOPIC_GUARDS_ATTEMPTED_THRESHOLD * DYSTOPIC_GUARDS_ATTEMPTED_THRESHOLD * These thresholds limit the amount of guards from the UTOPIC_GUARDS and DYSTOPIC_GUARDS which should be partitioned into a single UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST respectively. Thus, this represents the maximum percentage of each of UTOPIC_GUARDS and DYSTOPIC_GUARDS respectively which we will attempt to connect to. If this threshold is hit we assume that we are offline, filtered, or under a path bias attack by a LAN adversary. There are currently 1600 guards in the network. We allow the user to attempt 80 of them before failing (5% of the guards). With regards to filternet reachability, there are 450 guards on ports 80 or 443, so the probability of picking such a guard here should be high. This logic is not based on bandwidth, but rather on the number of relays which possess the Guard flag. This is for three reasons: First, because each possible *_GUARDLIST is roughly equivalent to others of the same category in terms of bandwidth, it should be unlikely [XXX How unlikely? —isis] for an OP to select a guardset which contains less nodes of high bandwidth (or vice versa). Second, the path-bias attacks detailed in proposal #241 are best mitigated through limiting the number of possible entry guards which an OP might attempt to use, and varying the level of security an OP can expect based solely upon the fact that the OP picked a higher number of low-bandwidth entry guards rather than a lower number of high-bandwidth entry guards seems like a rather cruel and unusual punishment in addition to the misfortune of already having slower entry guards. Third, we favour simplicity in the redesign of the guard selection algorithm, and introducing bandwidth weight fraction computations seems like an excellent way to overcomplicate the design and implementation. §3.2. Data Structures UTOPIC_GUARDLIST DYSTOPIC_GUARDLIST These lists consist of a subset of UTOPIC_GUARDS and DYSTOPIC_GUARDS respectively. The guards in these guardlists are the only guards to which we will attempt connecting. When an OP is attempting to connect to the network, she will construct hashring structure containing all potential guard nodes from both UTOPIC_GUARDS and DYSTOPIC_GUARDS. The nodes SHOULD BE inserted into the structure some number of times proportional to their consensus bandwidth weight. From this, the client will hash some information about themselves [XXX what info should we use? —isis] and, from that, choose #P number of points on the ring, where #P is {UTOPIC,DYSTOPIC}_GUARDLIST_ATTEMPTED_THRESHOLD proportion of the total number of unique relays inserted (if a duplicate is selected, it is discarded). These selected nodes comprise the {UTOPIC,DYSTOPIC}_GUARDLIST for (first) entry guards. (We say "first" in order to distinguish between entry guards and the vanguards proposed for hidden services in proposal #247.) [Perhaps we want some better terminology for this. Suggestions welcome. —isis] Each GUARDLIST SHOULD have the property that the total sum of bandwidth weights for the nodes contained within it is roughly equal to each other guardlist of the same type (i.e. one UTOPIC_GUARDLIST is roughly equivalent in terms of bandwidth to another UTOPIC_GUARDLIST, but necessarily equivalent to a DYSTOPIC_GUARDLIST). For space and time efficiency reasons, implementations of the GUARDLISTs SHOULD support prepopulation(), update(), insert(), and remove() functions. A second data structure design consideration is that the amount of "shifting" — that is, the differential between constructed hashrings as nodes are inserted or removed (read: ORs falling in and out of the network consensus) — SHOULD be minimised in order to reduce the resources required for hashring update upon receiving a newer consensus. The implementation we propose is to use a Consistent Hashring, modified to dynamically allocate replications in proportion to fraction of total bandwidth weight. As with a normal Consistent Hashring, replications determine the number times the relay is inserted into the hashring. The algorithm goes like this: router ← ⊥ key ← 0 replications ← 0 bw_weight_total ← 0 while router ∈ GUARDLIST: | bw_weight_total ← bw_weight_total + BW(router) while router ∈ GUARDLIST: | replications ← FLOOR(CONSENSUS_WEIGHT_FRACTION(BW(router), bw_total) * T) | factor ← (S / replications) | while replications != 0: | | key ← (TOINT(HMAC(ID)[:X] * replications * factor) mod S | | INSERT(key, router) | | replications <- replications - 1 where: - BW is a function for extracting the value of an OR's `w bandwith=` weight line from the consensus, - GUARDLIST is either UTOPIC_GUARDLIST or DYSTOPIC_GUARDLIST, - CONSENSUS_WEIGHT_FRACTION is a function for computing a router's consensus weight in relation to the summation of consensus weights (bw_total), - T is some arbitrary number for translating a router's consensus weight fraction into the number of replications, - H is some collision-resistant hash digest, - S is the total possible hash space of H (e.g. for SHA-1, with digest sizes of 160 bits, this would be 2^160), - HMAC is a keyed message authentication code which utilises H, - ID is an hexadecimal string containing the hash of the router's public identity key, - X is some (arbitrary) number of bytes to (optionally) truncate the output of the HMAC to, - S[:X] signifies truncation of S, some array of bytes, to a sub-array containing X bytes, starting from the first byte and continuing up to and including the Xth byte, such that the returned sub-array is X bytes in length. - INSERT is an algorithm for inserting items into the hashring, - TOINT converts hexadecimal to decimal integers, For routers A and B, where B has a little bit more bandwidth than A, this gets you a hashring which looks like this: B-´¯¯`-BA A,` `. / \ B| |B \ / `. ,´A AB--__--´B When B disappears, A remains in the same positions: _-´¯¯`-_A A,` `. / \ | | \ / `. ,´A A`--__--´ And similarly if A disappears: B-´¯¯`-B ,` `. / \ B| |B \ / `. ,´ B--__--´B Thus, no "shifting" problems, and recalculation of the hashring when a new consensus arrives via the update() function is much more time efficient. Alternatively, for a faster and simpler algorithm, but non-uniform distribution of the keys, one could remove the "factor" and replace the derivation of "key" in the algorithm above with: key ← HMAC(ID || replications)[:X] A reference implementation in Python is available². [1] §4. Footnotes ¹ "Dystopic" was chosen because those are the guards you should choose from if you're behind a FascistFirewall. ² One tiny caveat being that the ConsistentHashring class doesn't dynamically assign replication count by bandwidth weight; it gets initialised with the number of replications. However, nothing in the current implementation prevents you from doing: >>> h = ConsistentHashring('SuperSecureKey', replications=6) >>> h.insert(A) >>> h.replications = 23 >>> h.insert(B) >>> h.replications = 42 >>> h.insert(C) §5. References [0]: https://trac.torproject.org/projects/tor/ticket/16120 [1]: https://gitweb.torproject.org/user/isis/bridgedb.git/tree/bridgedb/hashring.py?id=949d33e8#n481 -*- coding: utf-8 -*-
Filename: 260-rend-single-onion.txt Title: Rendezvous Single Onion Services Author: Tim Wilson-Brown, John Brooks, Aaron Johnson, Rob Jansen, George Kadianakis, Paul Syverson, Roger Dingledine Created: 2015-10-17 Status: Finished Implemented-In: 0.2.9.3-alpha 1. Overview Rendezvous single onion services are an alternative design for single onion services, which trade service-side location privacy for improved performance, reliability, and scalability. Rendezvous single onion services have a .onion address identical to any other onion service. The descriptor contains the same information as the existing double onion (hidden) service descriptors. The introduction point and rendezvous protocols occur as in double onion services, with one modification: one-hop connections are made from the onion server to the introduction and rendezvous points. This proposal is a revision of the unnumbered proposal Direct Onion Services: Fast-but-not-hidden services by Roger Dingledine, and George Kadianakis at https://lists.torproject.org/pipermail/tor-dev/2015-April/008625.html It incorporates much of the discussion around hidden services since April 2015, including content from Single Onion Services (Proposal #252) by John Brooks, Paul Syverson, and Roger Dingledine. 2. Motivation Rendezvous single onion services are best used by sites which: * Don't require location anonymity * Would appreciate lower latency or self-authenticated addresses * Would like to work with existing tor clients and relays * Can't accept connections to an open ORPort Rendezvous single onion services have a few benefits over double onion services: * Connection latency is lower, as one-hop circuits are built to the introduction and rendezvous points, rather than three-hop circuits * Stream latency is reduced on a four-hop circuit * Less Tor network capacity is consumed by the service, as there are fewer hops (4 rather than 6) between the client and server via the rendezvous point Rendezvous single onion services have a few benefits over single onion services: * A rendezvous single onion service can load-balance over multiple rendezvous backends (see proposal #255) * A rendezvous single onion service doesn't need an accessible ORPort (it works behind a NAT, and in server enclaves that only allow outward connections) * A rendezvous single onion service is compatible with existing tor clients, hidden service directories, introduction points, and rendezvous points Rendezvous single onion services have a few drawbacks over single onion services: * Connection latency is higher, as one-hop circuits are built to the introduction and rendezvous points. Single onion services perform one extend to the single onion service's ORPort only It should also be noted that, while single onion services receive many incoming connections from different relays, rendezvous single onion services make many outgoing connections to different relays. This should be taken into account when planning the connection capacity of the infrastructure supporting the onion service. Rendezvous single onion services are not location hidden on the service side, but clients retain all of the benefits and privacy of onion services. (The rationale for the 'single' and 'double' nomenclature is described in section 7.4 of proposal #252.) We believe that it is important for the Tor community to be aware of the alternative single onion service designs, so that we can reach consensus on the features and tradeoffs of each design. However, we recognise that each additional flavour of onion service splits the anonymity set of onion service users. Therefore, it may be best for user anonymity that not all designs are adopted, or that mitigations are implemented along with each additional flavour. (See sections 8 & 9 for a further discussion.) 3. Onion descriptors The rendezvous single onion descriptor format is identical to the double onion descriptor format. 4. Reaching a rendezvous single onion service as a client Clients reach rendezvous single onion services in an identical fashion to double onion services. The rendezvous design means that clients do not know whether they are talking to a double or rendezvous single onion service, unless that service tells them. (This may be a security issue.) However, the use of a four-hop path between client and rendezvous single onion service may be statistically distinguishable. (See section 8 for further discussion of security issues.) (Please note that this proposal follows the hop counting conventions in the tor source code. A circuit with a single connection between the client and the endpoint is one-hop; a circuit with 4 connections (and 3 nodes) between the client and endpoint is four-hop.) 5. Publishing a rendezvous single onion service To act as a rendezvous single onion service, a tor instance (or cooperating group of tor instances) must: * Publish onion descriptors in the same manner as any onion service, using three-hop circuits. This avoids service blocking by IP address. Proposal #224 (next-generation hidden services) avoids blocking by onion address. * Perform the rendezvous protocol in the same manner as a double onion service, but make the intro and rendezvous circuits one-hop. (This may allow intro and rendezvous points to block the service.) 5.1. Configuration options 5.1.1 RendezvousSingleOnionServiceNonAnonymousServer The tor instance operating a rendezvous single onion service must make one-hop circuits to the introduction and rendezvous points: RendezvousSingleOnionServiceNonAnonymousServer 0|1 If set, make one-hop circuits between the Rendezvous Single Onion Service server, and the introduction and rendezvous points. This option makes every onion service instance hosted by this tor instance a Rendezvous Single Onion Service. (Default: 0) Because of the grave consequences of misconfiguration here, we have added 'NonAnonymous' to the name of the torrc option. Furthermore, Tor MUST issue a startup warning message to operators of the onion service if this feature is enabled. [Should the name start with 'NonAnonymous' instead?] As RendezvousSingleOnionServiceNonAnonymousServer modifies the behaviour of every onion service on a tor instance, it is impossible to run hidden (double onion) services and rendezvous single onion services on the same tor instance. This is considered a feature, as it prevents hidden services from being discovered via rendezvous single onion services on the same tor instance. 5.1.2 Recommended Additional Options: Correctness Based on the experiences of Tor2Web with one-hop paths, operators should consider using the following options with every rendezvous single onion service, and every single onion service: UseEntryGuards 0 One-hop paths do not use entry guards. This also deactivates the entry guard pathbias code, which is not compatible with one-hop paths. Entry guards are a security measure against Sybil attacks. Unfortunately, they also act as the bottleneck of busy onion services and overload those Tor relays. LearnCircuitBuildTimeout 0 Learning circuit build timeouts is incompatible with one-hop paths. It also creates additional, unnecessary connections. Perhaps these options should be set automatically on (rendezvous) single onion services. Tor2Web sets these options automatically: UseEntryGuards 0 LearnCircuitBuildTimeout 0 5.1.3 Recommended Additional Options: Performance LongLivedPorts The default LongLivedPorts setting creates additional, unnecessary connections. This specifies no long-lived ports (the empty list). PredictedPortsRelevanceTime 0 seconds The default PredictedPortsRelevanceTime setting creates additional, unnecessary connections. High-churn / quick-failover RSOS using descriptor competition strategies should consider setting the following option: RendPostPeriod 600 seconds Refresh onion service descriptors, choosing an interval between 0 and 2*RendPostPeriod. Tor also posts descriptors on bootstrap, and when they change. (Strictly, 30 seconds after they first change, for descriptor stability.) XX - Reduce the minimum RendPostPeriod for RSOS to 1 minute? XX - Make the initial post 30 + rand(1*rendpostperiod) ? (Avoid thundering herd, but don't hide startup time) However, we do NOT recommend setting the following option to 1, unless bug #17359 is resolved so tor onion services can bootstrap without predicted circuits. __DisablePredictedCircuits 0 This option disables all predicted circuits. It is equivalent to: LearnCircuitBuildTimeout 0 LongLivedPorts PredictedPortsRelevanceTime 0 seconds And turning off hidden service server preemptive circuits, which is currently unimplemented (#17360) 5.1.4 Recommended Additional Options: Security We recommend that no other services are run on a rendezvous single onion service tor instance. Since tor runs as a client (and not a relay) by default, rendezvous single onion service operators should set: XX - George says we don't allow operators to run HS/Relay any more, or that we warn them. SocksPort 0 Disallow connections from client applications to the tor network via this tor instance. ClientOnly 1 Even if the defaults file configures this instance to be a relay, never relay any traffic or serve any descriptors. 5.2. Publishing descriptors A single onion service must publish descriptors in the same manner as any onion service, as defined by rend-spec. 5.3. Authorization Client authorization for a rendezvous single onion service is possible via the same methods used for double onion services. 6. Related Proposals, Tools, and Features 6.1. Load balancing High capacity services can distribute load and implement failover by: * running multiple instances that publish to the same onion service directories, * publishing descriptors containing multiple introduction points (OnionBalance), * publishing different introduction points to different onion service directories (OnionBalance upcoming(?) feature), * handing off rendezvous to a different tor instance via control port messages (proposal #255), or by a combination of these methods. 6.2. Ephemeral single onion services (ADD_ONION) The ADD_ONION control port command could be extended to support ephemerally configured rendezvous single onion services. Given that RendezvousSingleOnionServiceNonAnonymousServer modifies the behaviour of all onion services on a tor instance, if it is set, any ephemerally configured onion service should become a rendezvous single onion service. 6.3. Proposal 224 ("Next-Generation Hidden Services") This proposal is compatible with proposal 224, with onion services acting just like a next-generation hidden service, but making one-hop paths to the introduction and rendezvous points. 6.4. Proposal 246 ("Merging Hidden Service Directories and Intro Points") This proposal is compatible with proposal 246. The onion service will publish its descriptor to the introduction points in the same manner as any other onion service. Clients will use the merged hidden service directory and introduction point just as they do for other onion services. 6.5. Proposal 252 ("Single Onion Services") This proposal is compatible with proposal 252. The onion service will publish its descriptor to the introduction points in the same manner as any other onion service. Clients can then choose to extend to the single onion service, or continue with the rendezvous protocol. Running a rendezvous single onion service and single onion service allows older clients to connect via rendezvous, and newer clients to connect via extend. This is useful for the transition period where not all clients support single onion services. 6.6. Proposal 255 ("Hidden Service Load Balancing") This proposal is compatible with proposal 255. The onion service will perform the rendezvous protocol in the same manner as any other onion service. Controllers can then choose to handoff the rendezvous point connection to another tor instance, which should also be configured as a rendezvous single onion service. 7. Considerations 7.1 Modifying RendezvousSingleOnionServiceNonAnonymousServer at runtime Implementations should not reuse introduction points or introduction point circuits if the value of RendezvousSingleOnionServiceNonAnonymousServer is different than it was when the introduction point was selected. This is because these circuits will have an undesirable length. There is specific code in tor that preserves introduction points on a HUP, if RendezvousSingleOnionServiceNonAnonymousServer has changed, all circuits should be closed, and all introduction points must be discarded. 7.2 Delaying connection expiry Tor clients typically expire connections much faster than tor relays [citation needed]. (Rendezvous) single onion service operators may find that keeping connections open saves on connection latency. However, it may also place an additional load on the service. (This could be implemented by increasing the configured connection expiry time.) 7.3. (No) Benefit to also running a Tor relay In tor Trac ticket #8742, running a relay and hidden onion service on the same tor instance was disabled for security reasons. While there may be benefits to running a relay on the same instance as a rendezvous single onion service (existing connections mean lower latency, it helps the tor network overall), a security analysis of this configuration has not yet been performed. In addition, a potential drawback is overloading a busy single onion service. 7.4 Predicted circuits We should look whether we can optimize further the predicted circuits that Tor makes as an onion service for this mode. 8. Security Implications 8.1 Splitting the Anonymity Set Each additional flavour of onion service, and each additional externally visible onion service feature, provides oportunities for fingerprinting. Also, each additional type of onion service shrinks the anonymity set for users of double onion (hidden) services who require server location anonymity. These users benefit from the cover provided by current users of onion services, who use them for client anonymity, self-authentication, NAT-punching, or other benefits. For this reason, features that shrink the double onion service anonymity set should be carefully considered. The benefits and drawbacks of additional features also often depend on a particular threat model. It may be that a significant number of users and sites adopt (rendezvous) single onion services due to their benefits. This could increase the traffic on the tor network, therefore increasing anonymity overall. However, the unique behaviour of each type of onion service may still be distinguishable on both the client and server ends of the connection. 8.2 Hidden Service Designs can potentially be more secure As a side-effect, by optimizing for performance in this feature, it allows us to lean more heavily towards security decisions for regular onion services. 8.3 One-hop onion service paths may encourage more attacks There's a possible second-order effect here since both RSOS and double onion services will have foo.onion addresses and it's not clear based on the address which one the service uses: if *some* .onion addresses are easy to track down, are we encouraging adversaries to attack all rendezvous points just in case? 9. Further Work Further proposals or research could attempt to mitigate the anonymity-set splitting described in section 8. Here are some initial ideas. 9.1 Making Client Exit connections look like Client Onion Service Connections A mitigation to this fingerprinting is to make each (or some) exit connections look like onion service connections. This provides cover for particular types of onion service connections. Unfortunately, it is not possible to make onion service connections look like exit connections, as there are no suitable dummy servers to exit to on the Internet. 9.1.1 Making Client Exit connections perform Descriptor Downloads (Some) exit connections could perform a dummy descriptor download. (However, descriptors for recently accessed onion services are cached, so dummy downloads should only be performed occasionally.) Exit connections already involve a four-hop "circuit" to the server (including the connection between the exit and the server on the Internet). The server on the Internet is not included in the consensus. Therefore, this mitigation would effectively cover single onion services which are not relays. 9.1.2 Making Client Exit connections perform the Rendezvous Protocol (Some) exit connections could perform a dummy rendezvous protocol. Exit connections already involve a four-hop "circuit" to the server (including the connection between the exit and the server on the Internet). Therefore, this mitigation would effectively cover rendezvous single onion services, as long as a dummy descriptor download was also performed occasionally. 9.1.3 Making Single Onion Service rendezvous points perform name resolution Currently, Exits perform DNS name resolution, and changing this behaviour would cause unacceptable connection latency. Therefore, we could make onion service connections look like exit connections by making the rendezvous point do name resolution (that is, descriptor fetching), and, if needed, the introduction part of the protocol. This could potentially *reduce* the latency of single onion service connections, depending on the length of the paths used by the rendezvous point. However, this change makes rendezvous points almost as powerful as Exits, a careful security analysis will need to be performed before this is implemented. There is also a design issue with rendezvous name resolution: a client wants to leave resolution (descriptor download) to the RP, but it doesn't know whether it can use the exit-like protocol with an RP until it has downloaded the descriptor. This might mean that single onion services of both flavours need a different address style or address namespace. We could use .single.onion or something. (This would require an update to the HSDir code.) 9.2 Performing automated and common queries over onion services Tor could create cover traffic for a flavour of onion service by performing automated or common queries via an onion service of that type. In addition, onion service-based checks have security benefits over DNS-based checks. See Genuine Onion, Syverson and Boyce, 2015, at http://www.nrl.navy.mil/itd/chacs/syverson-genuine-onion-simple-fast-flexible-and-cheap-website-authentication Here are some examples of automated queries that could be performed over an onion service: 9.2.1 torcheck over onion service torcheck ("Congratulations! This browser is configured to use Tor.") could be retrieved from an onion service. Incidentally, this would resolve the exitmap issues in #17297, but it would also fail to check that exit connections work, which is important for many Tor Browser users. 9.2.2 Tor Browser version checks over onion service Running tor browser version checks over an onion service seems to be an excellent use-case for onion services. It would also have the Tor Project "eating its own dogfood", that is, using onion services for its essential services. 9.2.3 Tor Browser downloads over onion service Running tor browser downloads over an onion service might require some work on the onion service codebase to support high loads, load-balancing, and failover. It is a good use case for a (rendezvous) single onion service, as the traffic over the tor network is only slightly higher than for Tor Browser downloads over tor. (4 hops for [R]SOS, 3 hops for Exit.) 9.2.4 SSL Observatory submissions over onion service HTTPS certificates could be submitted to HTTPS Everywhere's SSL Observatory over an onion service. This option is disabled in Tor Browser by default. Perhaps some users would be more comfortable enabling submission over an onion service, due to the additional security benefits.
Filename: 261-aez-crypto.txt Title: AEZ for relay cryptography Author: Nick Mathewson Created: 28 Oct 2015 Status: Obsolete 0. History I wrote the first draft of this around October. This draft takes a more concrete approach to the open questions from last time around. 1. Summary and preliminaries This proposal describes an improved algorithm for circuit encryption, based on the wide-block SPRP AEZ. I also describe the attendant bookkeeping, including CREATE cells, and several variants of the proposal. For more information about AEZ, see http://web.cs.ucdavis.edu/~rogaway/aez/ For motivations, see proposal 202. 2. Specifications 2.1. New CREATE cell types. We add a new CREATE cell type that behaves as an ntor cell but which specifies that the circuit will be created to use this mode of encryption. [TODO: Can/should we make this unobservable?] The ntor handshake is performed as usual, but a different PROTOID is used: "ntor-curve25519-sha256-aez-1" To derive keys under this handshake, we use SHAKE256 to derive the following output: struct shake_output { u8 aez_key[48]; u8 chain_key[32]; u8 chain_val_forward[16]; u8 chain_val_backward[16]; }; The first two two fields are constant for the lifetime of the circuit. 2.2. New relay cell payload We specify the following relay cell payload format, to be used when the exit node circuit hop was created with the CREATE format in 2.1 above: struct relay_cell_payload { u32 zero_1; u16 zero_2; u16 stream_id; u16 length IN [0..498]; u8 command; u8 data[498]; // payload_len - 11 }; Note that the payload length is unchanged. The fields are now rearranged to be aligned. The 'recognized' and 'length' fields are replaced with zero_1, zero_2, and the high 7 bits of length, for a minimum of 55 bits of unambigious verification. (Additional verification can be done by checking the other fields for correctness; AEZ users can exploit plaintext redundancy for additional cryptographic checking.) When encrypting a cell for a hop that was created using one of these circuits, clients and relays encrypt them using the AEZ algorithm with the following parameters: Let Chain denote chain_val_forward if this is a forward cell or chain_val_backward otherwise. tau = 0 # We set tau=0 because want no per-hop ciphertext expansion. Instead # we use redundancy in the plaintext to authenticate the data. Nonce = struct { u64 cell_number; u8 is_forward; u8 is_early; } # The cell number is the number of relay cells that have # traveled in this direction on this circuit before this cell. # ie, it's zero for the first cell, two for the second, etc. # # is_forward is 1 for outbound cells, 0 for inbound cells. # is_early is 1 for cells packaged as RELAY_EARLY, 0 for # cells packaged as RELAY. # # Technically these two values would be more at home in AD # than in Nonce; but AEZ doesn't actually distinguish N and AD # internally. Define CELL_CHAIN_BYTES = 32 AD = [ XOR(prev_plaintext[:CELL_CHAIN_BYTES], prev_ciphertext[:CELL_CHAIN_BYTES]), Chain ] # Using the previous cell's plaintext/ciphertext as additional data # guarantees that any corrupt ciphertext received will corrupt the # plaintext, which will corrupt all future plaintexts. Set Chain = AES(chain_key, Chain) xor Chain. # This 'chain' construction is meant to provide forward # secrecy. Each chain value is replaced after each cell with a # (hopefully!) hard-to-reverse construction. This instantiates a wide-block cipher, tweaked based on the cell index and direction. It authenticates part of the previous cell's plaintext, thereby ensuring that if the previous cell was corrupted, this cell will be unrecoverable. 3. Design considerations 3.1. Wide-block pros and cons? See proposal 202, section 4. 3.2. Given wide-block, why AEZ? It's a reasonably fast probably secure wide-block cipher. In particular, it's performance-competitive with AES_CTR, and far better than what we're doing now. See performance appendix. It seems secure-ish too. Several cryptographers I know seem to think it's likely secure enough, and almost surely at least as good as AES. [There are many other competing wide-block SPRP constructions if you like. Many require blocks be an integer number of blocks, or aren't tweakable. Some are slow. Do you know a good one?] 3.3. Why _not_ AEZ? There are also some reasons to consider avoiding AEZ, even if we do decide to use a wide-block cipher. FIRST it is complicated to implement. As the specification says, "The easiness claim for AEZ is with respect to ease and versatility of use, not implementation." SECOND, it's still more complicated to implement well (fast, side-channel-free) on systems without AES acceleration. We'll need to pull the round functions out of fast assembly AES, which is everybody's favorite hobby. THIRD, it's really horrible to try to do it in hardware. FOURTH, it is comparatively new. Although several cryptographers like it, and it is closely related to a system with a security proof, you never know. FIFTH, something better may come along. 4. Alternative designs 4.1. Two keys. We could have a separate AEZ key for forward and backward encryption. This would use more space, however. 4.2. Authenticating things differently In computing the AD, we could replace xor with concat. In computing the AD, we could replace CELL_CHAIN_BYTES with 16, or 509. (Another thing we might dislike about the current proposal is that it appears to requires us to remember 32 bytes of plaintext until we get another cell. But that part is fixable: note that in the structure of AEZ, the AD is processed in the AEZ-hash() function, and then no longer used. We can compute the AEZ-hash() to be used for the next cell after each cell is en/de crypted.) 4.3. Other hashes. We could update the ntor definition used in this to use a better hash than SHA256 inside. 4.4. Less predictable plaintext. A positively silly option would be to reserve the last X bytes of each relay cell's plaintext for random bytes, if they are not used for payload. This might help a little, in a really doofy way. A.1. Performance notes: memory requirements Let's ignore Tor overhead here, but not openssl overhead. IN THE CURRENT PROTOCOL, the total memory required at each relay is: 2 sha1 states, 2 aes states. Each sha1 state uses 96 bytes. Each aes state uses 244 bytes. (Plus 32 bytes counter-mode overhead.) This works out to 704 bytes at each hop. IN THE PROTOCOL ABOVE, using an optimized AEZ implementation, we'll need 128 bytes for the expanded AEZ key schedule. We'll need another 244 bytes for the AES key schedule for the chain key. And there's 32 bytes of chaining values. This gives us 404 bytes at each hop, for a savings of 42%. If we used separate AES and AEZ keys in each direction, we would be looking at 776 bytes, for a space increase of 10%. A.2. Performance notes: CPU requirements on AESNI hosts The cell_ops benchmark in bench.c purports to tell us how long it takes to encrypt a tor cell. But it wasn't really telling the truth, since it only did one SHA1 operation every 2^16 cells, when entries and exits really do one SHA1 operation every end-to-end cell. I expanded it to consider the slow (SHA1) case as well. I ran this on my friendly OSX laptop (2.9 GHz Intel Core i5) with AESNI support: Inbound cells: 169.72 ns per cell. Outbound cells: 175.74 ns per cell. Inbound cells, slow case: 934.42 ns per cell. Outbound cells, slow case: 928.23 ns per cell. Note that So for an n-hop circuit, each cell does the slow case and (n-1) fast cases at the entry; the slow case at the exit, and the fast case at each middle relay. So for 3 hop circuits, the total current load on the network is roughly 425 ns per hop, concentrated at the exit. Then I started messing around with AEZ benchmarks, using the aesni-optimized version of AEZ on the AEZ website. (Further optimizations are probably possible.) For the AES256, I used the usual aesni-based aes construction. Assuming xor is free in comparison to other operations, and CELL_CHAIN_BYTES=32, I get roughly 270 ns per cell for the entire operation. If we were to pick CELL_CHAIN_BYTES=509, we'd be looking at around 303 ns per cell. If we were to pick CELL_CHAIN_BYTES=509 and replace XOR with CONCAT, it would be around 355 ns per cell. If we were to keep CELL_CHAIN_BYTES=32, and remove the AES256-chaining, I see values around 260 ns per cell. (This is all very spotty measurements, with some xors left off, and not much effort done at optimization beyond what the default optimized AEZ does today.) A.3. Performance notes: what if we don't have AESNI? Here I'll test on a host with sse2 and ssse3, but no aesni instructions. From Tor's benchmarks I see: Inbound cells: 1218.96 ns per cell. Outbound cells: 1230.12 ns per cell. Inbound cells, slow case: 2099.97 ns per cell. Outbound cells, slow case: 2107.45 ns per cell. For a 3-hop circuit, that gives on average 1520 ns per cell. [XXXX Do a real benchmark with a fast AEZ backend. First, write one. Preliminary results are a bit disappointing, though, so I'll need to invetigate alternatives as well.]
Filename: 262-rekey-circuits.txt Title: Re-keying live circuits with new cryptographic material Author: Nick Mathewson Created: 28 Dec 2015 Status: Reserve [NOTE: This proposal is in "Reserve" status because the issue it addresses should be solved by any future relay encryption protocol. (2020 July 31)] 1. Summary and Motivation Cryptographic primitives have an upper limit of how much data should be encrypted with the same key. But currently Tor circuits have no upper limit of how much data they will deliver. While the upper limits of our AES-CTR crypto is ridiculously high (on the order of hundreds of exabytes), the AEZ crypto we're considering suggests we should rekey after the equivalent in cells after around 280 TB. 280 TB is still high, but not ridiculously high. So in this proposal I explain a general mechanism for rekeying a circuit. We shouldn't actually build this unless we settle on 2. RELAY_REKEY cell operation To rekey, the circuit initiator ("client") can send a new RELAY_REKEY cell type: struct relay_rekey { u16 rekey_method IN [0, 1]; u8 rekey_data[]; } const REKEY_METHOD_ACK = 0; const REKEY_METHOD_SHAKE256_CLIENT = 1; This cell means "I am changing the key." The new key material will be derived from SHAKE256 of the aez_key concatenated with the rekey_data field, to fill a new shake_output structure. The client should set rekey_data at random. After sending one of these RELAY_REKEY cells, the client uses the new aez_key to encrypt all of its data to this hop, but retains the old aez_key for decrypting the data coming back from the relay. When the relay receives a RELAY_REKEY cell, it sends a RELAY_REKEY cell back towards the client, with empty rekey_data, and relay_method==0, and then updates its own key material for all additional data it sends and receives to the client. When the client receives this reply, it can discard the old AEZ key, and begin decrypting subsequent inbound cells with the new key. So in summary: the client sends a series of cells encrypted with the old key, and then sends a REKEY cell, followed by relay cells encrypted with the new key: OldKey[data data data ... data rekey] NewKey[data data data...] And after the server receives the REKEY cell, it stops sending relay cells encrypted with the old keys, sends its own REKEY cell with the ACK method, and starts sending cells encrypted with the new key. REKEY arrives somewhere in here I V OldKey[data data data data rekey-ack] NewKey[data data data ...] 2.1. Supporting other cryptography types Each relay cipher must specify its own behavior in the presence of a REKEY cell of each type that it supports. In general, the behavior of method 1 ("shake256-client") is "regenerate keys as if we were calling the original KDF after a CREATE handshake, using SHAKE256 on our current static key material and on a 32-byte random input." The behavior of any unsupported REKEY method must be to close the circuit with an error. The AES-CTR relay cell crypto doesn't support rekeying. See 3.2 below if you disagree. 2.2. How often to re-key? Clients should follow a deterministic algorithm in deciding when to re-key, so as not to leak client differences. This algorithm should be type-specific. For AEZ, I recommend that clients conservatively rekey every 2**32 cells (about 2 TB). And to make sure that this code actually works, the schedule should be after 2**15 cells, and then every 2**32 cells thereafter. It may be beneficial to randomize these numbers. If so, let's try subtracting between 0 and 25% at random. 2.3. How often to allow re-keying? We could define a lower bound to prevent too-frequent rekeying. I'm not sure I see the point here; the process described above is not that costly. 3. Alternative designs 3.1. Should we add some public key cryptography here? We could change the body of a REKEY cell and its ack to be more like CREATE/CREATED. Then we'd have to add a third step from the client to the server to acknowledge receipt of the 'CREATED' cell and changing of the key. So, what would this added complexity and computational load buy us? It would defend against the case where an adversary had compromised the shared key material for a circuit, but was not able to compromise the rekey process. I'm not sure that this is reasonable; the likeliest cases I can think of here seem to be "get compromised, stay compromised" for a circuit. 3.2. Hey, could we use this for forward secrecy with AES-CTR? We could, but the best solution to AES-CTR's limitations right now is to stop using our AES-CTR setup. Anything that supports REKEY will also presumably support AEZ or something better. 3.3. We could upgrade ciphers with this! Yes we could. We could define this not only to change the key, but to upgrade to a better ciphersuite. For example, we could start by negotiating AES-CTR, and then "secretly" upgrade to AEZ. I'm not sure that's worth the complexity, or that it would really be secret in the presence of traffic analysis.
Filename: 263-ntru-for-pq-handshake.txt Title: Request to change key exchange protocol for handshake v1.2 Author: John SCHANCK, William WHYTE and Zhenfei ZHANG Created: 29 Aug 2015 Updated: 4 Feb 2016 Status: Obsolete This proposal was made obsolete by proposal #269. 1. Introduction Recognized handshake types are: 0x0000 TAP -- the original Tor handshake; 0x0001 reserved 0x0002 ntor -- the ntor+curve25519+sha256 handshake; Request for a new (set of) handshake type: 0x010X ntor+qsh -- the hybrid of ntor+curve25519+sha3 handshake and a quantum-safe key encapsulation mechanism where 0X0101 ntor+qsh -- refers to this modular design; no specific Key Encapsulation Mechanism (KEM) is assigned. 0X0102 ntor+ntru -- the quantum safe KEM is based on NTRUEncrypt, with parameter ntrueess443ep2 0X0103 ntor+rlwe -- the quantum safe KEM is based on ring learning with error encryption scheme; parameter not specified DEPENDENCY: Proposal 249: Allow CREATE cells with >505 bytes of handshake data 1.1 Motivation: Quantum-safe forward-secure key agreement We are trying to add Quantum-safe forward-secrecy to the key agreement in tor handshake. (Classical) forward-secrecy means that if the long-term key is compromised, the communication prior to this compromise still stays secure. Similarly, Quantum-safe forward-secrecy implies if the long-term key is compromised due to attackers with quantum-computing capabilities, the prior communication still remains secure. Current approaches for handling key agreement, for instance the ntor handshake protocol, do not have this feature. ntor uses ECC, which will be broken when quantum computers become available. This allows the simple yet very effective harvest-then-decrypt attack, where an adversary with significant storage capabilities harvests Tor handshakes now and decrypts them in the future. The proposed handshake protocol achieves quantum-safe forward-secrecy and stops those attacks by introducing a secondary short-term pre-master secret that is transported via a quantum-safe method. In the case where the long-term key is compromised via quantum algorithm, the attacker still needs to recover the second pre-master secret to be able to decrypt the communication. 1.2 Motivation: Allowing plug & play for quantum-safe encryption algorithms We would like to be conservative on the selection of quantum-safe encryption algorithm. For this purpose, we propose a modular design that allows any quantum-safe encryption algorithm to be included in this handshake framework. We will illustrate the proposal with NTRUEncrypt encryption algorithm. 2. Proposal 2.1 Overview In Tor, authentication is one-way in the authenticated key-exchange protocol. This proposed new handshake protocol is consistent with that approach. We aim to provide quantum-safe forward-secrecy and modular design to the Tor handshake, with the minimum impact on the current version. We aim to use as many existing mechanisms as possible. For purposes of comparison, proposed modifications are indicated with * at the beginning of the corresponding line, the original approaches in ntor are marked with # when applicable. In order to enable variant quantum-safe algorithms for Tor handshake, we propose a modular approach that allows any quantum-safe encryption algorithm to be adopted in this framework. Our approach is a hybridization of ntor protocol and a KEM. We instantiate this framework with NTRUEncrypt, a lattice-based encryption scheme that is believed to be quantum resistant. This framework is expandable to other quantum-safe encryptions such as Ring Learning with Error (R-LWE) based schemes. 2.1.1 Achieved Property: 1) The proposed key exchange method is quantum-safe forward-secure: two secrets are exchanged, one protected by ECC, one protected by NTRUEncrypt, and then put through the native Tor Key Derivation Function (KDF) to derive the encryption and authentication keys. Both secrets are protected with one-time keys for their respective public key algorithms. 2) The proposed key exchange method provides one-way authentication: The server is authenticated, while the client remains anonymous. 3) The protocol is at least as secure as ntor. In the case where the quantum-safe encryption algorithm fails, the protocol is indentical to ntor protocol. 2.1.2 General idea: When a client wishes to establish a one-way authenticated key K with a server, a session key is established through the following steps: 1) Establish a common secret E (classical cryptography, i.e., ECC) using a one-way authenticated key exchange protocol. #ntor currently uses this approach#; 2) Establish a common "parallel" secret P using a key encapsulation mechanism similar to TLS_RSA. In this feature request we use NTRUEncrypt as an example. 3) Establish a new session key k = KDF(E|P, info, i), where KDF is a Key Derivation Function. 2.1.3 Building Blocks 1) ntor: ECDH-type key agreement protocol with one-way authentication; ##existing approach: See 5.1.4 tor-spec.txt## 2) A quantum-safe encryption algorithm: we use QSE to refer to the quantum-safe encryption algorithm, and use NTRUEncrypt as our example; **new approach** 3) SHA3-256 hash function (see FIPS 202), and SHAKE256 KDF; ##previous approach: HMAC-based Extract-and-Expand KDF-RFC5869## 2.2 The protocol 2.2.1 Initialization H(x,t) as SHA3-256 with message x and key t. H_LENGTH = 32 ID_LENGTH = 20 G_LENGTH = 32 * QSPK_LENGTH = XXX length of QSE public key * QSC_LENGTH = XXX length of QSE cipher * PROTOID = "ntor-curve25519-sha3-1-[qseid]" #pre PROTOID = "ntor-curve25519-sha256-1" t_mac = PROTOID | ":mac" t_key = PROTOID | ":key_extract" t_verify = PROTOID | ":verify" These three variables define three different cryptographic hash functions: hash1 = H(*, t_mac); hash2 = H(*, t_key); hash3 = H(*, t_verify); MULT(A,b) = the multiplication of the curve25519 point 'A' by the scalar 'b'. G = The preferred base point for curve25519 KEYGEN() = The curve25519 key generation algorithm, returning a private/public keypair. m_expand = PROTOID | ":key_expand" curve25519 b, B = KEYGEN(); * QSH * QSSK,QSPK = QSKEYGEN(); * cipher = QSENCRYPT (*, PK); * message = QSDECRYPT (*, SK); 2.2.2 Handshake To perform the handshake, the client needs to know an identity key digest for the server, and an ntor onion key (a curve25519 public key) for that server. Call the ntor onion key "B". The client generates a temporary key pair: x, X = KEYGEN(); and a QSE temporary key pair: * QSSK, QSPK = QSKEYGEN(); ================================================================================ and generates a client-side handshake with contents: NODEID Server identity digest [ID_LENGTH bytes] KEYID KEYID(B) [H_LENGTH bytes] CLIENT_PK X [G_LENGTH bytes] * QSPK QSPK [QSPK_LENGTH bytes] ================================================================================ The server generates an ephemeral curve25519 keypair: y, Y = KEYGEN(); and an ephemeral "parallel" secret for encryption with QSE: * PAR_SEC P [H_LENGTH bytes] and computes: * C = ENCRYPT( P | B | Y, QSPK); Then it uses its ntor private key 'b' to compute an ECC secret E = EXP(X,y) | EXP(X,b) | B | X | Y and computes: * secret_input = E | P | QSPK | ID | PROTOID #pre secret_input = E | ID | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) * auth_input = verify | B | Y | X | C | QSPK | ID | PROTOID | "Server" #pre auth_input = verify | B | Y | X | ID | PROTOID | "Server" ================================================================================ The server's handshake reply is: AUTH H(auth_input, t_mac) [H_LENGTH bytes] * QSCIPHER C [QSPK_LENGTH bytes] Note: in previous ntor protocol the server also needs to send #pre SERVER_PK Y [G_LENGTH bytes] This value is now encrypted in C, so the server does not need to send Y. ================================================================================ The client decrypts C, then checks Y is in G^*, and computes E = EXP(Y,x) | EXP(B,x) | B | X | Y * P' = DECRYPT(C, QSSK) extract P,B from P' (P' = P|B), verifies B, and computes * secret_input = E | P | QSPK | ID | PROTOID #pre secret_input = E | ID | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) * auth_input = verify | B | Y | X | C | ID | PROTOID | "Server" #pre auth_input = verify | B | Y | X | ID | PROTOID | "Server" The client verifies that AUTH == H(auth_input, t_mac). Both parties now have a shared value for KEY_SEED. This value will be used during Key Derivation Function. 2.3 Instantiation with NTRUEncrypt The example uses the NTRU parameter set NTRU_EESS443EP2. This has keys and ciphertexts of length 610 bytes. This parameter set delivers 128 bits classical security and quantum security. This parameter set uses product form NTRU polynomials. For 256 bits classical and quantum security, use NTRU_EESS743EP2. We adjust the following parameters: handshake type: 0X0102 ntor+ntru the quantum safe KEM is based on NTRUEncrypt, with parameter ntrueess443ep2 PROTOID = "ntor-curve25519-sha3-1-ntrueess443ep2" QSPK_LENGTH = 610 length of NTRU_EESS443EP2 public key QSC_LENGTH = 610 length of NTRU_EESS443EP2 cipher NTRUEncrypt can be adopted in our framework without further modification. 3. Security Concerns The proof of security can be found at https://eprint.iacr.org/2015/287 We highlight some desired features. 3.1 One-way Authentication The one-way authentication feature is inherent from the ntor protocol. 3.2 Multiple Encryption The technique to combine two encryption schemes used in 2.2.4 is named Multiple Encryption. Discussion of appropriate security models can be found in [DK05]. Proof that the proposed handshake is secure under this model can be found at https://eprint.iacr.org/2015/287. 3.3 Cryptographic hash function The default hash function HMAC_SHA256 from Tor suffices to provide desired security for the present day. However, to be more future proof, we propose to use SHA3 when Tor starts to migrate to SHA3. 3.4 Key Encapsulation Mechanism The KEM in our protocol can be proved to be KEM-CCA-2 secure. 3.5 Quantum-safe Forward Secrecy Quantum-safe forward secrecy is achieved. 3.6 Quantum-safe authentication The proposed protocol is secure only until a quantum computer is developed that is capable of breaking the onion keys in real time. Such a computer can compromise the authentication of ntor online; the security of this approach depends on the authentication being secure at the time the handshake is executed. This approach is intended to provide security against the harvest-then-decrypt attack while an acceptable quantum-safe approach with security against an active attacker is developed. 4. Candidate quantum-safe encryption algorithms Two candidate quantum-safe encryption algorithms are under consideration. NTRUEncrypt, with parameter set ntrueess443ep2 provides 128 bits classcial and quantum security. The parameter sets is available for use now. LWE-based key exchange, based on Peikert's idea [Pei14]. Parameter sets suitable for this framework (the newerhop vairant) is still under development. 5. Bibliography [DK05] Y. Dodis, J. Katz, "Chosen-Ciphertext Security of Mulitple Encryption", Theory of Cryptography Conference, 2005. http://link.springer.com/chapter/10.1007%2F978-3-540-30576-7_11 (conference version) or http://cs.nyu.edu/~dodis/ps/2enc.pdf (preprint) [Pei14] C. Peikert, "Lattice Cryptography for the Internet", PQCrypto 2014.
Filename: 264-subprotocol-versions.txt Title: Putting version numbers on the Tor subprotocols Author: Nick Mathewson Created: 6 Jan 2016 Status: Closed Implemented-In: 0.2.9.4-alpha 1. Introduction At various points in Tor's history, we've needed to migrate from one protocol to another. In the past, we've mostly done this by allowing relays to advertise support for various features. We've done this in an ad-hoc way, though. In some cases, we've done it entirely based on the relays' advertised Tor version. That's a pattern we shouldn't continue. We'd like to support more live Tor relay implementations, and that means that tying "features" to "tor version" won't work going forwards. This proposal describes an alternative method that we can use to simplify the advertisement and discovery of features, and the transition from one set of features to another. 1.1. History: "Protocols" vs version-based gating. For ages, we've specified a "protocols" line in relay descriptors, with a list of supported versions for "relay" and "link" protocols. But we've never actually looked at it, and instead we've relied on tor version numbers to determine which features we could rely upon. We haven't kept the relay and link protocols listed there up-to-date either. Clients have used version checks for three purposes historically: checking relays for bugs, checking relays for features, and implementing bug-workarounds on their own state files. In this design, feature checks are now performed directly with subprotocol versions. We only need to keep using Tor versions specifically for bug workarounds. 2. Design: Advertising protocols. We revive the "Protocols" design above, in a new form. "proto" SP Entries NL Entries = Entries = Entry Entries = Entry SP Entries Entry = Keyword "=" Values Values = Value Values = Value "," Values Value = Int Value = Int "-" Int Int = NON_ZERO_DIGIT Int = Int DIGIT Each 'Entry' in the "proto" line indicates that the Tor relay supports one or more versions of the protocol in question. Entries should be sorted by keyword. Values should be numerically ascending within each entry. (This implies that there should be no overlapping ranges.) Ranges should be represented as compactly as possible. Ints must be no more than 2^32 - 1. The semantics for each keyword must be defined in a Tor specification. Extension keywords are allowed if they begin with "x-" or "X-". Keywords are case-sensitive. During voting, authorities copy these lines immediately below the "v" lines, using "pr" as the keyword instead of "proto". When a descriptor does not contain a "proto" entry, the authorities should reconstruct it using the approach described below in section A.1. They are included in the consensus using the same rules as currently used for "v" lines, if a sufficiently late consensus method is in use. 2.1. An alternative: Moving 'v' lines into microdescriptors. [[[[[ Here's an alternative: we could put "v" and "proto" lines into microdescriptors. When building microdescriptors, authorities could copy all valid "proto" entries verbatim if a sufficiently late consensus method is in use. When a descriptor does not contain a "proto" entry, the authorities should reconstruct it using the approach described below in section A.1. Tor clients that want to use "v" lines should prefer those in microdescriptors if present, and ignore those in the consensus. (Existing maintained client versions can be adapted to never look at "v" lines at all; the only versions that they still check for are ones not allowed on the network. The "v" line can be dropped from the consensus entirely when current clients have upgraded.) ]]]]] [I am rejecting this alternative for now, since proto lines should compress very well, given that the proto line is completely inferrable from the v line. Removing all the v lines from the current consensus would save only 1.7% after gzip compression.] 3. Using "proto"/"pr" and "v" lines Whenever possible, clients and relays should use the list of advertised protocols instead of version numbers. Version numbers should only be used when implementing bug-workarounds for specific Tor versions. Every new feature in tor-spec.txt, dir-spec.txt, and rend-spec.txt should be gated on a particular protocol version. 4. Required protocols The consensus may contain four lines: "recommended-relay-protocols", "required-relay-protocols", "recommended-client-protocols", and "required-client-protocols". Each has the same format as the "proto" line. To vote on these entries, a protocol/version combination is included only if it is listed by a majority of the voters. When a relay lacks a protocol listed in recommended-relay-protocols, it should warn its operator that the relay is obsolete. When a relay lacks a protocol listed in required-relay-protocols, it must not attempt to join the network. When a client lacks a protocol listed in recommended-client-protocols, it should warn the user that the client is obsolete. When a client lacks a protocol listed in required-client-protocols, it must not connect to the network. This implements a "safe forward shutdown" mechanism for zombie clients. If a client or relay has a cached consensus telling it that a given protocol is required, and it does not implement that protocol, it SHOULD NOT try to fetch a newer consensus. [[XXX I propose we remove this idea: The above features should be backported to 0.2.4 and later, or all the versions we expect to continue supporting.]] These lines should be voted on. A majority of votes is sufficient to make a protocol un-supported and it should require a supermajority of authorities (2/3) to make a protocol required. The required protocols should not be torrc-configurable, but rather should be hardwired in the Tor code. 5. Current protocols (See "6. Maintaining the protocol list" below for information about how I got these, and why version 0.2.4.19 comes up so often.) 5.1. "Link" The "link" protocols are those used by clients and relays to initiate and receive OR connections and to handle cells on OR connections. The "link" protocol versions correspond 1:1 to those versions. Two Tor instances can make a connection to each other only if they have at least one link protocol in common. The current "link" versions are: "1" through "4"; see tor-spec.txt for more information. All current Tor versions support "1-3"; version from 0.2.4.11-alpha and on support "1-4". Eventually we will drop "1" and "2". 5.2. "LinkAuth" LinkAuth protocols correspond to varieties of Authenticate cells used for the v3+ link protocools. The current version is "1". "2" is unused, and reserved by proposal 244. "3" is the ed25519 link handshake of proposal 220. 5.3. "Relay" The "relay" protocols are those used to handle CREATE cells, and those that handle the various RELAY cell types received after a CREATE cell. (Except, relay cells used to manage introduction and rendezvous points are managed with the "HSIntro" and "HSRend" protocols respectively.) "1" -- supports the TAP key exchange, with all features in Tor 0.2.3. Support for CREATE and CREATED and CREATE_FAST and CREATED_FAST and EXTEND and EXTENDED. "2" -- supports the ntor key exchange, and all features in Tor 0.2.4.19. Includes support for CREATE2 and CREATED2 and EXTEND2 and EXTENDED2. 5.4. "HSIntro" The "HSIntro" protocol handles introduction points. "3" -- supports authentication as of proposal 121 in Tor 0.2.1.6-alpha. 5.5. "HSRend" The "HSRend" protocol handles rendezvous points. "1" -- supports all features in Tor 0.0.6. "2" -- supports RENDEZVOUS2 cells of arbitrary length as long as they have 20 bytes of cookie in Tor 0.2.9.1-alpha. 5.6. "HSDir" The HSDir protocols are the set of hidden service document types that can be uploaded to, understood by, and downloaded from a tor relay, and the set of URLs available to fetch them. "1" -- supports all features in Tor 0.2.0.10-alpha. 5.7. "DirCache" The "DirCache" protocols are the set of documents available for download from a directory cache via BEGIN_DIR, and the set of URLs available to fetch them. (This excludes URLs for hidden service objects.) "1" -- supports all features in Tor 0.2.4.19. 5.8. "Desc" Describes features present or absent in descriptors. Most features in descriptors don't require a "Desc" update -- only those that need to someday be required. For example, someday clients will need to understand ed25519 identities. "1" -- supports all features in Tor 0.2.4.19. "2" -- cross-signing with onion-keys, signing with ed25519 identities. 5.9. "Microdesc" Describes features present or absent in microdescriptors. Most features in descriptors don't require a "MicroDesc" update -- only those that need to someday be required. These correspond more or less with consensus methods. "1" -- consensus methods 9 through 20. "2" -- consensus method 21 (adds ed25519 keys to microdescs). 5.10. "Cons" Describes features present or absent in consensus documents. Most features in consensus documents don't require a "Cons" update -- only those that need to someday be required. These correspond more or less with consensus methods. "1" -- consensus methods 9 through 20. "2" -- consensus method 21 (adds ed25519 keys to microdescs). 6. Maintaining the protocol lists What makes a good fit for a "protocol" type? Generally, it's a set of communications functionality that tends to upgrade in tandem, and in isolation from other parts of the Tor protocols. It tends to be functionality where it doesn't make sense to implement only part of it -- though omitting the whole thing might make sense. (Looking through our suite of protocols, you might make a case for splitting DirCache into sub-protocols.) We shouldn't add protocols for features where others can remain oblivious to their presence or absence. For example, if some directory caches start supporting a new header, and clients can safely send that header without knowing whether the directory cache will understand it, then a new protocol version is not required. Because all relays currently on the network are 0.2.4.19 or later, we can require 0.2.4.19, and use 0.2.4.19 as the minimal version so we we don't need to do code archaeology to determine how many no-longer-relevant versions of each protocol once existed. Adding new protocol types is pretty cheap, given compression. A.1. Inferring missing proto lines The directory authorities no longer allow versions of Tor before 0.2.4.18-rc. But right now, there is no version of Tor in the consensus before 0.2.4.19. Therefore, we should disallow versions of Tor earlier than 0.2.4.19, so that we can have the protocol list for all current Tor versions include: Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2 For Desc, Tor versions before 0.2.7.stable should be taken to have Desc=1 and versions 0.2.7.stable or later should have Desc=1-2. For Microdesc and Cons, Tor versions before 0.2.7.stable should be taken to support version 1; 0.2.7.stable and later should have 1-2. A.2. Initial required protocols For clients we will Recommend and Require these. Cons=1-2 Desc=1-2 DirCache=1 HSDir=2 HSIntro=3 HSRend=1 Link=4 LinkAuth=1 Microdesc=1-2 Relay=2 For relays we will Require: Cons=1 Desc=1 DirCache=1 HSDir=2 HSIntro=3 HSRend=1 Link=3-4 LinkAuth=1 Microdesc=1 Relay=1-2 For relays, we will additionally Recommend all protocols which we recommend for clients. A.3. Example integration with other open proposals In this appendix, I try to show that this proposal is viable by showing how it can integrate with other open proposals to avoid version-gating. I'm looking at open/draft/accepted proposals only. 140 Provide diffs between consensuses This proposal doesn't affect interoperability, though we could add a DirCache protocol version for it if we think we might want to require it someday. 164 Reporting the status of server votes Interoperability not affected; no new protocol. 165 Easy migration for voting authority sets Authority-only; no new protocol. 168 Reduce default circuit window Interoperability slightly affected; could be a new Relay protocol. 172 GETINFO controller option for circuit information 173 GETINFO Option Expansion Client/Relay interop not affected; no new protocol. 177 Abstaining from votes on individual flags Authority-only; no new protocol. 182 Credit Bucket No new protocol. 188 Bridge Guards and other anti-enumeration defenses No new protocol. 189 AUTHORIZE and AUTHORIZED cells This would be a new protocol, either a Link protocol or a new LinkAuth protocol. 191 Bridge Detection Resistance against MITM-capable Adversaries No new protocol. 192 Automatically retrieve and store information about bridges No new protocol. 195 TLS certificate normalization for Tor 0.2.4.x Interop not affected; no new protocol. 201 Make bridges report statistics on daily v3 network status requests No new protocol. 202 Two improved relay encryption protocols for Tor cells This would be a new Relay protocol. 203 Avoiding censorship by impersonating an HTTPS server Bridge/PT only; no new protocol. 209 Tuning the Parameters for the Path Bias Defense Client behavior only; no new protocol. 210 Faster Headless Consensus Bootstrapping Client behavior only; no new protocol. 212 Increase Acceptable Consensus Age Possibly add a new DirCache protocol version to describe the "will hold older descriptors" property. 219 Support for full DNS and DNSSEC resolution in Tor New relay protocol, or new protocol class (DNS=2?) 220 Migrate server identity keys to Ed25519 Once link authentication is supported, that's a new LinkAuth protocol version. No new protocol version is required for circuit extension, since it's a backward-compatible change. 224 Next-Generation Hidden Services in Tor Adds new HSDir and HSIntro and HSRend protocols. 226 "Scalability and Stability Improvements to BridgeDB: Switching to a Distributed Database System and RDBMS" No new protocol. 229 Further SOCKS5 extensions Client-only; no new protocol. 233 Making Tor2Web mode faster No new protocol. 234 Adding remittance field to directory specification Could be a new protocol; or not. 237 All relays are directory servers No new protocol. 239 Consensus Hash Chaining No new protocol. 242 Better performance and usability for the MyFamily option New Desc protocol. 244 Use RFC5705 Key Exporting in our AUTHENTICATE calls Part of prop220. Also adds another LinkAuth protocol version. 245 Deprecating and removing the TAP circuit extension protocol Removes Linkauth protocol 1. Removes a Desc protocol. 246 Merging Hidden Service Directories and Introduction Points Possibly adds a new HSIntro or HSDir protocol. 247 Defending Against Guard Discovery Attacks using Vanguards No new protocol. 248 Remove all RSA identity keys Adds a new Desc protocol version and a new Cons protocol version; eventually removes a version of each. 249 Allow CREATE cells with >505 bytes of handshake data Adds a new Link protocol version for CREATE2V. Adds a new Relay protocol version for new EXTEND2 semantics. 250 Random Number Generation During Tor Voting No new protocol. 251 Padding for netflow record resolution reduction No new protocol. 252 Single Onion Services No new protocol. 253 Out of Band Circuit HMACs New Relay protocol. 254 Padding Negotiation New Link protocol, new Relay protocol. 255 Controller features to allow for load-balancing hidden services No new protocol. 256 Key revocation for relays and authorities New Desc protocol. 257 Refactoring authorities and taking parts offline No new protocol. 258 Denial-of-service resistance for directory authorities No new protocol. 259 New Guard Selection Behaviour No new protocol 260 Rendezvous Single Onion Services No new protocol 261 AEZ for relay cryptography New Relay protocol version. 262 Re-keying live circuits with new cryptographic material New Relay protocol version 263 Request to change key exchange protocol for handshake New Relay protocol version.
Filename: 265-load-balancing-with-overhead.txt Title: Load Balancing with Overhead Parameters Authors: Mike Perry Created: 01 January 2016 Status: Open Target: arti-dirauth NOTE: This is one way to address several load balancing problems in Tor, including padding overhead and Exit+Guard issues. However, before attempting this, we should see if we can simplify the equations further by changing how we assign Guard, Fast and Stable flags in the first place. If we assign Guard flags such that Guards are properly allocated wrt Middle and Fast, and avoid assigning Guard to Exit, this will become simpler. Unfortunately, this is literally impossible to fix with C-Tor. In adition to numerous overrides and disparate safety checks that prevent changes, several bugs mean that Guard, Stable, and Fast flags are randomly assigned: See: https://gitlab.torproject.org/tpo/core/tor/-/issues/40230 https://gitlab.torproject.org/tpo/core/tor/-/issues/40395 https://gitlab.torproject.org/tpo/core/tor/-/issues/19162 https://gitlab.torproject.org/tpo/core/tor/-/issues/40733 https://gitlab.torproject.org/tpo/network-health/analysis/-/issues/45 https://gitlab.torproject.org/tpo/core/torspec/-/issues/100 https://gitlab.torproject.org/tpo/core/torspec/-/issues/160 https://gitlab.torproject.org/tpo/core/torspec/-/issues/158 Other approaches to flag equations that have been proposed: https://github.com/frochet/wf_proposal/blob/master/waterfilling-balancing-with-max-diversity.txt https://petsymposium.org/popets/2023/popets-2023-0127.pdf 0. Motivation In order to properly load balance in the presence of padding and non-negligible amounts of directory and hidden service traffic, the load balancing equations in Section 3.8.3 of dir-spec.txt are in need of some modifications. In addition to supporting the idea of overhead, the load balancing equations can also be simplified by treating Guard+Exit nodes as Exit nodes in all cases. This causes the 9 sub-cases of the current load balancing equations to consolidate into a single solution, which also will greatly simplify the consensus process, and eliminate edge cases such as #16255[1]. 1. Overview For padding overhead due to Proposals 251 and 254, and changes to hidden service path selection in Proposal 247, it will be useful to be able to specify a pair of parameters that represents the additional traffic present on Guard and Middle nodes due to these changes. The current load balancing equations unfortunately make this excessively complicated. With overhead factors included, each of the 9 subcases goes from being a short solution to over a page of calculations for each subcase. Moreover, out of 8751 hourly consensus documents produced in 2015[2], only 78 of them had a non-zero weight for using Guard+Exit nodes in the Guard position (weight Wgd), and most of those were well under 1%. The highest weight for using Guard+Exits in the Guard position recorded in 2015 was 2.62% (on December 10th, 2015). This means clients that chose a Guard node during that particular hour used only 2.62% of Guard+Exit flagged nodes' bandwidth when performing a bandwidth-weighted Guard selection. All clients that chose a Guard node during any other hour did not consider Guard+Exit nodes at all as potential candidates for their Guards. This indicates that we can greatly simplify these load balancing equations with little to no change in diversity to the network. 2. Simplified Load Balancing Equations Recall that the point of the load balancing equations in section 3.8.3 of dir-spec.txt is to ensure that an equal amount of client traffic is distributed between Guards, Middles, Exits, and Guard+Exits, where each flag type can occupy one or more positions in a path. This allocation is accomplished by solving a system of equations for weights for flag position selection to ensure equal allocation of client traffic for each position in a circuit. If we ignore overhead for the moment and treat Guard+Exit nodes as Exit nodes, then this allows the simplified system of equations to become: Wgg*G == M + Wme*E + Wmg*G # Guard position == middle position Wgg*G == Wee*E # Guard position == equals exit position Wmg*G + Wgg*G == G # Guard allocation weights sum to 1 Wme*E + Wee*E == E # Exit allocation weights sum to 1 This system has four equations and four unknowns, and by transitivity we ensure that allocated capacity for guard, middle, and exit positions are all equal. Unlike the equations in 3.8.3 of dir-spec.txt, there are no special cases to the solutions of these equations because there is no shortage of constraints and no decision points for allocation based on scarcity. Thus, there is only one solution. Using SymPy's symbolic equation solver (see attached script) we obtain: E + G + M E + G + M 2*E - G - M 2*G - E - M Wee: ---------, Wgg: ---------, Wme: -----------, Wmg: ------------ 3*E 3*G 3*E 3*G For the rest of the flags weights, we will do the following: Dual-flagged (Guard+Exit) nodes should be treated as Exits: Wgd = 0, Wmd = Wme, Wed = Wee Directory requests use middle weights: Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm Handle bridges and strange exit policies: Wgm=Wgg, Wem=Wee, Weg=Wed 2.1. Checking for underflow and overflow In the old load balancing equations, we required a case-by-case proof to guard against overflow and underflow, and to decide what to do in the event of various overflow and underflow conditions[3]. Even still, the system proved fragile to changes, such as the implementation of Guard uptime fractions[1]. Here, with the simplified equations, we can plainly see that the only time that a negative weight can arise is in Wme and Wmg, when 2*E < G+M or when 2*G < E+M. In other words, only when Exits or Guards are scarce. Similarly, the only time that a weight exceeding 1.0 can arise is in Wee and Wgg, which also happens when 2*E < G+M or 2*G < E+M. This means that parameters will always overflow in pairs (Wee and Wme, and/or Wgg and Wmg). In both these cases, simply clipping the parameters at 1 and 0 provides as close of a balancing condition as is possible, given the scarcity. 3. Load balancing with Overhead Parameters Intuitively, overhead due to padding and path selection changes can be represented as missing capacity in the relevant position. This means that in the presence of a Guard overhead fraction of G_o and a Middle overhead fraction of M_o, the total fraction of actual client traffic carried in those positions is (1-G_o) and (1-M_o), respectively. Then, to achieve a balanced allocation of traffic, we consider only the actual client capacity carried in each position: # Guard position minus overhead matches middle position minus overhead: (1-G_o)*(Wgg*G) == (1-M_o)*(M + Wme*E + Wmg*G) # Guard position minus overhead matches exit position: (1-G_o)*(Wgg*G) == 1*(Wee*E) # Guard weights still sum to 1: Wmg*G + Wgg*G == G # Exit weights still sum to 1: Wme*E + Wee*E == E Solving this system with SymPy unfortunately yields some unintuitively simplified results. For each weight, we first show the SymPy solution, and then factor that solution into a form analogous to Section 2: -(G_o - 1)*(M_o - 1)*(E + G + M) Wee: --------------------------------------- E*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2) (1 - G_o)*(1 - M_o)*(E + G + M) Wee: --------------------------------------- E*(2 - G_o - M_o + (1 - G_o)*(1 - M_o)) (M_o - 1)*(E + G + M) Wgg: --------------------------------------- G*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2) (1 - M_o)*(E + G + M) Wgg: --------------------------------------- G*(2 - G_o - M_o + (1 - G_o)*(1- M_o)) -E*(M_o - 1) + G*(G_o - 1)*(-M_o + 2) - M*(M_o - 1) Wmg: --------------------------------------------------- G*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2) (2 - M_o)*G*(1 - G_o) - M*(1 - M_o) - E*(1 - M_o) Wmg: --------------------------------------------------- G*(2 - G_o - M_o + (1 - G_o )*(1 - M_o)) E*(G_o + M_o - 2) + G*(G_o - 1)*(M_o - 1) + M*(G_o - 1)*(M_o - 1) Wme: ----------------------------------------------------------------- E*(G_o + M_o - (G_o - 1)*(M_o - 1) - 2) (2 - G_o - M_o)*E - G*(1 - G_o)*(1 - M_o) - M*(1 - G_o)*(1 - M_o) Wme: ----------------------------------------------------------------- E*(2 - G_o - M_o + (1 - G_o)*(1 - M_o)) A simple spot check with G_o = M_o = 0 shows us that with zero overhead, these solutions become identical to the solutions in Section 2 of this proposal. The final condition that we need to ensure is that these weight values never become negative or greater than 1.0[3]. 3.1. Ensuring against underflow and overflow Note that if M_o = G_o = 0, then the solutions and the overflow conditions are the same as in Section 2. Unfortunately, SymPy is unable to solve multivariate inequalities, which prevents us from directly deriving overflow conditions for each variable independently (at least easily and without mistakes). Wolfram Alpha is able to derive closed form solutions to some degree for this, but they are more complicated than checking the weights for underflow and overflow directly. However, for all overflow and underflow cases, simply warning in the event of overflow or underflow in the weight variable solutions above is equivalent anyway. Optimal load balancing given this scarcity should still result if we clip the resulting solutions to [0, 1.0]. It will be wise in the implementation to test the overflow conditions with M_o = G_o = 0, and with their actual values. This will allow us to know if the overflow is a result of inherent unbalancing, or due to input overhead values that are too large (and need to be reduced by, for example, reducing padding). 4. Consensus integration 4.1. Making use of the Overhead Factors In order to keep the consensus process simple on the Directory Authorities, the overhead parameters represent the combined overhead from many factors. The G_o variable is meant to account for sum of directory overhead, netflow padding overhead, future two-hop padding overhead, and future hidden service overhead (for cases where Guard->Middle->Exit circuits are not used). The M_o variable is meant to account for multi-hop padding overhead, hidden service overhead, as well as an overhead for any future two-hop directory connections (so that we can consolidate Guards and Directory guard functionality into a single Guard node). There is no need for an E_o variable, because even if there were Exit-specific overhead, it could be represented by an equivalent reductions in both G_o and M_o instead. Since all relevant padding and directory overhead information is included in the extra-info documents for each relay, the M_o and G_o variables could be computed automatically from these extra-info documents during the consensus process. However, it is probably wiser to keep humans in the loop and set them manually as consensus parameters instead, especially since we have not previously had to deal with serious adversarial consequences from malicious extra-info reporting. For clarity, though, it may be a good idea to separate all of the components of M_o and G_o into separate consensus parameters, and combine them (via addition) in the final equations. That way it will be easier to pinpoint the source of any potential overflow issues. This separation will also enable us to potentially govern padding's contribution to the overhead via a single tunable value. 4.2. Accounting for hidden service overhead with Prop 247 XXX: Hidden service path selection and 247 complicates this. With 247, we want paths only of G M M, where the Ms exclude Guard-flaged nodes. This means that M_o needs to add the total hidden service *network bytecount* overhead (2X the hidden service end-to-end traffic bytecount). We also need to *subtract* 4*Wmg*hs_e2e_bytecount from the G_o overhead, to account for not using Guard-flagged nodes for the four M's in full prop-247 G M M M M G circuits. 4.3. Accounting for RSOS overhead XXX: We also need to separately account for RSOS (and maybe SOS?) path usage in M_o. This will require separate acocunting for these service types in extra-info descriptors. 4.4 Integration with Guardfraction The GuardFraction changes in Proposal 236 and #16255 should continue to work with these new equations, so long as the total T, G, and M values are counted after the GuardFraction multiplier has been applied. 4.5. Guard flag assignment Ideally, the Guard flag assignment process would also not count Exit-flagged nodes when determining the Guard flag uptime and bandwidth cutoffs, since we will not be using Guard+Exit flagged nodes as Guard nodes at all when this change is applied. This will result in more accurate thresholds for Guard node status, as well as better control over the true total amount of Guard bandwidth in the consensus. 4.6. Cannibalization XXX: It sucks and complicates everything. kill it, except for hsdirs. 1. https://trac.torproject.org/projects/tor/ticket/16255 2. https://collector.torproject.org/archive/relay-descriptors/consensuses/ 3. http://tor-dev.torproject.narkive.com/17H9FewJ/correctness-proof-for-new-bandwidth-weights-bug-1952 Appendix A: SymPy Script for Balancing Equation Solutions #!/usr/bin/python from sympy.solvers import solve from sympy import simplify, Symbol, init_printing, pprint # Sympy variable declarations (G,M,E,D) = (Symbol('G'),Symbol('M'),Symbol('E'),Symbol('D')) (Wgd,Wmd,Wed,Wme,Wmg,Wgg,Wee) = (Symbol('Wgd'),Symbol('Wmd'),Symbol('Wed'), Symbol('Wme'),Symbol('Wmg'),Symbol('Wgg'), Symbol('Wee')) (G_o, M_o) = (Symbol('G_o'),Symbol('M_o')) print "Current Load Balancing Equation Solutions, Case 1:" pprint(solve( [Wgg*G + Wgd*D - (M + Wmd*D + Wme*E + Wmg*G), Wgg*G + Wgd*D - (Wee*E + Wed*D), Wed*D + Wmd*D + Wgd*D - D, Wmg*G + Wgg*G - G, Wme*E + Wee*E - E, Wmg - Wmd, 3*Wed - 1], Wgd, Wmd, Wed, Wme, Wmg, Wgg, Wee)) print print "Case 1 with guard and middle overhead: " pprint(solve( [(1-G_o)*(Wgg*G + Wgd*D) - (1-M_o)*(M + Wmd*D + Wme*E + Wmg*G), (1-G_o)*(Wgg*G + Wgd*D) - (Wee*E + Wed*D), Wed*D + Wmd*D + Wgd*D - D, Wmg*G + Wgg*G - G, Wme*E + Wee*E - E, Wmg - Wmd, 3*Wed - 1], Wgd, Wmd, Wed, Wme, Wmg, Wgg, Wee)) print "\n\n" print "Elimination of combined Guard+Exit flags (no overhead): " pprint(solve( [(Wgg*G) - (M + Wme*E + Wmg*G), (Wgg*G) - 1*(Wee*E), Wmg*G + Wgg*G - G, Wme*E + Wee*E - E], Wme, Wmg, Wgg, Wee)) print print "Elimination of combined Guard+Exit flags (Guard+middle overhead): " combined = solve( [(1-G_o)*(Wgg*G) - (1-M_o)*(M + Wme*E + Wmg*G), (1-G_o)*(Wgg*G) - 1*(Wee*E), Wmg*G + Wgg*G - G, Wme*E + Wee*E - E], Wme, Wmg, Wgg, Wee) pprint(combined)
Filename: 266-removing-current-obsolete-clients.txt Title: Removing current obsolete clients from the Tor network Author: Nick Mathewson Created: 14 Jan 2016 Status: Superseded Superseded-by: 264, 272. 1. Introduction Frequently, we find that very old versions of Tor should no longer be supported on the network. To remove relays is easy enough: we simply update the directory authorities to stop listing relays that advertise versions that are too old. But to disable clients is harder. In another proposal I describe a system for letting future clients go gracefully obsolete. This proposal explains how we can safely disable the obsolete clients we have today (and all other client versions of Tor to date, assuming that they will someday become obsolete). 1.1. Why disable clients? * Security. Anybody who hasn't updated their Tor client in 5 years is probably vulnerable to who-knows-what attacks. They aren't likely to get much anonymity either. * Withstand zombie installations. Some Tors out there were once configured to start-on-boot systems that are now unmaintained. (See 1.4 below.) They put needless load on the network, and help nobody. * Be able to remove backward-compatibility code. Currently, Tor supports some truly ancient protocols in order to avoid breaking ancient versions or Tor. This code needs to be maintained and tested. Some of it depends on undocumented or deprecated or non-portable OpenSSL features, and makes it hard to produce a conforming Tor server implementation. * Make it easier to write a conforming Tor relay. If a Tor relay needs to support every Tor client back through the beginning of time, that makes it harder to develop and test compatible implementations. 1.2. Is this dangerous? I don't think so. This proposal describes a way to make older clients gracefully disconnect from the network only when a majority of authorities agree that they should. A majority of authorities already have the ability to inflict arbitrary degrees of sabotage on the consensus document. 1.3. History The earliest versions of Tor checked the recommended-versions field in the directory to see whether they should keep running. If they saw that their version wasn't recommended, they'd shut down. There was an "IgnoreVersion" option that let you keep running anyway. Later, around 2004, the rule changed to "shut down if the version is _obsolete_", where obsolete was defined as "not recommended, and older than a version that is recommended." In 0.1.1.7-alpha, we made obsolete versions only produce a warning, and removed IgnoreVersion. (See 3ac34ae3293ceb0f2b8c49.) We have still disabled old tor versions. With Tor 0.2.0.5-alpha, we disabled Tor versions before 0.1.1.6-alpha by having the v1 authorities begin publishing empty directories only. In version 0.2.5.2-alpha, we completely removed support for the v2 directory protocol used before Tor 0.2.0; there are no longer any v2 authorities on the network. Tor versions before 0.2.1 will currently not progress past fetching an initial directory, because they believe in a number of directory authority identity keys that no longer sign the directory. Tor versions before 0.2.4 are (lightly) throttled in multihop circuit creation, because we prioritize ntor CREATE cells over TAP ones when under load. 1.4. The big problem: slow zombies and fast zombies It would be easy enough to 'disable' old clients by simply removing server support for the obsolete protocols that they use. But there's a problem with that approach: what will the clients do when they fail to make connections, or to extend circuits, or whatever else they are no longer able to do? * Ideally, I'd like such clients to stop functioning _quietly_. If they stop contacting the network, that would be best. * Next best would be if these clients contacted the network only occasionally and at different times. I'll call these clients "slow zombies". * Worse would be if the clients contact the network frequently, over and over. I'll call these clients "fast zombies". They would be at their worst when they focus on authorities, or when they act in synchrony to all strike at once. One goal of this proposal is to ensure that future clients do not become zombies at all; and that ancient clients become slow zombies at worst. 2. Some ideas that don't work. 2.1. Dropping connections based on link protocols. Tor versions before 0.2.3.6-alpha use a renegotiation-based handshake instead of our current handshake. We could detect these handshakes and close the connection at the relay side if the client attempts to renegotiate. I've tested these changes on versions maint-0.2.0 through maint-0.2.2. They result in zombies with the following behavior: The client contact each authority it knows about, attempting to make a one-hop directory connection. It fails, detects a failure, then reconnects more and more slowly ... but one hour later, it resets its connection schedule and starts again. In the steady state this appears to result in about two connections per client per authority per hour. That is probably too many. (Most authorities would be affected: of the authorities that existed in 0.2.2, gabelmoo has moved and turtles has shut down. The authorities Faravahar and longclaw are new. The authorities moria1, tor26, dizum, dannenberg, urras, maatuska and maatuska would all get hit here.) [two maatuskas? -RD] (We could simply remove the renegotiation-detection code entirely, and reply to all connections with an immediate VERSIONS cell. The behavior would probably be the same, though.) If we throttled connections rather than closing them, we'd only get one connection per authority per hour, but authorities would have to keep open a potentially huge number of sockets. 2.2. Blocking circuit creation under certain circumstances In tor 0.2.5.1-alpha, we began ignoring the UseNTorHandshake option, and always preferring the ntor handshake where available. Unfortunately, we can't simply drop all TAP handshakes, since clients and relays can still use them in the hidden service protocol. But we could detect these versions by: Looking for use of a TAP handshake from an IP not associated with any known relay, or on a connection where the client did not authenticate. (This could be from a bridge, but clients don't build circuits that go to an IntroPoint or RendPoint directly after a bridge.) This would still result in clients not having directories, however, and retrying once an hour. 3. Ideas that might work 3.1. Move all authorities to new ports We could have each authority known to older clients start listening for connections at a new port P. We'd forward the old port to the new port. Once sufficiently many clients were using the new ports, we could disable the forwarding. This would result in the old clients turning into zombies as above, but they would only be scrabbling at nonexistent ports, causing less load on the authorities. [This proposal would probably be easiest to implement.] 3.2. Start disabling old link protocols on relays We could have new relays start dropping support for the old link protocols, while maintaining support on the authorities and older relays. The result here would be a degradation of older client performance over time. They'd still behave zombieishly if the authorities dropped support, however. 3.3. Changing the consensus format. We could allow 'f' (short for "flag") as a synonym for 's' in consensus documents. Later, if we want to disable all Tor versions before today, we can change the consensus algorithm so that the consensus (or perhaps only the microdesc consensus) is spelled with 'f' lines instead of 's' lines. This will create a consensus which older clients and relays parse as having all nodes down, which will make them not connect to the network at all. We could similarly replace "r" with "n", or replace Running with Online, or so on. In doing this, we could also rename fresh-until and valid-until, so that new clients would have the real expiration date, and old clients would see "this consensus never expires". This would prevent them from downloading new consensuses. [This proposal would result in the quietest shutdown.] A. How to "pull the switch." This is an example timeline of how we could implement 3.3 above, along with proposal 264. TIME 0: Implement the client/relay side of proposal 264, backported to every currently extant Tor version that we still support. At the same time, add support for the new consensus type to all the same Tor versions. Don't disable anything yet. TIME 1....N: Encourage all distributions shipping packages for those old tor versions to upgrade to ones released at Time 0 or later. Keep informed of the upgrade status of the clients and relays on the Tor network. LATER: At some point after nearly all clients and relays have upgraded to the versions released at Time 0 or later, we could make the switchover to publishing the new consensus type. B. Next steps. We should verify what happens when currently extant client versions get an empty consensus. This will determine whether 3.3 will not work. Will they try to fetch a new one from the authorities at the end of the validity period. Another option is from Roger: we could add a flag meaning "ignore this consensus; it is a poison consensus to kill old Tor versions." And maybe we could have it signed only by keys that the current clients won't accept. And we could serve it to old clients rather than serving them the real consensus. And we could give it a really high expiration time. New clients wouldn't believe it. We'd need to flesh this out. Another option is also from Roger: Tell new clients about new locations to fetch directories from. Keep the old locations working for as long as we want to support them. We'd need to flesh this out too. The timeline above requires us to keep informed of the status of the different clients and relays attempting to connect to the tor network. We should make sure we'll actually able to do so. http://meetbot.debian.net/tor-dev/2016/tor-dev.2016-02-12-15.01.log.html has a more full discussion of the above ideas.
Filename: 267-tor-consensus-transparency.txt Title: Tor Consensus Transparency Author: Linus Nordberg Created: 2014-06-28 Status: Open 0. Introduction This document describes how to provide and use public, append-only, verifiable logs containing Tor consensus and vote status documents, much like what Certificate Transparency [CT] does for TLS certificates, making it possible for log monitors to detect false consensuses and votes. Tor clients and relays can refuse using a consensus not present in a set of logs of their choosing, as well as provide possible evidence of misissuance by submitting such a consensus to any number of logs. 1. Overview Tor status documents, consensuses as well as votes, are stored in one or more public, append-only, externally verifiable log using a history tree like the one described in [CrosbyWallach]. Consensus-users, i.e. Tor clients and relays, expect to receive one or more "proof of inclusions" with new consensus documents. A proof of inclusion is a hash sum representing the tree head of a log, signed by the logs private key, and an audit path listing the nodes in the tree needed to recreate the tree head. Consensus-users are configured to use one or more logs by listing a log address and a public key for each log. This is enough for verifying that a given consensus document is present in a given log. Submission of status documents to a log can be done by anyone with an internet connection (and the Tor network, in case of logs only on a .onion address). The submitter gets a signed tree head and a proof of inclusion in return. Directory authorities are expected to submit to one or more logs and include the proofs when serving consensus documents. Directory caches and consensus-users receiving a consensus not including a proof of inclusion may submit the document and use the proof they receive in return. Auditing log behaviour and monitoring the contents of logs is performed in cooperation between the Tor network and external services. Relays act as log auditors with help from Tor clients gossiping about what they see. Directory authorities are good candidates for monitoring log content since they know what votes they have sent and received as well as what consensus documents they have issued. Anybody can run both an auditor and a monitor though, which is an important property of the proposed system. 2. Motivation Popping a handful of boxes (currently five) or factoring the same number of RSA keys should not be ruled out as a possible attack against a subset of Tor users. An attacker controlling a majority of the directory authorities signing keys can, using man-in-the-middle or man-on-the-side attacks, serve consensus documents listing relays under their control. If mounted on a small subset of Tor users on the internet, the chance of detection is probably low. Implementation of this proposal increases the cost for such an attack by raising the chances of it being detected. Note that while the proposed solution gives each individual some degree of protection against using a false consensus this is not the primary goal but more of a nice side effect. The primary goal is to detect correctly signed consensus documents which differ from the consensus of the directory authoritites. This raises the risk of exposure of an attacker capable of producing a consensus and feed it to users. The complexity of the proposed solution is motivated by the fact that the log key is not just another key on top of the directory authority keys since the log doesn't have to be trusted. Another value is the decentralisation given -- anybody can run their own log and use it. Anybody can audit all existing logs and verify their correct behaviour. This empowers people outside the group of Tor directory authority operators and the people who trust them for one reason or the other. 3. Design Communication with logs is done over HTTP using TLS or Tor onion services for transport, similar to what is defined in [rfc6962-bis-12]. Parameters for POSTs and all responses are encoded as name/value pairs in JSON objects [RFC4627]. Summary of proposed changes to Tor: - Configuration is added for listing known logs and for describing policy for using them. - Directory authorities start submitting newly created consensuses to at least one public log. - Tor clients and relays receiving a consensus not accompanied by a proof of inclusion start submitting that consensus to at least one public log. - Consensus-users start rejecting consensuses accompanied by an invalid proof of inclusion. - A new cell type LOG_STH is defined, for clients and relays to exchange information about seen tree heads and their validity. - Consensus-users send seen tree heads to relays acting as log auditors. - Relays acting as log auditors validate tree heads (section 3.2.2) received from consensus-users and send results back. - Consensus-users start rejecting consensuses for which valid proofs of inclusion can not be obtained. Definitions: - Log id: The SHA-256 hash of the log's public key, to be treated as an opaque byte string identifying the log. 3.1. Consensus submission Logs accept consensus submissions from anyone as long as the consensus is signed by a majority of the Tor directory authorities of the Tor network that it's logging. Consensus documents are POST:ed to a well-known URL as defined in section 5.2. The output is what we call a proof of inclusion. 3.2. Verification 3.2.1. Log entry membership verification Calculate a tree head from the hash of the received consensus and the audit path in the accompanying proof. Verify that the calculated tree head is identical to the tree head in the proof. This can easily be done by consensus-users for each received consensus. We now know that the consensus is part of a tree which the log claims to be The Tree. Whether this tree is the same tree that everybody else see is unknown at this point. 3.2.2. Log consistency verification Ask the log for a consistency proof between the tree head to verify and a previously known good tree head from the pool. Section 5.3 specifies how to fetch a consistency proof. [[TBD require auditors to fetch and store the tree head for the empty tree as part of bootstrapping, in order to avoid the case where there's no older tree to verify against?]] [[TODO description of verification of consistency goes here]] Relays acting as auditors cache results to minimise calculations and communication with log servers. [[TBD have clients verify consistency as well? NOTE: we still want relays to see tree heads in order to catch a lying log (the split-view attack)]] We now know that the verified tree is a superset of a known good tree. 3.3. Log auditing A log auditor verifies two things: - A logs append-only property, i.e. that no entries once accepted by a log are ever altered or removed. - That a log presents the same view to all of its users [[TODO describe the Tor networks role in auditing more than what's found in section 3.2.2]] A log auditor typically doesn't care about the contents of the log entries, other than calculating their hash sums for auditing purposes. Tor relays should act as log auditors. 3.4. Log monitoring A log monitor downloads and investigates each entry in a log searching for anomalies according to its monitoring policy. This document doesn't define monitoring policies but does outline a few strategies for monitoring in section [[TBD]]. Note that there can be more than one valid consensus documents for a given point in time. One reason for this is that the number of signatures can differ due to consensus voting timing details. [[TODO Are there more reasons?]] [[TODO expand on monitoring strategies -- even if this is not part of the proposed extensions to the Tor network it's good for understanding. a) dirauths can verify consensus documents byte for byte; b) anyone can look for diffs larger than D per time T, where "diffs" certainly can be smarter than a plain text diff]] 3.5. Consensus-user behaviour [[TODO move most of this to section 5]] Keep an on-disk cache of consensus documents. Mark them as being in one of three states: LOG_STATE_UNKNOWN -- don't know whether it's present in enough logs or not LOG_STATE_LOGGED -- have seen good proof(s) of inclusion LOG_STATE_LOGGED_GOOD -- confident about the tree head representing a good tree Newly arrived consensus documents start in UNKNOWN or LOGGED depending on whether they are accompanied by enough proofs or not. There are two possible state transitions: - UNKNOWN --> LOGGED: When enough correctly verifying proofs of inclusion (section 3.2.1) have been seen. The number of good proofs required is a policy setting in the configuration of the consensus-user. - LOGGED --> LOGGED_GOOD: When the tree head in enough of the inclusion proofs have been verified (section 3.2.2) or enough LOG_STH cells vouching for the same tree heads have been seen. The number of verifications required is a policy setting in the configuration of the consensus-user. Consensuses in state UNKNOWN are not used but are instead submitted to one or more logs. If the submission succeeds, this will take the consensus to state LOGGED. Consensuses in state LOGGED are used despite not being fully verified with regard to logging. LOG_STH cells containing tree heads from received proofs are being sent to relays for verification. Clients send to all relays that they have a circuit to, i.e. their guard relay(s). Relays send to three random relays that they have a circuit to. 3.6. Relay behaviour when acting as an auditor In order to verify the append-only property of a log, relays acting as log auditors verify the consistency of tree heads received in LOG_STH cells. An auditor keeps a copy of 2+N known good tree heads in a pool stored on persistent media [[TBD where N is either a fixed number in the range 32-128 or is a function of the log size]]. Two of them are the oldest and newest tree heads seen, respectively. The rest, N, are randomly chosen from the tree heads seen. [[TODO describe or refer to an algorithm for "randomly chosen", hopefully not subjective to flushing attacks (or other attacks)]]. 3.7. Notable differences from Certificate Transparency - The data logged is "strictly time-stamped", i.e. ordered. - Much shorter lifetime of logged data -- a day rather than a year. Is the effects of this difference of importance only for "one-shot attacks"? - Directory authorities have consensus about what they're signing -- there are no "web sites knowing better". - Submitters are not in the same hurry as CA:s and can wait minutes rather than seconds for a proof of inclusion. 4. Security implications TODO 5. Specification 5.0. Data structures Data structures are defined as described in [RFC5246] section 4, i.e. TLS 1.2 presentation language. While it is tempting to try to avoid yet another format, the cost of redefining the data structures in [rfc6962-bis-12] outweighs this consideration. The burden of redefining, reimplementing and testing is extra true for those structures which need precise definitions because they are to be signed. 5.1. Signed Tree Head (STH) An STH is a TransItem structure of type "signed_tree_head" as defined in [rfc6962-bis-12] section 5.8. 5.2. Submitting a consensus document to a log POST https://<log server>/tct/v1/add-consensus Input: consensus: A consensus status document as defined in [dir-spec] section 3.4.1 [[TBD gziped and base64 encoded to save 50%?]] Output: sth: A signed tree head as defined in section 5.1 refering to a tree in which the submitted document is included. inclusion: An inclusion proof as specified for the "inclusion" output in [rfc6962-bis-12] section 6.5. 5.3. Getting a consistency proof from a log GET https://<log server>/tct/v1/get-sth-consistency Input and output as specified in [rfc6962-bis-12] section 6.4. 5.x. LOG_STH cells A LOG_STH cell is a variable-length cell with the following fields: TBDname [TBD octets] TBDname [TBD octets] TBDname [TBD octets] 6. Compatibility TBD 7. Implementation TBD 8. Performance and scalability notes TBD A. Open issues / TODOs - TODO: Add SCTs from CT, at least as a practical "cookie" (i.e. no need to send them around or include them anywhere). Logs should be given more time for distributing than we're willing to wait on an HTTP response for. - TODO: explain why no hash function and signing algorithm agility, [[rfc6962-bis-12] section 10 - TODO: add a blurb about the values of publishing logs as onion services - TODO: discuss compromise of log keys B. Acknowledgements This proposal leans heavily on [rfc6962-bis-12]. Some definitions are copied verbatim from that document. Valuable feedback has been received from Ben Laurie, Karsten Loesing and Ximin Luo. C. References [CrosbyWallach] http://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf [dir-spec] https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt [RFC4627] https://tools.ietf.org/html/rfc4627 [rfc6962-bis-12] https://datatracker.ietf.org/doc/draft-ietf-trans-rfc6962-bis/12 [CT] https://https://www.certificate-transparency.org/
Filename: 268-guard-selection.txt Title: New Guard Selection Behaviour Author: Isis Lovecruft, George Kadianakis, [Ola Bini] Created: 2015-10-28 Status: Obsolete (Editorial note: this was origianlly written as a revision of proposal 259, but it diverges so substantially that it seemed better to assign it a new number for reference, so that we aren't always talking about "The old 259" and "the new 259". -NM) This proposal has been obsoleted by proposal #271. §1. Overview Tor uses entry guards to prevent an attacker who controls some fraction of the network from observing a fraction of every user's traffic. If users chose their entries and exits uniformly at random from the list of servers every time they build a circuit, then an adversary who had (k/N) of the network would deanonymize F=(k/N)^2 of all circuits... and after a given user had built C circuits, the attacker would see them at least once with probability 1-(1-F)^C. With large C, the attacker would get a sample of every user's traffic with probability 1. To prevent this from happening, Tor clients choose a small number of guard nodes (currently 3). These guard nodes are the only nodes that the client will connect to directly. If they are not compromised, the user's paths are not compromised. But attacks remain. Consider an attacker who can run a firewall between a target user and the Tor network, and make many of the guards they don't control appear to be unreachable. Or consider an attacker who can identify a user's guards, and mount denial-of-service attacks on them until the user picks a guard that the attacker controls. In the presence of these attacks, we can't continue to connect to the Tor network unconditionally. Doing so would eventually result in the user choosing a hostile node as their guard, and losing anonymity. This proposal outlines a new entry guard selection algorithm, which addresses the following concerns: - Heuristics and algorithms for determining how and which guard(s) is(/are) chosen should be kept as simple and easy to understand as possible. - Clients in censored regions or who are behind a fascist firewall who connect to the Tor network should not experience any significant disadvantage in terms of reachability or usability. - Tor should make a best attempt at discovering the most appropriate behaviour, with as little user input and configuration as possible. §2. Design Alice, an OP attempting to connect to the Tor network, should undertake the following steps to determine information about the local network and to select (some) appropriate entry guards. In the following scenario, it is assumed that Alice has already obtained a recent, valid, and verifiable consensus document. The algorithm is divided into four components such that the full algorithm is implemented by first invoking START, then repeatedly calling NEXT while adviced it SHOULD_CONTINUE and finally calling END. For an example usage see §A. Appendix. Several components of NEXT can be invoked asynchronously. SHOULD_CONTINUE is used for the algorithm to be able to tell the caller whether we consider the work done or not - this can be used to retry primary guards when we finally are able to connect to a guard after a long network outage, for example. This algorithm keeps track of the unreachability status for guards in state global to the system, so that repeated runs will not have to rediscover unreachability over and over again. However, this state does not need to be persisted permanently - it is purely an optimization. The algorithm expects several arguments to guide its behavior. These will be defined in §2.1. The goal of this algorithm is to strongly prefer connecting to the same guards we have connected to before, while also trying to detect conditions such as a network outage. The way it does this is by keeping track of how many guards we have exposed ourselves to, and if we have connected to too many we will fall back to only retrying the ones we have already tried. The algorithm also decides on sample set that should be persisted - in order to minimize the risk of an attacker forcing enumeration of the whole network by triggering rebuilding of circuits. §2.1. Definitions Bad guard: a guard is considered bad if it conforms with the function IS_BAD (see §G. Appendix for details). Dead guard: a guard is considered dead if it conforms with the function IS_DEAD (see §H. Appendix for details). Obsolete guard: a guard is considered obsolete if it conforms with the function IS_OBSOLETE (see §I. Appendix for details). Live entry guard: a guard is considered live if it conforms with the function IS_LIVE (see §D. Appendix for details). §2.1. The START algorithm In order to start choosing an entry guard, use the START algorithm. This takes four arguments that can be used to fine tune the workings: USED_GUARDS This is a list that contains all the guards that have been used before by this client. We will prioritize using guards from this list in order to minimize our exposure. The list is expected to be sorted based on priority, where the first entry will have the highest priority. SAMPLED_GUARDS This is a set that contains all guards that should be considered for connection. This set should be persisted between runs. It should be filled by using NEXT_BY_BANDWIDTH with GUARDS as an argument if it's empty, or if it contains less than SAMPLE_SET_THRESHOLD guards after winnowing out older guards. N_PRIMARY_GUARDS The number of guards we should consider our primary guards. These guards will be retried more frequently and will take precedence in most situations. By default the primary guards will be the first N_PRIMARY_GUARDS guards from USED_GUARDS. When the algorith is used in constrained mode (have bridges or entry nodes in the configuration file), this value should be 1 otherwise the proposed value is 3. DIR If this argument is set, we should only consider guards that can be directory guards. If not set, we will consider all guards. The primary work of START is to initialize the state machine depicted in §2.2. The initial state of the machine is defined by: GUARDS This is a set of all guards from the consensus. It will primarily be used to fill in SAMPLED_GUARDS FILTERED_SAMPLED This is a set that contains all guards that we are willing to connect to. It will be obtained from calling FILTER_SET with SAMPLED_GUARDS as argument. REMAINING_GUARDS This is a running set of the guards we have not yet tried to connect to. It should be initialized to be FILTERED_SAMPLED without USED_GUARDS. STATE A variable that keeps track of which state in the state machine we are currently in. It should be initialized to STATE_PRIMARY_GUARDS. PRIMARY_GUARDS This list keeps track of our primary guards. These are guards that we will prioritize when trying to connect, and will also retry more often in case of failure with other guards. It should be initialized by calling algorithm NEXT_PRIMARY_GUARD repeatedly until PRIMARY_GUARDS contains N_PRIMARY_GUARDS elements. §2.2. The NEXT algorithm The NEXT algorithm is composed of several different possibly flows. The first one is a simple state machine that can transfer between two different states. Every time NEXT is invoked, it will resume at the state where it left off previously. In the course of selecting an entry guard, a new consensus can arrive. When that happens we need to update the data structures used, but nothing else should change. Before jumping in to the state machine, we should first check if it was at least PRIMARY_GUARDS_RETRY_INTERVAL minutes since we tried any of the PRIMARY_GUARDS. If this is the case, and we are not in STATE_PRIMARY_GUARDS, we should save the previous state and set the state to STATE_PRIMARY_GUARDS. §2.2.1. The STATE_PRIMARY_GUARDS state Return each entry in PRIMARY_GUARDS in turn. For each entry, if the guard should be retried and considered suitable use it. A guard is considered to eligible to retry if is marked for retry or is live and id not bad. Also, a guard is considered to be suitable if is live and, if is a directory it should not be a cache. If all entries have been tried transition to STATE_TRY_REMAINING. §2.2.2. The STATE_TRY_REMAINING state Return each entry in USED_GUARDS that is not in PRIMARY_GUARDS in turn.For each entry, if a guard is found return it. Return each entry from REMAINING_GUARDS in turn. For each entry, if the guard should be retried and considered suitable use it and mark it as unreachable. A guard is considered to eligible to retry if is marked for retry or is live and id not bad. Also, a guard is considered to be suitable if is live and, if is a directory it should not be a cache. If no entries remain in REMAINING_GUARDS, transition to STATE_PRIMARY_GUARDS. §2.2.3. ON_NEW_CONSENSUS First, ensure that all guard profiles are updated with information about whether they were in the newest consensus or not. Update the bad status for all guards in USED_GUARDS and SAMPLED_GUARDS. Remove all dead guards from USED_GUARDS and SAMPLED_GUARDS. Remove all obsolete guards from USED_GUARDS and SAMPLED_GUARDS. §2.3. The SHOULD_CONTINUE algorithm This algorithm takes as an argument a boolean indicating whether the circuit was successfully built or not. After the caller have tried to build a circuit with a returned guard, they should invoke SHOULD_CONTINUE to understand if the algorithm is finished or not. SHOULD_CONTINUE will always return true if the circuit failed. If the circuit succeeded, SHOULD_CONTINUE will always return false, unless the guard that succeeded was the first guard to succeed after INTERNET_LIKELY_DOWN_INTERVAL minutes - in that case it will set the state to STATE_PRIMARY_GUARDS and return true. §2.4. The END algorithm The goal of this algorithm is simply to make sure that we keep track of successful connections made. This algorithm should be invoked with the guard that was used to correctly set up a circuit. Once invoked, this algorithm will mark the guard as used, and make sure it is in USED_GUARDS, by adding it at the end if it was not there. §2.5. Helper algorithms These algorithms are used in the above algorithms, but have been separated out here in order to make the flow clearer. NEXT_PRIMARY_GUARD - Return the first entry from USED_GUARDS that is not in PRIMARY_GUARDS and that is in the most recent consensus. - If USED_GUARDS is empty, use NEXT_BY_BANDWIDTH with REMAINING_GUARDS as the argument. NEXT_BY_BANDWIDTH - Takes G as an argument, which should be a set of guards to choose from. - Return a randomly select element from G, weighted by bandwidth. FILTER_SET - Takes G as an argument, which should be a set of guards to filter. - Filter out guards in G that don't comply with IS_LIVE (see §D. Appendix for details). - If the filtered set is smaller than MINIMUM_FILTERED_SAMPLE_SIZE and G is smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, expand G and try to filter out again. G is expanded by adding one new guard at a time using NEXT_BY_BANDWIDTH with GUARDS as an argument. - If G is not smaller than MAXIMUM_SAMPLE_SIZE_THRESHOLD, G should not be expanded. Abort execution of this function by returning null and report an error to the user. §3. Consensus Parameters, & Configurable Variables This proposal introduces several new parameters that ideally should be set in the consensus but that should also be possible to set or override in the client configuration file. Some of these have proposed values, but for others more simulation and trial needs to happen. PRIMARY_GUARDS_RETRY_INTERVAL In order to make it more likely we connect to a primary guard, we would like to retry the primary guards more often than other types of guards. This parameter controls how many minutes should pass before we consider retrying primary guards again. The proposed value is 3. SAMPLE_SET_THRESHOLD In order to allow us to recognize completely unreachable network, we would like to avoid connecting to too many guards before switching modes. We also want to avoid exposing ourselves to too many nodes in a potentially hostile situation. This parameter, expressed as a fraction, determines the number of guards we should keep as the sampled set of the only guards we will consider connecting to. It will be used as a fraction for the sampled set. If we assume there are 1900 guards, a setting of 0.02 means we will have a sample set of 38 guards. This limits our total exposure. Proposed value is 0.02. MINIMUM_FILTERED_SAMPLE_SIZE The minimum size of the sampled set after filtering out nodes based on client configuration (FILTERED_SAMPLED). Proposed value is ???. MAXIMUM_SAMPLE_SIZE_THRESHOLD In order to guarantee a minimum size of guards after filtering, we expand SAMPLED_GUARDS until a limit. This fraction of GUARDS will be used as an upper bound when expanding SAMPLED_GUARDS. Proposed value is 0.03. INTERNET_LIKELY_DOWN_INTERVAL The number of minutes since we started trying to find an entry guard before we should consider the network down and consider retrying primary guards before using a functioning guard found. Proposed value 5. §4. Security properties and behavior under various conditions Under normal conditions, this algorithm will allow us to quickly connect and use guards we have used before with high likelihood of working. Assuming the first primary guard is reachable and in the consensus, this algorithm will deterministically always return that guard. Under dystopic conditions (when a firewall is in place that blocks all ports except for potentially port 80 and 443), this algorithm will try to connect to 2% of all guards before switching modes to try dystopic guards. Currently, that means trying to connect to circa 40 guards before getting a successful connection. If we assume a connection try will take maximum 10 seconds, that means it will take up to 6 minutes to get a working connection. When the network is completely down, we will try to connect to 2% of all guards plus 2% of all dystopic guards before realizing we are down. This means circa 50 guards tried assuming there are 1900 guards in the network. In terms of exposure, we will connect to a maximum of 2% of all guards plus 2% of all dystopic guards, or 3% of all guards, whichever is lower. If N is the number of guards, and k is the number of guards an attacker controls, that means an attacker would have a probability of 1-(1-(k/N)^2)^(N * 0.03) to have one of their guards selected before we fall back. In real terms, this means an attacker would need to control over 10% of all guards in order to have a larger than 50% chance of controlling a guard for any given client. In addition, since the sampled set changes slowly (the suggestion here is that guards in it expire every month) it is not possible for an attacker to force a connection to an entry guard that isn't already in the users sampled set. §A. Appendix: An example usage In order to clarify how this algorithm is supposed to be used, this pseudo code illustrates the building of a circuit: ESTABLISH_CIRCUIT: if chosen_entry_node = NULL if context = NULL context = ALGO_CHOOSE_ENTRY_GUARD_START(used_guards, sampled_guards=[], options, n_primary_guards=3, dir=false, guards_in_consensus) chosen_entry_node = ALGO_CHOOSE_ENTRY_GUARD_NEXT(context) if not IS_SUITABLE(chosen_entry_node) try another entry guard circuit = composeCircuit(chosen_entry_node) return circuit ON_FIRST_HOP_CALLBACK(channel): if !SHOULD_CONTINUE: ALGO_CHOOSE_ENTRY_GUARD_END(entryGuard) else chosen_entry_node = NULL §B. Appendix: Entry Points in Tor In order to clarify how this algorithm is supposed to be integrated with Tor, here are some entry points to trigger actions mentioned in spec: When establish_circuit: If *chosen_entry_node* doesn't exist If *context* exist, populate the first one as *context* Otherwise, use ALGO_CHOOSE_ENTRY_GUARD_START to initalize a new *context*. After this when we want to choose_good_entry_server, we will use ALGO_CHOOSE_ENTRY_GUARD_NEXT to get a candidate. Use chosen_entry_node to build_circuit and handle_first_hop, return this circuit When entry_guard_register_connect_status(should_continue): if !should_continue: Call ALGO_CHOOSE_ENTRY_GUARD_END(chosen_entry_node) else: Set chosen_entry_node to NULL When new directory_info_has_arrived: Do ON_NEW_CONSENSUS §C. Appendix: IS_SUITABLE helper function A guard is suitable if it satisfies all of the folowing conditions: - It's considered to be live, according to IS_LIVE. - It's a directory cache if a directory guard is requested. - It's not the chosen exit node. - It's not in the family of the chosen exit node. This conforms to the existing conditions in "populate_live_entry_guards()". §D. Appendix: IS_LIVE helper function A guard is considered live if it satisfies all of the folowing conditions: - It's not disabled because of path bias issues (path_bias_disabled). - It was not observed to become unusable according to the directory or the user configuration (bad_since). - It's marked for retry (can_retry) or it's been unreachable for some time (unreachable_since) but enough time has passed since we last tried to connect to it (entry_is_time_to_retry). - It's in our node list, meaninig it's present in the latest consensus. - It has a usable descriptor (either a routerdescriptor or a microdescriptor) unless a directory guard is requested. - It's a general-purpose router unless UseBridges is configured. - It's reachable by the configuration (fascist_firewall_allows_node). This conforms to the existing conditions in "entry_is_live()". A guard is observed to become unusable according to the directory or the user configuration if it satisfies any of the following conditions: - It's not in our node list, meaninig it's present in the latest consensus. - It's not currently running (is_running). - It's not a bridge and not a configured bridge (node_is_a_configured_bridge) and UseBridges is True. - It's not a possible guard and is not in EntryNodes and UseBridges is False. - It's in ExcludeNodes. Nevertheless this is ignored when loading from config. - It's not reachable by the configuration (fascist_firewall_allows_node). - It's disabled because of path bias issues (path_bias_disabled). This conforms to the existing conditions in "entry_guards_compute_status()". §E. Appendix: UseBridges and Bridges configurations This is mutually exclusive with EntryNodes. If options->UseBridges OR options->EntryNodes: - guards = populate_live_entry_guards() - this is the "bridge flavour" of IS_SUITABLE as mentioned before. - return node_sl_choose_by_bandwidth(guards, WEIGHT_FOR_GUARD) This is "choose a guard from S by bandwidth weight". UseBridges and Bridges must be set together. Bridges go to bridge_list (via bridge_add_from_config()), but how is it used? learned_bridge_descriptor() adds the bridge to the global entry_guards if UseBridges = True. We either keep the existing global entry_guards OR incorporate bridges in the proposal (remove non bridges from USED_GUARDS, and REMAINING_GUARDS = bridges?) If UseBridges is set as true, we need to fill the SAMPLED_GUARDS with bridges specified and learned from consensus. §F. Appendix: EntryNodes configuration This is mutually exclusive with Bridges. The global entry_guards will be updated with entries in EntryNodes (see entry_guards_set_from_config()). If EntryNodes is set, we need to fill the SAMPLED_GUARDS with EntryNodes specified in options. §G. Appendix: IS_BAD helper function A guard is considered bad if is not included in the newest consensus. §H. Appendix: IS_DEAD helper function A guard is considered dead if it's marked as bad for ENTRY_GUARD_REMOVE_AFTER period (30 days) unless they have been disabled because of path bias issues (path_bias_disabled). §I. Appendix: IS_OBSOLETE helper function A guard is considered obsolete if it was chosen by an Tor version we can't recognize or it was chosen more than GUARD_LIFETIME ago. -*- coding: utf-8 -*-
Filename: 269-hybrid-handshake.txt Title: Transitionally secure hybrid handshakes Author: John Schanck, William Whyte, Zhenfei Zhang, Nick Mathewson, Isis Lovecruft, Peter Schwabe Created: 7 June 2016 Updated: 2 Sept 2016 Status: Needs-Revision 1. Introduction This document describes a generic method for integrating a post-quantum key encapsulation mechanism (KEM) into an ntor-like handshake. A full discussion of the protocol and its proof of security may be found in [SWZ16]. 1.1 Motivation: Transitional forward-secret key agreement All currently deployed forward-secret key agreement protocols are vulnerable to quantum cryptanalysis. The obvious countermeasure is to switch to a key agreement mechanism that uses post-quantum primitives for both authentication and confidentiality. This option should be explored, but providing post-quantum router authentication in Tor would require a new consensus method and new microdescriptor elements. Since post-quantum public keys and signatures can be quite large, this may be a very expensive modification. In the near future it will suffice to use a "transitional" key agreement protocol -- one that provides pre-quantum authentication and post-quantum confidentiality. Such a protocol is secure in the transition between pre- and post-quantum settings and provides forward secrecy against adversaries who gain quantum computing capabilities after session negotiation. 1.2 Motivation: Fail-safe plug & play for post-quantum KEMs We propose a modular design that allows any post-quantum KEM to be included in the handshake. As there may be some uncertainty as to the security of the currently available post-quantum KEMs, and their implementations, we ensure that the scheme safely degrades to ntor in the event of a complete break on the KEM. 2. Proposal 2.1 Overview We re-use the public key infrastructure currently used by ntor. Each server publishes a static Diffie-Hellman (DH) onion key. Each client is assumed to have a certified copy of each server's public onion key and each server's "identity digest". To establish a session key, we propose that the client send two ephemeral public keys to the server. The first is an ephemeral DH key, the second is an ephemeral public key for a post-quantum KEM. The server responds with an ephemeral DH public key and an encapsulation of a random secret under the client's ephemeral KEM key. The two parties then derive a shared secret from: 1) the static-ephemeral DH share, 2) the ephemeral-ephemeral DH share, 3) the encapsulated secret, 4) the transcript of their communication. 2.2 Notation Public, non-secret, values are denoted in UPPER CASE. Private, secret, values are denoted in lower case. We use multiplicative notation for Diffie-Hellman operations. 2.3 Parameters DH A Diffie-Hellman primitive KEM A post-quantum key encapsulation mechanism H A cryptographic hash function LAMBDA (bits) Pre-quantum bit security parameter MU (bits) 2*LAMBDA KEY_LEN (bits) Length of session key material to output H_LEN (bytes) Length of output of H ID_LEN (bytes) Length of server identity digest DH_LEN (bytes) Length of DH public key KEM_PK_LEN (bytes) Length of KEM public key KEM_C_LEN (bytes) Length of KEM ciphertext PROTOID (string) "hybrid-[DH]-[KEM]-[H]-[revision]" T_KEY (string) PROTOID | ":key" T_AUTH (string) PROTOID | ":auth" Note: [DH], [KEM], and [H] are strings that uniquely identify the primitive, e.g. "x25519" 2.4 Subroutines HMAC(key, msg): The pseudorandom function defined in [RFC2104] with H as the underlying hash function. EXTRACT(salt, secret): A randomness extractor with output of length >= MU bits. For most choices of H one should use the HMAC based randomness extractor defined in [RFC5869]: EXTRACT(salt, secret) := HMAC(salt, secret). If MU = 256 and H is SHAKE-128 with MU bit output, or if MU = 512 and H is SHAKE-256 with MU bit output, then one may instead define: EXTRACT(salt, secret) := H(salt | secret). EXPAND(seed, context, len): The HMAC based key expansion function defined in [RFC5869]. Outputs the first len bits of K = K_1 | K_2 | K_3 | ... where K_0 = empty string (zero bits) K_i = HMAC(seed, K_(i-1) | context | INT8(i)). Alternatively, an eXtendable Output Function (XOF) may be used. In which case, EXPAND(seed, context, len) = XOF(seed | context, len) DH_GEN() -> (x, X): Diffie-Hellman keypair generation. Secret key x, public key X. DH_MUL(P,x) -> xP: Scalar multiplication in the DH group of the base point P by the scalar x. KEM_GEN() -> (sk, PK): Key generation for KEM. KEM_ENC(PK) -> (m, C): Encapsulation, C, of a uniform random message m under public key PK. KEM_DEC(C, sk): Decapsulation of the ciphertext C with respect to the secret key sk. KEYID(A) -> A or H(A): For DH groups with long element presentations it may be desirable to identify a key by its hash. For typical elliptic curve groups this should be the identity map. 2.5 Handshake To perform the handshake, the client needs to know the identity digest and an onion key for the router. The onion key must be for the specified DH scheme (e.g. x25519). Call the router's identity digest "ID" and its public onion key "A". The following Client Init / Server Response / Client Finish sequence defines the hybrid-DH-KEM protocol. See Fig. 1 for a schematic depiction of the same operations. - Client Init ------------------------------------------------------------ The client generates ephemeral key pairs: x, X = DH_GEN() esk, EPK = KEM_GEN() The client sends a CREATE cell with contents: ID [ID_LEN bytes] KEYID(A) [H_LEN bytes] X [DH_LEN bytes] EPK [KEM_PK_LEN bytes] - Server Response -------------------------------------------------------- The server generates an ephemeral DH keypair: y, Y := DH_GEN() The server computes the three secret shares: s0 := H(DH_MUL(X,a)) s1 := DH_MUL(X,y) s2, C := KEM_ENC(EPK) The server extracts the seed: SALT := ID | A | X | EPK secret := s0 | s1 | s2 seed := EXTRACT(SALT, secret) The server derives the authentication tag: verify := EXPAND(seed, T_AUTH, MU) TRANSCRIPT := ID | A | X | EPK | Y | C | PROTOID AUTH := HMAC(verify, TRANSCRIPT) The server sends a CREATED cell with contents: Y [DH_LEN bytes] C [KEM_C_LEN bytes] AUTH [CEIL(MU/8) bytes] - Client Finish ---------------------------------------------------------- The client computes the three secret shares: s0 := H(DH_MUL(A,x)) s1 := DH_MUL(Y,x) s2 := KEM_DEC(C, esk) The client then derives the seed: SALT := ID | A | X | EPK secret := s0 | s1 | s2 seed := EXTRACT(SALT, secret); The client derives the authentication tag: verify := EXPAND(seed, T_AUTH, MU) TRANSCRIPT := ID | A | X | EPK | Y | C | PROTOID AUTH := HMAC(verify, TRANSCRIPT) The client verifies that AUTH matches the tag received from the server. If the authentication check fails the client aborts the session. - Key derivation --------------------------------------------------------- Both parties derive the shared key from the seed: key := EXPAND(seed, T_KEY, KEY_LEN). .--------------------------------------------------------------------------. | Fig. 1: The hybrid-DH-KEM handshake. | .--------------------------------------------------------------------------. | | | Initiator Responder with identity key ID | | --------- --------- and onion key A | | | | x, X := DH_GEN() | | esk, EPK := KEM_GEN() | | CREATE_DATA := ID | A | X | EPK | | | | --- CREATE_DATA ---> | | | | y, Y := DH_GEN() | | s0 := H(DH_MUL(X,a)) | | s1 := DH_MUL(X,y) | | s2, C := KEM_ENC(EPK) | | SALT := ID | A | X | EPK | | secret := s0 | s1 | s2 | | seed := EXTRACT(SALT, secret) | | verify := EXPAND(seed, T_AUTH, MU) | | TRANSCRIPT := ID | A | X | Y | EPK | C | PROTOID | | AUTH := HMAC(verify, TRANSCRIPT) | | key := EXPAND(seed, T_KEY, KEY_LEN) | | CREATED_DATA := Y | C | AUTH | | | | <-- CREATED_DATA --- | | | | s0 := H(DH_MUL(A,x)) | | s1 := DH_MUL(Y,x) | | s2 := KEM_DEC(C, esk) | | SALT := ID | A | X | EPK | | secret := s0 | s1 | s2 | | seed := EXTRACT(SALT, secret) | | verify := EXPAND(seed, T_AUTH, MU) | | TRANSCRIPT := ID | A | X | Y | EPK | C | | | | assert AUTH == HMAC(verify, TRANSCRIPT) | | key := EXPAND(seed, T_KEY, KEY_LEN) | '--------------------------------------------------------------------------' 3. Changes from ntor The hybrid-null handshake differs from ntor in a few ways. First there are some superficial differences. The protocol IDs differ: ntor PROTOID "ntor-curve25519-sha256-1", hybrid-null PROTOID "hybrid-x25519-null-sha256-1", and the context strings differ: ntor T_MAC PROTOID | ":mac", ntor T_KEY PROTOID | ":key_extract", ntor T_VERIFY PROTOID | ":verify", ntor M_EXPAND PROTOID | ":key_expand", hybrid-null T_KEY PROTOID | ":key", hybrid-null T_AUTH PROTOID | ":auth". Then there are significant differences in how the authentication tag (AUTH) and key (key) are derived. The following description uses the HMAC based definitions of EXTRACT and EXPAND. In ntor the server computes secret_input := EXP(X,y) | EXP(X,a) | ID | A | X | Y | PROTOID seed := HMAC(T_KEY, secret_input) verify := HMAC(T_VERIFY, seed) auth_input := verify | ID | A | Y | X | PROTOID | "Server" AUTH := HMAC(T_MAC, auth_input) key := EXPAND(seed, M_EXPAND, KEY_LEN) In hybrid-null the server computes SALT := ID | A | X secret_input := H(EXP(X,a)) | EXP(X,y) seed := EXTRACT(SALT, secret_input) verify := EXPAND(seed, T_AUTH, MU) TRANSCRIPT := ID | A | X | Y | PROTOID AUTH := HMAC(verify, TRANSCRIPT) key := EXPAND(seed, T_KEY, KEY_LEN) First, note that hybrid-null hashes EXP(X,a). This is due to the fact that weaker assumptions were used to prove the security of hybrid-null than were used to prove the security of ntor. While this may seem artificial we recommend keeping it. Second, ntor uses fixed HMAC keys for all sessions. This is unlikely to be a real-world security issue, but it requires stronger assumptions about HMAC than if the order of the arguments were reversed. Finally, ntor uses a mixture of public and secret data in auth_input, whereas the equivalent term in hybrid-null is the public transcript. 4. Versions [XXX rewrite section w/ new versioning proposal] Recognized handshake types are: 0x0000 TAP -- the original Tor handshake; 0x0001 reserved 0x0002 ntor -- the ntor-x25519-sha256 handshake; Request for new handshake types: 0x010X hybrid-XX -- a hybrid of a x25519 handshake and a post-quantum key encapsulation mechanism where 0x0101 hybrid-null -- No post-quantum key encapsulation mechanism. 0x0102 hybrid-ees443ep2 -- Using NTRUEncrypt parameter set ntrueess443ep2 0x0103 hybrid-newhope -- Using the New Hope R-LWE scheme DEPENDENCY: Proposal 249: Allow CREATE cells with >505 bytes of handshake data 5. Bibliography [SWZ16] Schanck, J., Whyte, W., and Z. Zhang, "Circuit extension handshakes for Tor achieving forward secrecy in a quantum world", PETS 2016, DOI 10.1515/popets-2016-0037, June 2016. [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing for Message Authentication", RFC 2104, DOI 10.17487/RFC2104, February 1997 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)", RFC 5869, DOI 10.17487/RFC5869, May 2010 A1. Instantiation with NTRUEncrypt This example uses the NTRU parameter set EESS443EP2 [XXX cite] which is estimated at the 128 bit security level for both pre- and post-quantum settings. EES443EP2 specifies three algorithms: EES443EP2_GEN() -> (sk, PK), EES443EP2_ENCRYPT(m, PK) -> C, EES443EP2_DECRYPT(C, sk) -> m. The m parameter for EES443EP2_ENCRYPT can be at most 49 bytes. We define EES443EP2_MAX_M_LEN := 49. 0x0102 hybrid-x25519-ees443ep2-shake128-1 -------------------- DH := x25519 KEM := EES443EP2 H := SHAKE-128 with 256 bit output LAMBDA := 128 MU := 256 H_LEN := 32 ID_LEN := 20 DH_LEN := 32 KEM_PK_LEN := 615 KEM_C_LEN := 610 KEY_LEN := XXX PROTOID := "hybrid-x25519-ees443ep2-shake128-1" T_KEY := "hybrid-x25519-ees443ep2-shake128-1:key" T_AUTH := "hybrid-x25519-ees443ep2-shake128-1:auth" Subroutines ----------- HMAC(key, message) := SHAKE-128(key | message, MU) EXTRACT(salt, secret) := SHAKE-128(salt | secret, MU) EXPAND(seed, context, len) := SHAKE-128(seed | context, len) KEM_GEN() := EES443EP2_GEN() KEM_ENC(PK) := (s, C) where s = RANDOMBYTES(EES443EP2_MAX_M_LEN) and C = EES443EP2_ENCRYPT(s, PK) KEM_DEC(C, sk) := EES443EP2_DECRYPT(C, sk) A2. Instantiation with NewHope [XXX write intro] 0x0103 hybrid-x25519-newhope-shake128-1 -------------------- DH := x25519 KEM := NEWHOPE H := SHAKE-128 with 256 bit output LAMBDA := 128 MU := 256 H_LEN := 32 ID_LEN := 20 DH_LEN := 32 KEM_PK_LEN := 1824 KEM_C_LEN := 2048 KEY_LEN := XXX PROTOID := "hybrid-x25519-newhope-shake128-1" T_KEY := "hybrid-x25519-newhope-shake128-1:key" T_AUTH := "hybrid-x25519-newhope-shake128-1:auth" Subroutines ----------- HMAC(key, message) := SHAKE-128(key | message, MU) EXTRACT(salt, secret) := SHAKE-128(salt | secret, MU) EXPAND(seed, context, len) -> SHAKE-128(seed | context, len) KEM_GEN() -> (sk, PK) where SEED := RANDOMBYTES(MU) (sk,B) := NEWHOPE_KEYGEN(A_SEED) PK := B | A_SEED KEM_ENC(PK) -> NEWHOPE_ENCAPS(PK) KEM_DEC(C, sk) -> NEWHOPE_DECAPS(C, sk)
Filename: 270-newhope-hybrid-handshake.txt Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope Author: Isis Lovecruft, Peter Schwabe Created: 16 Apr 2016 Updated: 22 Jul 2016 Status: Obsolete Depends: prop#220 prop#249 prop#264 prop#270 §0. Introduction RebelAlliance is a post-quantum secure hybrid handshake, comprised of an alliance between the X25519 and NewHope key exchanges. NewHope is a post-quantum-secure lattice-based key-exchange protocol based on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid handshake for Tor, based on a combination of Tor's current NTor handshake and a shared key derived through a NewHope ephemeral key exchange. For further details on the NewHope key exchange, the reader is referred to "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and Schwabe [0][1]. For the purposes of brevity, we consider that NTor is currently the only handshake protocol in Tor; the older TAP protocol is ignored completely, due to the fact that it is currently deprecated and nearly entirely unused. §1. Motivation An attacker currently monitoring and storing circuit-layer NTor handshakes who later has the ability to run Shor's algorithm on a quantum computer will be able to break Tor's current handshake protocol and decrypt previous communications. It is unclear if and when such attackers equipped with large quantum computers will exist, but various estimates by researchers in quantum physics and quantum engineering give estimates of only 1 to 2 decades. Clearly, the security requirements of many Tor users include secrecy of their messages beyond this time span, which means that Tor needs to update the key exchange to protect against such attackers as soon as possible. §2. Design An initiator and responder, in parallel, conduct two handshakes: - An X25519 key exchange, as described in the description of the NTor handshake in Tor proposal #216. - A NewHope key exchange. The shared keys derived from these two handshakes are then concatenated and used as input to the SHAKE-256 extendable output function (XOF), as described in FIPS-PUB-202 [2], in order to produce a shared key of the desired length. The testvectors in §C assume that this key has a length of 32 bytes, but the use of a XOF allows arbitrary lengths to easily support future updates of the symmetric primitives using the key. See also §3.3.1. §3. Specification §3.1. Notation Let `a || b` be the concatenation of a with b. Let `a^b` denote the exponentiation of a to the bth power. Let `a == b` denote the equality of a with b, and vice versa. Let `a := b` be the assignment of the value of b to the variable a. Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in FIPS-PUB-202) applied to message x. Let X25519 refer to the curve25519-based key agreement protocol described in RFC7748 §6.1. [3] Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do the appropriate manipulations when generating the secret key (clearing the low bits, twidding the high bits). Additionally, EXP() MUST include the check for all-zero output due to the input point being of small order (cf. RFC7748 §6). Let `X25519_KEYID(B) == B` where B is a valid X25519 public key. When representing an element of the Curve25519 subgroup as a byte string, use the standard (32-byte, little-endian, x-coordinate-only) representation for Curve25519 points. Let `ID` be a router's identity key taken from the router microdescriptor. In the case for relays possessing Ed25519 identity keys (cf. Tor proposal #220), this is a 32-byte string representing the public Ed25519 identity key. For backwards and forwards compatibility with routers which do not possess Ed25519 identity keys, this is a 32-byte string created via the output of H(ID). We refer to the router as the handshake "responder", and the client (which may be an OR or an OP) as the "initiator". ID_LENGTH [32 bytes] H_LENGTH [32 bytes] G_LENGTH [32 bytes] PROTOID := "pqtor-x25519-newhope-shake256-1" T_MAC := PROTOID || ":mac" T_KEY := PROTOID || ":key_extract" T_VERIFY := PROTOID || ":verify" (X25519_SK, X25519_PK) := X25519_KEYGEN() §3.2. Protocol ======================================================================================== | | | Fig. 1: The NewHope-X25519 Hybrid Handshake. | | | | Before the handshake the Initiator is assumed to know Z, a public X25519 key for | | the Responder, as well as the Responder's ID. | ---------------------------------------------------------------------------------------- | | | Initiator Responder | | | | SEED := H(randombytes(32)) | | x, X := X25519_KEYGEN() | | a, A := NEWHOPE_KEYGEN(SEED) | | CLIENT_HDATA := ID || Z || X || A | | | | --- CLIENT_HDATA ---> | | | | y, Y := X25519_KEYGEN() | | NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) | | M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) | | SERVER_HDATA := Y || AUTH || M | | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | | | | <-- SERVER_HDATA ---- | | | | NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) | | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) | | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | | | ======================================================================================== §3.2.1. The NTor Handshake §3.2.1.1. Prologue Take a router with identity ID. As setup, the router generates a secret key z, and a public onion key Z with: z, Z := X25519_KEYGEN() The router publishes Z in its server descriptor in the "ntor-onion-key" entry. Henceforward, we refer to this router as the "responder". §3.2.1.2. Initiator To send a create cell, the initiator generates a keypair: x, X := X25519_KEYGEN() and creates the NTor portion of a CREATE2V cell's HDATA section: CLIENT_NTOR := ID || Z || X [96 bytes] The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in the event the responder OR has recently rotated keys, the responder can determine which keypair to use. The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2), to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1) to the responder. CLIENT_NEWHOPE [1824 bytes] (see §3.2.2) CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes] If the responder does not respond with a CREATED2V cell, the initiator SHOULD NOT attempt to extend the circuit through the responder by sending fragmented EXTEND2 cells, since the responder's lack of support for CREATE2V cells is assumed to imply the responder also lacks support for fragmented EXTEND2 cells. Alternatively, for initiators with a sufficiently late consensus method, the initiator MUST check that "proto" line in the responder's descriptor (cf. Tor proposal #264) advertises support for the "Relay" subprotocol version 3 (see §5). §3.2.1.3. Responder The responder generates a keypair of y, Y = X25519_KEYGEN(), and does NTOR_SHAREDB() as follows: (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B): secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID NTOR_KEY := H(secret_input, T_KEY) verify := H(secret_input, T_VERIFY) auth_input := verify || ID || Z || Y || X || PROTOID || "Server" AUTH := H(auth_input, T_MAC) The responder sends a CREATED2V cell containing: SERVER_NTOR := Y || AUTH [64 bytes] SERVER_NEWHOPE [2048 bytes] (see §3.2.2) SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes] and sends this to the initiator. §3.2.1.4. Finalisation The initiator then checks Y is in G^* [see NOTE below], and does NTOR_SHAREDA() as follows: (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID NTOR_KEY := H(secret_input, T_KEY) verify := H(secret_input, T_VERIFY) auth_input := verify || ID || Z || Y || X || PROTOID || "Server" if AUTH == H(auth_input, T_MAC) return NTOR_KEY Both parties now have a shared value for NTOR_KEY. They expand this into the keys needed for the Tor relay protocol. [XXX We think we want to omit the final hashing in the production of NTOR_KEY here, and instead put all the inputs through SHAKE-256. --isis, peter] [XXX We probably want to remove ID and B from the input to the shared key material, since they serve for authentication but, as pre-established "prologue" material to the handshake, they should not be used in attempts to strengthen the cryptographic suitability of the shared key. Also, their inclusion is implicit in the DH exponentiations. I should probably ask Ian about the reasoning for the original design choice. --isis] §3.2.2. The NewHope Handshake §3.2.2.1. Parameters & Mathematical Structures Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to a polynomial, we mean an element of Rq. n := 1024 q := 12289 SEED [32 Bytes] NEWHOPE_POLY [1792 Bytes] NEWHOPE_REC [256 Bytes] NEWHOPE_KEY [32 Bytes] NEWHOPE_MSGA := (NEWHOPE_POLY || SEED) NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC) §3.2.2.2. High-level Description of Newhope API Functions For a description of internal functions, see §B. (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED): â := gen_a(seed) s := poly_getnoise() e := poly_getnoise() ŝ := poly_ntt(s) ê := poly_ntt(e) b̂ := pointwise(â, ŝ) + ê sp := poly_tobytes(ŝ) bp := poly_tobytes(b̂) return (sp, (bp || seed)) (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA): s' := poly_getnoise() e' := poly_getnoise() e" := poly_getnoise() b̂ := poly_frombytes(bp) â := gen_a(seed) ŝ' := poly_ntt(s') ê' := poly_ntt(e') û := poly_pointwise(â, ŝ') + ê' v := poly_invntt(poly_pointwise(b̂,ŝ')) + e" r := helprec(v) up := poly_tobytes(û) k := rec(v, r) return ((up || r), k) NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY): û := poly_frombytes(up) ŝ := poly_frombytes(sp) v' := poly_invntt(poly_pointwise(û, ŝ)) k := rec(v', r) return k When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further discussion. §3.3. Key Expansion The client and server derive a shared key, SHARED, by: HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR" SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey) §3.3.1. Note on the Design Choice The reader may wonder why one would use SHAKE-256 to produce a 256-bit output, since the security strength in bits for SHAKE-256 is min(d/2,256) for collision resistance and min(d,256) for first- and second-order preimages, where d is the output length. The reasoning is that we should be aiming for 256-bit security for all of our symmetric cryptography. One could then argue that we should just use SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an easy way to derive longer shared secrets in the future without requiring a new handshake. The construction is odd, but the future is bright. As we are already using SHAKE-256 for the 32-byte output hash, we are also using it for all other 32-byte hashes involved in the protocol. Note that the only difference between SHA3-256 and SHAKE-256 with 32-byte output is one domain-separation byte. [XXX why would you want 256-bit security for the symmetric side? Are you talking pre- or post-quantum security? --peter] §4. Security & Anonymity Implications This handshake protocol is one-way authenticated. That is, the server is authenticated, while the client remains anonymous. The client MUST NOT cache and reuse SEED. Doing so gives non-trivial adversarial advantages w.r.t. all-for-the-price-of-one attacks during the caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA is reused for handshakes along the same circuit or multiple different circuits, an adversary conducting a sybil attack somewhere along the path(s) will be able to correlate the identity of the client across circuits or hops. §5. Compatibility Because our proposal requires both the client and server to send more than the 505 bytes possible within a CREATE2 cell's HDATA section, it depends upon the implementation of a mechanism for allowing larger CREATE cells (cf. Tor proposal #249). We reserve the following handshake type for use in CREATE2V/CREATED2V and EXTEND2V/EXTENDED2V cells: 0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE] We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264 §5.3) to signify support this handshake, and hence for the CREATE2V and fragmented EXTEND2 cells which it requires. There are no additional entries or changes required within either router descriptors or microdescriptors to support this handshake method, due to the NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519 public keys already being included within the "ntor-onion-key" entry. Add a "UseNewHopeKEX" configuration option and a corresponding consensus parameter to control whether clients prefer using this NewHope hybrid handshake or some previous handshake protocol. If the configuration option is "auto", clients SHOULD obey the consensus parameter. The default configuration SHOULD be "auto" and the consensus value SHOULD initially be "0". §6. Implementation The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software implementations of NewHope, one C reference implementation and an optimized implementation using AVX2 vector instructions. Those implementations are available at [1]. Additionally, there are implementations in Go by Yawning Angel, available from [4] and in Rust by Isis Lovecruft, available from [5]. The software used to generate the test vectors in §C is based on the C reference implementation and available from: https://code.ciph.re/isis/newhope-tor-testvectors https://github.com/isislovecruft/newhope-tor-testvectors §7. Performance & Scalability The computationally expensive part in the current NTor handshake is the X25519 key-pair generation and the X25519 shared-key computation. The current implementation in Tor is a wrapper to support various highly optimized implementations on different architectures. On Intel Haswell processors, the fastest implementation of X25519, as reported by the eBACS benchmarking project [6], takes 169920 cycles for key-pair generation and 161648 cycles for shared-key computation; these add up to a total of 331568 cycles on each side (initiator and responder). The C reference implementation of NewHope, also benchmarked on Intel Haswell, takes 358234 cycles for the initiator and 402058 cycles for the Responder. The core computation of the proposed combination of NewHope and X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator and a slowdown by a factor of 2.2 for the Responder compared to the current NTor handshake. These numbers assume a fully optimized implementation of the NTor handshake and a C reference implementation of NewHope. With optimized implementations of NewHope, such as the one for Intel Haswell described in [0], the computational slowdown will be considerably smaller than a factor of 2. §8. References [0]: https://cryptojedi.org/papers/newhope-20160328.pdf [1]: https://cryptojedi.org/crypto/#newhope [2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061 [3]: https://tools.ietf.org/html/rfc7748#section-6.1 [4]: https://github.com/Yawning/newhope [5]: https://code.ciph.re/isis/newhopers [6]: http://bench.cr.yp.to §A. Cell Formats §A.1. CREATE2V Cells The client portion of the handshake should send CLIENT_HDATA, formatted into a CREATE2V cell as follows: CREATE2V { [2114 bytes] HTYPE := 0x0003 [2 bytes] HLEN := 0x0780 [2 bytes] HDATA := CLIENT_HDATA [1920 bytes] IGNORED := 0x00 [194 bytes] } [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the same number of bytes as SERVER_HDATA? --isis] §A.2. CREATED2V Cells The server responds to the client's CREATE2V cell with SERVER_HDATA, formatted into a CREATED2V cell as follows: CREATED2V { [2114 bytes] HLEN := 0x0800 [2 bytes] HDATA := SERVER_HDATA [2112 bytes] IGNORED := 0x00 [0 bytes] } §A.3. Fragmented EXTEND2 Cells When the client wishes to extend a circuit, the client should fragment CLIENT_HDATA into four EXTEND2 cells: EXTEND2 { NSPEC := 0x02 { [1 byte] LINK_ID_SERVER [22 bytes] XXX LINK_ADDRESS_SERVER [8 bytes] XXX } HTYPE := 0x0003 [2 bytes] HLEN := 0x0780 [2 bytes] HDATA := CLIENT_HDATA[0,461] [462 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := CLIENT_HDATA[462,954] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := CLIENT_HDATA[955,1447] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := 0x00[172] [172 bytes] } The client sends this to the server to extend the circuit from, and that server should format the fragmented EXTEND2 cells into a CREATE2V cell, as described in §A.1. §A.4. Fragmented EXTENDED2 Cells EXTENDED2 { NSPEC := 0x02 { [1 byte] LINK_ID_SERVER [22 bytes] XXX LINK_ADDRESS_SERVER [8 bytes] XXX } HTYPE := 0x0003 [2 bytes] HLEN := 0x0800 [2 bytes] HDATA := SERVER_HDATA[0,461] [462 bytes] } EXTENDED2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := SERVER_HDATA[462,954] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := SERVER_HDATA[955,1447] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := SERVER_HDATA[1448,1939] [492 bytes] } EXTEND2 { NSPEC := 0x00 [1 byte] HTYPE := 0xFFFF [2 bytes] HLEN := 0x0000 [2 bytes] HDATA := SERVER_HDATA[1940,2112] [172 bytes] } §B. NewHope Internal Functions gen_a(SEED): returns a uniformly random poly poly_getnoise(): returns a poly sampled from a centered binomial poly_ntt(poly): number-theoretic transform; returns a poly poly_invntt(poly): inverse number-theoretic transform; returns a poly poly_pointwise(poly, poly): pointwise multiplication; returns a poly poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly helprec(poly): returns a NEWHOPE_REC byte array rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY --- Description of the Newhope internal functions --- gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128 is considered a sequence of 16-bit little-endian integers. This sequence is used to initialize the coefficients of the returned polynomial from the least significant (coefficient of X^0) to the most significant (coefficient of X^1023) coefficient. For each of the 16-bit integers first eliminate the highest two bits (to make it a 14-bit integer) and then use it as the next coefficient if it is smaller than q=12289. Note that the amount of output required from SHAKE to initialize all 1024 coefficients of the polynomial varies depending on the input seed. Note further that this function does not process any secret data and thus does not need any timing-attack protection. poly_getnoise() first generates 4096 bytes of uniformly random data. This can be done by reading these bytes from the system's RNG; efficient implementations will typically only read a 32-byte seed from the system's RNG and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR mode). The output of the PRG is considered an array of 2048 16-bit integers r[0],...,r[2047]. The coefficients of the output polynomial are computed as HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW stands for Hamming weight. Note that the choice of RNG is a local decision; different implementations are free to use different RNGs. Note further that the output of this function is secret; the PRG (and the computation of HW) need to be protected against timing attacks. poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a pseudocode description of a very naive in-place transformation of an input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the following code (all arithmetic on coefficients performed modulo q): psi = 7 omega = 49 for i in range(0,n): t[i] = f[i] * psi^i for i in range(0,n): f[i] = 0 for j in range(0,n): f[i] += t[j] * omega^((i*j)%n) Note that this is not how poly_ntt should be implemented if performance is an issue; in particular, efficient algorithms for the number-theoretic transform take time O(n*log(n)) and not O(n^2) Note further that all arithmetic in poly_ntt has to be protected against timing attacks. poly_invntt(poly f): For a mathematical description of poly_invntt see the [0]; a pseudocode description of a very naive in-place transformation of an input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the following code (all arithmetic on coefficients performed modulo q): invpsi = 8778; invomega = 1254; invn = 12277; for i in range(0,n): t[i] = f[i]; for i in range(0,n): f[i]=0; for j in range(0,n): f[i] += t[j] * invomega^((i*j)%n) f[i] *= invpsi^i f[i] *= invn Note that this is not how poly_invntt should be implemented if performance is an issue; in particular, efficient algorithms for the inverse number-theoretic transform take time O(n*log(n)) and not O(n^2) Note further that all arithmetic in poly_invntt has to be protected against timing attacks. poly_pointwise(poly f, poly g) performs pointwise multiplication of the two polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0, h1 = f1*g1,..., h1023 = f1023*g1023. poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e., brings them to the interval [0,q-1]. Denote these reduced coefficients as f0,..., f1023; note that they all fit into 14 bits. The function then packs those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed little-endian representation", i.e., r[0] = f[0] & 0xff; r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6) r[2] = (f[1] >> 2) & 0xff; r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4) . . . r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2) r[1791] = f[1023] >> 6 Note that this function needs to be protected against timing attacks. In particular, avoid non-constant-time conditional subtractions (or other non-constant-time expressions) in the reduction modulo q of the coefficients. poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives as input an array of 1792 bytes and coverts it into the internal representation of a poly. Note that poly_frombytes does not need to check whether the coefficients are reduced modulo q or reduce coefficients modulo q. Note further that the function must not leak any information about its inputs through timing information, as it is also applied to the secret key of the initiator. helprec(poly f) computes 256 bytes of reconciliation information from the input poly f. Internally, one byte of reconciliation information is computed from four coefficients of f by a function helprec4. Let the input polynomial f = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be r[0],...r[256]. This output byte array is computed as r[0] = helprec4(f0,f256,f512,f768) r[1] = helprec4(f1,f257,f513,f769) r[2] = helprec4(f2,f258,f514,f770) . . . r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following: helprec4(x0,x1,x2,x3): b = randombit() r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b) r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6) return r The function CVPD4 does the following: CVPD4(y0,y1,y2,y3): v00 = round(y0/2q) v01 = round(y1/2q) v02 = round(y2/2q) v03 = round(y3/2q) v10 = round((y0-1)/2q) v11 = round((y1-1)/2q) v12 = round((y2-1)/2q) v13 = round((y3-1)/2q) t = abs(y0 - 2q*v00) t += abs(y1 - 2q*v01) t += abs(y2 - 2q*v02) t += abs(y3 - 2q*v03) if(t < 2q): v0 = v00 v1 = v01 v2 = v02 v3 = v03 k = 0 else v0 = v10 v1 = v11 v2 = v12 v3 = v13 r = 1 return (v0-v3,v1-v3,v2-v3,k+2*v3) In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to the largest integer that does not exceed x; abs() returns the absolute value. Note that all computations involved in helprec operate on secret data and must be protected against timing attacks. rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from f and r. Specifically, it computes one bit of key from 4 coefficients of f and one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r = r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let the bits of the output by k0,...,k255, where k0 = k[0] & 0x01 k1 = (k[0] >> 1) & 0x01 k2 = (k[0] >> 2) & 0x01 . . . k8 = k[1] & 0x01 k9 = (k[1] >> 1) & 0x01 . . . k255 = (k[32] >> 7) The function rec computes k0,...,k255 as k0 = rec4(f0,f256,f512,f768,r[0]) k1 = rec4(f1,f257,f513,f769,r[1]) . . . k255 = rec4(f255,f511,f767,f1023,r[255]) The function rec4 does the following: rec4(y0,y1,y2,y3,r): r0 = r & 0x03 r1 = (r >> 2) & 0x03 r2 = (r >> 4) & 0x03 r3 = (r >> 6) & 0x03 Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3) The function Decode does the following: Decode(v0,v1,v2,v3): t0 = round(v0/8q) t1 = round(v1/8q) t2 = round(v2/8q) t3 = round(v3/8q) t = abs(v0 - 8q*t0) t += abs(v0 - 8q*t0) t += abs(v0 - 8q*t0) t += abs(v0 - 8q*t0) if(t > 1) return 1 else return 0 §C. Test Vectors
Filename: 271-another-guard-selection.txt Title: Another algorithm for guard selection Author: Isis Lovecruft, George Kadianakis, Ola Bini, Nick Mathewson Created: 2016-07-11 Supersedes: 259, 268 Status: Closed Implemented-In: 0.3.0.1-alpha 0.0. Preliminaries This proposal derives from proposals 259 and 268; it is meant to supersede both. It is in part a restatement of it, in part a simplification, and in part a refactoring so that it does not have the serialization problems noted by George Kadianakis. It makes other numerous small changes. Isis, George, and Ola should all get the credit for the well-considered ideas. Whenever I say "Y is a subset of X" you can think in terms of "Y-membership is a flag that can be set on members of X" or "Y-membership is a predicate that can be evaluated on members of X." "More work is needed." There's a to-do at the end of the document. 0.1. Notation: identifiers We mention identifiers of these kinds: [SECTIONS] {INPUTS}, {PERSISTENT_DATA}, and {OPERATING_PARAMETERS}. {non_persistent_data} <states>. Each named identifier receives a type where it is defined, and is used by reference later on. I'm using this convention to make it easier to tell for certain whether every thingy we define is used, and vice versa. 1. Introduction and motivation Tor uses entry guards to prevent an attacker who controls some fraction of the network from observing a fraction of every user's traffic. If users chose their entries and exits uniformly at random from the list of servers every time they build a circuit, then an adversary who had (k/N) of the network would deanonymize F=(k/N)^2 of all circuits... and after a given user had built C circuits, the attacker would see them at least once with probability 1-(1-F)^C. With large C, the attacker would get a sample of every user's traffic with probability 1. To prevent this from happening, Tor clients choose a small number of guard nodes (currently 3). These guard nodes are the only nodes that the client will connect to directly. If they are not compromised, the user's paths are not compromised. But attacks remain. Consider an attacker who can run a firewall between a target user and the Tor network, and make many of the guards they don't control appear to be unreachable. Or consider an attacker who can identify a user's guards, and mount denial-of-service attacks on them until the user picks a guard that the attacker controls. In the presence of these attacks, we can't continue to connect to the Tor network unconditionally. Doing so would eventually result in the user choosing a hostile node as their guard, and losing anonymity. This proposal outlines a new entry guard selection algorithm, which tries to meet the following goals: - Heuristics and algorithms for determining how and which guards are chosen should be kept as simple and easy to understand as possible. - Clients in censored regions or who are behind a fascist firewall who connect to the Tor network should not experience any significant disadvantage in terms of reachability or usability. - Tor should make a best attempt at discovering the most appropriate behaviour, with as little user input and configuration as possible. - Tor clients should discover usable guards without too much delay. - Tor clients should resist (to the extent possible) attacks that try to force them onto compromised guards. 2. State instances In the algorithm below, we describe a set of persistent and non-persistent state variables. These variables should be treated as an object, of which multiple instances can exist. In particular, we specify the use of three particular instances: A. UseBridges If UseBridges is set, then we replace the {GUARDS} set in [Sec:GUARDS] below with the list of list of configured bridges. We maintain a separate persistent instance of {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived values for the UseBridges case. In this case, we impose no upper limit on the sample size. B. EntryNodes / ExcludeNodes / Reachable*Addresses / FascistFirewall / ClientUseIPv4=0 If one of the above options is set, and UseBridges is not, then we compare the fraction of usable guards in the consensus to the total number of guards in the consensus. If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC}, we use a separate instance of the state. (While Tor is running, we do not change back and forth between the separate instance of the state and the default instance unless the fraction of usable guards is 5% higher than, or 5% lower than, {MEANINGFUL_RESTRICTION_FRAC}. This prevents us from flapping back and forth between instances if we happen to hit {MEANINGFUL_RESTRICTION_FRAC} exactly. If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a separate instance of the state, and warn the user. [TODO: should we have a different instance for each set of heavily restricted options?] C. Default If neither of the above variant-state instances is used, we use a default instance. 3. Circuit Creation, Entry Guard Selection (1000 foot view) A circuit in Tor is a path through the network connecting a client to its destination. At a high-level, a three-hop exit circuit will look like this: Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination Entry guards are the only nodes which a client will connect to directly, Exit relays are the nodes by which traffic exists the Tor network in order to connect to an external destination. 3.1 Path selection For any circuit, at least one entry guard and middle node(s) are required. An exit node is required if traffic will exit the Tor network. Depending on its configuration, a relay listed in a consensus could be used for any of these roles. However, this proposal defines how entry guards specifically should be selected and managed, as opposed to middle or exit nodes. 3.1.1 Entry guard selection At a high level, a relay listed in a consensus will move through the following states in the process from initial selection to eventual usage as an entry guard: relays listed in consensus | sampled | | confirmed filtered | | | primary usable_filtered Relays listed in the latest consensus can be sampled for guard usage if they have the "Guard" flag. Sampling is random but weighted by bandwidth. [Paul Syverson in a conversation at the Wilmington Meeting 2017 says that we should look into how we're doing this sampling. Essentially, his concern is that, since we are sampling by bandwidth at first (when we choose the `sampled` set), then later there is another bias—when trying to build circuits (and hence marking guards as confirmed) we select those which completed a usable circuit first (and hence have the lowest latency)—that this sort of "doubly skewed" selection may "snub" some low-consensus-weight guards and leave them unused completely. Thus the issue is primarily that we're not allocating network resources efficiently. Mine and Nick's guard algorithm simulation code never checked what percentage of possible guards the algorithm reasonably allowed clients to use; this would be an interesting thing to check in simulation at some point. If it does turn out to be a problem, Paul's intuition for a fix is to select uniformly at random to obtain the `sampled` set, then weight by bandwidth when trying to build circuits and marking guards as confirmed. —isis] Once a path is built and a circuit established using this guard, it is marked as confirmed. Until this point, guards are first sampled and then filtered based on information such as our current configuration (see SAMPLED and FILTERED sections) and later marked as usable_filtered if the guard is not primary but can be reached. It is always preferable to use a primary guard when building a new circuit in order to reduce guard churn; only on failure to connect to existing primary guards will new guards be used. 3.1.2 Middle and exit node selection Middle nodes are selected at random from relays listed in the latest consensus, weighted by bandwidth. Exit nodes are chosen similarly but restricted to relays with an exit policy. 3.2 Circuit Building Once a path is chosen, Tor will use this path to build a new circuit. If the circuit is built successfully, it either can be used immediately or wait for a better guard, depending on whether other circuits already exist with higher-priority guards. If at any point the circuit fails, the guard is marked as unreachable, the circuit is closed, and waiting circuits are updated. 4. The algorithm. 4.0. The guards listed in the current consensus. [Section:GUARDS] By {set:GUARDS} we mean the set of all guards in the current consensus that are usable for all circuits and directory requests. (They must have the flags: Stable, Fast, V2Dir, Guard.) **Rationale** We require all guards to have the flags that we potentially need from any guard, so that all guards are usable for all circuits. 4.1. The Sampled Guard Set. [Section:SAMPLED] We maintain a set, {set:SAMPLED_GUARDS}, that persists across invocations of Tor. It is an unordered subset of the nodes that we have seen listed as a guard in the consensus at some point. For each such guard, we record persistently: - {pvar:ADDED_ON_DATE}: The date on which it was added to sampled_guards. We base this value on RAND(now, {GUARD_LIFETIME}/10). See Appendix [RANDOM] below. - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to sampled_guards. - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in the _most recent_ consensus we have seen. - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date of the earliest consensus in which this guard was listed such that we have not seen it listed in any later consensus. Otherwise "None." We randomize this, based on RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5) For each guard in {SAMPLED_GUARDS}, we also record this data, non-persistently: - {tvar:last_tried_connect}: A 'last tried to connect at' time. Default 'never'. - {tvar:is_reachable}: an "is reachable" tristate, with possible values { <state:yes>, <state:no>, <state:maybe> }. Default '<maybe>.' [Note: "yes" is not strictly necessary, but I'm making it distinct from "maybe" anyway, to make our logic clearer. A guard is "maybe" reachable if it's worth trying. A guard is "yes" reachable if we tried it and succeeded.] - {tvar:failing_since}: The first time when we failed to connect to this guard. Defaults to "never". Reset to "never" when we successfully connect to this guard. - {tvar:is_pending} A "pending" flag. This indicates that we are trying to build an exploratory circuit through the guard, and we don't know whether it will succeed. We require that {SAMPLED_GUARDS} contain at least {MIN_FILTERED_SAMPLE} guards from the consensus (if possible), but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards in the consensus, and not more then {MAX_SAMPLE_SIZE} in total. (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we set the maximum at {MIN_FILTERED_SAMPLE}.) To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from ({GUARDS} - {SAMPLED_GUARDS}), weighted by bandwidth. We remove an entry from {SAMPLED_GUARDS} if: * We have a live consensus, and {IS_LISTED} is false, and {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER} days in the past. OR * We have a live consensus, and {ADDED_ON_DATE} is over {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago. Note that {SAMPLED_GUARDS} does not depend on our configuration. It is possible that we can't actually connect to any of these guards. **Rationale** The {SAMPLED_GUARDS} set is meant to limit the total number of guards that a client will connect to in a given period. The upper limit on its size prevents us from considering too many guards. The first expiration mechanism is there so that our {SAMPLED_GUARDS} list does not accumulate so many dead guards that we cannot add new ones. The second expiration mechanism makes us rotate our guards slowly over time. 4.2. The Usable Sample [Section:FILTERED] We maintain another set, {set:FILTERED_GUARDS}, that does not persist. It is derived from: - {SAMPLED_GUARDS} - our current configuration, - the path bias information. A guard is a member of {set:FILTERED_GUARDS} if and only if all of the following are true: - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to true. - It is not disabled because of path bias issues. - It is not disabled because of ReachableAddress police, the ClientUseIPv4 setting, the ClientUseIPv6 setting, the FascistFirewall setting, or some other option that prevents using some addresses. - It is not disabled because of ExcludeNodes. - It is a bridge if UseBridges is true; or it is not a bridge if UseBridges is false. - Is included in EntryNodes if EntryNodes is set and UseBridges is not. (But see 2.B above). We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which is defined to be the subset of {FILTERED_GUARDS} where {is_reachable} is <yes> or <maybe>. We try to maintain a requirement that {USABLE_FILTERED_GUARDS} contain at least {MIN_FILTERED_SAMPLE} elements: Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we add new elements to {SAMPLED_GUARDS} until one of the following is true: * {USABLE_FILTERED_GUARDS} is large enough, OR * {SAMPLED_GUARDS} is at its maximum size. ** Rationale ** These filters are applied _after_ sampling: if we applied them before the sampling, then our sample would reflect the set of filtering restrictions that we had in the past. 4.3. The confirmed-guard list. [Section:CONFIRMED] [formerly USED_GUARDS] We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}. It contains guards that we have used before, in our preference order of using them. It is a subset of {SAMPLED_GUARDS}. For each guard in this list, we store persistently: - {pvar:IDENTITY} Its fingerprint - {pvar:CONFIRMED_ON_DATE} When we added this guard to {CONFIRMED_GUARDS}. Randomized as RAND(now, {GUARD_LIFETIME}/10). We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic." That is, a circuit is considered for use for client traffic when we have decided that we could attach a stream to it; at that point the guard for that circuit SHOULD be added to {CONFIRMED_GUARDS}. Whenever we remove a member from {SAMPLED_GUARDS}, we also remove it from {CONFIRMED_GUARDS}. [Note: You can also regard the {CONFIRMED_GUARDS} list as a total ordering defined over a subset of {SAMPLED_GUARDS}.] Definition: we call Guard A "higher priority" than another Guard B if, when A and B are both reachable, we would rather use A. We define priority as follows: * Every guard in {CONFIRMED_GUARDS} has a higher priority than every guard not in {CONFIRMED_GUARDS}. * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier on the {CONFIRMED_GUARDS} list has a higher priority. * Among guards that do not appear in {CONFIRMED_GUARDS}, {is_pending}==true guards have higher priority. * Among those, the guard with earlier {last_tried_connect} time have higher priority. * Finally, among guards that do not appear in {CONFIRMED_GUARDS} with {is_pending==false}, all have equal priority. ** Rationale ** We add elements to this ordering when we have actually used them for building a usable circuit. We could mark them at some other time (such as when we attempt to connect to them, or when we actually connect to them), but this approach keeps us from committing to a guard before we actually use it for sensitive traffic. 4.4. The Primary guards [Section:PRIMARY] We keep a run-time non-persistent ordered list of {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It contains {N_PRIMARY_GUARDS} elements. To compute primary guards, take the ordered intersection of {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first {N_PRIMARY_GUARDS} elements. If there are fewer than {N_PRIMARY_GUARDS} elements, add additional elements to PRIMARY_GUARDS chosen _uniformly_ at random from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}). Once an element has been added to {PRIMARY_GUARDS}, we do not remove it until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list. Note that {PRIMARY_GUARDS} do not have to be in {USABLE_FILTERED_GUARDS}: they might be unreachable. ** Rationale ** These guards are treated differently from other guards. If one of them is usable, then we use it right away. For other guards {FILTERED_GUARDS}, if it's usable, then before using it we might first double-check whether perhaps one of the primary guards is usable after all. 4.5. Retrying guards. [Section:RETRYING] (We run this process as frequently as needed. It can be done once a second, or just-in-time.) If a primary sampled guard's {is_reachable} status is <no>, then we decide whether to update its {is_reachable} status to <maybe> based on its {last_tried_connect} time, its {failing_since} time, and the {PRIMARY_GUARDS_RETRY_SCHED} schedule. If a non-primary sampled guard's {is_reachable} status is <no>, then we decide whether to update its {is_reachable} status to <maybe> based on its {last_tried_connect} time, its {failing_since} time, and the {GUARDS_RETRY_SCHED} schedule. ** Rationale ** An observation that a guard has been 'unreachable' only lasts for a given amount of time, since we can't infer that it's unreachable now from the fact that it was unreachable a few minutes ago. 4.6. Selecting guards for circuits. [Section:SELECTING] Every origin circuit is now in one of these states: <state:usable_on_completion>, <state:usable_if_no_better_guard>, <state:waiting_for_better_guard>, or <state:complete>. You may only attach streams to <complete> circuits. (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO cells, and INTRODUCE cells on <complete> circuits.) The per-circuit state machine is: New circuits are <usable_on_completion> or <usable_if_no_better_guard>. A <usable_on_completion> circuit may become <complete>, or may fail. A <usable_if_no_better_guard> circuit may become <usable_on_completion>; may become <waiting_for_better_guard>; or may fail. A <waiting_for_better_guard> circuit will become <complete>, or will be closed, or will fail. A <complete> circuit remains <complete> until it fails or is closed. Each of these transitions is described below. We keep, as global transient state: * {tvar:last_time_on_internet} -- the last time at which we successfully used a circuit or connected to a guard. At startup we set this to "infinitely far in the past." When we want to build a circuit, and we need to pick a guard: * If any entry in PRIMARY_GUARDS has {is_reachable} status of <maybe> or <yes>, return the first such guard. The circuit is <usable_on_completion>. [Note: We do not use {is_pending} on primary guards, since we are willing to try to build multiple circuits through them before we know for sure whether they work, and since we will not use any non-primary guards until we are sure that the primary guards are all down. (XX is this good?)] * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS} and {USABLE_FILTERED_GUARDS} is nonempty, return the first entry in that intersection that has {is_pending} set to false. Set its value of {is_pending} to true. The circuit is now <usable_if_no_better_guard>. (If all entries have {is_pending} true, pick the first one.) * Otherwise, if there is no such entry, select a member at random from {USABLE_FILTERED_GUARDS}. Set its {is_pending} field to true. The circuit is <usable_if_no_better_guard>. We update the {last_tried_connect} time for the guard to 'now.' In some cases (for example, when we need a certain directory feature, or when we need to avoid using a certain exit as a guard), we need to restrict the guards that we use for a single circuit. When this happens, we remember the restrictions that applied when choosing the guard for that circuit, since we will need them later (see [UPDATE_WAITING].). ** Rationale ** We're getting to the core of the algorithm here. Our main goals are to make sure that 1. If it's possible to use a primary guard, we do. 2. We probably use the first primary guard. So we only try non-primary guards if we're pretty sure that all the primary guards are down, and we only try a given primary guard if the earlier primary guards seem down. When we _do_ try non-primary guards, however, we only build one circuit through each, to give it a chance to succeed or fail. If ever such a circuit succeeds, we don't use it until we're pretty sure that it's the best guard we're getting. (see below). [XXX timeout.] 4.7. When a circuit fails. [Section:ON_FAIL] When a circuit fails in a way that makes us conclude that a guard is not reachable, we take the following steps: * We set the guard's {is_reachable} status to <no>. If it had {is_pending} set to true, we make it non-pending. * We close the circuit, of course. (This removes it from consideration by the algorithm in [UPDATE_WAITING].) * Update the list of waiting circuits. (See [UPDATE_WAITING] below.) [Note: the existing Tor logic will cause us to create more circuits in response to some of these steps; and also see [ON_CONSENSUS].] ** Rationale ** See [SELECTING] above for rationale. 4.8. When a circuit succeeds [Section:ON_SUCCESS] When a circuit succeeds in a way that makes us conclude that a guard _was_ reachable, we take these steps: * We set its {is_reachable} status to <yes>. * We set its {failing_since} to "never". * If the guard was {is_pending}, we clear the {is_pending} flag. * If the guard was not a member of {CONFIRMED_GUARDS}, we add it to the end of {CONFIRMED_GUARDS}. * If this circuit was <usable_on_completion>, this circuit is now <complete>. You may attach streams to this circuit, and use it for hidden services. * If this circuit was <usable_if_no_better_guard>, it is now <waiting_for retry>. You may not yet attach streams to it. Then check whether the {last_time_on_internet} is more than {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: * If it is, then mark all {PRIMARY_GUARDS} as "maybe" reachable. * If it is not, update the list of waiting circuits. (See [UPDATE_WAITING] below) [Note: the existing Tor logic will cause us to create more circuits in response to some of these steps; and see [ON_CONSENSUS].] ** Rationale ** See [SELECTING] above for rationale. 4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] We run this procedure whenever it's possible that a <waiting_for_better_guard> circuit might be ready to be called <complete>. * If any circuit C1 is <waiting_for_better_guard>, AND: * All primary guards have reachable status of <no>. * There is no circuit C2 that "blocks" C1. Then, upgrade C1 to <complete>. Definition: In the algorithm above, C2 "blocks" C1 if: * C2 obeys all the restrictions that C1 had to obey, AND * C2 has higher priority than C1, AND * Either C2 is <complete>, or C2 is <waiting_for_better_guard>, or C2 has been <usable_if_no_better_guard> for no more than {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds. We run this procedure periodically: * If any circuit stays is <waiting_for_better_guard> for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds, time it out. **Rationale** If we open a connection to a guard, we might want to use it immediately (if we're sure that it's the best we can do), or we might want to wait a little while to see if some other circuit which we like better will finish. When we mark a circuit <complete>, we don't close the lower-priority circuits immediately: we might decide to use them after all if the <complete> circuit goes down before {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds. 4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] We update {GUARDS}. For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}. [**] We remove entries from {SAMPLED_GUARDS} if appropriate, according to the sampled-guards expiration rules. If they were in {CONFIRMED_GUARDS}, we also remove them from {CONFIRMED_GUARDS}. We recompute {FILTERED_GUARDS}, and everything that derives from it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}. (Whenever one of the configuration options that affects the filter is updated, we repeat the process above, starting at the [**] line.) 4.11. Deciding whether to generate a new circuit. [Section:NEW_CIRCUIT_NEEDED] In current Tor, we generate a new circuit when we don't have enough circuits either built or in-progress to handle a given stream, or an expected stream. For the purpose of this rule, we say that <waiting_for_better_guard> circuits are neither built nor in-progress; that <complete> circuits are built; and that the other states are in-progress. A. Appendices A.1. Parameters with suggested values. [Section:PARAM_VALS] (All suggested values chosen arbitrarily) {param:MAX_SAMPLE_THRESHOLD} -- 20% {param:MAX_SAMPLE_SIZE} -- 60 {param:GUARD_LIFETIME} -- 120 days {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days [previously ENTRY_GUARD_REMOVE_AFTER] {param:MIN_FILTERED_SAMPLE} -- 20 {param:N_PRIMARY_GUARDS} -- 3 {param:PRIMARY_GUARDS_RETRY_SCHED} -- every 30 minutes for the first 6 hours. -- every 2 hours for the next 3.75 days. -- every 4 hours for the next 3 days. -- every 9 hours thereafter. {param:GUARDS_RETRY_SCHED} -- 1 hour -- every hour for the first 6 hours. -- every 4 hours for the next 3.75 days. -- every 18 hours for the next 3 days. -- every 36 hours thereafter. {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes {param:MEANINGFUL_RESTRICTION_FRAC} -- .2 {param:EXTREME_RESTRICTION_FRAC} -- .01 {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days A.2. Random values [Section:RANDOM] Frequently, we want to randomize the expiration time of something so that it's not easy for an observer to match it to its start time. We do this by randomizing its start date a little, so that we only need to remember a fixed expiration interval. By RAND(now, INTERVAL) we mean a time between now and INTERVAL in the past, chosen uniformly at random. A.3. Why not a sliding scale of primaryness? [Section:CVP] At one meeting, I floated the idea of having "primaryness" be a continuous variable rather than a boolean. I'm no longer sure this is a great idea, but I'll try to outline how it might work. To begin with: being "primary" gives it a few different traits: 1) We retry primary guards more frequently. [Section:RETRYING] 2) We don't even _try_ building circuits through lower-priority guards until we're pretty sure that the higher-priority primary guards are down. (With non-primary guards, on the other hand, we launch exploratory circuits which we plan not to use if higher-priority guards succeed.) [Section:SELECTING] 3) We retry them all one more time if a circuit succeeds after the net has been down for a while. [Section:ON_SUCCESS] We could make each of the above traits continuous: 1) We could make the interval at which a guard is retried depend continuously on its position in CONFIRMED_GUARDS. 2) We could change the number of guards we test in parallel based on their position in CONFIRMED_GUARDS. 3) We could change the rule for how long the higher-priority guards need to have been down before we call a <usable_if_no_better_guard> circuit <complete> based on a possible network-down condition. For example, we could retry the first guard if we tried it more than 10 seconds ago, the second if we tried it more than 20 seconds ago, etc. I am pretty sure, however, that if these are worth doing, they need more analysis! Here's why: * They all have the potential to leak more information about a guard's exact position on the list. Is that safe? Is there any way to exploit that? I don't think we know. * They all seem like changes which it would be relatively simple to make to the code after we implement the simpler version of the algorithm described above. A.3. Controller changes We will add to control-spec.txt a new possible circuit state, GUARD_WAIT, that can be given as part of circuit events and GETINFO responses about circuits. A circuit is in the GUARD_WAIT state when it is fully built, but we will not use it because a circuit with a better guard might become built too. A.4. Persistent state format The persistent state format doesn't need to be part of this proposal, since different implementations can do it differently. Nonetheless, here's the one Tor uses: The "state" file contains one Guard entry for each sampled guard in each instance of the guard state (see section 2). The value of this Guard entry is a set of space-separated K=V entries, where K contains any nonspace character except =, and V contains any nonspace characters. Implementations must retain any unrecognized K=V entries for a sampled guard when the regenerate the state file. The order of K=V entries is not allowed to matter. Recognized fields (values of K) are: "in" -- the name of the guard state instance that this sampled guard is in. If a sampled guard is in two guard states instances, it appears twice, with a different "in" field each time. Required. "rsa_id" -- the RSA id digest for this guard, encoded in hex. Required. "bridge_addr" -- If the guard is a bridge, its configured address and OR port. Optional. "nickname" -- the guard's nickname, if any. Optional. "sampled_on" -- the date when the guard was sampled. Required. "sampled_by" -- the Tor version that sampled this guard. Optional. "unlisted_since" -- the date since which the guard has been unlisted. Optional. "listed" -- 0 if the guard is not listed ; 1 if it is. Required. "confirmed_on" -- date when the guard was confirmed. Optional. "confirmed_idx" -- position of the guard in the confirmed list. Optional. "pb_use_attempts", "pb_use_successes", "pb_circ_attempts", "pb_circ_successes", "pb_successful_circuits_closed", "pb_collapsed_circuits", "pb_unusable_circuits", "pb_timeouts" -- state for the circuit path bias algorithm, given in decimal fractions. Optional. All dates here are given as a (spaceless) ISO8601 combined date and time in UTC (e.g., 2016-11-29T19:39:31). I do not plan to build a migration mechanism from the old format to the new. TODO. Still non-addressed issues [Section:TODO] Simulate to answer: Will this work in a dystopic world? Simulate actual behavior. For all lifetimes: instead of storing the "this began at" time, store the "remove this at" time, slightly randomized. Clarify that when you get a <complete> circuit, you might need to relaunch circuits through that same guard immediately, if they are circuits that have to be independent. Fix all items marked XX or TODO. "Directory guards" -- do they matter? Suggestion: require that all guards support downloads via BEGINDIR. We don't need to worry about directory guards for relays, since we aren't trying to prevent relay enumeration. IP version preferenes via ClientPreferIPv6ORPort Suggestion: Treat it as a preference when adding to {CONFIRMED_GUARDS}, but not otherwise.
Filename: 272-valid-and-running-by-default.txt Title: Listed routers should be Valid, Running, and treated as such Created: 26 Aug 2016 Author: Nick Mathewson Status: Closed Implemented-In: 0.2.9.3-alpha, 0.2.9.4-alpha 1. Introduction and proposal. This proposal describes a change in how clients understand consensus flags, and how authorities vote on consensuses. 1.1. Authority-side changes Back in proposal 138, we made it so that non-Running routers were not included in the consensus documents. We should do the same with the Valid flag. Specifically, after voting, if the authorities find that a router would not receive the Valid flag, they should not include it at all. This will require the allocation of a new consensus method, since it is a change in how consensuses are made from votes. In the most recent consensus, it will affect exactly 1 router. 1.2. Client-side changes I propose that clients should consider every listed router to be listed as Running and Valid if any consensus method above or higher is in use. 2. Benefits Removing the notion of listed but invalid routers will remove an opportunity for error, and let us remove some client side code. More interestingly, the above changes would allow us to eventually stop including the Running and Valid flags, thereby providing an authority-side way to feature-gate clients off of the Tor network without a fast-zombie problem. (See proposal 266 for discussion.) A. An additional possible change Perhaps authorities might also treat BadExit like they treat the absence of Valid and Running: as sufficient reason to not include a router in the consensus. Right now, there are only 4 listed BadExit routers in the consensus, amounting to a small fraction of total bandwidth. Making this change would allow us to remove the client-side badexit logic. B. Does this solve the zombie problem? I tested it a little, and it does seem to be a way to make even the most ancient consensus-understanding Tors stop fetching descriptors and using the network. More testing needed though.
Filename: 273-exit-relay-pinning.txt Title: Exit relay pinning for web services Author: Philipp Winter, Tobias Pulls, Roya Ensafi, and Nick Feamster Created: 2016-09-22 Status: Reserve Target: n/a 0. Overview To mitigate the harm caused by malicious exit relays, this proposal presents a novel scheme -- exit relay pinning -- to allow web sites to express that Tor connections should preferably originate from a set of predefined exit relays. This proposal is currently in draft state. Any feedback is appreciated. 1. Motivation Malicious exit relays are increasingly becoming a problem. We have been witnessing numerous opportunistic attacks, but also highly sophisticated, targeted attacks that are financially motivated. So far, we have been looking for malicious exit relays using active probing and a number of heuristics, but since it is inexpensive to keep setting up new exit relays, we are facing an uphill battle. Similar to the now-obsolete concept of exit enclaves, this proposal enables web services to express that Tor clients should prefer a predefined set of exit relays when connecting to the service. We encourage sensitive sites to set up their own exit relays and have Tor clients prefer these relays, thus greatly mitigating the risk of man-in-the-middle attacks. 2. Design 2.1 Overview A simple analogy helps in explaining the concept behind exit relay pinning: HTTP Public Key Pinning (HPKP) allows web servers to express that browsers should pin certificates for a given time interval. Similarly, exit relay pinning (ERP) allows web servers to express that Tor Browser should prefer a predefined set of exit relays. This makes it harder for malicious exit relays to be selected as last hop for a given website. Web servers advertise support for ERP in a new HTTP header that points to an ERP policy. This policy contains one or more exit relays, and is signed by the respective relay's master identity key. Once Tor Browser obtained a website's ERP policy, it will try to select the site's preferred exit relays for subsequent connections. The following subsections discuss this mechanism in greater detail. 2.2 Exit relay pinning header Web servers support ERP by advertising it in the "Tor-Exit-Pins" HTTP header. The header contains two directives, "url" and "max-age": Tor-Exit-Pins: url="https://example.com/pins.txt"; max-age=2678400 The "url" directive points to the full policy, which MUST be HTTPS. Tor Browser MUST NOT fetch the policy if it is not reachable over HTTPS. Also, Tor Browser MUST abort the ERP procedure if the HTTPS certificate is not signed by a trusted authority. The "max-age" directive determines the time in seconds for how long Tor Browser SHOULD cache the ERP policy. After seeing a Tor-Exit-Pins header in an HTTP response, Tor Browser MUST fetch and interpret the policy unless it already has it cached and the cached policy has not yet expired. 2.3 Exit relay pinning policy An exit relay pinning policy MUST be formatted in JSON. The root element is called "erp-policy" and it points to a list of pinned exit relays. Each list element MUST contain two elements, "fingerprint" and "signature". The "fingerprint" element points to the hex-encoded, uppercase, 40-digit fingerprint of an exit relay, e.g., 9B94CD0B7B8057EAF21BA7F023B7A1C8CA9CE645. The "signature" element points to an Ed25519 signature, uppercase and hex-encoded. The following JSON shows a conceptual example: { "erp-policy": [ "start-policy", { "fingerprint": Fpr1, "signature": Sig_K1("erp-signature" || "example.com" || Fpr1) }, { "fingerprint": Fpr2, "signature": Sig_K2("erp-signature" || "example.com" || Fpr2) }, ... { "fingerprint": Fprn, "signature": Sig_Kn("erp-signature" || "example.com" || Fprn) }, "end-policy" ] } Fpr refers to a relay's fingerprint as discussed above. In the signature, K refers to a relay's master private identity key. The || operator refers to string concatenation, i.e., "foo" || "bar" results in "foobar". "erp-signature" is a constant and denotes the purpose of the signature. "start-policy" and "end-policy" are both constants and meant to prevent an adversary from serving a client only a partial list of pins. The signatures over fingerprint and domain are necessary to prove that an exit relay agrees to being pinned. The website's domain -- in this case example.com -- is part of the signature, so third parties such as evil.com cannot coerce exit relays they don't own to serve as their pinned exit relays. After having fetched an ERP policy, Tor Browser MUST first verify that the two constants "start-policy" and "end-policy" are present, and then validate the signature over all list elements. If any element does not validate, Tor Browser MUST abort the ERP procedure. If an ERP policy contains more than one exit relay, Tor Browser MUST select one at random, weighted by its bandwidth. That way, we can balance load across all pinned exit relays. Tor Browser could enforce the mapping from domain to exit relay by adding the following directive to its configuration file: MapAddress example.com example.com.Fpr_n.exit 2.4 Defending against malicious websites The purpose of exit relay pinning is to protect a website's users from malicious exit relays. We must further protect the same users from the website, however, because it could abuse ERP to reduce a user's anonymity set. The website could group users into arbitrarily-sized buckets by serving them different ERP policies on their first visit. For example, the first Tor user could be pinned to exit relay A, the second user could be pinned to exit relay B, etc. This would allow the website to link together the sessions of anonymous users. We cannot prevent websites from serving client-specific policies, but we can detect it by having Tor Browser fetch a website's ERP policy over multiple independent exit relays. If the policies are not identical, Tor Browser MUST ignore the ERP policies. If Tor Browser would attempt to fetch the ERP policy over n circuits as quickly as possible, the website would receive n connections within a narrow time interval, suggesting that all these connections originated from the same client. To impede such time-based correlation attacks, Tor Browser MUST wait for a randomly determined time span before fetching the ERP policy. Tor Browser SHOULD randomly sample a delay from an exponential distribution. The disadvantage of this defence is that it can take a while until Tor Browser knows that it can trust an ERP policy. 2.5 Design trade-offs We now briefly discuss alternative design decisions, and why we defined ERP the way we did. Instead of having a web server *tell* Tor Browser about pinned exit relays, we could have Tor Browser *ask* the web server, e.g., by making it fetch a predefined URL, similar to robots.txt. We believe that this would involve too much overhead because only a tiny fraction of sites that Tor users visit will have an ERP policy. ERP implies that adversaries get to learn all the exit relays from which all users of a pinned site come from. These exit relays could then become a target for traffic analysis or compromise. Therefore, websites that pin exit relays SHOULD have a proper HTTPS setup and host their exit relays topologically close to the content servers, to mitigate the threat of network-level adversaries. It's possible to work around the bootstrapping problem (i.e., the very first website visit cannot use pinned exits) by having an infrastructure that allows us to pin exits out-of-band, e.g., by hard-coding them in Tor Browser, or by providing a lookup service prior to connecting to a site, but the additional complexity does not seem to justify the added security or reduced overhead. 2.6 Open questions o How should we deal with selective DoS or otherwise unavailable exit relays? That is, what if an adversary takes offline pinned exit relays? Should Tor Browser give up, or fall back to non-pinned exit relays that are potentially malicious? Should we give site operators an option to express a fallback if they care more about availability than security? o Are there any aspects that are unnecessarily tricky to implement in Tor Browser? If so, let's figure out how to make it easier to build. o Is a domain-level pinning granularity sufficient? o Should we use the Ed25519 master or signing key? o Can cached ERP policies survive a Tor Browser restart? After all, we are not supposed to write to disk, and ERP policies are basically like a browsing history. o Should we have some notion of "freshness" in an ERP policy? The problem is that an adversary could save my ERP policy for example.com, and if I ever give up example.com, the adversary could register it, and use my relays for pinning. This could easily be mitigated by rotating my relay identity keys, and might not be that big a problem. o Should we support non-HTTP services? For example, do we want to support, say, SSH? And if so, how would we go about it? o HPKP also defines a "report-uri" directive to which errors should be reported. Do we want something similar, so site operators can detect issues such as attempted DoS attacks? o It is wasteful to send a 60-70 byte header to all browsers while only a tiny fraction of them will want it. Web servers could send the header only to IP addresses that run an exit relay, but that adds quite a bit of extra complexity. o We currently defend against malicious websites by fetching the ERP policy over several exit relays, spread over time. In doing so, we are making assumptions on the number of visits the website sees. Is there a better solution that isn't significantly more complex?
Filename: 274-rotate-onion-keys-less.txt Title: Rotate onion keys less frequently Author: Nick Mathewson Created: 20-Feb-2017 Status: Closed Implemented-In: 0.3.1.1-alpha 1. Overview This document proposes that, in order to limit the bandwidth needed for microdescriptor listing and transmission, we reduce the onion key rotation rate from the current value (7 days) to something closer to 28 days. Doing this will reduce the total microdescriptor download volume by approximately 70%. 2. Motivation Currently, clients must download a networkstatus consensus document once an hour, and must download every unfamiliar microdescriptor listed in that document. Therefore, we can reduce client directory bandwidth if we can cause microdescriptors to change less often. Furthermore, we are planning (in proposal 140) to implement a diff-based mechanism for clients to download only the parts of each consensus that have changed. If we do that, then by having the microdescriptor for each router change less often, we can make these consensus diffs smaller as well. 3. Analysis I analyzed microdescriptor changes over the month of January 2017, and found that 94.5% of all microdescriptor transitions were changes in onion key alone. Therefore, we could reduce the number of changed "m" lines in consensus diffs by approximately 94.5% * (3/4) =~ 70%, if we were to rotate onion keys one-fourth as often. The number of microdescriptors to actually download should decrease by a similar number. This amounts to a significant reduction: currently, by back-of-the-envelope estimates, an always-on client that downloads all the directory info in a month downloads about 449MB of compressed consensuses and something around 97 MB of compressed microdescriptors. This proposal would save that user about 12% of their total directory bandwidth. If we assume that consensus diffs are implemented (see proposal 140), then the user's compressed consensus downloads fall to something closer to 27 MB. Under that analysis, the microdescriptors will dominate again at 97 MB -- so lowering the number of microdescriptors to fetch would save more like 55% of the remaining bandwidth. [Back-of-the-envelope technique: assume every consensus is downloaded, and every microdesc is downloaded, and microdescs are downloaded in groups of 61, which works out to a constant rate.] We'll need to do more analysis to assess the impact on clients that connect to the network infrequently enough to miss microdescriptors: nonetheless, the 70% figure above ought to apply to clients that connect at least weekly. (XXXX Better results pending feedback from ahf's analysis.) 4. Security analysis The onion key is used to authenticate a relay to a client when the client is building a circuit through that relay. The only reason to limit their lifetime is to limit the impact if an attacker steals an onion key without being detected. If an attacker steals an onion key and is detected, the relay can issue a new onion key ahead of schedule, with little disruption. But if the onion key theft is _not_ detected, then the attacker can use that onion key to impersonate the relay until clients start using the relay's next key. In order to do so, the attacker must also impersonate the target relay at the link layer: either by stealing the relay's link keys, which rotate more frequently, or by compromising the previous relay in the circuit. Therefore, onion key rotation provides a small amount of protection only against an attacker who can compromise relay keys very intermittently, and who controls only a small portion of the network. Against an attacker who can steal keys regularly it does little, and an attacker who controls a lot of the network can already mount other attacks. 5. Proposal I propose that we move the default onion key rotation interval from 7 days to 28 days, as follows. There should be a new consensus parameter, "onion-key-rotation-days", measuring the key lifetime in days. Its minimum should be 1, its maximum should be 90, and its default should be 28. There should also be a new consensus parameter, "onion-key-grace-period-days", measuring the interval for which older onion keys should still be accepted. Its minimum should be 1, its maximum should be onion-key-rotation-days, and its default should be 7. Every relay should list each onion key it generates for onion-key-rotation-days days after generating it, and then replace it. Relays should continue to accept their most recent previous onion key for an additional onion-key-grace-period-days days after it is replaced.
Filename: 275-md-published-time-is-silly.txt Title: Stop including meaningful "published" time in microdescriptor consensus Author: Nick Mathewson Created: 20-Feb-2017 Status: Closed Target: 0.3.1.x-alpha Implemented-In: 0.4.8.1-alpha 0. Status: As of 0.2.9.11 / 0.3.0.7 / 0.3.1.1-alpha, Tor no longer takes any special action on "future" published times, as proposed in section 4. As of 0.4.0.1-alpha, we implemented a better mechanism for relays to know when to publish. (See proposal 293.) 1. Overview This document proposes that, in order to limit the bandwidth needed for networkstatus diffs, we remove "published" part of the "r" lines in microdescriptor consensuses. The more extreme, compatibility-breaking version of this idea will reduce ed consensus diff download volume by approximately 55-75%. A less-extreme interim version would still reduce volume by approximately 5-6%. 2. Motivation The current microdescriptor consensus "r" line format is: r Nickname Identity Published IP ORPort DirPort as in: r moria1 lpXfw1/+uGEym58asExGOXAgzjE 2017-01-10 07:59:25 \ 128.31.0.34 9101 9131 As I'll show below, there's not much use for the "Published" part of these lines. By omitting them or replacing them with something more compressible, we can save space. What's more, changes in the Published field are one of the most frequent changes between successive networkstatus consensus documents. If we were to remove this field, then networkstatus diffs (see proposal 140) would be smaller. 3. Compatibility notes Above I've talked about "removing" the published field. But of course, doing this would make all existing consensus consumers stop parsing the consensus successfully. Instead, let's look at how this field is used currently in Tor, and see if we can replace the value with something else. * Published is used in the voting process to decide which descriptor should be considered. But that is taken from vote networkstatus documents, not consensuses. * Published is used in mark_my_descriptor_dirty_if_too_old() to decide whether to upload a new router descriptor. If the published time in the consensus is more than 18 hours in the past, we upload a new descriptor. (Relays are potentially looking at the microdesc consensus now, since #6769 was merged in 0.3.0.1-alpha.) Relays have plenty of other ways to notice that they should upload new descriptors. * Published is used in client_would_use_router() to decide whether a routerstatus is one that we might possibly use. We say that a routerstatus is not usable if its published time is more than OLD_ROUTER_DESC_MAX_AGE (5 days) in the past, or if it is not at least TestingEstimatedDescriptorPropagationTime (10 minutes) in the future. [***] Note that this is the only case where anything is rejected because it comes from the future. * client_would_use_router() decides whether we should download a router descriptor (not a microdescriptor) in routerlist.c * client_would_use_router() is used from count_usable_descriptors() to decide which relays are potentially usable, thereby forming the denominator of our "have descriptors / usable relays" fraction. So we have a fairly limited constraints on which Published values we can safely advertize with today's Tor implementations. If we advertise anything more than 10 minutes in the future, client_would_use_router() will consider routerstatuses unusable. If we advertize anything more than 18 hours in the past, relays will upload their descriptors far too often. 4. Proposal Immediately, in 0.2.9.x-stable (our LTS release series), we should stop caring about published_on dates in the future. This is a two-line change. As an interim solution: We should add a new consensus method number that changes the process by which Published fields in consensuses are generated. It should set all Published fields in the consensus to be the same value. These fields should be taken to rotate every 15 hours, by taking consensus valid-after time, and rounding down to the nearest multiple of 15 hours since the epoch. As a longer-term solution: Once all Tor versions earlier than 0.2.9.x are obsolete (in mid 2018), we can update with a new consensus method, and set the published_on date to some safe time in the future. 5. Analysis To consider the impact on consensus diffs: I analyzed consensus changes over the month of January 2017, using scripts at [1]. With the interim solution in place, compressed diff sizes fell by 2-7% at all measured intervals except 12 hours, where they increased by about 4%. Savings of 5-6% were most typical. With the longer-term solution in place, and all published times held constant permanently, the compressed diff sizes were uniformly at least 56% smaller. With this in mind, I think we might want to only plan to support the longer-term solution. [1] https://github.com/nmathewson/consensus-diff-analysis
Filename: 276-lower-bw-granularity.txt Title: Report bandwidth with lower granularity in consensus documents Author: Nick Mathewson Created: 20-Feb-2017 Status: Dead Target: 0.3.1.x-alpha [NOTE: We're calling this proposal dead for now: the benefits are small compared to the possible loss in routing correctness. If/when proposal 300 is built, it will have even less benefit. (2020 July 31)] 1. Overview This document proposes that, in order to limit the bandwidth needed for networkstatus diffs, we lower the granularity with which bandwidth is reported in consensus documents. Making this change will reduce the total compressed ed diff download volume by around 10%. 2. Motivation Consensus documents currently report bandwidth values as the median of the measured bandwidth values in the votes. (Or as the median of all votes' values if there are not enough measurements.) And when voting, in turn, authorities simply report whatever measured value they most recently encountered, clipped to 3 significant base-10 figures. This means that, from one consensus to the next, these weights very often and with little significance: A large fraction of bandwidth transitions are under 2% in magnitude. As we begin to use consensus diffs, each change will take space to transmit. So lowering the amount of changes will lower client bandwidth requirements significantly. 3. Proposal I propose that we round the bandwidth values, as they are placed in votes, to no more than two significant digits. In addition, for values beginning with decimal "2" through "4", we should round the first two digits the nearest multiple of 2. For values beginning with decimal "5" though "9", we should round to the nearest multiple of 5. The change will take effect progressively as authorities upgrade: since the median value is used, when one authority upgrades, 1/5 of the bandwidths will be rounded (on average). Once all authorities upgrade, all bandwidths will be rounded like this. 4. Analysis The rounding proposed above will not round any value by more than 5% more than current rounding, so the overall impact on bandwidth balancing should be small. In order to assess the bandwidth savings of this approach, I smoothed the January 2017 consensus documents' Bandwidth fields, using scripts from [1]. I found that if clients download consensus diffs once an hour, they can expect 11-13% mean savings after xz or gz compression. For two-hour intervals, the savings is 8-10%; for three-hour or four-hour intervals, the savings only is 6-8%. After that point, we start seeing diminishing returns, with only 1-2% savings on a 72-hour interval's diff. [1] https://github.com/nmathewson/consensus-diff-analysis 5. Open questions: Is there a greedier smoothing algorithm that would produce better results? Is there any reason to think this amount of smoothing would not be safe? Would a time-aware smoothing mechanism work better?
Filename: 277-detect-id-sharing.txt Title: Detect multiple relay instances running with same ID Author: Nick Mathewson Created: 20-Feb-2017 Status: Open Target: 0.3.?? 1. Overview This document proposes that we detect multiple relay instances running with the same ID, and block them all, or block all but one of each. 2. Motivation While analyzing microdescriptor and relay status transitions (see proposal XXXX), I found that something like 16/10631 router identities from January 2017 were apparently shared by two or more relays, based on their excessive number of onion key transitions. This is probably accidental: and if intentional, it's probably not achieving whatever the relay operators intended. Sharing identities causes all the relays in question to "flip" back and forth onto the network, depending on which one uploaded its descriptor most recently. One relay's address will be listed; and so will that relay's onion key. Routers connected to one of the other relays will believe its identity, but be suspicious of its address. Attempts to extend to the relay will fail because of the incorrect onion key. No more than one of the relays' bandwidths will actually get significant use. So clearly, it would be best to prevent this. 3. Proposal 1: relay-side detection Relays should themselves try to detect whether another relay is using its identity. If a relay, while running, finds that it is listed in a fresh consensus using an onion key other than its current or previous onion key, it should tell its operator about the problem. (This proposal borrows from Mike Perry's ideas related to key theft detection.) 4. Proposal 2: offline detection Any relay that has a large number of onion-key transitions over time, but only a small number of distinct onion keys, is probably two or more relays in conflict with one another. In this case, the operators can be contacted, or the relay blacklisted. We could build support for blacklisting all but one of the addresses, but it's probably best to treat this as a misconfiguratino serious enough that it needs to be resolved.
Filename: 278-directory-compression-scheme-negotiation.txt Title: Directory Compression Scheme Negotiation Author: Alexander Færøy Created: 2017-03-06 Status: Closed Implemented-In: 0.3.1.1-alpha 0. Overview This document describes a method to provide and use different compression schemes in Tor's directory specification[0] and let it be up the client and server to negotiate a mutually supported scheme using the semantics of the HTTP protocol. Furthermore this proposal also extends Tor's directory protocol with support for the LZMA and Zstandard compression schemes. 1. Motivation Currently Tor serves each directory client with its different document flavours in either an uncompressed format or, if the client adds a ".z"-suffix to the URL file path, a zlib-compressed document. This have historically been non-problematic, but it disallows us from easily extending the set of supported compression schemes. Some of the problems this proposal is trying to aid: - We currently only support zlib-based compression schemes and there is no way for directory servers or clients to announce which compression schemes they support. Zlib might not be the ideal compression scheme for all purposes. - It is not easily possible to add support for additional compression schemes without adding additional file extensions or flavours of the directory documents. - In low-bandwidth and/or low-memory client scenarios it is useful to be able to limit the amount of supported compression schemes to have a client only support the most efficient compression scheme for the given use-case and have the directory servers support the most commonly available compression schemes used throughout the network. - We add support for the LZMA compression scheme, which yields better compressed size and decompression time at the expensive of higher compression time and higher memory usage. - We add support for the Zstandard compression scheme, which yields better compression ratio than GZip, but slightly worse than LZMA, but with a smaller CPU and memory footprint than LZMA. 2. Analysis We investigated the compression ratio, memory usage, memory allocation strategies, and execution time for compression and decompression of the GZip, BZip2, LZMA, and Zstandard compression schemes at compression levels 1 through 9. The data used in this analysis can be found in [1] and the `bench` tool for generating the data can be found in [2]. During the preparation for this proposal Nick have analysed compressing consensus diffs using both GZip, LZMA, and Zstandard. The result of Nick's analysis can be found in [3]. We must continue to support both "gzip", "deflate", and "identity" which are the currently available compression schemes in the Tor network. Further to enhance the compression ratio Nick have also worked on proposal #274 (Rotate onion keys less frequently), #275 (Stop including meaningful "published" time in microdescriptor consensus), #276 (Report bandwidth with lower granularity in consensus documents), and #277 (Detect multiple relay instances running with same ID) which all aid in making our consensus documents less dynamic. 3. Proposal We extend the directory client requests to include the "Accept-Encoding" header as part of its request. The "Accept-Encoding" header should contain a comma-separated list of names of the compression schemes of which the client supports. For example: GET / HTTP/1.0 Accept-Encoding: x-zstd, x-tor-lzma, gzip, deflate When a directory server receives a request with the "Accept-Encoding" header included, to either the ".z" compressed or the uncompressed version of any given document, it must decide on a mutually supported compression scheme and add the "Content-Encoding" header to its response and thus notifying the client of its decision. The "Content-Encoding" header can at most contain one supported compression scheme. If no mutual compression scheme can be negotiated the server must respond with an HTTP error status code of 406 "Not Acceptable". For example: HTTP/1.0 200 OK Content-Length: 1337 Connection: close Content-Encoding: x-zstd Currently supported compression scheme names includes "identity", "gzip", and "deflate". This proposal adds two additional compression scheme named "x-tor-lzma" (LZMA) and "x-zstd" (Zstandard). All compression scheme names are case-insensitive. The "deflate", "gzip", and "identity" compression schemes must be supported by directory servers for backwards compatibility. We use the name "x-tor-lzma" instead of just "x-lzma" because we require a defined upper bound of memory usage that is available for decompression of LZMA compressed data. The upper bound for memory available for LZMA decompression is defined as 16 MB. This currently means that will not use the LZMA compression scheme with a "preset" value higher than 6. Additionally, when a client, that supports this proposals, makes a request to a directory document with the ".z"-suffix it must send an ordered set of supported compression schemes where the last elements in the set contains compression schemes that are supported by all of the currently available Tor nodes ("gzip", "deflate", "identity"). In this way older relays will simply respond with the document compressed using zlib deflate without any prior knowledge of the newly added compression schemes. If a directory server receives a request to a document with the ".z" suffix, where the client does not include an "Accept-Encoding" header, the server should respond with the zlib compressed version of the document for backwards compatibility with client that does not support this proposal. The "Content-Length" header contains the number of compressed bytes sent to the client. The new compression schemes will be available for directory clients over both clearnet and BEGIN_DIR-style connections. 4. Security Implications 4.1 Compression and Decompression Bombs We currently detect compression and decompression "bombs" and must continue to do so with any additional compression schemes that we add. The detection of compression and decompression bombs are handled in `is_compression_bomb()` in torgzip.c and the same functionality is used both for compression and decompression. These functions must be extended to support LZMA and Zstandard. 4.2 Detection of Compression Algorithms To ensure that we do not pass compressed data through the incorrect decompression handler, when we have received data from another peer, Tor tries to detect the compression scheme in `detect_compression_method()`` in torgzip.c. This function should be extended to also detect the LZMA and Zstandard formats. Possible methods of applying this detection is looking at xz-tools, zstd's CLI, and the libmagic 'compress' module. 4.3 Fingerprinting All clients should aim at supporting the same set of supported compression schemes to avoid fingerprinting. 5. Compatibility This proposal does not break any backwards compatibility. Tor will continue to support serving uncompressed and zlib-compressed objects using the method defined in the directory specification[0], but will allow newer clients to negotiate a mutually supported compression scheme. 6. Performance and Scalability Each newly added compression scheme adds to the compression cache of a relay, which increases the memory requirements of a relay. The LZMA compression scheme yields better compression ratio at the expense of higher memory and CPU requirements for compression and slightly higher memory and CPU requirements for decompression. The Zstandard compression scheme yields better compression ratio than GZip does, but does not suffer from the same high CPU and memory requirements for compression as LZMA does. Because of the high requirements for CPU and memory usage for LZMA it is possible that we do not support this scheme for all available documents or that we only support it in situations where it is possible to pre-compute and cache the compressed document. 7. References [0]: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt [1]: https://docs.google.com/spreadsheets/d/1devQlUOzMPStqUl9mPawFWP99xSsRM8xWv7DNcqjFdo [2]: https://gitlab.com/ahf/tor-sponsor4-compression [3]: https://github.com/nmathewson/consensus-diff-analysis 8. Acknowledgements This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
Filename: 279-naming-layer-api.txt Title: A Name System API for Tor Onion Services Author: George Kadianakis, Yawning Angel, David Goulet Created: 04-Oct-2016 Status: Needs-Revision Table Of Contents: 1. Introduction 1.1. Motivation 1.2. Design overview and rationale 2. System Specification 2.1. System overview [SYSTEMOVERVIEW] 2.2. System Illustration 2.3. System configuration [TORRC] 2.3.1. Tor name resolution logic 2.4. Name system initialization [INITPROTOCOL] 2.5. Name resolution using NS API 2.5.1. Message format 2.5.2. RESOLVED status codes 2.5.3. Further name resolution behavior 2.6. Cancelling a name resolution request 2.7. Launching name plugins [INITENVVARS] 2.8. Name plugin workflow [NSBEHAVIOR] 2.8.1. Name plugin shutdown [NSSHUTDOWN] 2.9. Further details of stdin/stdout communication 2.9.1. Message Format 3. Discussion 3.1. Using second-level domains instead of tld 3.2. Name plugins handling all tlds '*' 3.3. Deployment strategy 3.4. Miscellaneous discussion topics 4. Acknowledgements A.1: Example communication Tor <-> name plugin [PROTOEXAMPLE] A.2: Example plugins [PLUGINEXAMPLES] 1. Introduction This proposal specifies a modular API for integrating name systems with Tor. 1.1. Motivation Tor onion service addresses are decentralized and self-authenticated but they are not human-memorable (e.g. 3g2upl4pq6kufc4m.onion). This is a source of poor usability, since Internet users are familiar with the convenient naming of DNS and are not used to addresses being random text. In particular, onion addresses are currently composed of 16 random base32 characters, and they look like this: 3g2upl4pq6kufc4m.onion vwakviie2ienjx7t.onion idnxcnkne4qt76tg.onion vwakviie2ienjx6t.onion When Proposal 224 gets deployed, onion addresses will become even bigger: 53 base32 characters. That's: llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion lfels7g3rbceenuuqmpsz45z3lswakqf56n5i3bvqhc22d5rrsza.onion odmmeotgcfx65l5hn6ejkaruvai222vs7o7tmtllszqk5xbysola.onion qw3yvgovok3dsarhqephpu2pkiwzjdk2fgdfwwf3tb3vgzxr5kba.onion Over the years Tor users have come up with various ad-hoc ways of handling onion addresses. For example people memorize them, or use third-party centralized directories, or just google them everytime. We believe that the UX problem of non-human-memorable addresses is not actually solved with the above ad-hoc solutions and remains a critical usability barrier that prevents onion services from being used by a wider audience. 1.2. Design overview and rationale During the past years there has been lots of research on secure naming and various such systems have been proposed (e.g. GNS, Namecoin, etc.). Turns out securely naming things is a very hard research problem, and hence none of the proposed systems is a clear winner: all of them carry various trade-offs. Furthermore, none of the proposed systems has seen widespread use so far, which makes it even harder to pick a single project. Given the plenitude of options, one approach to decide which system is best is to make various decent name systems available and let the Tor community and the sands of time pick the winner. Also, it might be that there is no single winner, and perhaps different specialized name system should be used in different situations. We believe that by getting secure name systems actually get utilized by real users, the whole field will mature and existing systems will get battle-hardened. Hence, the contribution of this proposal is a modular Name System API (NSA) that allows developers to integrate their own name systems in Tor. The interface design is name-system-agnostic, and it's heavily based on the pluggable transports API (proposal 180). It should be flexible enough to accommodate all sorts of name systems (see [PLUGINEXAMPLES]). 2. System Specification A developer that wants to integrate a name system with Tor needs to first write a wrapper that understands the Tor Name System API (NS API). Using the Name System API, Tor asks the name system to perform name queries, and receives the query results. The NS API works using environment variables and stdin/stdout communication. It aims to be portable and easy to implement. 2.1. System overview [SYSTEMOVERVIEW] Here is an overview of the Tor name system: Alice, a Tor user, can activate various name systems by editing her torrc file and specifying which tld each name system is responsible for. For this section, let's consider a simple fictional name system, unicorn, which magically maps domains with the .corn tld to the correct onion address. Here it is: OnionNamePlugin 0 .corn /usr/local/bin/unicorn After Alice enables the unicorn plugin, she attempts connecting to elephantforum.corn. Tor will intercept the SOCKS request, and use the executable at /usr/local/bin/unicorn to query the unicorn name system for elephantforum.corn. Tor communicates with the unicorn plugin using the Tor NS API through which name queries and their results can be transported using stdin/stdout. If elephantforum.corn corresponds to an onion address in the unicorn name system, unicorn should return the onion address to Tor using the Tor NS API. Tor must then internally rewrite the elephantforum.corn address to the actual onion address, and initiate a connection to it. 2.2. System Illustration Here is a diagram illustrating how the Tor Name System API works. The name system used in this example is GNS, but there is nothing GNS-specific here and GNS could be swapped for any other name system (like hosts files, or Namecoin). The example below illustrates how a user who types debian.zkey in their Tor browser gets redirected to sejnfjrq6szgca7v.onion after Tor consults the GNS network. Please revisit this illustration after reading the rest of the proposal. | $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$ | 1. $ 4. GNS magic!! $ | User: SOCKS CONNECT to $ debian.zkey -> sejnfjrq6szgca7v.onion$ | http://debian.zkey/ $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$~~~~~~~$ | $ +-----|-----------------------------------------+ $ |+----v-----+ 2. +---------+| 3. $ ||Tor | debian.zkey |Tor || debian.zkey +-$-------+ ||Networking------------------------->Naming -------------------------> | ||Submodule | |Submodule|| Tor Name System API | GNS | || <------------------------- <------------------------- wrapper| || | 6. | ||5. | | |+----|-----+ sejnfjrq6szgca7v.onion +---------+|sejnfjrq6szgca7v.onion +---------+ +-----|-----------------------------------------+ | 7. | Tor: Connect to | http://sejnfjrq6szgca7v.onion/ v 2.3. System configuration [TORRC] As demonstrated in [SYSTEMOVERVIEW], a Tor user who wants to use a name system has to edit their configuration file appropriately. Here is the torrc line format: OnionNamePlugin <priority> <tld> <path> where <priority> is a positive integer denoting the priority with which this name plugin should be consulted. <tld> is a string which restricts the scope of this plugin to a particular tld. Finally, <path> is a filesystem path to an executable that speaks the Tor Name System API and can act as an intermediary between Tor and the name system. For example here is a snippet from a torrc file: OnionNamePlugin 0 .hosts /usr/local/bin/local-hosts-file OnionNamePlugin 1 .zkey /usr/local/bin/gns-tor-wrapper OnionNamePlugin 2 .bit /usr/local/bin/namecoin-tor-wrapper OnionNamePlugin 3 .scallion /usr/local/bin/community-hosts-file 2.3.1. Tor name resolution logic When Tor receives a SOCKS request to an address that has a name plugin assigned to it, it needs to perform a query for that address using that name plugin. If there are multiple name plugins that correspond to the requested address, Tor queries all relevant plugins sorted by their priority value, until one of them returns a successful result. If two plugins have the same priority value, Tor MUST abort. If all plugins fail to successfuly perform the name resolution, Tor SHOULD default to using the exit node for name resolution. XXX or not? because of leaks? 2.4. Name system initialization [INITPROTOCOL] When Tor finds OnionNamePlugin lines in torrc, it launches and initializes their respective executables. When launching name plugins, Tor sets various environment variables to pass data to the name plugin (e.g. NS API version, state directory, etc.). More information on the environment variables at [INITENVVARS]. After a name plugin initializes and parses all needed environment variables, it communicates with Tor using its stdin/stdout. The first line that a name plugin sends to stdout signifies that it's ready to receive name queries. This line looks like this: INIT <VERSION> <STATUS_CODE> [<STATUS_MSG>] where VERSION is the Tor NS API protocol version that the plugin supports, STATUS_CODE is an integer status code, and STATUS_MSG is an optional string error message. STATUS_CODE value 0 is reserved for "success", and all other integers are error codes. See [PROTOEXAMPLE] for an example of this protocol. 2.5. Name resolution using NS API Here is how actual name resolution requests are performed in NS API. 2.5.1. Message format When Tor receives a SOCKS request to an address with a tld that has a name plugin assigned to it, Tor performs an NS API name query for that address. Tor does this by printing lines on the name plugin stdout as follows: RESOLVE <QUERY_ID> <NAME_STRING> where QUERY_ID is a unique integer corresponding to this query, and NAME_STRING is the name to be queried. When the name plugin completes the name resolution, it prints the following line in its stdout: RESOLVED <QUERY_ID> <STATUS_CODE> <RESULT> where QUERY_ID is the corresponding query ID and STATUS_CODE is an integer status code. RESULT is the resolution result (an onion address) or an error message if the resolution was not succesful. See [PROTOEXAMPLE] for an example of this protocol. XXX Should <RESULT> be optional in the case of failure? 2.5.2. RESOLVED status codes Name plugins can deliver the following status codes: 0 -- The name resolution was successful. 1 -- Name resolution generic failure. 2 -- Name tld not recognized. 3 -- Name not registered. 4 -- Name resolution timeout exceeded. XXX add more status codes here as needed 2.5.3. Further name resolution behavior Tor and name plugins MAY cache name resolution results in memory as needed. Caching results on disk should be avoided. Tor SHOULD abort (or cancel) an ongoing name resolution request, if it takes more than NAME_RESOLUTION_TIMEOUT seconds. XXX NAME_RESOLUTION_TIMEOUT = ??? Tor MUST validate that the resolution result is a valid .onion name. XXX should we also accept IPs and regular domain results??? XXX perhaps we should make sure that results are not names that need additional name resolution to avoid endless loops. e.g. imagine some sort of loop like this: debian.zkey -> debian-bla.zkey -> debian.zkey -> etc. 2.6. Cancelling a name resolution request Tor might need to cancel an ongoing name resolution request (e.g. because a timeout passed, or the client is not interested in that address anymore). In this case, Tor sends the following line to the plugin stdout as follows: CANCEL <QUERY_ID> to which the name plugin, after performing the cancellation, SHOULD answer with: CANCELED <QUERY_ID> 2.7. Launching name plugins [INITENVVARS] As described in [INITPROTOCOL], when Tor launches a name plugin, it sets certain environment variables. At a minimum, it sets (in addition to the normal environment variables inherited from Tor): "TOR_NS_STATE_LOCATION" -- A filesystem directory path where the plugin should store state if it wants to. This directory is not required to exist, but the plugin SHOULD be able to create it if it doesn't. The plugin MUST NOT store state elsewhere. Example: TOR_NS_STATE_LOCATION=/var/lib/tor/ns_state/ "TOR_NS_PROTO_VERSION" -- To tell the plugin which versions of this configuration protocol Tor supports. Future versions will give a comma-separated list. Plugins MUST accept comma-separated lists containing any version that they recognize, and MUST work correctly even if some of the versions they don't recognize are non-numeric. Valid version characters are non-space, non-comma printing ASCII characters. Example: TOR_NS_PROTO_VERSION=1,1a,2,4B "TOR_NS_PLUGIN_OPTIONS" -- Specifies configuration options for this name plugin as a semicolon-separated list of k=v strings with options that are to be passed to the plugin. Colons, semicolons, equal signs and backslashes MUST be escaped with a backslash. If there are no arguments that need to be passed to any of the plugins, "TOR_NS_PLUGIN_OPTIONS" MAY be omitted. For example consider the following options for the "banana" name plugin: TOR_NS_PLUGIN_OPTIONS=timeout=5;url=https://bananacake.com Will pass to banana the parameters 'timeout=5' and 'url=https://bananacake.com'. XXX Do we like this option-passing interface? Do we have any lessons from our PT experiences? XXX Add ControlPort/SocksPort environment variables. See [PROTOEXAMPLE] for an example of this environment 2.8. Name plugin workflow [NSBEHAVIOR] Name plugins follow the following workflow: 1) Tor sets the required environment values and launches the name plugin as a sub-process (fork()/exec()). See [INITENVVARS]. 2) The name plugin checks its environment, and determines the supported NS API versions using the env variable TOR_NS_PROTO_VERSION. 2.1) If there are no compatible versions, the name plugin writes an INIT message with a failure status code as in [INITPROTOCOL], and then shuts down. 3) The name plugin parses and handles the rest of the environment values. 3.1) If the environment variables are malformed, or otherwise invalid, the name plugin writes an INIT message with a failure status code as in [INITPROTOCOL], and then shuts down. 4) After the name plugin completely initializes, it sends a successful INIT message to stdout as in [INITPROTOCOL]. Then it continues monitoring its stdin for incoming RESOLVE messages. 6) When the name plugin receives a RESOLVE message, it performs the name resolution and replies with the appropriate RESOLVED message. 7) Upon being signaled to terminate by the parent process [NSSHUTDOWN], the name plugin gracefully shuts down. 2.8.1. Name plugin shutdown [NSSHUTDOWN] To ensure clean shutdown of all name plugins when Tor shuts down, the following rules apply for name plugins: Name plugins MUST handle OS specific mechanisms to gracefully terminate (e.g. SIGTERM). Name plugins SHOULD monitor their stdin and exit gracefully when it is closed. 2.9. Further details of stdin/stdout communication 2.9.1. Message Format Tor communicates with its name plugins by writing NL-terminated lines to stdout. The line metaformat is <Line> ::= <Keyword> <OptArgs> <NL> <Keyword> ::= <KeywordChar> | <Keyword> <KeywordChar> <KeyWordChar> ::= <any US-ASCII alphanumeric, dash, and underscore> <OptArgs> ::= <Args>* <Args> ::= <SP> <ArgChar> | <Args> <ArgChar> <ArgChar> ::= <any US-ASCII character but NUL or NL> <SP> ::= <US-ASCII whitespace symbol (32)> <NL> ::= <US-ASCII newline (line feed) character (10)> Tor MUST ignore lines with keywords that it doesn't recognize. 3. Discussion 3.1. Using second-level domains instead of tld People have suggested that users should try to connect to reddit.zkey.onion instead of reddit.zkey. That is, we should always preserve .onion as the tld, and only utilize second-level domains for naming. The argument for this is that this way users cannot accidentally leak addresses to DNS, as the .onion domain is reserved by RFC 7686. The counter-argument here is that this might be confusing to users since they are not used to the second-level domain being special (e.g. co.uk). Also, what happens when someone registers a 16-character name, that matches the length of a vanilla onion address? We should consider the concerns here and take the right decision. 3.2. Name plugins handling all tlds '*' In [TORRC], we assigned a single tld to each name plugin. Should we also accept catch-all tlds using '*'? I'm afraid that this way a name system could try to resolve even normal domains like reddit.com . Perhaps we trust the name plugin itself, but maybe the name system network could exploit this? Also, the catch-all tld will probably cause some engineering complications in this proposal (as it did for PTs). 3.3. Deployment strategy We need to devise a deployment strategy that will allow us to improve the UX of our users as soon as possible, but without taking hasty, sloppy or uneducated decisions. For starters, we should make it easy for developers to write wrappers around their secure name systems. We should develop libraries that speak the NS API protocol and can be used to quickly write wrappers. Similar libraries were quite successful during pluggable transport deployment; see pyptlib and goptlib. In the beginning, name plugins should be third-party applications that can be installed by interested users manually or through package managers. Users will also have to add the appropriate OnionNamePlugin line to their torrc. This will be a testing phase, and also a community-growing phase. After some time, and when we get a better idea of how name plugins work for the community, we can start considering how to make them more easily usable. For example, we can start by including some name plugins into TBB in an optional opt-in fashion. We should be careful here, as people have real incentives for attacking name systems and we should not put our users unwillingly in danger. 3.4. Miscellaneous discussion topics 1. The PT spec tries hard so that a single executable can expose multiple PTs. In this spec, it's clear that each executable is a single name plugin. Is this OK or a bad idea? Should we change interfaces so that each name plugin has an identifier, and then use that identifier for things? 2. Should we make our initialization protocol _identical_ to the PT API initialization protocol? That is, use ENV-ERROR etc. instead of INT? 3. Does it make sense to support reverse queries, from .onion to names? So that people auto-learn the names of the onions they use? 4. Acknowledgements Proposal details discussed during Tor hackfest in Seattle between Yawning, David and me. Thanks to Lunar and indolering for more discussion and feedback. This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. Appendix A.1: Example communication Tor <-> name plugin [PROTOEXAMPLE] Environemnt variables: TOR_NS_STATE_LOCATION=/var/lib/tor/ns_state TOR_NS_PROTO_VERSION=1 TOR_NS_PLUGIN_OPTIONS=timeout=5;cache=/home/user/name_cache Messages between Tor and the banana name plugin: Name plugin (banana) -> Tor: INIT 1 0 Tor -> Name plugin (banana): RESOLVE 1 daewonskate.banana Name plugin (banana) -> Tor: RESOLVED 1 0 jqkscnkne4qt91iq.onion Tor -> Name plugin (banana): RESOLVE 1 architecturedirect.zkey Name plugin (banana) -> Tor (banana): RESOLVE 1 2 "zkey not recognized tld" Tor -> Name plugin (banana): RESOLVE 1 origamihub.banana Name plugin (banana) -> Tor (banana): RESOLVE 1 2 wdxfpaxar4dg12vd.onion Appendix A.2: Example plugins [PLUGINEXAMPLES] Here are a few examples of name plugins for brainstorming: a) Simplest plugin: A local hosts file. Basically a local petname system that maps names to onion addresses. b) A remote hosts file. A centralized community hosts file that people trust. c) Multiple remote hosts files. People can add their own favorite community hosts file. d) Multiple remote hosts files with notaries and reputation trust. Like moxie's convergence tool but for names. e) GNS f) OnioNS g) Namecoin/Blockstart
Filename: 280-privcount-in-tor.txt Title: Privacy-Preserving Statistics with Privcount in Tor Author: Nick Mathewson, Tim Wilson-Brown Created: 02-Aug-2017 Status: Superseded Superseded-By: 288 0. Acknowledgments Tariq Elahi, George Danezis, and Ian Goldberg designed and implemented the PrivEx blinding scheme. Rob Jansen and Aaron Johnson extended PrivEx's differential privacy guarantees to multiple counters in PrivCount: https://github.com/privcount/privcount/blob/master/README.markdown#research-background Rob Jansen and Tim Wilson-Brown wrote the majority of the experimental PrivCount code, based on the PrivEx secret-sharing variant. This implementation includes contributions from the PrivEx authors, and others: https://github.com/privcount/privcount/blob/master/CONTRIBUTORS.markdown This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. 1. Introduction and scope PrivCount is a privacy-preserving way to collect aggregate statistics about the Tor network without exposing the statistics from any single Tor relay. This document describes the behavior of the in-Tor portion of the PrivCount system. It DOES NOT describe the counter configurations, or any other parts of the system. (These will be covered in separate proposals.) 2. PrivCount overview Here follows an oversimplified summary of PrivCount, with enough information to explain the Tor side of things. The actual operation of the non-Tor components is trickier than described below. All values in the scheme below are 64-bit unsigned integers; addition and subtraction are modulo 2^64. In PrivCount, a Data Collector (in this case a Tor relay) shares numeric data with N different Tally Reporters. (A Tally Reporter performs the summing and unblinding roles of the Tally Server and Share Keeper from experimental PrivCount.) All N Tally Reporters together can reconstruct the original data, but no (N-1)-sized subset of the Tally Reporters can learn anything about the data. (In reality, the Tally Reporters don't reconstruct the original data at all! Instead, they will reconstruct a _sum_ of the original data across all participating relays.) To share data, for each value X to be shared, the relay generates random values B_1 though B_n, and shares each B_i secretly with a single Tally Reporter. The relay then publishes Y = X + SUM(B_i) + Z, where Z is a noise value taken at random from a gaussian distribution. The Tally Reporters can reconstruct X+Z by securely computing SUM(B_i) across all contributing Data Collectors. (Tally Reporters MUST NOT share individual B_i values: that would expose the underlying relay totals.) In order to prevent bogus data from corrupting the tally, the Tor relays and the Tally Reporters perform multiple "instances" of this algorithm, randomly sampling the relays in each instance. Each relay sends multiple Y values for each measurement, built with different sets of B_i. These "instances" are numbered in order from 1 to R. So that the system will still produce results in the event of a single Tally Reporter failure, these instances are distributed across multiple subsets of Tally Reporters. Below we describe a data format for this. 3. The document format This document format builds on the line-based directory format used for other tor documents, described in Tor's dir-spec.txt. Using this format, we describe two kinds of documents here: a "counters" document that publishes all the Y values, and a "blinding" document that describes the B_i values. But see "An optimized alternative" below. The "counters" document has these elements: "privctr-dump-format" SP VERSION SP SigningKey [At start, exactly once] Describes the version of the dump format, and provides an ed25519 signing key to identify the relay. The signing key is encoded in base64 with padding stripped. VERSION is "alpha" now, but should be "1" once this document is finalized. [[[TODO: Do we need a counter version as well? Noise is distributed across a particular set of counters, to provide differential privacy guarantees for those counters. Reducing noise requires a break in the collection. Adding counters is ok if the noise on each counter monotonically increases. (Removing counters always reduces noise.) We also need to work out how to handle instances with mixed Tor versions, where some Data Collectors report a different set of counters than other Data Collectors. (The blinding works if we substitute zeroes for missing counters on Tally Reporters. But we also need to add noise in this case.) -teor ]]] "starting-at" SP IsoTime [Exactly once] The start of the time period when the statistics here were collected. "ending-at" SP IsoTime [Exactly once] The end of the time period when the statistics here were collected. "num-instances" SP Number [Exactly once] The number of "instances" that the relay used (see above.) "tally-reporter" SP Identifier SP Key SP InstanceNumbers [At least twice] The curve25519 public key of each Tally Reporter that the relay believes in. (If the list does not match the list of participating tally reporters, they won't be able to find the relay's values correctly.) The identifiers are non-space, non-nul character sequences. The Key values are encoded in base64 with padding stripped; they must be unique within each counters document. The InstanceNumbers are comma-separated lists of decimal integers from 0 to (num-instances - 1), in ascending order. Keyword ":" SP Int SP Int SP Int ... [Any number of times] The Y values for a single measurement. There are num-instances such Y values for each measurement. They are 64-bit unsigned integers, expressed in decimal. The "Keyword" denotes which measurement is being shared. Keyword MAY be any sequence of characters other than colon, nul, space, and newline, though implementators SHOULD avoid getting too creative here. Keywords MUST be unique within a single document. Tally Reporters MUST handle unrecognized keywords. Keywords MAY appear in any order. It is safe to send the blinded totals for each instance to every Tally Reporter. To unblind the totals, a Tally Reporter needs: * a blinding document from each relay in the instance, and * the per-counter blinding sums from the other Tally Reporters in their instance. [[[TODO: But is it safer to create a per-instance counters document? -- teor]]] The semantics of individual measurements are not specified here. "signature" SP Signature [At end, exactly once] The Ed25519 signature of all the fields in the document, from the first byte, up to but not including the "signature" keyword here. The signature is encoded in base64 with padding stripped. The "blinding" document has these elements: "privctr-secret-offsets" SP VERSION SP SigningKey [At start, exactly once.] The VERSION and SigningKey parameters are the same as for "privctr-dump-format". "instances" SP Numbers [Exactly once] The instances that this Tally Reporter handles. They are given as comma-separated decimal integers, as in the "tally-reporter" entry in the counters document. They MUST match the instances listed in the counters document. [[[TODO: this is redundant. Specify the constraint instead? --teor]]] "num-counters" SP Number [Exactly once] The number of counters that the relay used in its counters document. This MUST be equal to the number of keywords in the counters document. [[[TODO: this is redundant. Specify the constraint instead? --teor]]] "tally-reporter-pubkey" SP Key [Exactly once] The curve25519 public key of the tally reporter who is intended to receive and decrypt this document. The key is base64-encoded with padding stripped. "count-document-digest" SP "sha3" Digest NL "-----BEGIN ENCRYPTED DATA-----" NL Data "-----END ENCRYPTED DATA-----" NL [Exactly once] The SHA3-256 digest of the count document corresponding to this blinding document. The digest is base64-encoded with padding stripped. The data encodes the blinding values (See "The Blinding Values") below, and is encrypted to the tally reporter's public key using the hybrid encryption algorithm described below. "signature" SP Signature [At end, exactly once] The Ed25519 signature of all the fields in the document, from the first byte, up to but not including the "signature" keyword here. The signature is encoded in base64 with padding stripped. 4. The Blinding Values The "Data" field of the blinding documents above, when decrypted, yields a sequence of 64-bit binary values, encoded in network (big-endian) order. There are C * R such values, where C is the number of keywords in the count document, and R is the number of instances that the Tally Reporter participates in. The client generates all of these values uniformly at random. For each keyword in the count document, in the order specified by the count document, the decrypted data holds R*8 bytes for the specified instance of that keyword's blinded counter. For example: if the count document lists the keywords "b", "x", "g", and "a" (in that order), and lists instances "0" and "2", then the decrypted data will hold the blinding values in this order: b, instance 0 b, instance 2 x, instance 0 x, instance 2 g, instance 0 g, instance 2 a, instance 0 a, instance 2 4. Implementation Notes A relay should, when starting a new round, generate all the blinding values and noise values in advance. The relay should then use these values to compute Y_0 = SUM(B_i) + Z for each instance of each counter. Having done this, the relay MUST encrypt the blinding values to the public key of each tally reporter, and wipe them from memory. 5. The hybrid encryption algorithm We use a hybrid encryption scheme above, where items can be encrypted to a public key. We instantiate it as follows, using curve25519 public keys. To encrypt a plaintext M to a public key PK1 1. the sender generates a new ephemeral keypair sk2, PK2. 2. The sender computes the shared diffie hellman secret SEED = (sk2 * PK1). 3. The sender derives 64 bytes of key material as SHAKE256(TEXT | SEED)[...64] where "TEXT" is "Expand curve25519 for privcount encryption". The first 32 bytes of this is an aes key K1; the second 32 bytes are a mac key K2. 4. The sender computes a ciphertext C as AES256_CTR(K1, M) 5. The sender computes a MAC as SHA3_256([00 00 00 00 00 00 00 20] | K2 | C) 6. The hybrid-encrypted text is PK2 | MAC | C. 6. An optimized alternative As an alternative, the sequences of blinding values are NOT transmitted to the tally reporters. Instead the client generates a single ephemeral keypair sk_c, PK_c, and places the public key in its counts document. It does this each time a new round begins. For each tally reporter with public key PK_i, the client then does the handshake sk_c * PK_i to compute SEED_i. The client then generates the blinding values for that tally reporter as SHAKE256(SEED_i)[...R*C*8]. After initializing the counters to Y_0, the client can discard the blinding values and sk_c. Later, the tally reporters can reconstruct the blinding values as SHAKE256(sk_i * PK_c)[...] This alternative allows the client to transmit only a single public key, when previously it would need to transmit a complete set of blinding factors for each tally reporter. Further, the alternative does away with the need for blinding documents altogether. It is, however, more sensitive to any defects in SHAKE256 than the design above. Like the rest of this design, it would need rethinking if we want to expand this scheme to work with anonymous data collectors, such as Tor clients.
Filename: 281-bulk-md-download.txt Title: Downloading microdescriptors in bulk Author: Nick Mathewson Created: 11-Aug-2017 Status: Reserve 1. Introduction This proposal describes a ways to download more microdescriptors at a time, using fewer bytes. Right now, to download N microdescriptors, the client must send about 44*N bytes in its HTTP request. Because clients can request microdescriptors in any combination, the directory caches cannot pre-compress responses to these requests, and need to use less space-efficient on-the-fly compression algorithms. Under this proposal, clients simply say "Send me the microdescriptors I need", given what I know. 2. Combined microdescriptor downloads 2.1. By diff If a client has a consensus with base64 sha3-256 digest X, and it previously had a consensus with base64 sha3-256 digests Y then it may request all the microdescriptors listed in X but not Y, by asking for the resource: /tor/micro/diff/X/Y Clients SHOULD only ask for this resource compressed. Caches MUST NOT answer this request unless they recognize the consensus with digest X, and digest Y. If answering, caches MUST reply with all of the microdescriptors that the cache holds that were listed by consensus X, and MUST omit all the microdescriptors that were not listed in consensus Y. (For the purposes of this proposal, microdescriptors are "the same" if they are textually identical and have the same digest.) 2.2. By consensus: If a client has fewer than NMNM% of the microdescriptors listed in a consensus X, it should fetch the resource /tor/micro/full/X Clients SHOULD only ask for this resource compressed. Caches MUST NOT answer this request unless they recognize the consensus with digest X. They should send all the microdescriptors they have that are listed in that consensus. 2.3. When to make these requests Clients should decide to use this format in preference to the old download-by-digest format if the consensus X lists their preferred directory cache as using a new DirCache subprotocol version. (See 5 below.) When a client has some preferred directory caches that support this subprotocol and some that do not, it chooses one at random, and uses these requests if that one supports this subprotocol. (A client always has a consensus when it requests microdescriptors, so it will know whether any given cache supports these requests.) 3. Performance analysis This is a back-of-the-envelope analysis using a month's worth of consensus documents, and a randomly chosen sample of microdescriptors. On average, about 0.5% of the microdescriptors change between any two consensuses. Call it 50. That means 50*43 bytes == 2150 bytes to request the microdescriptors. It means ~24530 bytes of microdescriptors downloaded, compressed to ~13687 bytes by zstd. With this proposal, we're down to 86 bytes for the request, and we can precompute the compressed output, making it save to use lzma2, getting a compressed result more like 13362. It appears that this change would save about 15% for incremental microdescriptor downloads, most of that coming from the reduction in request size. For complete downloads, a complete set of microdescriptors is about 7700 microdesciptors long. That makes the total number of bytes for the requests 7700*43 == 331100 bytes. The response, if compressed with lzma instead of zstd, would fall from 1659682 to 1587804 bytes, for a total savings of 20%. 5. Compatibility Caches supporting this download protocol need to advertise support of a new DirCache subprotocol version.
Filename: 282-remove-named-from-consensus.txt Title: Remove "Named" and "Unnamed" handling from consensus voting Author: Nick Mathewson Created: 12-Sep-2017 Status: Accepted Target: arti-dirauth 1. Summary Authorities no longer vote for the "Named" and "Unnamed" flags, and we have begun to remove the client code that supports them. (See proposal 235). The next logical step is to remove the special handling from these flags from the consensus voting algorithm. We specify this here. 2. Proposal We add a new consensus method, here represented as M, to be allocated when this proposal's implementation is merged. We specify that the Named and Unnamed flags are only handled specially when the negotiated consensus method is earlier than M. If the negotiated method is M or later, then the Named and Unnamed flags are handled as if any they were any other consensus flags.
Filename: 283-ipv6-in-micro-consensus.txt Title: Move IPv6 ORPorts from microdescriptors to the microdesc consensus Author: Tim Wilson-Brown (teor), Nick Mathewson Created: 18-Oct-2017 Status: Closed Target: 0.3.3.x Implemented-In: 0.3.3.1-alpha Ticket: #20916 1. Summary Moving IPv6 ORPorts from microdescs to the microdesc consensus will make it easier for IPv6 clients to bootstrap and select reachable guards. Tor clients on IPv6-only connections currently have to use IPv6 Fallback Directory Mirrors to fetch their microdescriptors. This does not scale well. After this change, they will be able to fetch microdescriptors from any IPv6-enabled directory mirror in the consensus. Tor clients on versions 0.2.8.x and 0.2.9.x are currently unable to bootstrap over IPv6-only connections when using microdescriptors. After this consensus change, they will be able to bootstrap without any client code changes. For clients that use microdescriptors (the default), IPv6 ORPorts are always placed in microdescriptors. So these clients can only tell if an IPv6 ORPort is unreachable when a majority of voting authorities mark the relay as not Running. After this proposal, clients will be able to discover unreachable ORPorts, even if a minority of voting authorities set AuthDirHasIPv6Connectivity 1. 2. Proposal We add two new consensus methods, here represented as M and N (M < N), to be allocated when this proposal's implementation is merged. These consensus methods move IPv6 ORPorts from microdescs to the microdesc consensus. We use two different methods because this allows us to modify client code based on each method. Also, if a bug is discovered in one of the methods, authorities can be patched to stop voting for it, and then we can implement a fix in a later method. 2.1. Add Reachable IPv6 ORPorts to the Microdesc Consensus We specify that microdescriptor consensuses created with methods M or later contain reachable IPv6 ORPorts. 2.2. Remove IPv6 ORPorts from Microdescriptors We specify that microdescriptors created with methods N or later start omitting IPv6 ORPorts. 3. Retaining Existing Behaviour The following existing behaviour will be retained: 3.1. Authority IPv6 Reachability Only authorities configured with AuthDirHasIPv6Connectivity 1 will test IPv6 ORPort reachability, and vote for IPv6 ORPorts. This means that: * if no voting authorities set AuthDirHasIPv6Connectivity 1, there will be no IPv6 ORPorts in the consensus, * if a minority of voting authorities set AuthDirHasIPv6Connectivity 1: unreachable IPv6 ORPort lines will be dropped from the consensus, but the relay will still be listed as Running, and reachable IPv6 ORPort lines will be included in the consensus. * if a majority of voting authorities set AuthDirHasIPv6Connectivity 1, relays with unreachable IPv6 ORPorts will not be listed as Running. Reachable IPv6 ORPort lines will be included in the consensus. (To ensure that any valid majority will vote relays with unreachable IPv6 ORPorts not Running, 75% of authorities must set AuthDirHasIPv6Connectivity 1.) We will document this behaviour in the tor manual page, see #23870. 3.2. NS Consensus IPv6 ORPorts The NS consensus will continue to contain reachable IPv6 ORPorts. 4. Impact and Related Changes 4.1. Directory Authority Configuration We will work to get a super-majority (75%) of authorities checking relay IPv6 reachability, to avoid Running-flag flapping. To do this, authorities need to get IPv6 connectivity, and set AuthDirHasIPv6Connectivity 1. 4.2. Relays and Bridges Tor relays and bridges do not currently use IPv6 ORPorts from the consensus. We expect that 2/3 of authorities will be voting for consensus method N before future Tor relay or bridge versions use IPv6 ORPorts from the consensus. 4.3. Clients 4.3.1. Legacy Clients 4.3.1.1. IPv6 ORPort Circuits Tor clients on versions 0.2.8.x to 0.3.2.x check directory documents for ORPorts in the following order: * descriptors (routerinfo, available if using bridges or full descriptors) * consensus (routerstatus) * microdescriptors (IPv6 ORPorts only) Their behaviour will be identical to the current behaviour for consensus methods M and earlier. When consensus method N is used, they will ignore unreachable IPv6 ORPorts without any code changes, as long as they are using microdescriptors. 4.3.1.2. IPv6 ORPort Bootstrap Tor clients on versions 0.2.8.x and 0.2.9.x are currently unable to bootstrap over IPv6-only connections when using microdescriptors. This happens because the microdesc consensus does not contain IPv6 ORPorts. (IPv6-only Tor clients on versions 0.3.0.2-alpha and later use fallback directory mirrors to fetch their microdescriptors.) When consensus method M is used, 0.2.8.x and 0.2.9.x clients will be able to bootstrap over IPv6-only connections using microdescriptors, without any code changes. 4.3.2. Future Clients 4.3.2.1. Ignoring IPv6 ORPorts in Microdescs Tor clients on versions 0.3.3.x and later will ignore unreachable IPv6 ORPorts once consensus method M or later is in use. This requires some code changes, see #23827. 4.3.2.2. IPv6 ORPort Bootstrap If a bootstrapping IPv6-only client has a consensus made with method M or later, it should download microdescriptors from one of the IPv6 ORPorts in that consensus. This requires some code changes, see #23827. Previously, IPv6-only clients would use fallback directory mirrors to download microdescs, because there were no IPv6 ORPorts in the microdesc consensus. 4.3.2.3. Ignoring Addresses in Unused Directory Documents If a client doesn't use a particular directory document type for a node, it should ignore any addresses in that document type. This requires some code changes, see #23975. 5. Data Size This change removes 7-50 bytes from the microdescriptors of relays that have an IPv6 ORPort, and adds them to reachable IPv6 relays' microdesc consensus entries. As of October 2017, 600 relays (9%) have IPv6 ORPorts in the NS consensus. Their "a" lines take up 19 KB, or 33 bytes each on average. The gzip-compressed microdesc consensus is 564 KB, and adding the existing IPv6 addresses makes it 576 KB (a 2.1% increase). Adding IPv6 addresses to every relay makes it 644 KB (a 14% increase). zstd-compressed microdesc consensuses show smaller increases of 1.7% and 8.0%, respectively. Most tor clients are already running 0.3.1.7, which implements consensus diffs and zstd compression. We expect that most directory mirrors will also implement consensus diffs and zstd compression by the time 2/3 of authorities are voting for consensus method M. Consensus diffs will reduce the worst-case impact of this change for clients and relays that have a recent consensus. 6. External Impacts We don't expect this change to impact Onionoo and similar projects, because they typically use the NS consensus. 7. Monitoring OnionOO has implemented an "unreachable IPv6 address" attribute: https://trac.torproject.org/projects/tor/ticket/21637 Metrics is working on IPv6 relay graphs: https://trac.torproject.org/projects/tor/ticket/23761 Consensus-health implements a ReachableIPv6 pseudo-flag for authorities and relays: https://consensus-health.torproject.org/
Filename: 284-hsv3-control-port.txt Title: Hidden Service v3 Control Port Author: David Goulet Created: 02-November-2017 Status: Closed 1. Summary This document extends the hidden service control port events and commands to version 3 (rend-spec-v3.txt). No command nor events are newly added in this document, it only desribes how the current commands and events are extended to support v3. 2. Format The formatting of this document follows section 2 of control-spec.txt. It is split in two sections, the Commands and the Events for hidden service version 3. We define the alphabet of a Base64 encoded value to be: Base64Character = "A"-"Z" / "a"-"z" / "0"-"9" / "+" / "/" For a command or event, if nothing is mentionned, the behavior doesn't change from the control port specification. 3. Specification: 3.1. Commands As specified in the control specification, all commands are case-insensitive but the keywords are case-sensitive. 3.1.1. GETINFO Hidden service commands are: "hs/client/desc/id/<ADDR>" The <ADDR> can be a v3 address without the ".onion" part. The rest is as is. "hs/service/desc/id/<ADDR>" The <ADDR> can be a v3 address without the ".onion" part. The rest is as is. "onions/{current,detached}" No change. This command can support v3 hidden service without changes returning v3 address(es). 3.1.2. HSFETCH The syntax of this command supports both an HSAddress or a versionned descriptor ID. However, for descriptor ID, version 3 doesn't have the same concept as v2 so, for v3 the descriptor ID is the blinded key of a descriptor which is used as an index to query the HSDir: The syntax becomes: "HSFETCH" SP (HSAddress / "v" Version "-" DescId) *[SP "SERVER=" Server] CRLF HSAddress = (16*Base32Character / 56*Base32Character) Version = "2" / "3" DescId = (32*Base32Character / 32*Base64Character) Server = LongName The "HSAddress" key is extended to accept 56 base32 characters which is the format of a version 3 onion address. The "DescId" of the form 32*Base64Character is the descriptor blinded key used as an index to query the directory. It can only be used with "Version=3". 3.1.5. HSPOST To support version 3, the command needs an extra parameter that is the onion address of the given descriptor. With v2, the address could have been deduced from the given descriptor but with v3, this is not possible. In order to fire up the HS_DESC event correctly, we need the address so the request can be linked on the control port. Furthermore, the given descriptor will be validated with the given address and an error will be returned if they are not matching. The syntax becomes: "+HSPOST" *[SP "SERVER=" Server] [SP "HSADDRESS=" HSAddress] CRLF Descriptor CRLF "." CRLF HSAddress = 56*Base32Character The "HSAddress" key is optional and only applies for v3 descriptors. A 513 error is returned if used with v2. 3.1.3. ADD_ONION For this command to support version 3, new values are added but the syntax is unchanged: "ADD_ONION" SP KeyType ":" KeyBlob [SP "Flags=" Flag *("," Flag)] 1*(SP "Port=" VirtPort ["," Target]) *(SP "ClientAuth=" ClientName [":" ClientBlob]) CRLF New "KeyType" value to "ED25519-V3" which identifies the key type to be a v3 ed25519 key. With the KeyType == "ED25519-V3", the "KeyBlob" should be a base64 encoded ed25519 private key. The "NEW:BEST" option will still return a version 2 address as long as the HiddenServiceVersion torrc option default is 2. To ask for a new v3 key, this should be used: "NEW:ED25519-V3". Because client authentication is not yet implemented, the "ClientAuth" field is ignored as well as "Flags=BasicAuth". A 513 error is returned if "ClientAuth" is used with an ED25519-V3 key type. 3.1.4. DEL_ONION The syntax of this command is: "DEL_ONION" SP ServiceID CRLF ServiceID = The Onion Service address without the trailing ".onion" suffix The "ServiceID" can simply be a v3 address. Nothing else changes. 3.2. Events 3.2.1. HS_DESC For this event to support vesrion 3, one optional field and new values are added: "650" SP "HS_DESC" SP Action SP HSAddress SP AuthType SP HsDir [SP DescriptorID] [SP "REASON=" Reason] [SP "REPLICA=" Replica] [SP "HSDIR_INDEX=" HSDirIndex] Action = "REQUESTED" / "UPLOAD" / "RECEIVED" / "UPLOADED" / "IGNORE" / "FAILED" / "CREATED" HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" AuthType = "NO_AUTH" / "BASIC_AUTH" / "STEALTH_AUTH" / "UNKNOWN" HsDir = LongName / Fingerprint / "UNKNOWN" DescriptorID = 32*Base32Character / 43*Base64Character Reason = "BAD_DESC" / "QUERY_REJECTED" / "UPLOAD_REJECTED" / "NOT_FOUND" / "UNEXPECTED" / "QUERY_NO_HSDIR" Replica = 1*DIGIT HSDirIndex = 64*HEXDIG The "HSDIR_INDEX=" is an optional field that is only for version 3 which contains the computed index of the HsDir the descriptor was uploaded to or fetched from. The "HSAddress" key is extended to accept 56 base32 characters which is the format of a version 3 onion address. The "DescriptorID" key is extended to accept 43 base64 characters which is the descriptor blinded key used for the index value at the "HsDir". The "REPLICA=" field is not used for the "CREATED" event because v3 doesn't use the replica number in the descriptor ID computation. Because client authentication is not yet implemented, the "AuthType" field is always "NO_AUTH". 3.2.2. HS_DESC_CONTENT For this event to support version 3, new values are added but the syntax is unchanged: "650" "+" "HS_DESC_CONTENT" SP HSAddress SP DescId SP HsDir CRLF Descriptor CRLF "." CRLF "650" SP "OK" CRLF HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" DescId = 32*Base32Character / 32*Base64Character HsDir = LongName / "UNKNOWN" Descriptor = The text of the descriptor formatted as specified in rend-spec-v3.txt section 2.4 or empty string on failure. The "HSAddress" key is extended to accept 56 base32 characters which is the format of a version 3 onion address. The "DescriptorID" key is extended to accept 32 base64 characters which is the descriptor blinded key used for the index value at the "HsDir". 3.2.3 CIRC and CIRC_MINOR These circuit events have an optional field named "REND_QUERY" which takes an "HSAddress". This field is extended to support v3 address: HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN"
Filename: 285-utf-8.txt Title: Directory documents should be standardized as UTF-8 Author: Nick Mathewson Created: 13 November 2017 Status: Accepted Target: arti-dirauth Ticket: https://gitlab.torproject.org/tpo/core/tor/-/issues/40131 1. Summary and motivation People frequently want to include non-ASCII text in their router descriptors. The Contact line is a favorite place to do this, but in principle the platform line would also be pretty logical. Unfortunately, there's no specified way to encode non-ASCII in our directory documents. Fortunately, almost everybody who does it, uses UTF-8 anyway. As we move towards Rust support in Tor, we gain another motivation for standarding on UTF-8, since Rust's native strings strongly prefer UTF-8. So, in this proposal, we describe a migration path to having all directory documents be fully UTF-8. (See 2.3 below for a discussion of what exactly we mean by "non-UTF-8".) 2. Proposal First, we should have Tor relays reject ContactInfo lines (and any other lines copied directly into router descriptors) that are not UTF-8. At the same time, we should have authorities reject any router descriptors or extrainfo documents that are not valid UTF-8. Simultaneously, we can have all Tor instances reject all non-directory-descriptor directory documents that are not UTF-8, since none should exist today. Finally, once the authorities have updated, we should have all Tor instances reject all directory documents that are not UTF-8. (We should not take this step until the authorities have upgraded, or else the behavior of updated and non-updated clients could be distinguished.) 2.1. Hidden service descriptors' encrypted bodies For the encrypted bodies of hidden service descriptors, we cannot reject them at the authority level, and so we need to take a slightly different approach to prevent client fingerprinting attacks. First, we should make Tor instances start warning about any hidden service descriptors whose bodies, post-decryption, contain non-utf-8 plaintext. At the same time, we add a consensus parameter to indicate that hidden service descriptors with non-utf-8 plaintexts should be rejected entirely: "reject-encrypted-non-utf-8". If that parameter is set to 1, then hidden service clients will not only warn, but reject the descriptors. Once the vast majority of clients are running versions that support the "reject-encrypted-non-utf-8" parameter, that parameter can be set to 1. 2.2. Bridge descriptors Since clients download bridge descriptors directly from the bridges, they also need a two-phase plan as for hidden service descriptors above. Here we take the same approach as in section 2.1 above, except using the parameter "reject-bridge-descriptor-non-utf-8". 2.3. Which UTF-8 exactly? We define the allowable set of UTF-8 as: * Zero or mode Unicode scalar values (as defined by The Unicode Standard, Version 3.1 or later), that is: * Unicode code points U+00 through U+10FFFF, * but excluding the code points U+D800 through U+DFFF, * Excluding the scalar value U+00 (for compatibility with NUL-terminated C strings), * Serialized using the UTF-8 encoding scheme (as defined by The Unicode Standard, Version 3.1 or later), in particular: * each code point is encoded with the shortest possible encoding, * Without a Unicode byte order mark (BOM, U+FEFF) at the start of the descriptor. (BOMs are optional and not recommended in UTF-8. Allowing a BOM would break backwards compatibility with ASCII-only Tor implementations.) Byte-swapped BOMs (U+FFFE) must also be rejected. In order to remain compatible with future versions of The Unicode Standard, we allow all possible code points, including Reserved code points. For languages with a conforming UTF-8 implementation (as defined by The Unicode Standard, Version 3.1 or later), this is equivalent to well-formed UTF-8, with the following additional rules: * reject a BOM (U+FEFF) or byte-swapped BOM (U+FFFE) at the start of the descriptor, * reject U+00 at any point in the descriptor, * accept all code point types used in UTF-8, including Control, Private-Use, Noncharacter, and Reserved. (The Surrogate code point type is not used in UTF-8.) For languages without a conforming UTF-8 implementation, we recommend checking UTF-8 conformity based on the "Well-Formed UTF-8 Byte Sequences" table from The Unicode Standard, Version 11 (or later). Note that U+00 is serialized to 0x00, but U+FEFF is serialized to 0xEFBBBF, and U+FFFE is serialized to 0xEFBFBE. 3. References The Unicode Standard, Version 11, Chapter 3. In particular: * Unicode scalar values: D76, page 120. * UTF-8 encoding form: D92, pages 125-127. * Well-Formed UTF-8 Byte Sequences: Table 3-7, page 126. * Byte order mark: C11, page 83; D94, page 130. * UTF-8 encoding scheme: D96, pages 130.
Filename: 286-hibernation-api.txt Title: Controller APIs for hibernation access on mobile Author: Nick Mathewson Created: 30-November-2017 Status: Rejected Notes: This proposal was useful for our early thinking, but a simpler solution (DisableNetwork) proved much more useful. 1. Introduction On mobile platforms, battery life is achieved by reducing needless network access and CPU access. Tor currently provides few ways for controllers and operating systems to tune its behavior. This proposal describes controller APIs for better management of Tor's hibernation mechanisms, and extensions to those mechanisms, for better power management in mobile environments. 1.1. Background: hibernation and idling in Tor today We have an existing "hibernation" mechanism that we use to implement "bandwidth accounting" and "slow shutdown" mechanisms: When a Tor instance is close to its bandwidth limit: it stops accepting new connections or circuits, and only processes those it has, until the bandwidth limit is reached. Once the bandwidth limit is reached, Tor closes all connections and circuits, and all non-controller listeners, until a new accounting limit begins. Tor handles the INT signal on relays similarly: it stops accepting new connections or circuits, and gives the existing ones a short interval in which to shut down. Then Tor closes all connections and exits the process entirely. Tor's "idle" mechanism is related to hibernation, though its implementation is separate. When a Tor clients has passed a certain amount of time without any user activity, it declares itself "idle" and stops performing certain background tasks, such as fetching directory information, or building circuits in anticipation of future needs. (This is tied in the codebase to the "predicted ports" mechanism, but it doesn't have to be.) 1.2. Background: power-management signals on mobile platforms (I'm not a mobile developer, so I'm about to wildly oversimplify. Please let me know where I'm wrong.) Mobile platforms achieve long battery life by turning off the parts they don't need. The most important parts to turn off are the antenna(s) and the screen; the CPU can be run in a slower mode. But it doesn't do much good turning things off when they're unused, if some background app is going to make sure that they're always in use! So mobile platforms use signals of various kinds to tell applications "okay, shut up now". Some apps need to do online background activities periodically; to help this out, mobile platforms give them a signal "Hey, now is a good time if you want to do that" and "stop now!" 1.3. Mostly out-of-scope: limiting CPU wakeups when idle. The changes described here will be of limited use if we do not also alter Tor so that, when it's idle, the CPU is pretty quiet. That isn't the case right now: we have large numbers of callbacks that happen periodically (every second, every minute, etc) whether they need to or not. We're hoping to limit those, but that's not what this proposal is about. 2. Improvements to the hibernation model To present a consistent interface that applications and controllers can use to manage power consumption, we make these enhancements to our hibernation model. First, we add three new hibernation states: "IDLE", "IDLE_UPDATING", "SLEEP", and "SLEEP_UPDATING". "IDLE" is like the current "idle" or "no predicted ports" state: Tor doesn't launch circuits or start any directory activity, but its listeners are still open. Tor clients can enter the IDLE state on their own when they are LIVE, but haven't gotten any client activity for a while. Existing connections and circuits are not closed. If the Tor instance receives any new connections, it becomes LIVE. "IDLE_UPDATING" is like IDLE, except that Tor should check for directory updates as appropriate. If there are any, it should fetch directory information, and then become IDLE again. "SLEEPING" is like the current "dormant state we use for bandwidth exhaustion, but it is controller-initiated: it begins when Tor is told to enter it, and ends when Tor is told to leave it. Existing connections and circuits are closed; listeners are closed too. "SLEEP_UPDATING" is like SLEEP, except that Tor should check for directory updates as appropriate. If there are any, it should fetch directory information, and then SLEEP again. 2.1. Relay operation Relays and bridges should not automatically become IDLE on their own. 2.2. Onion service operation When a Tor instance that is running an onion service is IDLE, it does the minimum to try to remain responsive on the onion service: It keeps its introduction points open if it can. Once a day, it fetches new directory information and opens new introduction points. 3. Controller hibernation API 3.1. Examining the current hibernation state We define a new "GETINFO status/hibernation" to inspect the current hibernation state. Possible values are: - "live" - "idle:control" - "idle:no-activity" - "sleep:control" - "sleep:accounting" - "idle-update:control" - "sleep-update:control" - "shutdown:exiting" - "shutdown:accounting" - "shutdown:control" The first part of each value indicates Tor's current state: "live" -- completely awake "idle" -- waiting to see if anything happens "idle-update" -- waiting to see if anything happens; probing for directory information "sleep" -- completely unresponsive "shutdown" -- unresponsive to new requests; still processing existing requests. The second part of each value indicates the reason that Tor entered this state: "control" -- a controller told us to do this. "no-activity" -- Tor became idle on its own due to not noticing any requests. "accounting" -- the bandwidth system told us to enter this state. "exiting" -- Tor is in this state because it's getting ready to exit. We add a STATUS_GENERAL hibernation event as follows: HIBERNATION "STATUS=" (one of the status pairs above.) Indicates that Tor's hibernation status has changed. Note: Controllers MUST accept status values here that they don't recognize. The "GETINFO accounting/hibernating" value and the "STATUS_SERVER HIBERANATION_STATUS" event keep their old meaning. 3.2. Changing the hibernation state We add the following new possible values to the SIGNAL controller command: "SLEEP" -- enter the sleep state, after an appropriate shutdown interval. "IDLE" -- enter the idle state "SLEEPWALK" -- If in sleep or idle, start probing for directory information in the sleep-update or idle-update state respectively. Remain in that state until we've probed for directory information, or until we're told to IDLE or SLEEP again, or (if we're idle) until we get client activity. Has no effect if not in sleep or idle. "WAKEUP" -- If in sleep, sleep-update, idle, idle-update, or shutdown:sleep state, enter the live state. Has no effect in any other state. 3.3. New configuration parameters StartIdle -- Boolean. If set to 1, Tor begins in IDLE mode.
Filename: 287-reduce-lifetime.txt Title: Reduce circuit lifetime without overloading the network Author: Fernando Fernandez Mancera Created: 30-Nov-2017 Status: Open Motivation: Currently Tor users are reusing a given circuit for ten minutes (by default) after it's first used. This time is too long because a malicious Exit relay can trace a user's pseudonymous profile, especially if connections from multiple protocols are put on the same circuit. This time it is established on MaxCircuitDirtiness parameter and by default its value is ten minutes. I have been thinking in a way to fix this. The first idea that came to my mind was to use StreamIsolationByHost and StreamIsolationByPort on it, but I wasn't able to sort it out. One day, I thought "Why is time so important?" and later on I realized that maybe focusing on the amount of bytes running through the circuit could end up being a better approach on this problem. Design: I propose two options to reduce this problem, both based on taking into account the amount of bytes running through a circuit. MaxCircuitSizeDirtiness (temporal parameter name) will take an integer field that is contained on an interval and represents the maximum amount of bytes that can be written/read (we need to discuss about the use of one for both) by the circuit. If the circuit exceeds that amount, new streams won't use this circuit anymore. MaxCircuitSizeDirtinessByPort (temporal parameter name) will take an array of integers that are contained on an interval and represents the maximum amount of bytes that can be written/read (we need to discuss about the use of one for both) by the circuit per port (StreamIsolationByPort). This array is parallel to the array of ports from StreamIsolationByPort. If the circuit exceeds that amount, new streams won't use this circuit anymore. Regarding default values it would be useful to set up one a bit lower than the average amount of bytes per circuit. On MaxCircuitSizeDirtinessByPort after discuss it we shouldn't set up a default value because someone can identify the port used. About MaxCircuitDirtiness, if the others are set up by default it could be bigger, like thirty minutes, so if the user doesn't send/receive a significant amount of data the circuit will be changed anyway. Security Implications: It is believed that the proposed changes will improve the anonymity for end users. The end user won't reuse a given circuit if they have sent a considerable amount of bytes, thus making more difficult for malicious Exit relays to be able to trace a user's pseudonymous profile. Obviously this is a probability, of course it's possible that sensitive data will leak in a little amount of data but it's more even possible that sensitive data will leak in a large amount. Specification: In order to implement this feature we will need to add some new functionalities. We need to parse MaxCircuitSizeDirtiness and MaxCircuitSizeDirtinessByPort from the torrc config file. We need to create a function or improve one to check the amount of bytes that are running through the circuit and if this amount is higher than the established value, consider the circuit dirty. Compatibility: The proposed changes should not create any compatibility issues. New Tor clients will be able to take advantage of this without any modification to the network. Implementation: It is proposed that MaxCircuitSizeDirtiness will be enabled by default and also increase MaxCircuitDirtiness to thirty minutes. It is proposed that MaxCircuitSizeDirtinessByPort won't be enabled by default for port 22, 53, and port 80 as StreamIsolationByPort. About TorBrowser or any other Tor application that is able to manage circuits by its own because of KeepAliveIsolateSOCKSAuth option being active by default shouldn't be affected by this new feature. As the same form that it currently ignores MaxCircuitDirtiness parameter. Performance and scalability notes: The proposed changes will reduce Tor network stress as users who do not exceed the set amount will reduce circuit generation by three (if default MaxCircuitDirtinesss value is thirty minutes). I want to work on demonstrating that by a research but first it's nice to get the idea accepted. References: Tor project research ideas [https://research.torproject.org/ideas.html] Enhancing Tor's Performance using Real-time Traffic Classification [https://www.cypherpunks.ca/~iang/pubs/difftor-ccs.pdf] (It's not exactly about that, but they talk about circuit lifetime and the ten minutes problem a few times. Also it's an interesting paper.)
Filename: 288-privcount-with-shamir.txt Title: Privacy-Preserving Statistics with Privcount in Tor (Shamir version) Author: Nick Mathewson, Tim Wilson-Brown, Aaron Johnson Created: 1-Dec-2017 Supercedes: 280 Status: Reserve 0. Acknowledgments Tariq Elahi, George Danezis, and Ian Goldberg designed and implemented the PrivEx blinding scheme. Rob Jansen and Aaron Johnson extended PrivEx's differential privacy guarantees to multiple counters in PrivCount: https://github.com/privcount/privcount/blob/master/README.markdown#research-background Rob Jansen and Tim Wilson-Brown wrote the majority of the experimental PrivCount code, based on the PrivEx secret-sharing variant. This implementation includes contributions from the PrivEx authors, and others: https://github.com/privcount/privcount/blob/master/CONTRIBUTORS.markdown This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. The use of a Shamir secret-sharing-based approach is due to a suggestion by Aaron Johnson (iirc); Carolin Zöbelein did some helpful analysis here. Aaron Johnson and Tim Wilson-Brown made improvements to the draft proposal. 1. Introduction and scope PrivCount is a privacy-preserving way to collect aggregate statistics about the Tor network without exposing the statistics from any single Tor relay. This document describes the behavior of the in-Tor portion of the PrivCount system. It DOES NOT describe the counter configurations, or any other parts of the system. (These will be covered in separate proposals.) 2. PrivCount overview Here follows an oversimplified summary of PrivCount, with enough information to explain the Tor side of things. The actual operation of the non-Tor components is trickier than described below. In PrivCount, a Data Collector (DC, in this case a Tor relay) shares numeric data with N different Tally Reporters (TRs). (A Tally Reporter performs the summing and unblinding roles of the Tally Server and Share Keeper from experimental PrivCount.) All N Tally Reporters together can reconstruct the original data, but no (N-1)-sized subset of the Tally Reporters can learn anything about the data. (In reality, the Tally Reporters don't reconstruct the original data at all! Instead, they will reconstruct a _sum_ of the original data across all participating relays.) In brief, the system works as follow: To share data, for each counter value V to be shared, the Data Collector first adds Gaussian noise to V in order to produce V', uses (K,N) Shamir secret-sharing to generate N shares of V' (K<=N, K being the reconstruction threshold), encrypts each share to a different Tally Reporter, and sends each encrypted share to the Tally Reporter it is encrypted for. The Tally Reporters then agree on the set S of Data Collectors that sent data to all of them, and each Tally Reporter forms a share of the aggregate value by decrypting the shares it received from the Data Collectors in S and adding them together. The Tally Reporters then, collectively, perform secret reconstruction, thereby learning the sum of all the different values V'. The use of Shamir secret sharing lets us survive up to N-K crashing TRs. Waiting until the end to agree on a set S of surviving relays lets us survive an arbitrary number of crashing DCs. In order to prevent bogus data from corrupting the tally, the Tally Reporters can perform the aggregation step multiple times, each time proceeding with a different subset of S and taking the median of the resulting values. Relay subsets should be chosen at random to avoid relays manipulating their subset membership(s). If an shared random value is required, all relays must submit their results, and then the next revealed shared random value can be used to select relay subsets. (Tor's shared random value can be calculated as soon as all commits have been revealed. So all relay results must be received *before* any votes are cast in the reveal phase for that shared random value.) Below we describe the algorithm in more detail, and describe the data format to use. 3. The algorithm All values below are B-bit integers modulo some prime P; we suggest B=62 and P = 2**62 - 2**30 - 1 (hex 0x3fffffffbfffffff). The size of this field is an upper limit on the largest sum we can calculate; it is not a security parameter. There are N Tally Reporters: every participating relay must agree on which N exist, and on their current public keys. We suggest listing them in the consensus networkstatus document. All parties must also agree on some ordering the Tally Reporters. Similarly, all parties must also agree on some value K<=N. There are a number of well-known "counters", identified known by ASCII identifiers. Each counter is a value that the participating relays will know how to count. Let C be the number of counters. 3.1. Data Collector (DC) side At the start of each period, every Data Collector ("client" below) initializes their state as follows 1. For every Tally Reporter with index i, the client constructs a random 32-byte random value SEED_i. The client then generates a pseudorandom bitstream of using the SHAKE-256 XOF with SEED_i as its input, and divides this stream into C values, with the c'th value denoted by MASK(i, c). [To divide the stream into values, consider the stream 8 bytes at a time as unsigned integers in network (big-endian) order. For each such integer, clear the top (64-B) bits. If the result is less than P, then include the integer as one of the MASK(i, .) values. Otherwise, discard this 8-byte segment and proceed to the next value.] 2. The client encrypts SEED_i using the public key of Tally Reporter i, and remembers this encrypted value. It discards SEED_i. 3. For every counter c, the client generates a noise value Z_c from an appropriate Gaussian distribution. If the noise value is negative, the client adds P to bring Z_c into the range 0...(P-1). (The noise MUST be sampled using the procedure in Appendix C.) The client then uses Shamir secret sharing to generate N shares (x,y) of Z_c, 1 <= x <= N, with the x'th share to be used by the x'th Tally Reporter. See Appendix A for more on Shamir secret sharing. See Appendix B for another idea about X coordinates. The client picks a random value CTR_c and stores it in the counter, which serves to locally blind the counter. The client then subtracts (MASK(x, c)+CTR_c) from y, giving "encrypted shares" of (x, y0) where y0 = y-CTR_c. The client then discards all MASK values, all CTR values, and all original shares (x,y), all CTR and the noise value Z_c. For each counter c, it remembers CTR_c, and N shares of the form (x, y). To increment a counter by some value "inc": 1. The client adds "inc" to counter value, modulo P. (This step is chosen to be optimal, since it will happen more frequently than any other step in the computation.) Aggregate counter values that are close to P/2 MUST be scaled to avoid overflow. See Appendix D for more information. (We do not think that any counters on the current Tor network will require scaling.) To publish the counter values: 1. The client publishes, in the format described below: The list of counters it knows about The list of TRs it knows about For each TR: For each counter c: A list of (i, y-CTR_c-MASK(x,c)), which corresponds to the share for the i'th TR of counter c. SEED_i as encrypted earlier to the i'th TR's public key. 3.2. Tally Reporter (TR) side This section is less completely specified than the Data Collector's behavior: I expect that the TRs will be easier to update as we proceed. (Each TR has a long-term identity key (ed25519). It also has a sequence of short-term curve25519 keys, each associated with a single round of data collection.) 1. When a group of TRs receives information from the Data Collectors, they collectively chose a set S of DCs and a set of counters such that every TR in the group has a valid entry for every counter, from every DC in the set. To be valid, an entry must not only be well-formed, but must also have the x coordinate in its shares corresponding to the TR's position in the list of TRs. 2. For each Data Collector's report, the i'th TR decrypts its part of the client's report using its curve25519 key. It uses SEED_i and SHAKE-256 to regenerate MASK(0) through MASK(C-1). Then for each share (x, y-CTR_c-MASK(x,c)) (note that x=i), the TR reconstructs the true share of the value for that DC and counter c by adding V+MASK(x,c) to the y coordinate to yield the share (x, y_final). 3. For every counter in the set, each TR computes the sum of the y_final values from all clients. 4. For every counter in the set, each TR publishes its a share of the sum as (x, SUM(y_final)). 5. If at least K TRs publish correctly, then the sum can be reconstructed using Lagrange polynomial interpolation. (See Appendix A). 6. If the reconstructed sum is greater than P/2, it is probably a negative value. The value can be obtained by subtracting P from the sum. (Negative values are generated when negative noise is added to small signals.) 7. If scaling has been applied, the sum is scaled by the scaling factor. (See Appendix D.) 4. The document format 4.1. The counters document. This document format builds on the line-based directory format used for other tor documents, described in Tor's dir-spec.txt. Using this format, we describe a "counters" document that publishes the shares collected by a given DC, for a single TR. The "counters" document has these elements: "privctr-dump-format" SP VERSION SP SigningKey [At start, exactly once] Describes the version of the dump format, and provides an ed25519 signing key to identify the relay. The signing key is encoded in base64 with padding stripped. VERSION is "alpha" now, but should be "1" once this document is finalized. "starting-at" SP IsoTime [Exactly once] The start of the time period when the statistics here were collected. "ending-at" SP IsoTime [Exactly once] The end of the time period when the statistics here were collected. "share-parameters" SP Number SP Number [Exactly once] The number of shares needed to reconstruct the client's measurements (K), and the number of shares produced (N), respectively. "tally-reporter" SP Identifier SP Integer SP Key [At least twice] The curve25519 public key of each Tally Reporter that the relay believes in. (If the list does not match the list of participating Tally Reporters, they won't be able to find the relay's values correctly.) The identifiers are non-space, non-nul character sequences. The Key values are encoded in base64 with padding stripped; they must be unique within each counters document. The Integer values are the X coordinate of the shares associated with each Tally Reporter. "encrypted-to-key" SP Key [Exactly once] The curve25519 public key to which the report below is encrypted. Note that it must match one of the Tally Reporter options above. "report" NL "----- BEGIN ENCRYPTED MESSAGE-----" NL Base64Data "----- END ENCRYPTED MESSAGE-----" NL [Exactly once] An encrypted document, encoded in base64. The plaintext format is described in section 4.2. below. The encryption is as specified in section 5 below, with STRING_CONSTANT set to "privctr-shares-v1". "signature" SP Signature [At end, exactly once] The Ed25519 signature of all the fields in the document, from the first byte, up to but not including the "signature" keyword here. The signature is encoded in base64 with padding stripped. 4.2. The encrypted "shares" document. The shares document is sent, encrypted, in the "report" element above. Its plaintext contents include these fields: "encrypted-seed" NL "----- BEGIN ENCRYPTED MESSAGE-----" NL Base64Data "----- END ENCRYPTED MESSAGE-----" NL [At start, exactly once.] An encrypted document, encoded in base64. The plaintext value is the 32-byte value SEED_i for this TR. The encryption is as specified in section 5 below, with STRING_CONSTANT set to "privctr-seed-v1". "d" SP Keyword SP Integer [Any number of times] For each counter, the name of the counter, and the obfuscated Y coordinate of this TR's share for that counter. (The Y coordinate is calculated as y-CTR_c as in 3.1 above.) The order of counters must correspond to the order used when generating the MASK() values; different clients do not need to choose the same order. 5. Hybrid encryption This scheme is taken from rend-spec-v3.txt, section 2.5.3, replacing "secret_input" and "STRING_CONSTANT". It is a hybrid encryption method for encrypting a message to a curve25519 public key PK. We generate a new curve25519 keypair (sk,pk). We run the algorithm of rend-spec-v3.txt 2.5.3, replacing "secret_input" with Curve25519(sk,PK) | SigningKey, where SigningKey is the DC's signing key. (Including the DC's SigningKey here prevents one DC from replaying another one's data.) We transmit the encrypted data as in rend-spec-v3.txt 2.5.3, prepending pk. Appendix A. Shamir secret sharing for the impatient In Shamir secret sharing, you want to split a value in a finite field into N shares, such that any K of the N shares can reconstruct the original value, but K-1 shares give you no information at all. The key insight here is that you can reconstruct a K-degree polynomial given K+1 distinct points on its curve, but not given K points. So, to split a secret, we going to generate a (K-1)-degree polynomial. We'll make the Y intercept of the polynomial be our secret, and choose all the other coefficients at random from our field. Then we compute the (x,y) coordinates for x in [1, N]. Now we have N points, any K of which can be used to find the original polynomial. Moreover, we can do what PrivCount wants here, because adding the y coordinates of N shares gives us shares of the sum: If P1 is the polynomial made to share secret A and P2 is the polynomial made to share secret B, and if (x,y1) is on P1 and (x,y2) is on P2, then (x,y1+y2) will be on P1+P2 ... and moreover, the y intercept of P1+P2 will be A+B. To reconstruct a secret from a set of shares, you have to either go learn about Lagrange polynomials, or just blindly copy a formula from your favorite source. Here is such a formula, as pseudocode^Wpython, assuming that each share is an object with a _x field and a _y field. def interpolate(shares): for sh in shares: product_num = FE(1) product_denom = FE(1) for sh2 in shares: if sh2 is sh: continue product_num *= sh2._x product_denom *= (sh2._x - sh._x) accumulator += (sh._y * product_num) / product_denom return accumulator Appendix B. An alternative way to pick X coordinates Above we describe a system where everybody knows the same TRs and puts them in the same order, and then does Shamir secret sharing using "x" as the x coordinate for the x'th TR. But what if we remove that requirement by having x be based on a hash of the public key of the TR? Everything would still work, so long as all users chose the same K value. It would also let us migrate TR sets a little more gracefully. Appendix C. Sampling floating-point Gaussian noise for differential privacy Background: When we add noise to a counter value (signal), we want the added noise to protect all of the bits in the signal, to ensure differential privacy. But because noise values are generated from random double(s) using floating-point calculations, the resulting low bits are not distributed evenly enough to ensure differential privacy. As implemented in the C "double" type, IEEE 754 double-precision floating-point numbers contain 53 significant bits in their mantissa. This means that noise calculated using doubles can not ensure differential privacy for client activity larger than 2**53: * if the noise is scaled to the magnitude of the signal using multiplication, then the low bits are unprotected, * if the noise is not scaled, then the high bits are unprotected. But the operations in the noise transform also suffer from floating-point inaccuracy, further affecting the low bits in the mantissa. So we can only protect client activity up to 2**46 with Laplacian noise. (We assume that the limit for Gaussian noise is similar.) Our noise generation procedure further reduces this limit to 2**42. For byte counters, 2**42 is 4 Terabytes, or the observed bandwidth of a 1 Gbps relay running at full speed for 9 hours. It may be several years before we want to protect this much client activity. However, since the mitigation is relatively simple, we specify that it MUST be implemented. Procedure: Data collectors MUST sample noise as follows: 1. Generate random double(s) in [0, 1] that are integer multiples of 2**-53. TODO: the Gaussian transform in step 2 may require open intervals 2. Generate a Gaussian floating-point noise value at random with sigma 1, using the random double(s) generated in step 1. 3. Multiply the floating-point noise by the floating-point sigma value. 4. Truncate the scaled noise to an integer to remove the fractional bits. (These bits can never correspond to signal bits, because PrivCount only collects integer counters.) 5. If the floating-point sigma value from step 3 is large enough that any noise value could be greater than or equal to 2**46, we need to randomise the low bits of the integer scaled noise value. (This ensures that the low bits of the signal are always hidden by the noise.) If we use the sample_unit_gaussian() transform in nickm/privcount_nm: A. The maximum r value is sqrt(-2.0*ln(2**-53)) ~= 8.57, and the maximal sin(theta) values are +/- 1.0. Therefore, the generated noise values can be greater than or equal to 2**46 when the sigma value is greater than 2**42. B. Therefore, the number of low bits that need to be randomised is: N = floor(sigma / 2**42) C. We randomise the lowest N bits of the integer noise by replacing them with a uniformly distributed N-bit integer value in 0...(2**N)-1. 6. Add the integer noise to the integer counter, before the counter is incremented in response to events. (This ensures that the signal value is always protected.) This procedure is security-sensitive: changing the order of multiplications, truncations, or bit replacements can expose the low or high bits of the signal or noise. As long as the noise is sampled using this procedure, the low bits of the signal are protected. So we do not need to "bin" any signals. The impact of randomising more bits than necessary is minor, but if we fail to randomise an unevenly distributed bit, client activity can be exposed. Therefore, we choose to randomise all bits that could potentially be affected by floating-point inaccuracy. Justification: Although this analysis applies to Laplacian noise, we assume a similar analysis applies to Gaussian noise. (If we add Laplacian noise on DCs, the total ends up with a Gaussian distribution anyway.) TODO: check that the 2**46 limit applies to Gaussian noise. This procedure results in a Gaussian distribution for the higher ~42 bits of the noise. We can safely ignore the value of the lower bits of the noise, because they are insignificant for our reporting. This procedure is based on section 5.2 of: "On Significance of the Least Significant Bits For Differential Privacy" Ilya Mironov, ACM CCS 2012 https://www.microsoft.com/en-us/research/wp-content/uploads/2012/10/lsbs.pdf We believe that this procedure is safe, because we neither round nor smooth the noise values. The truncation in step 4 has the same effect as Mironov's "safe snapping" procedure. Randomising the low bits removes the 2**46 limit on the sigma value, at the cost of departing slightly from the ideal infinite-precision Gaussian distribution. (But we already know that these bits are distributed poorly, due to floating-point inaccuracy.) Mironov's analysis assumes that a clamp() function is available to clamp large signal and noise values to an infinite floating-point value. Instead of clamping, PrivCount's arithmetic wraps modulo P. We believe that this is safe, because any reported values this large will be meaningless modulo P. And they will not expose any client activity, because "modulo P" is an arithmetic transform of the summed noised signal value. Alternatives: We could round the encrypted value to the nearest multiple of the unprotected bits. But this relies on the MASK() value being a uniformly distributed random value, and it is less generic. We could also simply fail when we reach the 2**42 limit on the sigma value, but we do not want to design a system with a limit that low. We could use a pure-integer transform to create Gaussian noise, and avoid floating-point issues entirely. But we have not been able to find an efficient pure-integer Gaussian or Laplacian noise transform. Nor do we know if such a transform can be used to ensure differential privacy. Appendix D. Scaling large counters We do not believe that scaling will be necessary to collect PrivCount statistics in Tor. As of November 2017, the Tor network advertises a capacity of 200 Gbps, or 2**51 bytes per day. We can measure counters as large as ~2**61 before reaching the P/2 counter limit. If scaling becomes necessary, we can scale event values (and noise sigmas) by a scaling factor before adding them to the counter. Scaling may introduce a bias in the final result, but this should be insignificant for reporting. Appendix Z. Remaining client-side uncertainties [These are the uncertainties at the client side. I'm not considering TR-only operations here unless they affect clients.] Should we do a multi-level thing for the signing keys? That is, have an identity key for each TR and each DC, and use those to sign short-term keys? How to tell the DCs the parameters of the system, including: - who the TRs are, and what their keys are? - what the counters are, and how much noise to add to each? - how do we impose a delay when the noise parameters change? (this delay ensures differential privacy even when the old and new counters are compared) - or should we try to monotonically increase counter noise? - when the collection intervals start and end? - what happens in networks where some relays report some counters, and other relays report other counters? - do we just pick the latest counter version, as long as enough relays support it? (it's not safe to report multiple copies of counters) How the TRs agree on which DCs' counters to collect? How data is uploaded to DCs? What to say about persistence on the DC side?
Filename: 289-authenticated-sendmes.txt Title: Authenticating sendme cells to mitigate bandwidth attacks Author: Rob Jansen, Roger Dingledine, David Goulet Created: 2016-12-01 Status: Closed Implemented-In: 0.4.1.1-alpha 1. Overview and Motivation In Rob's "Sniper attack", a malicious Tor client builds a circuit, fetches a large file from some website, and then refuses to read any of the cells from the entry guard, yet sends "sendme" (flow control acknowledgement) cells down the circuit to encourage the exit relay to keep sending more cells. Eventually enough cells queue at the entry guard that it runs out of memory and exits [0, 1]. We resolved the "runs out of memory and exits" part of the attack with our Out-Of-Memory (OOM) manager introduced in Tor 0.2.4.18-rc. But the earlier part remains unresolved: a malicious client can launch an asymmetric bandwidth attack by creating circuits and streams and sending a small number of sendme cells on each to cause the target relay to receive a large number of data cells. This attack could be used for general mischief in the network (e.g., consume Tor network bandwidth resources or prevent access to relays), and it could probably also be leveraged to harm anonymity a la the "congestion attack" designs [2, 3]. This proposal describes a way to verify that the client has seen all of the cells that its sendme cell is acknowledging, based on the authenticated sendmes design from [1]. 2. Sniper Attack Variations There are some variations on the attack involving the number and length of the circuits and the number of Tor clients used. We explain them here to help understand which of them this proposal attempts to defend against. We compare the efficiency of these attacks in terms of the number of cells transferred by the adversary and by the network, where receiving and sending a cell counts as two transfers of that cell. 2.1 Single Circuit, without Sendmes The simplest attack is where the adversary starts a single Tor client, creates one circuit and two streams to some website, and stops reading from the TCP connection to the entry guard. The adversary gets 1000 "attack" cells "for free" (until the stream and circuit windows close). The attack data cells are both received and sent by the exit and the middle, while being received and queued by the guard. Adversary: 6 transfers to create the circuit 2 to begin the two exit connections 2 to send the two GET requests --- 10 total Network: 18 transfers to create the circuit 22 to begin the two exit connections (assumes two for the exit TCP connect) 12 to send the two GET requests to the website 5000 for requested data (until the stream and circuit windows close) --- 5052 total 2.2 Single Circuit, with Sendmes A slightly more complex version of the attack in 2.1 is where the adversary continues to send sendme cells to the guard (toward the exit), and then gets another 100 attack data cells sent across the network for every three additional exitward sendme cells that it sends (two stream-level sendmes and one circuit-level sendme). The adversary also gets another three clientward sendme cells sent by the exit for every 100 exitward sendme cells it sends. If the adversary sends N sendmes, then we have: Adversary: 10 for circuit and stream setup N for circuit and stream sendmes --- 10+N Network: 5052 for circuit and stream setup and initial depletion of circuit windows N*100/3*5 for transferring additional data cells from the website N*3/100*4 for transferring sendmes from exit to client --- 5052 + N*166.79 It is important to note that once the adversary stops reading from the guard, it will no longer get feedback on the speed at which the data cells are able to be transferred through the circuit from the exit to the guard. It needs to approximate when it should send sendmes to the exit; if too many sendmes are sent such that the circuit window would open farther than 1000 cells (500 for streams), then the circuit may be closed by the exit. In practice, the adversary could take measurements during the circuit setup process and use them to estimate a conservative sendme sending rate. 2.3 Multiple Circuits The adversary could parallelize the above attacks using multiple circuits. Because the adversary needs to stop reading from the TCP connection to the guard, they would need to do a pre-attack setup phase during which they construct the attack circuits. Then, they would stop reading from the guard and send all of the GET requests across all of the circuits they created. The number of cells from 2.1 and 2.2 would then be multiplied by the number of circuits C that the adversary is able to build and sustain during the attack. 2.4 Multiple Guards The adversary could use the "UseEntryGuards 0" torrc option, or build custom circuits with stem to parallelize the attack across multiple guard nodes. This would slightly increase the bandwidth usage of the adversary, since it would be creating additional TCP connections to guard nodes. 2.5 Multiple Clients The adversary could run multiple attack clients, each of which would choose its own guard. This would slightly increase the bandwidth usage of the adversary, since it would be creating additional TCP connections to guard nodes and would also be downloading directory info, creating testing circuits, etc. 2.6 Short Two-hop Circuits If the adversary uses two-hop circuits, there is less overhead involved with the circuit setup process. Adversary: 4 transfers to create the circuit 2 to begin the two exit connections 2 to send the two GET requests --- 8 Network: 8 transfers to create the circuit 14 to begin the two exit connections (assumes two for the exit TCP connect) 8 to send the two GET requests to the website 5000 for requested data (until the stream and circuit windows close) --- 5030 2.7 Long >3-hop Circuits The adversary could use a circuit longer than three hops to cause more bandwidth usage across the network. Let's use an 8 hop circuit as an example. Adversary: 16 transfers to create the circuit 2 to begin the two exit connections 2 to send the two GET requests --- 20 Network: 128 transfers to create the circuit 62 to begin the two exit connections (assumes two for the exit TCP connect) 32 to send the two GET requests to the website 15000 for requested data (until the stream and circuit windows close) --- 15222 The adversary could also target a specific relay, and use it multiple times as part of the long circuit, e.g., as hop 1, 4, and 7. Target: 54 transfers to create the circuit 22 to begin the two exit connections (assumes two for the exit TCP connect) 12 to send the two GET requests to the website 5000 for requested data (until the stream and circuit windows close) --- 5088 3. Design This proposal aims to defend against the versions of the attack that utilize sendme cells without reading. It does not attempt to handle the case of multiple circuits per guard, or try to restrict the number of guards used by a client, or prevent a sybil attack across multiple client instances. The proposal involves three components: first, the client needs to add a token to the sendme payload, to prove that it knows the contents of the cells that it has received. Second, the exit relay needs to verify this token. Third, to resolve the case where the client already knows the contents of the file so it only pretends to read the cells, the exit relay needs to be able to add unexpected randomness to the circuit. (Note: this proposal talks about clients and exit relays, but since sendmes go in both directions, both sides of the circuit should do these changes.) 3.1. Changing the sendme payload to prove receipt of cells In short: clients put the latest received relay cell digest in the payload of their circuit-level sendme cells. Each relay cell header includes a 4-byte digest which represents the rolling hash of all bytes received on that circuit. So knowledge of that digest is an indication that you've seen the bytes that go into it. We pick circuit-level sendme cells, as opposed to stream-level sendme cells, because we think modifying just circuit-level sendmes is sufficient to accomplish the properties we need, and modifying just stream-level sendmes is not sufficient: a client could send a bunch of begin cells and fake their circuit-level sendmes, but never send any stream-level sendmes, attracting 500*n queued cells to the entry guard for the n streams that it opens. Which digest should the client put in the sendme payload? Right now circuit-level sendmes are sent whenever one window worth of relay cells (100) has arrived. So the client should use the digest from the cell that triggers the sendme. In order to achieve this, we need to version the SENDME cell so we can differentiate the original protocol versus the new authenticated cell. Right now, the SENDME payload is empty which translate to a version value of 0 with this proposed change. The version to achieve authenticated SENDMEs of this proposal would be 1. The SENDME cell payload would contain the following: VERSION [1 byte] DATA_LEN [2 bytes] DATA [DATA_LEN bytes] The VERSION tells us what is expected in the DATA section of length DATA_LEN. The recognized values are: 0x00: The rest of the payload should be ignored. 0x01: Authenticated SENDME. The DATA section should contain: DIGEST [20 bytes] If the DATA_LEN value is less than 4 bytes, the cell should be dropped and the circuit closed. If the value is more than 4 bytes, then the first 20 bytes should be read to get the correct value. The DIGEST is the digest value from the cell that triggered this SENDME as mentioned above. This value is matched on the other side from the previous cell. If a VERSION is unrecognized, the SENDME cell should be treated as version 0 meaning the payload is ignored. 3.2. Verifying the sendme payload In the current Tor, the exit relay keeps no memory of the cells it has sent down the circuit, so it won't be in a position to verify the digest that it gets back. But fortunately, the exit relay can count also, so it knows which cell is going to trigger the sendme response. Each circuit can have at most 10 sendmes worth of data outstanding. So the exit relay will keep a per-circuit fifo queue of the digests from the appropriate cells, and when a new sendme arrives, it pulls off the next digest in line, and verifies that it matches. If a sendme payload has a payload version of 1 yet its digest doesn't match the expected digest, or if the sendme payload has an unexpected payload version (see below about deployment phases), the exit relay must tear down the circuit. (If we later find that we need to introduce a newer payload version in an incompatible way, we would do that by bumping the circuit protocol version.) 3.3. Making sure there are enough unpredictable bytes in the circuit So far, the design as described fails to a very simple attacker: the client fetches a file whose contents it already knows, and it uses that knowledge to calculate the correct digests and fake its sendmes just like in the original attack. The fix is that the exit relay needs to be able to add some randomness into its cells. It can add this randomness, in a way that's completely orthogonal to the rest of this design, simply by choosing one relay cell every so often and not using the entire relay cell payload for actual data (i.e. using a Length field of less than 498), and putting some random bytes in the remainder of the payload. How many random bytes should the exit relay use, and how often should it use them? There is a tradeoff between security when under attack, and efficiency when not under attack. We think 1 byte of randomness every 1000 cells is a good starting plan, and we can always improve it later without needing to change any of the rest of this design. (Note that the spec currently says "The remainder of the payload is padded with NUL bytes." We think "is" doesn't mean MUST, so we should just be sure to update that part of the spec to reflect our new plans here.) 4. Deployment Plan This section describes how we will be able to deploy this new mechanism on the network. Alas, this deployment plan leaves a pretty large window until relays are protected from attack. It's not all bad news though, since we could flip the switches earlier than intended if we encounter a network-wide attack. There are 4 phases to this plan detailed in the following subsections. 4.1. Phase One - Remembering Digests Both sides begin remembering their expected digests, and they learn how to parse sendme version 1 payloads. When they receive a version 1 SENDME, they verify its digest and tear down the circuit if it's wrong. But they continue to send and accept payload version 0 sendmes. 4.2. Phase Two - Sending Version 1 We flip a switch in the consensus, and everybody starts sending payload version 1 sendmes. Payload version 0 sendmes are still accepted. The newly proposed consensus parameter to achieve this is: "sendme_emit_min_version" - Minimum SENDME version that can be sent. 4.3. Phase Three - Protover On phase four (section 4.4), the new consensus parameter that tells us which minimum version to accept, once flipped to version 1, has the consequence of making every tor not supporting that version to fail to operate on the network. It goes as far as unable to download a consensus. It is essentially a "false-kill" switch because tor will still run but will simply not work. It will retry over and over to download a consensus. In order to help us transition before only accepting v1 on the network, a new protover value is proposed (see section 9 of tor-spec.txt for protover details). Tor clients and relays that don't support this protover version from the consensus "required-client-protocols" or "required-relay-protocols" lines will exit and thus not try to join the network. Here is the proposed value: "FlowCtrl" Describes the flow control protocol at the circuit and stream level. If there is no FlowCtrl protocol version, tor supports the unauthenticated flow control features from its supported Relay protocols. "1" -- supports authenticated circuit level SENDMEs as of proposal 289 in Tor 0.4.1.1-alpha. 4.4. Phase Four - Accepting Version 1 We flip a different switch in the consensus, and everybody starts refusing payload version 0 sendmes. The newly proposed consensus parameter to achieve this is: "sendme_accept_min_version" - Minimum SENDME version that is accepted. It has to be two separate switches, not one unified one, because otherwise we'd have a race where relays learn about the update before clients know to start the new behavior. 4.5. Timeline The proposed timeline for the deployment phases: Phase 1: Once this proposal is merged into tor (expected: 0.4.1.1-alpha), v1 SENDMEs can be accepted on a circuit. Phase 2: Once Tor Browser releases a stable version containing 0.4.1, we consider that we have a very large portion of clients supporting v1 and thus limit the partition problem. We can safely emit v1 SENDMEs in the network because the payload is ignored for version 0 thus sending a v1 right now will not affect older tor's behavior and will be considered a v0. Phase 3: This phase will effectively exit() all tor not supporting "FlowCtrl=1". The earliest date we can do that is when all versions not supporting v1 are EOL. According to our release schedule[4], this can happen when our latest LTS (0.3.5) goes EOL that is on Feb 1st, 2022. Phase 4: We recommend to pass at least one version after Phase 3 so we can take the time to see the effect that it had on the network. Considering 6 months release time frame we expect to do this phase around July 2022. 5. Security Discussion Does our design enable any new adversarial capabilities? An adversarial middle relay could attempt to trick the exit into killing an otherwise valid circuit. An adversarial relay can already kill a circuit, but here it could make it appear that the circuit was killed for a legitimate reason (invalid or missing sendme), and make someone else (the exit) do the killing. There are two ways it might do this: by trying to make a valid sendme appear invalid; and by blocking the delivery of a valid sendme. Both of these depend on the ability for the adversary to guess which exitward cell is a sendme cell, which it could do by counting clientward cells. * Making a valid sendme appear invalid A malicious middle could stomp bits in the exitward sendme so that the exit sendme validation fails. However, bit stomping would be detected at the protocol layer orthogonal to this design, and unrecognized exitward cells would currently cause the circuit to be torn down. Therefore, this attack has the same end result as blocking the delivery of a valid sendme. (Note that, currently, clientward unrecognized cells are dropped but the circuit is not torn down.) * Blocking delivery of a valid sendme A malicious middle could simply drop a exitward sendme, so that the exit is unable to verify the digest in the sendme payload. The following exitward sendme cell would then be misaligned with the sendme that the exit is expecting to verify. The exit would kill the circuit because the client failed to prove it has read all of the clientward cells. The benefits of such an attack over just directly killing the circuit seem low, and we feel that the added benefits of the defense outweigh the risks. 6. Open problems With the proposed defenses in place, an adversary will be unable to successfully use the "continue sending sendmes" part of these attacks. But this proposal won't resolve the "build up many circuits over time, and then use them to attack all at once" issue, nor will it stop sybil attacks like if an attacker makes many parallel connections to a single target relay, or reaches out to many guards in parallel. We spent a while trying to figure out if we can enforce some upper bound on how many circuits a given connection is allowed to have open at once, to limit every connection's potential for launching a bandwidth attack. But there are plausible situations where well-behaving clients accumulate many circuits over time: Ricochet clients with many friends, popular onion services, or even Tor Browser users with a bunch of tabs open. Even though a per-conn circuit limit would produce many false positives, it might still be useful to have it deployed and available as a consensus parameter, as another tool for combatting a wide-scale attack on the network: a parameter to limit the total number of open circuits per conn (viewing each open circuit as a threat) would complement the current work in #24902 to rate limit circuit creates per client address. But we think the threat of parallel attacks might be best handled by teaching relays to react to actual attacks, like we've done in #24902: we should teach Tor relays to recognize when somebody is *doing* this attack on them, and to squeeze down or outright block the client IP addresses that have tried it recently. An alternative direction would be to await research ideas on how guards might coordinate to defend against attacks while still preserving user privacy. In summary, we think authenticating the sendme cells is a useful building block for these future solutions, and it can be (and should be) done orthogonally to whatever sybil defenses we pick later. 7. References [0] https://blog.torproject.org/blog/new-tor-denial-service-attacks-and-defenses [1] https://www.freehaven.net/anonbib/#sniper14 [2] https://www.freehaven.net/anonbib/#torta05 [3] https://www.freehaven.net/anonbib/#congestion-longpaths [4] https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases 8. Acknowledgements This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
Filename: 290-deprecate-consensus-methods.txt Title: Continuously update consensus methods Author: Nick Mathewson Created: 2018-02-21 Status: Meta 1. Background Directory authorities use the "consensus method" mechanism to achieve forward compatibility during voting. When each authority publishes its vote, it includes a list of numbered consensus methods that it supports. Each authority chooses to calculate the consensus according to the highest consensus method it knows supported by more than 2/3 of the voting authorities. So long as all the authorities have a method in common, they will all reach the same consensus. Consensus method 1 was first introduced in the Tor 0.2.0 series around 2008. But by 2012, we realized that we had a problem: we were stuck documenting and supporting old consensus methods indefinitely. With proposal 215, we deprecated and removed support for all consensus methods before method 13. That was good as far as it went, but it didn't solve the problem going forward: the latest consensus method is now 28. This proposal describes a policy for removing older consensus methods going forward, so we won't have to keep supporting them forever. 2. Proposal I propose that from time to time, old consensus methods should be deprecated. Specifically, I propose that we deprecate all methods older than the highest method supported in the first stable release of the oldest LTS (long-term support) release series. For example, the current oldest LTS series is 0.2.5.x. The first stable release in that series was 0.2.5.10. The highest consensus method listed by 0.2.5.10 is 18. Therefore, we should currently consider ourselves free to deprecate all methods before 18. Once 0.2.5.x is deprecated, 0.2.9.x will become the oldest LTS series. The first stable release in that series was 0.2.9.8. The highest consensus method listed by 0.2.9.8 is 25. Therefore, once 0.2.5.x is deprecated (in May 2018), we may deprecate all methods before 25. When a consensus method is deprecated, it should no longer be listed or implemented by the latest Tor releases. (It's okay for older authorities to keep advertising it.) Most consensus methods add a feature that is used in "method M or later". Deprecating method M-1 means that the feature is used in all supported consensus methods. Therefore, we can remove any code that makes the feature conditional on a consensus method, and any code for previous implementations of the feature. Some consensus methods remove a feature that was used up to method M. Deprecating method M means that the feature is no longer used by any supported consensus methods. Therefore, we can remove any code that implements the feature. A. Acknowledgments Thanks to isis and teor for the discussion that led to this proposal. I believe that teor first suggested the policy described in section 2 above. B. Client and relay compatibility notes Dear reader: you may be worrying that this proposal will cause old clients or relays to stop working prematurely. That is not the case. Consensus methods determine how the authorities behave, but they do not represent backward-incompatible changes in how they generate their consensuses.
Filename: 291-two-guard-nodes.txt Title: The move to two guard nodes Author: Mike Perry Created: 2018-03-22 Supersedes: Proposal 236 Status: Finished 0. Background Back in 2014, Tor moved from three guard nodes to one guard node[1,2,3]. We made this change primarily to limit points of observability of entry into the Tor network for clients and onion services, as well as to reduce the ability of an adversary to track clients as they move from one internet connection to another by their choice of guards. 1. Proposed changes 1.1. Switch to two guards per client When this proposal becomes effective, clients will switch to using two guard nodes. The guard node selection algorithms of Proposal 271 will remain unchanged. Instead of having one primary guard "in use", Tor clients will always use two. This will be accomplished by setting the guard-n-primary-guards-to-use consensus parameter to 2, as well as guard-n-primary-guards to 2. (Section 3.1 covers the reason for both parameters). This is equivalent to using the torrc option NumEntryGuards=2, which can be used for testing behavior prior to the consensus update. 1.2. Enforce Tor's path restrictions across this guard layer In order to ensure that Tor can always build circuits using two guards without resorting to a third, they must be chosen such that Tor's path restrictions could still build a path with at least one of them, regardless of the other nodes in the path. In other words, we must ensure that both guards are not chosen from the same /16 or the same node family. In this way, Tor will always be able to build a path using these guards, preventing the use of a third guard. 2. Discussion 2.1. Why two guards? The main argument for switching to two guards is that because of Tor's path restrictions, we're already using two guards, but we're using them in a suboptimal and potentially dangerous way. Tor's path restrictions enforce the condition that the same node cannot appear twice in the same circuit, nor can nodes from the same /16 subnet or node family be used in the same circuit. Tor's paths are also built such that the exit node is chosen first and held fixed during guard node choice, as are the IP, HSDIR, and RPs for onion services. This means that whenever one of these nodes happens to be the guard[4], or be in the same /16 or node family as the guard, Tor will build that circuit using a second "primary" guard, as per proposal 271[7]. Worse still, the choice of RP, IP, and exit can all be controlled by an adversary (to varying degrees), enabling them to force the use of a second guard at will. Because this happens somewhat infrequently in normal operation, a fresh TLS connection will typically be created to the second "primary" guard, and that TLS connection will be used only for the circuit for that particular request. This property makes all sorts of traffic analysis attacks easier, because this TLS connection will not benefit from any multiplexing. This is more serious than traffic injection via an already in-use guard because the lack of multiplexing means that the data retention level required to gain information from this activity is very low, and may exist for other reasons. To gain information from this behavior, an adversary needs only connection 5-tuples + timestamps, as opposed to detailed timeseries data that is polluted by other concurrent activity and padding. In the most severe form of this attack, the adversary can take a suspect list of Tor client IP addresses (or the list of all Guard node IP addresses) and observe when secondary Tor connections are made to them at the time when they cycle through all guards as RPs for connections to an onion service. This adversary does not require collusion on the part of observers beyond the ability to provide 5-tuple connection logs (which ISPs may retain for reasons such as netflow accounting, IDS, or DoS protection systems). A fully passive adversary can also make use of this behavior. Clients unlucky enough to pick guard nodes in heavily used /16s or in large node families will tend to make use of a second guard more frequently even without effort from the adversary. In these cases, the lack of multiplexing also means that observers along the path to this secondary guard gain more information per observation. 2.2. Why not MORE guards? We do not want to increase the number of observation points for client activity into the Tor network[1]. We merely want better multiplexing for the cases where this already happens. 2.3. Can you put some numbers on that? The Changing of the Guards[13] paper studies this from a few different angles, but one of the crucially missing graphs is how long a client can expect to run with N guards before it chooses a malicious guard. However, we do have tables in section 3.2.1 of proposal 247 that cover this[14]. There are three tables there: one for a 1% adversary, one for a 5% adversary, and one for a 10% adversary. You can see the probability of adversary success for one and two guards in terms of the number of rotations needed before the adversary's node is chosen. Not surprisingly, the two guard adversary gets to compromise clients roughly twice as quickly, but the timescales are still rather large even for the 10% adversary: they only have 50% chance of success after 4 rotations, which will take about 14 months with Tor's 3.5 month guard rotation. 2.4. What about guard fingerprinting? More guards also means more fingerprinting[8]. However, even one guard may be enough to fingerprint a user who moves around in the same area, if that guard is low bandwidth or there are not many Tor users in that area. Furthermore, our use of separate directory guards (and three of them) means that we're not really changing the situation much with the addition of another regular guard. Right now, directory guard use alone is enough to track all Tor users across the entire world. While the directory guard problem could be fixed[12] (and should be fixed), it is still the case that another mechanism should be used for the general problem of guard-vs-location management[9]. 3. Alternatives There are two other solutions that also avoid the use of secondary guard in the path restriction case. 3.1. Eliminate path restrictions entirely If Tor decided to stop enforcing /16, node family, and also allowed the guard node to be chosen twice in the path, then under normal conditions, it should retain the use of its primary guard. This approach is not as extreme as it seems on face. In fact, it is hard to come up with arguments against removing these restrictions. Tor's /16 restriction is of questionable utility against monitoring, and it can be argued that since only good actors use node family, it gives influence over path selection to bad actors in ways that are worse than the benefit it provides to paths through good actors[10,11]. However, while removing path restrictions will solve the immediate problem, it will not address other instances where Tor temporarily opts to use a second guard due to congestion, OOM, or failure of its primary guard, and we're still running into bugs where this can be adversarially controlled or just happen randomly[5]. While using two guards means twice the surface area for these types of bugs, it also means that instances where they happen simultaneously on both guards (thus forcing a third guard) are much less likely than with just one guard. (In the passive adversary model, consider that one guard fails at any point with probability P1. If we assume that such passive failures are independent events, both guards would fail concurrently with probability P1*P2. Even if the events are correlated, the maximum chance of concurrent failure is still MIN(P1,P2)). Note that for this analysis to hold, we have to ensure that nodes that are at RESOURCELIMIT or otherwise temporarily unresponsive do not cause us to consider other primary guards beyond than the two we have chosen. This is accomplished by setting guard-n-primary-guards to 2 (in addition to setting guard-n-primary-guards-to-use to 2). With this parameter set, the proposal 271 algorithm will avoid considering more than our two guards, unless *both* are down at once. 3.2. No Guard-flagged nodes as exit, RP, IP, or HSDIRs Similar to 3.1, we could instead forbid the use of Guard-flagged nodes for the exit, IP, RP, and HSDIR positions. This solution has two problems: First, like 3.1, it also does not handle the case where resource exhaustion could force the use of a second guard. Second, it requires clients to upgrade to the new behavior and stop using Guard flagged nodes before it can be deployed. 4. The future is confluxed An additional benefit of using a second guard is that it enables us to eventually use conflux[6]. Conflux works by giving circuits a 256bit cookie that is sent to the exit/RP, and circuits that are then built to the same exit/RP with the same cookie can then be fused together. Throughput estimates are used to balance traffic between these circuits, depending on their performance. We have unfortunately signaled to the research community that conflux is not worth pursuing, because of our insistence on a single guard. While not relevant to this proposal (indeed, conflux requires its own proposal and also concurrent research), it is worth noting that whichever way we go here, the door remains open to conflux because of its utility against similar issues. If our conflux implementation includes packet acking, then circuits can still survive the loss of one guard node due to DoS, OOM, or other failures because the second half of the path will remain open and usable (see the probability of concurrent failure arguments in Section 3.1). If exits remember this cookie for a short period of time after the last circuit is closed, the technique can be used to protect against DoS/OOM/guard downtime conditions that take down both guard nodes or destroy many circuits to confirm both guard node choices. In these cases, circuits could be rebuilt along an alternate path and resumed without end-to-end circuit connectivity loss. This same technique will also make things like ephemeral bridges (ie Snowflake/Flashproxy) more usable, because bridge uptime will no longer be so crucial to usability. It will also improve mobile usability by allowing us to resume connections after mobile Tor apps are briefly suspended, or if the user switches between cell and wifi networks. Furthermore, it is likely that conflux will also be useful against traffic analysis and congestion attacks. Since the load balancing is dynamic and hard to predict by an external observer and also increases overall traffic multiplexing, traffic correlation and website traffic fingerprinting attacks will become harder, because the adversary can no longer be sure what percentage of the traffic they have seen (depending on their position and other potential concurrent activity). Similarly, it should also help dampen congestion attacks, since traffic will automatically shift away from a congested guard. 5. Acknowledgements This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. References: 1. https://blog.torproject.org/improving-tors-anonymity-changing-guard-parameters 2. https://trac.torproject.org/projects/tor/ticket/12206 3. https://gitweb.torproject.org/torspec.git/tree/proposals/236-single-guard-node.txt 4. https://trac.torproject.org/projects/tor/ticket/14917 5. https://trac.torproject.org/projects/tor/ticket/25347#comment:14 6. https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf 7. https://gitweb.torproject.org/torspec.git/tree/proposals/271-another-guard-selection.txt 8. https://trac.torproject.org/projects/tor/ticket/9273#comment:3 9. https://tails.boum.org/blueprint/persistent_Tor_state/ 10. https://trac.torproject.org/projects/tor/ticket/6676#comment:3 11. https://bugs.torproject.org/15060 12. https://trac.torproject.org/projects/tor/ticket/10969 13. https://www.freehaven.net/anonbib/cache/wpes12-cogs.pdf 14. https://gitweb.torproject.org/torspec.git/tree/proposals/247-hs-guard-discovery.txt
Filename: 292-mesh-vanguards.txt Title: Mesh-based vanguards Authors: George Kadianakis and Mike Perry Created: 2018-05-08 Status: Accepted Supersedes: 247 0. Motivation A guard discovery attack allows attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001). Following the guard discovery attack with a compromise and/or coercion of the guard node can lead to the deanonymization of a hidden service. 1. Overview This document tries to make the above guard discovery + compromise attack harder to launch. It introduces a configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration. With this new path selection, we force the adversary to perform a Sybil attack and two compromise attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single compromise attack is required. With this new path selection, an attacker is forced to do a one or more node compromise attacks before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since compromise attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of node compromise attacks that they might not be able to complete. 1.1. Tor integration The mechanisms introduced in this proposal are currently implemented partially in Tor and partially through an external Python script: https://github.com/mikeperry-tor/vanguards The Python script uses the new Tor configuration options HSLayer2Nodes and HSLayer3Nodes to be able to select nodes for the guard layers. The Python script is tasked with maintaining and rotating the guard nodes as needed based on the lifetimes described in this proposal. In the future, we are aiming to include the whole functionality into Tor, with no need for external scripts. 1.2. Visuals Here is how a hidden service rendezvous circuit currently looks like: -> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n this proposal pins the two middle positions into a much more restricted sets, as follows: -> guard_2A -> guard_3A -> guard_1A -> guard_2B -> guard_3B HS -> guard_3C -> guard_1B -> guard_2C -> guard_3D -> guard_3E -> guard_2D -> guard_3F Additionally, to avoid linkability, we insert an extra middle node after the third layer guard for client side intro and hsdir circuits, and service-side rendezvous circuits. This means that the set of paths for Client (C) and Service (S) side look like this: C - G - L2 - L3 - R S - G - L2 - L3 - HSDIR S - G - L2 - L3 - I C - G - L2 - L3 - M - I C - G - L2 - L3 - M - HSDIR S - G - L2 - L3 - M - R 1.3. Threat model, Assumptions, and Goals Consider an adversary with the following powers: - Can launch a Sybil guard discovery attack against any node of a rendezvous circuit. The slower the rotation period of the node, the longer the attack takes. Similarly, the higher the percentage of the network is compromised, the faster the attack runs. - Can compromise any node on the network, but this compromise takes time and potentially even coercive action, and also carries risk of discovery. We also make the following assumptions about the types of attacks: 1. A Sybil attack is observable by both people monitoring the network for large numbers of new nodes, as well as vigilant hidden service operators. It will require either large amounts of traffic sent towards the hidden service, multiple test circuits, or both. 2. A Sybil attack against the second or first layer Guards will be more noisy than a Sybil attack against the third layer guard, since the second and first layer Sybil attack requires a timing side channel in order to determine success, whereas the Sybil success is almost immediately obvious to third layer guard, since it will be instructed to connect to a cooperating malicious rend point by the adversary. 3. As soon as the adversary is confident they have won the Sybil attack, an even more aggressive circuit building attack will allow them to determine the next node very fast (an hour or less). 4. The adversary is strongly disincentivized from compromising nodes that may prove useless, as node compromise is even more risky for the adversary than a Sybil attack in terms of being noticed. Given this threat model, our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a node compromise attack. Ideally, we want the node compromise attacks to carry a non-negligible probability of being useless to the adversary by the time they complete. On the other hand, the outermost layer of guards should rotate fast enough to _require_ a Sybil attack. See our vanguard simulator project for a simulation of the above adversary model and a motivation for the parameters selected within this proposal: https://github.com/asn-d6/vanguard_simulator https://github.com/asn-d6/vanguard_simulator/wiki/Optimizing-vanguard-topologies 2. Design When a hidden service picks its guard nodes, it also picks an additional NUM_LAYER2_GUARDS-sized set of middle nodes for its `second_guard_set`, as well as a NUM_LAYER3_GUARDS-sized set of middle nodes for its `third_guard_set`. When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hop of the circuit. A hidden service rotates nodes from the 'second_guard_set' at a random time between MIN_SECOND_GUARD_LIFETIME hours and MAX_SECOND_GUARD_LIFETIME hours. A hidden service rotates nodes from the 'third_guard_set' at a random time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME hours. Each node's rotation time is tracked independently, to avoid disclosing the rotation times of the primary and second-level guards. 2.1. Security parameters We set NUM_LAYER2_GUARDS to 4 nodes and NUM_LAYER3_GUARDS to 6 nodes. We set MIN_SECOND_GUARD_LIFETIME to 1 day, and MAX_SECOND_GUARD_LIFETIME to 45 days inclusive, for an average rotation rate of 29.5 days, using the max(X,X) distribution specified in Section 3.3. We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and MAX_THIRD_GUARD_LIFETIME to 48 hours inclusive, for an average rotation rate of 31.5 hours, using the max(X,X) distribution specified in Section 3.3. See Section 3 for more analysis on these constants. 2.2. Path restriction changes In order to avoid information leaks and ensure paths can be built, path restrictions must be loosened. In particular, we allow the following: 1. Nodes from the same /16 and same family for any/all hops 2. Guard nodes can be chosen for RP/IP/HSDIR 3. Guard nodes can be chosen for hop before RP/IP/HSDIR. The first change prevents the situation where paths cannot be built if two layers all share the same subnet and/or node family. It also prevents the the use of a different entry guard based on the family or subnet of the IP, HSDIR, or RP. The second change prevents an adversary from forcing the use of a different entry guard by enumerating all guard-flaged nodes as the RP. The third change prevents an adversary from learning the guard node by way of noticing which nodes were not chosen for the hop before it. 3. Rationale and Security Parameter Selection 3.1. Sybil rotation counts for a given number of Guards The probability of Sybil success for Guard discovery can be modeled as the probability of choosing 1 or more malicious middle nodes for a sensitive circuit over some period of time. P(At least 1 bad middle) = 1 - P(All Good Middles) = 1 - P(One Good middle)^(num_middles) = 1 - (1 - c/n)^(num_middles) c/n is the adversary compromise percentage In the case of Vanguards, num_middles is the number of Guards you rotate through in a given time period. This is a function of the number of vanguards in that position (v), as well as the number of rotations (r). P(At least one bad middle) = 1 - (1 - c/n)^(v*r) Here's detailed tables in terms of the number of rotations required for a given Sybil success rate for certain number of guards. 1.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 11 6 4 3 3 2 2 2 2 1 1 15% 17 9 6 5 4 3 3 2 2 2 2 25% 29 15 10 8 6 5 4 4 3 3 2 50% 69 35 23 18 14 12 9 8 7 6 5 60% 92 46 31 23 19 16 12 11 10 8 6 75% 138 69 46 35 28 23 18 16 14 12 9 85% 189 95 63 48 38 32 24 21 19 16 12 90% 230 115 77 58 46 39 29 26 23 20 15 95% 299 150 100 75 60 50 38 34 30 25 19 99% 459 230 153 115 92 77 58 51 46 39 29 5.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 3 2 1 1 1 1 1 1 1 1 1 15% 4 2 2 1 1 1 1 1 1 1 1 25% 6 3 2 2 2 1 1 1 1 1 1 50% 14 7 5 4 3 3 2 2 2 2 1 60% 18 9 6 5 4 3 3 2 2 2 2 75% 28 14 10 7 6 5 4 4 3 3 2 85% 37 19 13 10 8 7 5 5 4 4 3 90% 45 23 15 12 9 8 6 5 5 4 3 95% 59 30 20 15 12 10 8 7 6 5 4 99% 90 45 30 23 18 15 12 10 9 8 6 10.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 2 1 1 1 1 1 1 1 1 1 1 15% 2 1 1 1 1 1 1 1 1 1 1 25% 3 2 1 1 1 1 1 1 1 1 1 50% 7 4 3 2 2 2 1 1 1 1 1 60% 9 5 3 3 2 2 2 1 1 1 1 75% 14 7 5 4 3 3 2 2 2 2 1 85% 19 10 7 5 4 4 3 3 2 2 2 90% 22 11 8 6 5 4 3 3 3 2 2 95% 29 15 10 8 6 5 4 4 3 3 2 99% 44 22 15 11 9 8 6 5 5 4 3 The rotation counts in these tables were generated with: def num_rotations(c, v, success): r = 0 while 1-math.pow((1-c), v*r) < success: r += 1 return r 3.2. Rotation Period As specified in Section 1.2, the primary driving force for the third layer selection was to ensure that these nodes rotate fast enough that it is not worth trying to compromise them, because it is unlikely for compromise to succeed and yield useful information before the nodes stop being used. From the table in Section 3.1, with NUM_LAYER2_GUARDS=4 and NUM_LAYER3_GUARDS=6, it can be seen that this means that the Sybil attack on layer3 will complete with 50% chance in 12*31.5 hours (15.75 days) for the 1% adversary, ~4 days for the 5% adversary, and 2.62 days for the 10% adversary. Since rotation of each node happens independently, the distribution of when the adversary expects to win this Sybil attack in order to discover the next node up is uniform. This means that on average, the adversary should expect that half of the rotation period of the next node is already over by the time that they win the Sybil. With this fact, we choose our range and distribution for the second layer rotation to be short enough to cause the adversary to risk compromising nodes that are useless, yet long enough to require a Sybil attack to be noticeable in terms of client activity. For this reason, we choose a minimum second-layer guard lifetime of 1 day, since this gives the adversary a minimum expected value of 12 hours for during which they can compromise a guard before it might be rotated. If the total expected rotation rate is 29.5 days, then the adversary can expect overall to have 14.75 days remaining after completing their Sybil attack before a second-layer guard rotates away. 3.3. Rotation distributions In order to skew the distribution of the third layer guard towards higher values, we use max(X,X) for the distribution, where X is a random variable that takes on values from the uniform distribution. Here's a table of expectation (arithmetic means) for relevant ranges of X (sampled from 0..N-1). The table was generated with the following python functions: def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N) def ProbMaxXX(N, i): return (2.0*i+1)/(N*N) def ExpFn(N, ProbFunc): exp = 0.0 for i in xrange(N): exp += i*ProbFunc(N, i) return exp The current choice for second-layer guards is noted with **, and the current choice for third-layer guards is noted with ***. Range Min(X,X) Max(X,X) 40 12.84 26.16 41 13.17 26.83 42 13.50 27.50 43 13.84 28.16 44 14.17 28.83 45 14.50 29.50** 46 14.84 30.16 47 15.17 30.83 48 15.50 31.50*** The Cumulative Density Function (CDF) tells us the probability that a guard will no longer be in use after a given number of time units have passed. Because the Sybil attack on the third node is expected to complete at any point in the second node's rotation period with uniform probability, if we want to know the probability that a second-level Guard node will still be in use after t days, we first need to compute the probability distribution of the rotation duration of the second-level guard at a uniformly random point in time. Let's call this P(R=r). For P(R=r), the probability of the rotation duration depends on the selection probability of a rotation duration, and the fraction of total time that rotation is likely to be in use. This can be written as: P(R=r) = ProbMaxXX(X=r)*r / \sum_{i=1}^N ProbMaxXX(X=i)*i or in Python: def ProbR(N, r, ProbFunc=ProbMaxXX): return ProbFunc(N, r)*r/ExpFn(N, ProbFunc) For the full CDF, we simply sum up the fractional probability density for all rotation durations. For rotation durations less than t days, we add the entire probability mass for that period to the density function. For durations d greater than t days, we take the fraction of that rotation period's selection probability and multiply it by t/d and add it to the density. In other words: def FullCDF(N, t, ProbFunc=ProbR): density = 0.0 for d in xrange(N): if t >= d: density += ProbFunc(N, d) # The +1's below compensate for 0-indexed arrays: else: density += ProbFunc(N, d)*(float(t+1))/(d+1) return density Computing this yields the following distribution for our current parameters: t P(SECOND_ROTATION <= t) 1 0.03247 2 0.06494 3 0.09738 4 0.12977 5 0.16207 10 0.32111 15 0.47298 20 0.61353 25 0.73856 30 0.84391 35 0.92539 40 0.97882 45 1.00000 This CDF tells us that for the second-level Guard rotation, the adversary can expect that 3.3% of the time, their third-level Sybil attack will provide them with a second-level guard node that has only 1 day remaining before it rotates. 6.5% of the time, there will be only 2 day or less remaining, and 9.7% of the time, 3 days or less. Note that this distribution is still a day-resolution approximation. 4. Security concerns and mitigations 4.1. Mitigating fingerprinting of new HS circuits By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit with varying degrees of certainty. The Guard node is able to recognize a Vanguard client with a high degree of certainty because it will observe a client IP creating the overwhelming majority of its circuits to just a few middle nodes in any given 31.5 day time period. The middle nodes will be able to tell with a variable certainty that depends on both its traffic volume and upon the popularity of the service, because they will see a large number of circuits that tend to pick the same Guard and Exit. The final nodes will be able to tell with a similar level of certainty that depends on their capacity and the service popularity, because they will see a lot of handshakes that all tend to have the same second hops. The most serious of these is the Guard fingerprinting issue. When proposal 254-padding-negotiation is implemented, services that enable this feature should use those padding primitives to create fake circuits to random middle nodes that are not their guards, in an attempt to look more like a client. Additionally, if Tor Browser implements "virtual circuits" based on SOCKS username+password isolation in order to enforce the re-use of paths when SOCKS username+passwords are re-used, then the number of middle nodes in use during a typical user's browsing session will be proportional to the number of sites they are viewing at any one time. This is likely to be much lower than one new middle node every ten minutes, and for some users, may be close to the number of Vanguards we're considering. This same reasoning is also an argument for increasing the number of second-level guards beyond just two, as it will spread the hidden service's traffic over a wider set of middle nodes, making it both easier to cover, and behave closer to a client using SOCKS virtual circuit isolation. 5. Default vs optional behavior We suggest this torrc option to be optional because it changes path selection in a way that may seriously impact hidden service performance, especially for high traffic services that happen to pick slow guard nodes. However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should in fact enable this feature globally, but only after we verify its viability for high-traffic hidden services, and ensure that it is free of second-order load balancing effects. Even after that point, until Single Onion Services are implemented, there will likely still be classes of very high traffic hidden services for whom some degree of location anonymity is desired, but for which performance is much more important than the benefit of Vanguards, so there should always remain a way to turn this option off. In the meantime, a reference implementation is available at: https://github.com/mikeperry-tor/vanguards/blob/master/vanguards/vanguards.py 6. Acknowledgements This research was supported in part by NSF grants CNS-1111539, CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. Appendix A: Full Python program for generating tables in this proposal #!/usr/bin/python import math ############ Section 3.1 ################# def num_rotations(c, v, success): i = 0 while 1-math.pow((1-c), v*i) < success: i += 1 return i def rotation_line(c, pct): print " %2d%% %6d%6d%6d%6d%6d%6d%6d%6d%6d%6d%8d" % \ (pct, num_rotations(c, 1, pct/100.0), num_rotations(c, 2, pct/100.0), \ num_rotations(c, 3, pct/100.0), num_rotations(c, 4, pct/100.0), num_rotations(c, 5, pct/100.0), num_rotations(c, 6, pct/100.0), num_rotations(c, 8, pct/100.0), num_rotations(c, 9, pct/100.0), num_rotations(c, 10, pct/100.0), num_rotations(c, 12, pct/100.0), num_rotations(c, 16, pct/100.0)) def rotation_table_31(): for c in [1,5,10]: print "\n %2.1f%% Network Compromise: " % c print " Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen" for success in [10,15,25,50,60,75,85,90,95,99]: rotation_line(c/100.0, success) ############ Section 3.3 ################# def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N) def ProbMaxXX(N, i): return (2.0*i+1)/(N*N) def ExpFn(N, ProbFunc): exp = 0.0 for i in xrange(N): exp += i*ProbFunc(N, i) return exp def ProbUniformX(N, i): return 1.0/N def ProbR(N, r, ProbFunc=ProbMaxXX): return ProbFunc(N, r)*r/ExpFn(N, ProbFunc) def FullCDF(N, t, ProbFunc=ProbR): density = 0.0 for d in xrange(N): if t >= d: density += ProbFunc(N, d) # The +1's below compensate for 0-indexed arrays: else: density += ProbFunc(N, d)*float(t+1)/(d+1) return density def expectation_table_33(): print "\n Range Min(X,X) Max(X,X)" for i in xrange(10,49): print " %2d %2.2f %2.2f" % (i, ExpFn(i,ProbMinXX), ExpFn(i, ProbMaxXX)) def CDF_table_33(): print "\n t P(SECOND_ROTATION <= t)" for i in xrange(1,46): print " %2d %2.5f" % (i, FullCDF(45, i-1)) ########### Output ############ # Section 3.1 rotation_table_31() # Section 3.3 expectation_table_33() CDF_table_33() ---------------------- 1. https://onionbalance.readthedocs.org/en/latest/design.html#overview
Filename: 293-know-when-to-publish.txt Title: Other ways for relays to know when to publish Author: Nick Mathewson Created: 30-May-2018 Status: Closed Target: 0.3.5 Implemented-In: 0.4.0.1-alpha [IMPLEMENTATION NOTES: Mechanism one is implemented; mechanism two is rejected.] 1. Motivation In proposal 275, we give reasons for dropping the published-on field from consensus documents, to improve the performance of consensus diffs. We've already changed Tor (as of 0.2.9.11) to allow us to set those fields far in the future -- but unfortunately, there is still one use case that requires them: relays use the published-on field to tell if they are about to fall out of the consensus and need to make new descriptors. Here we propose two alternative mechanisms for relays to know that they should publish descriptors, so we can enact proposal 275 and set the published-on field to some time in the distant future. 2. Mechanism One: The StaleDesc flag Authorities should begin voting on a new StaleDesc flag. When authorities vote, if the most recent published_on date for a descriptor is over DESC_IS_STALE_INTERVAL in the past, the authorities should vote to give the StaleDesc flag to that relay. If any relay sees that it has the StaleDesc flag, it should upload some time in the first half of the voting interval. (Implementors should take care not to re-upload over and over, though: Relays won't lose the flag until the next voting interval is reached.) (Define DESC_IS_STALE_INTERVAL as equal to FORCE_REGENERATE_DESCRIPTOR_INTERVAL.) 3. Mechanism Two: Uploading more frequently when rejected. Tor relays should remember the last time at which they uploaded a descriptor that was accepted by a majority of dirauths. If this time is more than FAST_RETRY_DESCRIPTOR_INTERVAL in the past, we mark our descriptor as dirty from mark_my_descriptor_dirty_if_too_old(). 4. Implications for proposal 275 Once most relays are running versions that support the features above, and once authorities are generating consensuses with the StaleDesc flag, there will no longer be a need to keep the published time in consensus documents accurate -- we can start setting it to some time in the distant future, per proposal 275.
Filename: 294-tls-1.3.txt Title: TLS 1.3 Migration Authors: Isis Lovecruft Created: 11 December 2017 Updated: 23 January 2018 Status: Draft This proposal is currently in draft state and should be periodically revised as we research how much of our idiosyncratic older TLS uses can be removed. 1. Motivation TLS 1.3 is a substantial redesign over previous versions of TLS, with several significant protocol changes which should likely provide Tor implementations with not only greater security but an improved ability to blend into "normal" TLS traffic on the internet, due to its improvements in encrypting more portions of the handshake. Tor implementations may utilise the new TLS 1.3 EncryptedExtensions feature to define arbitrary encrypted TLS extensions to encompass our less standard (ab)uses of TLS. Additionally, several new Elliptic Curve (EC) based signature algorithms, including Ed25519 and Ed448, are included within the base specification including a single specification for EC point compression for each supported curve, further decreasing our reliance on Tor-protocol-specific uses and extensions (and implementation details). Other new features which Tor implementations might take advantage of include improved (server-side) stateless session resumption, which might be usable for OPs to resume sessions with their guards, for example after network disconnection or router IP address reassignment. 2. Summary Everything that's currently TLS 1.2: make it use TLS 1.3. KABLAM. DONE. For an excellent summary of differences between TLS 1.2 and TLS 1.3, see [TLS-1.3-DIFFERENCES]. 3. Specification 3.1. Link Subprotocol 4 (We call it "Link v4" here, but reserve whichever is the subsequently available subprotocol version at the time.) 3.2. TLS Session Resumption & Compression As before, implementations MUST NOT allow TLS session resumption. In the event that it might be decided in the future that OR implementations would benefit from 0-RTT, we can re-evaluate this decision and its security considerations in a separate proposal. Compression has been removed from TLS in version 1.3, so we no longer need to make recommendations against its usage. 3.3. Handshake Protocol 3.3.1. Negotiation The initiator sends the following four sets of options, as defined in §4.1.1 of [TLS-1.3-NEGOTIATION]: > > - A list of cipher suites which indicates the AEAD algorithm/HKDF hash > pairs which the client supports. > - A “supported_groups” (Section 4.2.7) extension which indicates the > (EC)DHE groups which the client supports and a “key_share” (Section 4.2.8) > extension which contains (EC)DHE shares for some or all of these groups. > - A “signature_algorithms” (Section 4.2.3) extension which indicates the > signature algorithms which the client can accept. > - A “pre_shared_key” (Section 4.2.11) extension which contains a list of > symmetric key identities known to the client and a “psk_key_exchange_modes” > (Section 4.2.9) extension which indicates the key exchange modes that may be > used with PSKs. In our case, the initiator MUST leaave the PSK section blank and MUST include the "key_share" extension, and the responder proceeds to select a ECDHE group, including its "key_share" in the response ServerHello. 3.3.2. ClientHello To initiate a v4 handshake, the client sends a TLS1.3 ClientHello with the following options: - The "legacy_version" field MUST be set to "TLS 1.2 (0x0303)". TLS 1.3 REQUIRES this. (Actual version negotiation is done via the "supported_versions" extension. See §5.1 of this proposal for details of the case where a TLS-1.3 capable initiator finds themself talking to a node which does not support TLS 1.3 and/or doesn't support v4.) - The "random" field MUST be filled with 32 bytes of securely generated randomness. - The "legacy_session_id" MUST be set to a new pseudorandom value each time, regardless of whether the initiator has previously opened either a TLS1.2 or TLS1.3 connection to the other side. - The "legacy_compression_methods" MUST be set to a single null byte, indicating no compression is supported. (This is the only valid setting for this field in TLS1.3, since there is no longer any compression support.) - The "cipher_suites" should be set to "TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:TLS13-AES-256-GCM-SHA384:" This is the DEFAULT cipher suite list for OpenSSL 1.1.1. While an argument could be made for customisation to remove the AES-128 option, we choose to attempt to blend in which the majority of other TLS-1.3 clients, since this portion of the handshake is unencrypted. (If the initiator actually means to begin a v3 protocol connection, they send these ciphersuites anyway, cf. §5.2 of this proposal.) - The "supported_groups" MUST include "X25519" and SHOULD NOT include any of the NIST P-* groups. - The "signature_algorithms" MUST include "ed25519 (0x0807)". Implementations MAY advertise support for other signature schemes, including "ed448 (0x0808)", however they MUST NOT advertise support for ECDSA schemes due to the perils of secure implementation. The initiator MUST NOT send any "pre_shared_key" or "psk_key_exchange_modes" extensions. The details of the "signature_algorithms" choice depends upon the final standardisation of PKIX. [IETF-PKIX] 3.3.2.1. ClientHello Extensions From [TLS-1.3_SIGNATURE_ALGOS]: > > The “signature_algorithms_cert” extension was added to allow implementatations > which supported different sets of algorithms for certificates and in TLS itself > to clearly signal their capabilities. TLS 1.2 implementations SHOULD also > process this extension. In order to support cross-certification, the initiator's ClientHello MUST include the "signature_algorithms_cert" extension, in order to signal that some certificate chains (one in particular) will include a certificate signed using RSA-PKCSv1-SHA1: - The "signature_algorithms_cert" MUST include the legacy algorithm "rsa_pkcs1_sha1(0x0201)". 3.3.3. ServerHello To respond to a TLS 1.3 ClientHello which supports the v4 link handshake protocol, the responder sends a ServerHello with the following options: - The "legacy_version" field MUST be set to "TLS 1.2 (0x0303)". TLS 1.3 REQUIRES this. (Actual version negotiation is done via the "supported_versions" extension. See §5.1 of this proposal for details of the case where a TLS-1.3 capable initiator finds themself talking to a node which does not support TLS 1.3 and/or doesn't support v4.) - The "random" field MUST be filled with 32 bytes of securely generated randomness. - The "legacy_session_id_echo" field MUST be filled with the contents of the "legacy_session_id" from the initiator's ClientHello. - The "cipher_suite" field MUST be set to "TLS13-CHACHA20-POLY1305-SHA256". - The "legacy_compression_method" MUST be set to a single null byte, indicating no compression is supported. (This is the only valid setting for this field in TLS1.3, since there is no longer any compression support.) XXX structure and "key_share" response (XXX can we pre-generate a cache of XXX key_shares?) 3.3.3.1 ServerHello Extensions XXX what extensions do we need? 4. Implementation Details 4.1. Certificate Chains and Cross-Certifications TLS 1.3 specifies that a certificate in a chain SHOULD be directly certified by the preceding certificate in the chain. This seems to imply that OR implementations SHOULD NOT do the DAG-like construction normally implied by cross-certification between the master Ed25519 identity key and the master RSA-1024 identity key. Instead, since certificate chains are expected to be linear, we'll need three certificate chains included in the same handshake: 1. EdMaster->EdSigning, EdSigning->Link 2. EdMaster->RSALegacy 3. RSALegacy->EdMaster where A->B denotes that the certificate containing B has been signed with key A. 4.2. Removal of AUTHENTICATE, CLIENT_AUTHENTICATE, and CERTS cells XXX see prop#224 and RFC5705 and compare XXX when can we remove our "renegotiation" handshake completely? 5. Compatibility 5.1. TLS 1.2 version negotiation From [TLS-1.3-DIFFERENCES]: > > The “supported_versions” ClientHello extension can be used to > negotiate the version of TLS to use, in preference to the > legacy_version field of the ClientHello. If an OR does not receive a ClientHello with a "supported_versions" extenstion, it MUST fallback to using the Tor Link subprotocols v3. That is, the OR MUST immediately fallback to TLS 1.2 (or v3 with TLS 1.3, cf. the next section) and, following both Tor's "renegotiation" and "in-protocol" version negotiation mechanisms, immediately send a VERSIONS cell. Otherwise, upon seeing a "supported_versions" in the ClientHello set to 0x0304, the OR should procede with Tor's Link subprotocol 4. 5.2. Preparing Tor's v3 Link Subprotocol for TLS 1.3 Some changes to the current v3 Link protocol are required, and these MUST be backported, since implementations which are currently compiled against TLS1.3-supporting OpenSSLs fail to establish any connections due to: - failing to include any ciphersuite candidates which are TLS1.3 compatible This is likely to be accomplished by: 1. Prefacing our v3 ciphersuite lists with TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:TLS13-AES-256-GCM-SHA384: (We could also retroactively change our custom cipher suite list to be the HIGH cipher suites, since this includes all TLS 1.3 suites.) 2. Calling SSL_CTX_set1_groups() to set the supported groups (should be set to "X25519:P-256"). [TLS-1.3_SET1_GROUPS] 3. Taking care that older OpenSSLs, which instead have the concept of "curves" not groups, should have their equivalent TLS context settings in place. [TLS-1.3_SET1_GROUPS] mentions that "The curve functions were first added to OpenSSL 1.0.2. The equivalent group functions were first added to OpenSSL 1.1.1". However more steps may need to be taken. [XXX are there any more steps necessary? —isis] 6. Security Implications XXX evaluate the static RSA attack and its effects on TLS1.2/TLS1.3 XXX dual-operable protocols and determine if they apply XXX XXX Jager, T., Schwenk, J. and J. Somorovsky, "On the Security XXX of TLS 1.3 and QUIC Against Weaknesses in PKCS#1 v1.5 Encryption", XXX Proceedings of ACM CCS 2015 , 2015. XXX https://www.nds.rub.de/media/nds/veroeffentlichungen/2015/08/21/Tls13QuicAttacks.pdf 7. Performance and Scalability 8. Availability and External Deployment 8.1. OpenSSL Availability and Interoperability Implementation should be delayed until the stable release of OpenSSL 1.1.1. OpenSSL 1.1.1 will be binary and API compatible with OpenSSL 1.1.0, so in preparation we might wish to revise our current usage to OpenSSL 1.1.0 to be prepared. From Matt Caswell in [OPENSSL-BLOG-TLS-1.3]: > > OpenSSL 1.1.1 will not be released until (at least) TLSv1.3 is > finalised. In the meantime the OpenSSL git master branch contains > our development TLSv1.3 code which can be used for testing purposes > (i.e. it is not for production use). You can check which draft > TLSv1.3 version is implemented in any particular OpenSSL checkout > by examining the value of the TLS1_3_VERSION_DRAFT_TXT macro in the > tls1.h header file. This macro will be removed when the final > version of the standard is released. > > In order to compile OpenSSL with TLSv1.3 support you must use the > “enable-tls1_3” option to “config” or “Configure”. > > Currently OpenSSL has implemented the “draft-20” version of > TLSv1.3. Many other libraries are still using older draft versions in > their implementations. Notably many popular browsers are using > “draft-18”. This is a common source of interoperability > problems. Interoperability of the draft-18 version has been tested > with BoringSSL, NSS and picotls. > > Within the OpenSSL git source code repository there are two branches: > “tls1.3-draft-18” and “tls1.3-draft-19”, which implement the older > TLSv1.3 draft versions. In order to test interoperability with other > TLSv1.3 implementations you may need to use one of those > branches. Note that those branches are considered temporary and are > likely to be removed in the future when they are no longer needed. At the time of its release, we may wish to test interoperability with other implementation(s). 9. Future Directions The implementation of this proposal would greatly ease the implementation difficulty and maintenance requirements for some other possible future beneficial areas of work. 9.1. TLS Handshake Composability Handshake composition (i.e. hybrid handshakes) in TLS 1.3 is incredibly straightforward. For example, provided we had a Supersingular Isogeny Diffie-Hellman (SIDH) based implementation with a sane API, composition of Elliptic Curve Diffie-Hellman (ECDH) and SIDH handshakes would be a trivial codebase addition (~10-20 lines of code, for others who have implemented this). Our current circuit-layer protocol safeguards the majority of our security and anonymity guarantees, while our TLS layer has historically been either a stop-gap and/or an attempted (albeit usually not-so-successful) obfuscation mechanism. However, our TLS usage has, in many cases, successfully, through combination with the circuit layer cryptography, prevented more then a few otherwise horrendous bugs. After our circuit-layer protocol is upgraded to a hybrid post-quantum secure protocol (prop#269 and prop#XXX), and in order to ensure that our TLS layer continues to act in this manner as a stop gap — including within threat models which include adversaries capable of recording traffic now and decrypting with a potential quantum computer in the future — our TLS layer should also provide safety against such a quantum-capable adversary. A. References [TLS-1.3-DIFFERENCES]: https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#rfc.section.1.3 [OPENSSL-BLOG-TLS-1.3]: https://www.openssl.org/blog/blog/2017/05/04/tlsv1.3/ [TLS-1.3-NEGOTIATION]: https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#rfc.section.4.1.1 [IETF-PKIX]: https://datatracker.ietf.org/doc/draft-ietf-curdle-pkix/ [TLS-1.3_SET1_GROUPS]: https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set1_groups.html [TLS-1.3_SIGNATURE_ALGOS]: https://tlswg.github.io/tls13-spec/draft-ietf-tls-tls13.html#signature-algorithms
Filename: 295-relay-crypto-with-adl.txt Title: Using ADL for relay cryptography (solving the crypto-tagging attack) Author: Tomer Ashur, Orr Dunkelman, Atul Luykx Created: 22 Feb 2018 Last-Modified: 13 Jan. 2020 Status: Open 0. Context Although Crypto Tagging Attacks were identified already in the original Tor design, it was not before the rise of the Procyonidae in 2012 that their severity was fully realized. In Proposal 202 (Two improved relay encryption protocols for Tor cells) Nick Mathewson discussed two approaches to stymie tagging attacks and generally improve Tor's cryptography. In Proposal 261 (AEZ for relay cryptography) Mathewson puts forward a concrete approach which uses the tweakable wide-block cipher AEZ. This proposal suggests an alternative approach to Proposal 261 using the notion of Release (of) Unverified Plaintext (RUP) security. It describes an improved algorithm for circuit encryption based on CTR-mode which is already used in Tor, and an additional component for hashing. Incidentally, and similar to Proposal 261, this proposal employs the ENCODE-then-ENCIPHER approach thus it improves Tor's E2E integrity by using (sufficient) redundancy. For more information about the scheme and a security proof for its RUP-security see Tomer Ashur, Orr Dunkelman, Atul Luykx: Boosting Authenticated Encryption Robustness with Minimal Modifications. CRYPTO (3) 2017: 3-33 available online at https://eprint.iacr.org/2017/239 . For authentication between the OP and the edge node we use the PIV scheme: https://eprint.iacr.org/2013/835 . A recent paper presented a birthday bound distinguisher against the ADL scheme, thus showing that the RUP security proof is tight: https://eprint.iacr.org/2019/1359 . 2. Preliminaries 2.1 Motivation For motivation, see proposal 202. 2.2. Notation Symbol Meaning ------ ------- M Plaintext C_I Ciphertext CTR Counter Mode N_I A de/encryption nonce (to be used in CTR-mode) T_I A tweak (to be used to de/encrypt the nonce) Tf'_I A running digest (forward direction) Tb'_I A running digest (backward direction) ^ XOR || Concatenation (This is more readable than a single | but must be adapted before integrating the proposal into tor-spec.txt) 2.3. Security parameters HASH_LEN -- The length of the hash function's output, in bytes. PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) DIG_KEY_LEN -- The key length used to digest messages (e.g., using GHASH). Since GHASH is only defined for 128-bit keys, we recommend DIG_KEY_LEN = 128. ENC_KEY_LEN -- The key length used for encryption (e.g., AES). We recommend ENC_KEY_LEN = 256. 2.4. Key derivation (replaces Section 5.2.2 in Tor-spec.txt) For newer KDF needs, Tor uses the key derivation function HKDF from RFC5869, instantiated with SHA256. The generated key material is: K = K_1 | K_2 | K_3 | ... where, if H(x,t) denotes HMAC_SHA256 with value x and key t, and m_expand denotes an arbitrarily chosen value, and INT8(i) is an octet with the value "i", then K_1 = H(m_expand | INT8(1) , KEY_SEED ) and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ), in RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, salt == t_key, and IKM == secret_input. When used in the ntor handshake a string of key material is generated and is used in the following way: Length Purpose Notation ------ ------- -------- HASH_LEN forward authentication digest IV AF HASH_LEN forward digest IV DF HASH_LEN backward digest IV DB ENC_KEY_LEN encryption key Kf ENC_KEY_LEN decryption key Kb DIG_KEY_LEN forward digest key Khf DIG_KEY_LEN backward digest key Khb ENC_KEY_LEN forward tweak key Ktf ENC_KEY_LEN backward tweak key Ktb DIGEST_LEN nonce to use in the hidden service protocol(*) (*) I am not sure that if this is still needed. Excess bytes from K are discarded. 2.6. Ciphers For hashing(*) we use GHASH(**) with a DIG_KEY_LEN-bit key. We write this as Digest(K,M) where K is the key and M the message to be hashed. We use AES with an ENC_KEY_LEN-bit key. For AES encryption (resp., decryption) we write E(K,X) (resp., D(K,X)) where K is an ENC_KEY_LEN-bit key and X the block to be encrypted (resp., decrypted). For a stream cipher, unless otherwise specified, we use ENC_KEY_LEN-bit AES in counter mode, with a nonce that is generated as explained below. We write this as Encrypt(K,N,X) (resp., Decrypt(K,N,X)) where K is the key, N the nonce, and X the message to be encrypted (resp., decrypted). (*) The terms hash and digest are used interchangeably. (**) Proposal 308 suggested that using POLYVAL [GLL18] would be more efficient here. This proposal will work just the same if POLYVAL is used instead of GHASH. 3. Routing relay cells Let n denote the integer representing the destination node. For I = 1...n, we set Tf'_{I} = DF_I, Tb'_{I} = DB_I, and Ta'_I = AF_I where DF_I, DB_I, and AF_I are generated according to Section 2.4. 3.1. Forward Direction The forward direction is the direction that CREATE/CREATE2 cells are sent. 3.1.1. Routing from the origin When an OP sends a relay cell, they prepare the cell as follows: The OP prepares the authentication part of the message: C_{n+1} = M Ta_I = Digest(Khf_n,Ta'_I||C_{n+1}) N_{n+1} = Ta_I ^ E(Ktf_n,Ta_I ^ 0) Ta'_{I} = Ta_I Then, the OP prepares the multi-layered encryption: For I=n...1: C_I = Encrypt(Kf_I,N_{I+1},C_{I+1}) T_I = Digest(Khf_I,Tf'_I||C_I) N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1}) Tf'_I = T_I The OP sends C_1 and N_1 to node 1. 3.1.2. Relaying forward at onion routers When a forward relay cell is received by OR_I, it decrypts the payload with the stream cipher, as follows: 'Forward' relay cell: T_I = Digest(Khf_I,Tf'_I||C_I) N_{I+1} = T_I ^ D(Ktf_I,T_I ^ N_I) C_{I+1} = Decrypt(Kf_I,N_{I+1},C_I) Tf'_I = T_I The OR then decides whether it recognizes the relay cell as described below. If the OR recognizes the cell, it processes the contents of the relay cell. Otherwise, it passes C_{I+1}||N_{I+1} along the circuit if the circuit continues. For more information, see section 4 below. 3.2. Backward direction The backward direction is the opposite direction from CREATE/CREATE2 cells. 3.2.1. Relaying backward at onion routers When a backward relay cell is received by OR_I, it encrypts the payload with the stream cipher, as follows: 'Backward' relay cell: T_I = Digest(Khb_I,Tb'_I||C_{I+1}) N_I = T_I ^ E(Ktb_I,T_I ^ N_{I+1}) C_I = Encrypt(Kb_I,N_I,C_{I+1}) Tb'_I = T_I with C_{n+1} = M and N_{n+1}=0. Once encrypted, the node passes C_I and N_I along the circuit towards the OP. 3.2.2. Routing to the origin When a relay cell arrives at an OP, the OP decrypts the payload with the stream cipher as follows: OP receives relay cell from node 1: For I=1...n, where n is the end node on the circuit: C_{I+1} = Decrypt(Kb_I,N_I,C_I) T_I = Digest(Khb_I,Tb'_I||C_{I+1}) N_{I+1} = T_I ^ D(Ktb_I,T_I ^ N_I) Tb'_I = T_I If the payload is recognized (see Section 4.1), then: The sending node is I. Stop, process the payload and authenticate. 4. Application connections and stream management 4.1. Relay cells Within a circuit, the OP and the end node use the contents of RELAY packets to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP. The payload of each unencrypted RELAY cell consists of: Relay command [1 byte] StreamID [2 bytes] Length [2 bytes] Data [PAYLOAD_LEN-21 bytes] The old Digest field is removed since sufficient information for authentication is now included in the nonce part of the payload. The old 'Recognized' field is removed and the node always tries to authenticate the message as follows. 4.1.1 forward direction (executed by the end node): Ta_I = Digest(Khf_n,Ta'_I||C_{n+1}) Tag = Ta_I ^ D(Ktf_n,Ta_I ^ N_{n+1}) If Tag = 0: Ta'_I = Ta_I The message is authenticated. Otherwise: Ta'_I remains unchanged. The message is not authenticated. 4.1.2 backward direction (executed by the OP): The message is recognized and authenticated (i.e., C_{n+1} = M) if and only if N_{n+1} = 0. The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the payload is padding bytes. 4.2. Appending the encrypted nonce and dealing with version-homogenic and version-heterogenic circuits When a cell is prepared to be routed from the origin (see Section 3.1.1 above) the encrypted nonce N is appended to the encrypted cell (occupying the last 16 bytes of the cell). If the cell is prepared to be sent to a node supporting the new protocol, N is used to generate the layer's nonce. Otherwise, if the node only supports the old protocol, N is still appended to the encrypted cell (so that following nodes can still recover their nonce), but a synchronized nonce (as per the old protocol) is used in CTR-mode. When a cell is sent along the circuit in the 'backward' direction, nodes supporting the new protocol always assume that the last 16 bytes of the input are the nonce used by the previous node, which they process as per Section 3.2.1. If the previous node also supports the new protocol, these cells are indeed the nonce. If the previous node only supports the old protocol, these bytes are either encrypted padding bytes or encrypted data. 5. Security 5.1. Resistance to crypto-tagging attacks A crypto-tagging attack involves a circuit with two colluding nodes and at least one honest node between them. The attack works when one node makes a change to the cell (tagging) in a way that can be undone by the other colluding party. In between, the tagged cell is processed by honest nodes which do not detect the change. The attack is possible due to the malleability property of CTR-mode: a change to a ciphertext bit effects only the respective plaintext bit in a predicatble way. This proposal frustrates the crypto-tagging attack by linking the nonce to the encrypted message such that any change to the ciphertext results in a random nonce and hence, random plaintext. Let us consider the following 3-hop scenario: the entry and end nodes are malicious and colluding and the middle node is honest. 5.1.1. forward direction Suppose that node I tags the ciphertext part of the message (C'_{I+1} != C_{I+1}) then forwards it to the next node (I+1). As per Section 3.1.2. Node I+1 digests C'_{I+1} to generate T_{I+1} and N_{I+2}. Since C'_{I+2} is different from what it should be, so are the resulting T_{I+1} and N_{I+2}. Hence, decrypting C'_{I+1} using these values results in a random string for C_{I+2}. Since C_{I+2} is now just a random string, it is decrypted into a random string and cannot be authenticated. Furthermore, since C'_{I+1} is different than what it should be, Tf'_{I+1} (i.e., the running digest of the middle node) is now out of sync with that of the OP, which means that all future cells sent through this node will decrypt into garbage (random strings). Likewise, suppose that instead of tagging the ciphertext, Node I tags the encrypted nonce N'_{I+1} != N_{I+1}. Now, when Node I+1 digests the payload the tweak T_{I+1} is fine, but using it to decrypt N'_{I+1} again results in a random nonce for N_{I+2}. This random nonce is used to decrypt C_{I+1} into a random C'_{I+2} which cannot be authenticated by the end node. Since C_{I+2} is a random string, the running digest of the end node is now out of sync with that of OP, which prevents the end node from decrypting further cells. 5.1.2. Backward direction In the backward direction the tagging is done by Node I+2 untagging by Node I. Suppose first that Node I+2 tags the ciphertext C_{I+2} and sends it to Node I+1. As per Section 3.2.1, Node I+1 first digests C_{I+2} and uses the resulting T_{I+1} to generate a nonce N_{I+1}. From this it is clear that any change introduced by Node I+2 influences the entire payload and cannot be removed by Node I. Unlike in Section 5.1.1., the cell is blindly delivered by Node I to the OP which decrypts it. However, since the payload leaving the end node was modified, the message cannot be authenticated by the OP which can be trusted to tear down the circuit. Suppose now that tagging is done by Node I+2 to the nonce part of the payload, i.e., N_{I+2}. Since this value is encrypted by Node I+1 to generate its own nonce N_{I+1}, again, a random nonce is used which affects the entire keystream of CTR-mode. The cell again cannot be authenticated by the OP and the circuit is torn down. We note that the end node can modify the plain message before ever encrypting it and this cannot be discovered by the Tor protocol. This vulnerability is outside the scope of this proposal and users should always use TLS to make sure that their application data is encrypted before it enters the Tor network. 5.2. End-to-end authentication Similar to the old protocol, this proposal only offers end-to-end authentication rather than per-hop authentication. However, unlike the old protocol, the ADL-construction is non-malleable and hence, once a non-authentic message was processed by an honest node supporting the new protocol, it is effectively destroyed for all nodes further down the circuit. This is because the nonce used to de/encrypt all messages is linked to (a digest of) the payload data. As a result, while honest nodes cannot detect non-authentic messages, such nodes still destroy the message thus invalidating its authentication tag when it is checked by edge nodes. As a result, security against crypto-tagging attacks is ensured as long as an honest node supporting the new protocol processes the message between two dishonest ones. 5.3. The running digest Unlike the old protocol, the running digest is now computed as the output of a GHASH call instead of a hash function call (SHA256). Since GHASH does not provide the same type of security guarantees as SHA256, it is worth discussing why security is not lost from computing the running digest differently. The running digest is used to ensure that if the same payload is encrypted twice, then the resulting ciphertext does not remain the same. Therefore, all that is needed is that the digest should repeat with low probability. GHASH is a universal hash function, hence it gives such a guarantee assuming its key is chosen uniformly at random. 6. Forward secrecy Inspired by the approach of Proposal 308, a small modification to this proposal makes it forward secure. The core idea is to replace the encryption key KF_n after de/encrypting the cell. As an added benefit, this would allow to keep the authentication layer stateless (i.e., without keeping a running digest for this layer). Below we present the required changes to the sections above. 6.1. Routing from the Origin (replacing 3.1.1 above) When an OP sends a relay cell, they prepare the cell as follows: The OP prepares the authentication part of the message: C_{n+1} = M T_{n+1} = Digest(Khf_n,C_{n+1}) N_{n+1} = T_{n+1} ^ E(Ktf_n,T_{n+1} ^ 0) Then, the OP prepares the multi-layered encryption: For the final layer n: (C_n,Kf'_n) = Encrypt(Kf_n,N_{n+1},C_{I+1}||0||0) (*) T_n = Digest(Khf_I,Tf'_n||C_n) N_n = T_I ^ E(Ktf_n,T_n ^ N_{n+1}) Tf'_n = T_n Kf_n = Kf'_n (*) CTR mode is used to generate two additional blocks. This 256-bit value is denoted K'f_n and is used in subsequent steps to replace the encryption key of this layer. To achieve forward secrecy it is important that the obsolete Kf_n is erased in a non-recoverable way. For layer I=(n-1)...1: C_I = Encrypt(Kf_I,N_{I+1},C_{I+1}) T_I = Digest(Khf_I,Tf'_I||C_I) N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1}) Tf'_I = T_I The OP sends C_1 and N_1 to node 1. Alternatively, if we want that all nodes use the same functionality OP prepares the cell as follows: For layer I=n...1: (C_I,K'f_I) = Encrypt(Kf_I,N_{I+1},C_{I+1}||0||0) (*) T_I = Digest(Khf_I,Tf'_I||C_I) N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1}) Tf'_I = T_I Kf_I = Kf'_I (*) CTR mode is used to generate two additional blocks. This 256-bit value is denoted K'f_n and is used in subsequent steps to replace the encryption key of this layer. To achieve forward secrecy it is important that the obsolete Kf_n is erased in a non-recoverable way. This scheme offers forward secrecy in all levels of the circuit. 6.2. Relaying Forward at Onion Routers (replacing 3.1.2 above) When a forward relay cell is received by OR I, it decrypts the payload with the stream cipher, as follows: 'Forward' relay cell: T_I = Digest(Khf_I,Tf'_I||C_I) N_{I+1} = T_I ^ D(Ktf_I,T_I ^ N_I) C_{I+1} = Decrypt(Kf_I,N_{I+1},C_I||0||0) Tf'_I = T_I The OR then decides whether it recognizes the relay cell as described below. Depending on the choice of scheme from 6.1 the OR uses the last two blocks of C_{I+1} to update the encryption key or discards them. If the cell is recognized the OR also processes the contents of the relay cell. Otherwise, it passes C_{I+1}||N_{I+1} along the circuit if the circuit continues. For more information about recognizing and authenticating relay cells, see 5.4.5 below. 6.3. Relaying Backward at Onion Routers (replacing 3.2.1 above) When an edge node receives a message M to be routed back to the origin, it encrypts it as follows: T_n = Digest(Khb_n,Tb'_n||M) N_n = T_n ^ E(Ktb_n,T_n ^ 0) (C_n,K'b_n) = Encrypt(Kb_n,N_n,M||0||0) (*) Tb'_n = T_n Kb_n = K'b_n (*) CTR mode is used to generate two additional blocks. This 256-bit value is denoted K'b_n and will be used in subsequent steps to replace the encryption key of this layer. To achieve forward secrecy it is important that the obsolete K'b_n is erased in a non-recoverable way. Once encrypted, the edge node sends C_n and N_n along the circuit towards the OP. When a backward relay cell is received by OR_I (I<n), it encrypts the payload with the stream cipher, as follows: 'Backward' relay cell: T_I = Digest(Khb_I,Tb'_I||C_{I+1}) N_I = T_I ^ E(Ktb_I,T_I ^ N_{I+1}) C_I = Encrypt(Kb_I,N_I,C_{I+1}) Tb'_I = T_I Each node passes C_I and N_I along the circuit towards the OP. If forward security is desired for all layers in the circuit, all OR's encrypt as follows: T_I = Digest(Khb_I,Tb'_I||C_{I+1}) N_I = T_I ^ E(Ktb_I,T_I ^ 0) (C_I,K'b_I) = Encrypt(Kb_n,N_n,M||0||0) Tb'_I = T_I Kb_I = K'b_I 6.4. Routing to the Origin (replacing 3.2.2 above) When a relay cell arrives at an OP, the OP decrypts the payload with the stream cipher as follows: OP receives relay cell from node 1: For I=1...n, where n is the end node on the circuit: C_{I+1} = Decrypt(Kb_I,N_I,C_I) T_I = Digest(Khb_I,Tb'_I||C_{I+1}) N_{I+1} = T_I ^ D(Ktb_I,T_I ^ N_I) Tb'_I = T_I And updates the encryption keys according to the strategy chosen for 6.3. If the payload is recognized (see Section 4.1), then: The sending node is I. Process the payload! 6.5. Recognizing and authenticating a relay cell (replacing 4.1.1 above): Authentication in the forward direction is done as follows: T_{n+1} = Digest(Khf_n,C_{n+1}) Tag = T_{n+1} ^ D(Ktf_n,T_{n+1} ^ N_{n+1}) The message is recognized and authenticated (i.e., M = C_{n+1}) if and only if Tag = 0. No changes are required to the authentication process when the relay cell is sent backwards.
Filename: 296-expose-bandwidth-files.txt Title: Have Directory Authorities expose raw bandwidth list files Author: Tom Ritter Created: 11-December-2017 Status: Closed Ticket: https://trac.torproject.org/projects/tor/ticket/21377 Implemented-In: 0.4.0.1-alpha 1. Introduction Bandwidth Authorities (bwauths) perform scanning of the Tor Network and calculate observed bandwidths for each relay. They produce a bandwidth list file that is given to a Directory Authority. The Directory Authority uses the bw (bandwidth) value from this file in its vote file denoting its view of the bandwidth of the relay. After collecting all of the votes from other Authorities, a consensus is calculated, and the consensus's view of a relay's speed is determined by choosing the low-median value of all the authorities' values for each relay. Only a single metric from the bandwidth list file is exposed by a Directory Authority's vote, however the original file contains considerably more diagnostic information about how the bwauth arrives at that measurement for that relay. For more details, see the bandwidth list file specification in bandwidth-file-spec.txt. 2. Motivation The bandwidth list file contains more information than is exposed in the overall vote file. This information is useful to debug: * anomalies in relays' utilization, * suspected bugs in the (decrepit) bwauth code, and * the transition to a replacement bwauth implementation. Currently, all bwauths expose the bandwidth list file through various (non- standard) means, and that file is downloaded (hourly) by a single person (as long as his home internet connection and home server is working) and archived (with a small amount of robustness.) It would be preferable to have this exposed in a standard manner. Doing so would no longer require bwauths to run HTTP servers to expose the file, no longer require them to take additional manual steps to provide it, and would enable public consumption by any interested parties. We hope that Collector will begin archiving the files. 3. Specification An authority SHOULD publish the bandwidth list file used to calculate its next vote. It SHOULD make the bandwidth list file available whenever the corresponding vote is available, at the corresponding URL. (See dir-spec for the exact details.) It SHOULD make the file available at http://<hostname>/tor/status-vote/next/bandwidth.z http://<hostname>/tor/status-vote/current/bandwidth.z It MUST NOT attempt to send its bandwidth list file in a HTTP POST to other authorities and it SHOULD NOT make bandwidth list files from other authorities available. Clients interested in consuming these documents should download them from each authority's: * next URL when votes are created. (In the public Tor network, this is after HH:50 during normal operation, and after HH:20 during a consensus failure.) * current URL after the valid-after time in the consensus. (After HH:00, and HH:30 during consensus failure.) 4. Security Implications The raw bandwidth list file does not [really: is not believed to] expose any sensitive information. All authorities currently make this document public already, an example is at https://bwauth.ritter.vg/bwauth/bwscan.V3BandwidthsFile 5. Compatibility Exposing the document presents no compatibility concerns. Applications that parse the document should follow the bandwidth list file specification in bandwidth-file-spec.txt. If a new bandwidth list format version is added, the applications MAY need to upgrade to that version.
Filename: 297-safer-protover-shutdowns.txt Title: Relaxing the protover-based shutdown rules Author: Nick Mathewson Created: 19-Sep-2018 Status: Closed Target: 0.3.5.x Implemented-In: 0.4.0.x IMPLEMENTATION NOTE: We went with the proposed change in section 2. The "release date" is now updated by the "make update-versions" target whenever the version number is incremented. Maintainers may also manually set the "release date" to the future. 1. Introduction In proposal 264 (now implemented) we introduced the subprotocol versioning mechanism to better handle forward-compatibility in the Tor network. Included was a mechanism for safely disabling obsolete versions of Tor that no longer ran any supported protocols. If a version of Tor receives a consensus that lists as "required" any protocol version that it cannot speak, Tor will not start--even if the consensus is in its cache. The intended use case for this is that once some protocol has been provided by all supported versions for a long time, the authorities can mark it as "required". We had thought about the "adding a requirement" case mostly. This past weekend, though, we found an unwanted side-effect: it is hard to safely *un*-require a currently required protocol. Here's what happened: - Long ago, we created the LinkAuth=1 protocol, which required direct access to the ClientRandom and ServerRandom fields. (0.2.3.6-alpha) - Later, once we implemented Ed25519 identity keys, we added an improved LinkAuth=3 protocol, which uses the RFC5705 "key export" mechanism. (0.3.0.1-alpha) - When we added the subprotocols mechanism, we listed LinkAuth=1 as required. (backported to 0.2.9.x) - While porting Tor to NSS, we found that LinkAuth=1 couldn't be supported, because NSS wisely declines to expose the TLS fields it uses. So we removed "LinkAuth=1" from the required list (backported to 0.3.2.x), and got a bunch of authorities to upgrade. - In 0.3.5.1-alpha, once enough authorities had upgraded, we removed "LinkAuth=1" from the supported subprotocols list when Tor is running with NSS. [*] - We found, however, that this change could cause a bug when Tor+NSS started with a cached consensus that was created before LinkAuth=1 was removed from the requirements. Tor would decline to start, because the (old) consensus told it that LinkAuth=1 was required. This proposal discusses two alternatives for making it safe to remove required subprotocol versions in the future. [*] There was actually a bug here where OpenSSL removed LinkAuth=1 too, but that's mostly beside the point for this timeline, other than the fact it would have made things waaay worse if people hadn't caught it. 2. Recommended change: consider the consensus date. I propose that when deciding whether to shut down because of subprotocol requirements, a Tor implementation should only shut down if the consensus is dated to some time after the implementation's release date. With this change, an old cached consensus cannot cause the implementation to shut down, but a newer one can. This makes it safe to put out a release that does not support a formerly required protocol, so long as the authorities have upgraded to stop requiring that protocol. (It is safe to use the *scheduled* release date for the implementation, plus a few months -- just so long as we don't plan to start requiring a subprotocol that's not supported by the latest version of Tor.) 3. Not-recommended change: ignore the cached consensus. Was it a mistake to have Tor consider a cached consensus when deciding whether to shut down? The rationale for considering the cached consensus was that when a Tor implementation is obsolete, we don't want it hammering on the network, probing for new consensuses, and possibly reconnecting aggressively as its handshakes fail. That still seems compelling to me, though it's possible that if we find some problem with the methodology from section 2 above, we'll need to find some other way to achieve this goal.
Filename: 298-canonical-families.txt Title: Putting family lines in canonical form Author: Nick Mathewson Created: 31-Oct-2018 Status: Closed Target: 0.3.6.x Implemented-In: 0.4.0.1-alpha 1. Introduction With ticket #27359, we begin encoding microdescriptor families in memory in a reference-counted form, so that if 10 relays all list the same family, their family only needs to be stored once. For large families, this has the potential to save a lot of RAM -- but only if the families are the same across those relays. Right now, family lines are often encoded in different ways, and placed into consensuses and microdescriptor lines in whatever format the relay reported. This proposal describes an algorithm that authorities should use while voting to place families into a canonical format. This algorithm is forward-compatible, so that new family line formats can be supported in the future. 2. The canonicalizing algorithm To make a the family listed in a router descriptor canonical: For all entries of the form $hexid=name or $hexid~name, remove the =name or ~name portion. Remove all entries of the form $hexid, where hexid is not 40 hexadecimal characters long. If an entry is a valid nickname, put it into lower case. If an entry is a valid $hexid, put it into upper case. If there are any entries, add a single $hexid entry for the relay in question, so that it is a member of its own family. Sort all entries in lexical order. Remove duplicate entries. Note that if an entry is not of the form "nickname", "$hexid", "$hexid=nickname" or "$hexid~nickname", then it will be unchanged: this is what makes the algorithm forward-compatible. 3. When to apply this algorithm We allocate a new consensus method number. When building a consensus using this method or later, before encoding a family entry into a microdescriptor, the authorities should apply the algorithm above. Relay MAY apply this algorithm to their own families before publishing them. Unlike authorities, relays SHOULD warn about unrecognized family items.
Filename: 299-ip-failure-count.txt Title: Preferring IPv4 or IPv6 based on IP Version Failure Count Author: Neel Chauhan Created: 25-Jan-2019 Status: Superseded Superseded-by: 306 Ticket: https://trac.torproject.org/projects/tor/ticket/27491 1. Introduction As IPv4 address space becomes scarce, ISPs and organizations will deploy IPv6 in their networks. Right now, Tor clients connect to guards using IPv4 connectivity by default. When networks first transition to IPv6, both IPv4 and IPv6 will be enabled on most networks in a so-called "dual-stack" configuration. This is to not break existing IPv4-only applications while enabling IPv6 connectivity. However, IPv6 connectivity may be unreliable and clients should be able to connect to the guard using the most reliable technology, whether IPv4 or IPv6. In ticket #27490, we introduced the option ClientAutoIPv6ORPort which adds preliminary "happy eyeballs" support. If set, this lets a client randomly choose between IPv4 or IPv6. However, this random decision does not take into account unreliable connectivity or network failures of an IP family. A successful Tor implementation of the happy eyeballs algorithm requires that unreliable connectivity on IPv4 and IPv6 are taken into consideration. This proposal describes an algorithm to take into account network failures in the random decision used for choosing an IP family and the data fields used by the algorithm. 2. Options To Enable The Failure Counter To enable the failure counter, we will add a flags to ClientAutoIPv6ORPort. The new format for ClientAutoIPv6ORPort is: ClientAutoIPv6ORPort 0|1 [flags] The first argument is to enable the automatic selection between IPv4 and IPv6 if it is 1. The second argument is a list of optional flags. The only flag so far is "TrackFailures", which enables the tracking of failures to make a better decision when selecting between IPv4 and IPv6. The tracking of failures will be described in the rest of this proposal. However, we should be open to more flags from future proposals as they are written and implemented. 3. Failure Counter Design I propose that the failure counter uses the following fields: * IPv4 failure points * IPv6 failure points These entries will exist as internal counters for the current session, and a calculated value from the previous session in the statefile. These values will be stored as 32-bit unsigned integers for the current session and in the statefile. When a new session is loaded, we will load the failure count from the statefile, and when a session is closed, the failure counts from the current session will be stored in the statefile. 4. Failure Probability Calculation The failure count of one IP version will increase the probability of the other IP version. For instance, a failure of IPv4 will increase the IPv6 probability, and vice versa. When the IP version is being chosen, I propose that these values will be included in the guard selection code: * IPv4 failure points * IPv6 failure points * Total failure points These values will be stored as 32-bit unsigned integers. A generic failure of an IP version will add one point to the failure point count values of the particular IP version which failed. A failure of an IP version from a "no route" error which happens when connections automatically fail will be counted as two failure points for the automatically failed version. The failure points for both IPv4 and IPv6 is sum of the values in the state file plus the current session's failure values. The total failure points is a sum of the IPv4 and IPv6 failure points, and is updated when the failure point count of an IP version is updated. The probability of a particular IP version is the failure points of the other version divided by the total number of failure points, multiplied by 4 and stored as an integer. We will call this value the summarized failure point value (SFPV). The reason for this summarization is to emulate a probability in 1/4 intervals by the random number generator. In the random number generator, we will choose a random number between 0 and 4. If the random number is less than the IPv6 SFPV, we will choose IPv4. If it is equal or greater, we will choose IPv6. If the probability is 0/4 with a SFPV value of 0, it will be rounded to 1/4 with a SFPV of 1. Also, if the probability is 4/4 with a SFPV of 4, it will be rounded to 3/4 with a SFPV of 3. The reason for this is to accomodate mobile clients which could change networks at any time (e.g. WiFi to cellular) which may be more or less reliable in terms of a particular IP family when compared to the previous network of the client. 5. Initial Failure Point Calculation When a client starts without failure points or if the FP value drops to 0, we need a SFPV value to start with. The Initial SFPV value will be counted based on whether the client is using a bridge or not and if the relays in the bridge configuration or consensus have IPv6. For clients connecting directly to Tor, we will: * During Bootstrap: use the number of IPv4 and IPv6 capable fallback directory mirrors during bootstrap. * After the initial consensus is received: use the number of IPv4 and IPv6 capable guards in the consensus. The reason why the consensus will be used to calculate the initial failure point value is because using the number of guards would bias the SFPV value with whatever's dominant on the network rather than what works on the client. For clients connecting through bridges, we will use the number of bridges configured and the IP versions supported. The initial value of the failure points in the scenarios described in this section would be: * IPv4 Faulure Points: Count the number of IPv6-capable relays * IPv6 Failure Points: Count the number of IPv4-capable relays If the consensus or bridge configuration changes during a session, we should not update the failure point counters to generate a SFPV. If we are starting a new session, we should use the existing failure points to generate a SFPV unless the counts for IPv4 or IPv6 are zero. 6. Forgetting Old Sessions We should be able to forget old failures as clients could change networks. For instance, a mobile phone could switch between WiFi and cellular. Keeping an exact failure history would have privacy implications, so we should store an approximate history. One way we could forget old sessions is by halving all the failure point (FP) values before adding when: * One or more failure point values are a multiple of a random number between 1 and 5 * One or more failure point values are greater than or equal to 100 The reason for halving the values at regular intervals is to forget old sessions while keeping an approxmate history. We halve all FP values so that one IP version doesn't dominante on the failure count if the other is halved. This keeps an approximate scale of the failures on a client. The reason for halving at a multiple of a random number instead of a fixed interval is so we can halve regularly while not making it too predictable. This prevents a situation where we would be halving too often to keep an approximate failure history. If we halve, we add the FP value for the failed IP version after halving all FPs if done to account for the failure. If halving is not done, we will just add the FP. If the FP value for one IP version goes down to zero, we will re-calculate the SFPV for that version using the methods described in Section 4. 7. Separate Concurrent Connection Limits Right now, there is a limit for three concurrent connections from a client. at any given time. This limit includes both IPv4 and IPv6 connections. This is to prevent denial of service attacks. I propose that a seperate connection limit is used for IPv4 and IPv6. This means we can have three concurrent IPv4 connections and three concurrent IPv6 connections at the same time. Having seperate connection limits allows us to deal with networks dropping packets for a particular IP family while still preventing potential denial of service attacks. 8. Pathbias and Failure Probability If ClientAutoIPv6ORPort is in use, and pathbias is triggered, we should ignore "no route" warnings. The reason for this is because we would be adding two failure points for the failed as described in Section 3 of this proposal. Adding two failure points would make us more likely to prefer the competing IP family over the failed one versus than adding a single failure point on a normal failure. 9. Counting Successful Connections If a connection to a particular IP version is successful, we should use it. This ensures that clients have a reliable connection to Tor. Accounting for successful connections can be done by adding one failure point to the competing IP version of the successful connection. For instance, if we have a successful IPv6 connection, we add one IPv4 failure point. Why use failure points for successful connections? This reduces the need for separate counters for successes and allows for code reuse. Why add to the competing version's failure point? Similar to how we should prefer IPv4 if IPv6 fails, we should also prefer IPv4 if it is successful. We should also prefer IPv6 if it is successful. Even on adding successes, we will still halve the failure counters as described in Section 5. 10. Acknowledgements Thank you teor for aiding me with the implementation of Happy Eyeballs in Tor. This would not have been possible if it weren't for you.
Filename: 300-walking-onions.txt Title: Walking Onions: Scaling and Saving Bandwidth Author: Nick Mathewson Created: 5-Feb-2019 Status: Informational 0. Status This proposal describes a mechanism called "Walking Onions" for scaling the Tor network and reducing the amount of client bandwidth used to maintain a client's view of the Tor network. This is a draft proposal; there are problems left to be solved and questions left to be answered. Proposal 323 tries to fill in all the gaps. 1. Introduction In the current Tor network design, we assume that every client has a complete view of all the relays in the network. To achieve this, clients download consensus directories at regular intervals, and download descriptors for every relay listed in the directory. The substitution of microdescriptors for regular descriptors (proposal 158) and the use of consensus diffs (proposal 140) have lowered the bytes that clients must dedicate to directory operations. But we still face the problem that, if we force each client to know about every relay in the network, each client's directory traffic will grow linearly with the number of relays in the network. Another drawback in our current system is that client directory traffic is front-loaded: clients need to fetch an entire directory before they begin building circuits. This places extra delays on clients, and extra load on the network. To anonymize the world, we will need to scale to a much larger number of relays and clients: requiring clients to know about every relay in the set simply won't scale, and requiring every new client to download a large document is also problematic. There are obvious responses here, and some other anonymity tools have taken them. It's possible to have a client only use a fraction of the relays in a network--but doing so opens the client to _epistemic attacks_, in which the difference in clients' views of the network is used to partition their traffic. It's also possible to move the problem of selecting relays from the client to the relays themselves, and let each relay select the next relay in turn--but this choice opens the client to _route capture attacks_, in which a malicious relay selects only other malicious relays. In this proposal, I'll describe a design for eliminating up-front client directory downloads. Clients still choose relays at random, but without ever having to hold a list of all the relays. This design does not require clients to trust relays any more than they do today, or open clients to epistemic attacks. I hope to maintain feature parity with the current Tor design; I'll list the places in which I haven't figured out how to do so yet. I'm naming this design "walking onions". The walking onion (Allium x proliferum) reproduces by growing tiny little bulbs at the end of a long stalk. When the stalk gets too top-heavy, it flops over, and the little bulbs start growing somewhere new. The rest of this document will run as follows. In section 2, I'll explain the ideas behind the "walking onions" design, and how they can eliminate the need for regular directory downloads. In section 3, I'll answer a number of follow-up questions that arise, and explain how to keep various features in Tor working. Section 4 (not yet written) will elaborate all the details needed to turn this proposal into a concrete set of specification changes. 2. Overview 2.1. Recapping proposal 141 Back in Proposal 141 ("Download server descriptors on demand"), Peter Palfrader proposed an idea for eliminating ahead-of-time descriptor downloads. Instead of fetching all the descriptors in advance, a client would fetch the descriptor for each relay in its path right before extending the circuit to that relay. For example, if a client has a circuit from A->B and wants to extend the circuit to C, the client asks B for C's descriptor, and then extends the circuit to C. (Note that the client needs to fetch the descriptor every time it extends the circuit, so that an observer can't tell whether the client already had the descriptor or not.) There are a couple of limitations for this design: * It still requires clients to download a consensus. * It introduces a extra round-trip to each hop of the circuit extension process. I'll show how to solve these problems in the two sections below. 2.2. An observation about the ntor handshake. I'll start with an observation about our current circuit extension handshake, ntor: it should not actually be necessary to know a relay's onion key before extending to it. Right now, the client sends: NODEID (The relay's identity) KEYID (The relay's public onion key) CLIENT_PK (a diffie-hellman public key) and the relay responds with: SERVER_PK (a diffie-hellman public key) AUTH (a function of the relay's private keys and *all* of the public keys.) Both parties generate shared symmetric keys from the same inputs that are are used to create the AUTH value. The important insight here is that we could easily change this handshake so that the client sends only CLIENT_PK, and receives NODEID and KEYID as part of the response. In other words, the client needs to know the relay's onion key to _complete_ the handshake, but doesn't actually need to know the relay's onion key in order to _initiate_ the handshake. This is the insight that will let us save a round trip: When the client goes to extend a circuit from A->B to C, it can send B a request to extend to C and retrieve C's descriptor in a single step. Specifically, the client sends only CLIENT_PK, and relay B can include C's keys as part of the EXTENDED cell. 2.3. Extending by certified index Now I'll explain how the client can avoid having to download a list of relays entirely. First, let's look at how a client chooses a random relay today. First, the client puts all of the relays in a list, and computes a weighted bandwidth for each one. For example, suppose the relay identities are R1, R2, R3, R4, and R5, and their bandwidth weights are 50, 40, 30, 20, and 10. The client makes a table like this: Relay Weight Range of index values R1 50 0..49 R2 40 50..89 R3 30 90..119 R4 20 120..139 R5 10 140..149 To choose a random relay, the client picks a random index value between 0 and 149 inclusive, and looks up the corresponding relay in the table. For example, if the client's random number is 77, it will choose R2. If its random number is 137, it chooses R4. The key observation for the "walking onions" design is that the client doesn't actually need to construct this table itself. Instead, we will have this table be constructed by the authorities and distributed to all the relays. Here's how it works: let's have the authorities make a new kind of consensus-like thing. We'll call it an Efficient Network Directory with Individually Verifiable Entries, or "ENDIVE" for short. This will differ from the client's index table above in two ways. First, every entry in the ENDIVE is normalized so that the bandwidth weights maximum index is 2^32-1: Relay Normalized weight Range of index values R1 0x55555546 0x00000000..0x55555545 R2 0x44444438 0x55555546..0x9999997d R3 0x3333332a 0x9999997e..0xcccccca7 R4 0x2222221c 0xcccccca8..0xeeeeeec3 R5 0x1111113c 0xeeeeeec4..0xffffffff Second, every entry in the ENDIVE is timestamped and signed by the authorities independently, so that when a client sees a line from the table above, it can verify that it came from an authentic ENDIVE. When a client has chosen a random index, one of these entries will prove to the client that a given relay corresponds to that index. Because of this property, we'll be calling these entries "Separable Network Index Proofs", or "SNIP"s for short. For example, a single SNIP from the table above might consist of: * A range of times during which this SNIP is valid * R1's identity * R1's ntor onion key * R1's address * The index range 0x00000000..0x55555545 * A signature of all of the above, by a number of authorities Let's put it together. Suppose that the client has a circuit from A->B, and it wants to extend to a random relay, chosen randomly weighted by bandwidth. 1. The client picks a random index value between 0 and 2**32 - 1. It sends that index to relay B in its EXTEND cell, along with a g^x value for the ntor handshake. Note: the client doesn't send an address or identity for the next relay, since it doesn't know what relay it has chosen! (The combination of an index and a g^x value is what I'm calling a "walking onion".) 2. Now, relay B looks up the index in its most recent ENDIVE, to learn which relay the client selected. (For example, suppose that the client's random index value is 0x50000001. This index value falls between 0x00000000 and 0x55555546 in the table above, so the relay B sees that the client has chosen R1 as its next hop.) 3. Relay B sends a create cell to R1 as usual. When it gets a CREATED reply, it includes the authority-signed SNIP for R1 as part of the EXTENDED cell. 4. As part of verifying the handshake, the client verifies that the SNIP was signed by enough authorities, that its timestamp is recent enough, and that it actually corresponds to the random index that the client selected. Notice the properties we have with this design: - Clients can extend circuits without having a list of all the relays. - Because the client's random index needs to match a routing entry signed by the authorities, the client is still selecting a relay randomly by weight. A hostile relay cannot choose which relay to send the client. On a failure to extend, a relay should still report the routing entry for the other relay that it couldn't connect to. As before, a client will start a new circuit if a partially constructed circuit is a partial failure. We could achieve a reliability/security tradeoff by letting clients offer the relay a choice of two or more indices to extend to. This would help reliability, but give the relay more influence over the path. We'd need to analyze this impact. In the next section, I'll discuss a bunch of details that we need to straighten out in order to make this design work. 3. Sorting out the details. 3.1. Will these routing entries fit in EXTEND2 and EXTENDED2 cells? The EXTEND2 cell is probably big enough for this design. The random index that the client sends can be a new "link specifier" type, replacing the IP and identity link specifiers. The EXTENDED2 cell is likely to need to grow here. We'll need to implement proposal 249 ("Allow CREATE cells with >505 bytes of handshake data") so that EXTEND2 and EXTENDED2 cells can be larger. 3.2. How should SNIPs be signed? We have a few options, and I'd like to look into the possibilities here more closely. The simplest possibility is to use **multiple signatures** on each SNIP, the way we do today for consensuses. These signatures should be made using medium-term Ed25519 keys from the authorities. At a cost of 64 bytes per signature, at 9 authorities, we would need 576 bytes for each SNIP. These signatures could be batch-verified to save time at the client side. Since generating a signature takes around 20 usec on my mediocre laptop, authorities should be able to generate this many signatures fairly easily. Another possibility is to use a **threshold signature** on each SNIP, so that the authorities collaboratively generate a short signature that the clients can verify. There are multiple threshold signature schemes that we could consider here, though I haven't yet found one that looks perfect. Another possibility is to use organize the SNIPs in a **merkle tree with a signed root**. For this design, clients could download the signed root periodically, and receive the hash-path from the signed root to the SNIP. This design might help with certificate-transparency-style designs, and it would be necessary if we ever want to move to a postquantum signature algorithm that requires large signatures. Another possibility (due to a conversation among Chelsea Komlo, Sajin Sasy, and Ian Goldberg), is to *use SNARKs*. (Why not? All the cool kids are doing it!) For this, we'd have the clients download a signed hash of the ENDIVE periodically, and have the authorities generate a SNARK for each SNIP, proving its presence in that document. 3.3. How can we detect authority misbehavior? We might want to take countermeasures against the possibility that a quorum of corrupt or compromised authorities give some relays a different set of SNIPs than they give other relays. If we incorporate a merkle tree or a hash chain in the design, we can use mechanisms similar to certificate transparency to ensure that the authorities have a consistent log of all the entries that they have ever handed out. 3.4. How many types of weighted node selection are there, and how do we handle them? Right now, there are multiple weights that we use in Tor: * Weight for exit * Weight for guard * Weight for middle node We also filter nodes for several properties, such as flags they have. To reproduce this behavior, we should enumerate the various weights and filters that we use, and (if there are not too many) create a separate index for each. For example, the Guard index would weight every node for selection as guard, assigning 0 weight to non-Guard nodes. The Exit index would weight every node for selection as an exit, assigning 0 weight to non-Exit nodes. When choosing a relay, the client would have to specify which index to use. We could either have a separate (labeled) set of SNIPs entries for each index, or we could have each SNIP have a separate (labeled) index range for each index. REGRESSION: the client's choice of which index to use would leak the next router's position and purpose in the circuit. This information is something that we believe relays can infer now, but it's not a desired feature that they can. 3.5. Does this design break onion service introduce handshakes? In rend-spec-v3.txt section 3.3.2, we specify a variant of ntor for use in INTRODUCE2 handshakes. It allows the client to send encrypted data as part of its initial ntor handshake, but requires the client to know the onion service's onion key before it sends its initial handshake. That won't be a problem for us here, though: we still require clients to fetch onion service descriptors before contacting a onion service. 3.6. How does the onion service directory work here? The onion service directory is implemented as a hash ring, where each relay's position in the hash ring is decided by a hash of its identity, the current date, and a shared random value that the authorities compute each day. To implement this hash ring using walking onions, we would need to have an extra index based not on bandwidth, but on position in the hash ring. Then onion services and clients could build a circuit, then extend it one more hop specifying their desired index in the hash ring. We could either have a command to retrieve a trio of hashring-based routing entries by index, or to retrieve (or connect to?) the n'th item after a given hashring entry. 3.7. How can clients choose guard nodes? We can reuse the fallback directories here. A newly bootstrapping client would connect to a fallback directory, then build a three-hop circuit, and finally extend the three-hop circuit by indexing to a random guard node. The random guard node's SNIP would contain the information that the client needs to build real circuits through that guard in the future. Because the client would be building a three-hop circuit, the fallback directory would not learn the client's guards. (Note that even if the extend attempt fails, we should still pick the node as a possible guard based on its router entry, so that other nodes can't veto our choice of guards.) 3.8. Does the walking onions design preclude postquantum circuit handshakes? Not at all! Both proposal 263 (ntru) and proposal 270 (newhope) work by having the client generate an ephemeral key as part of its initial handshake. The client does not need to know the relay's onion key to do this, so we can still integrate those proposals with this one. 3.9. Does the walking onions design stop us from changing the network topology? For Tor to continue to scale, we will someday need to accept that not every relay can be simultaneously connected to every other relay. Therefore, we will need to move from our current clique topology assumption to some other topology. There are also proposals to change node selection rules to generate routes providing better performance, or improved resistance to local adversaries. We can, I think, implement this kind of proposal by changing the way that ENDIVEs are generated. Instead giving every relay the same ENDIVE, the authorities would generate different ENDIVEs for different relays, depending on the probability distribution of which relay should be chosen after which in the network topology. In the extreme case, this would produce O(n) ENDIVEs and O(n^2) SNIPs. In practice, I hope that we could do better by having the network topology be non-clique, and by having many relays share the same distribution of successors. 3.10. How can clients handle exit policies? This is an unsolved challenge. If the client tells the middle relay its target port, it leaks information inappropriately. One possibility is to try to gather exit policies into common categories, such as "most ports supported" and "most common ports supported". Another (inefficient) possibility is for clients to keep trying exits until they find one that works. Another (inefficient) possibility is to require that clients who use unusual ports fall back to the old mechanism for route selection. 3.11. Can this approach support families? This is an unsolved challenge. One (inefficient) possibility is for clients to generate circuits and discard those that use multiple relays in the same family. One (not quite compatible) possibility is for the authorities to sort the ENDIVE so that relays in the same family are adjacent to one another. The index-bounds part of each SNIP would also have to include the bounds of the family. This approach is not quite compatible with the status quo, because it prevents relays from belonging to more than one family. One interesting possibility (due to Chelsea Komlo, Sajin Sasy, and Ian Goldberg) is for the middle node to take responsibility for family enforcement. In this design, the client might offer the middle node multiple options for the next relay's index, and the middle node would choose the first such relay that is neither in its family nor its predecessor's family. We'd need to look for a way to make sure that the middle node wasn't biasing the path selection. (TODO: come up with more ideas here.) 3.12. Can walking onions support IP-based and country-based restrictions? This is an unsolved challenge. If the user's restrictions do not exclude most paths, one (inefficient) possibility is for the user to generate paths until they generate one that they like. This idea becomes inefficient if the user is excluding most paths. Another (inefficient and fingerprintable) possibility is to require that clients who use complex path restrictions fall back to the old mechanism for route selection. (TODO: come up with better ideas here.) 3.13. What scaling problems have we not solved with this design? The walking onions design doesn't solve (on its own) the problem that the authorities need to know about every relay, and arrange to have every relay tested. The walking onions design doesn't solve (on its own) the problem that relays need to have a list of all the relays. (But see section 3.9 above.) 3.14. Should we still have clients download a consensus when they're using walking onions? There are some fields in the current consensus directory documents that the clients will still need, like the list of supported protocols and network parameters. A client that uses walking onions should download a new flavor of consensus document that contains only these fields, and does not list any relays. In some signature schemes, this consensus would contain a digest of the ENDIVE -- see 3.2 above. (Note that this document would be a "consensus document" but not a "consensus directory", since it doesn't list any relays.) 4. Putting it all together [This is the section where, in a later version of this proposal, I would specify the exact behavior and data formats to be used here. Right now, I'd say we're too early in the design phase.] A.1. Acknowledgments Thanks to Peter Palfrader for his original design in proposal 141, and to the designers of PIR-Tor, both of which inspired aspects of this Walking Onions design. Thanks to Chelsea Komlo, Sajin Sasy, and Ian Goldberg for feedback on an earlier version of this design. Thanks to David Goulet, Teor, and George Kadianakis for commentary on earlier versions of this draft. This research was supported by NSF grants CNS-1526306 and CNS-1619454. A.2. Additional ideas Teor notes that there are ways to try to get this idea to apply to one-pass circuit construction, something like the old onion design. We might be able to derive indices and keys from the same seeds, even. I don't see a way to do this without losing forward secrecy, but it might be worth looking at harder.
Filename: 301-dont-vote-on-package-fingerprints.txt Title: Don't include package fingerprints in consensus documents Author: Iain R. Learmonth Created: 2019-02-21 Status: Closed Ticket: #28465 0. Abstract I propose modifying the Tor consensus document to remove digests of the latest versions of package files. These "package" lines were never used by any directory authority and so add additional complexity to the consensus voting mechanisms while adding no additional value. 1. Introduction In proposal 227 [1], to improve the integrity and security of updates, a way to authenticate the latest versions of core Tor software through the consensus was described. By listing a location with this information for each version of each package, we can augment the update process of Tor software to authenticate the packages it downloads through the Tor consensus. This was implemented in tor 0.2.6.3-alpha. When looking at modernising our network archive recently [2], I came across this line for votes and consensuses. If packages are referenced by the consensus then ideally we should archive those packages just as we archive referenced descriptors. However, this line was never present in any vote archived. 2. Proposal We deprecate the "package" line in the specification for votes. Directory authorities stop voting for "package" lines in their votes. Changes to votes do not require a new consensus method, so this part of the proposal can be implemented separately. We allocate a consensus method when this proposal is implemented. Let's call it consensus method N. Authorities will continue computing consensus package lines in the consensus if the consensus method is between 19 and (N-1). If the consensus method is N or later, they omit these lines. 3. Security Considerations This proposal removes a feature that could be used for improved security but currently isn't. As such it is extra code in the codebase that may have unknown bugs or lead to bugs in the future due to unexpected interactions. Overall this should be a good thing for security of Core Tor. 4. Compatability Considerations A new consensus method is required for this proposal. The "package" line was always optional and so no client should be depending on it. There are no known consumers of the "package" lines (there are none to consume anyway). A. References [1] Nick Mathewson, Mike Perry. "Include package fingerprints in consensus documents". Tor Proposal 227, February 2014. [2] Iain Learmonth, Karsten Loesing. "Towards modernising data collection and archive for the Tor network". Technical Report 2018-12-001, December 2018. B. Acknowledgements Thanks to teor and Nick Mathewson for their comments and suggestions on this proposal.
Filename: 302-padding-machines-for-onion-clients.txt Title: Hiding onion service clients using padding Author: George Kadianakis, Mike Perry Created: Thursday 16 May 2019 Status: Closed Implemented-In: 0.4.1.1-alpha NOTE: Please look at section 3 of padding-spec.txt now, not this document. 0. Overview Tor clients use "circuits" to do anonymous communications. There are various types of circuits. Some of them are for navigating the normal Internet, others are for fetching Tor directory information, others are for connecting to onion services, while others are simply for measurements and testing. It's currently possible for MITM type of adversaries (like tor-network-level and local-area-network adversaries) to distinguish Tor circuit types from each other using a wide array of metadata and distinguishers. In this proposal, we study various techniques that can be used to distinguish client-side onion service circuits and provide WTF-PAD circuit padding machines (using prop#254) to hide them against certain adversaries. 1. Motivation We are writing this proposal for various reasons: 1) We believe that in an ideal setting MITM adversaries should not be able to distinguish circuit types by inspecting traffic. Tor traffic should look amorphous to an outside observer to maximize uncertainty and anonymity properties. Client-side onion service circuits are an easy target for this proposal, because we believe we can improve their privacy with low bandwidth overhead. 2) We want to start experimenting with the WTF-PAD subsystem of Tor, and this use-case provides us with a good testbed. 3) We hope that by actually starting to use the WTF-PAD subsystem of Tor, we will encourage more researchers to start experimenting with it. 2. Scope of the proposal [SCOPE] Given the above, this proposal sets forth to use the WTF-PAD system to hide client-side onion service circuits against the classifiers of paper by Kwon et al. above. By client-side onion service circuits we refer to these two types of circuits: - Client-side introduction circuits: Circuit from client to the introduction point - Client-side rendezvous circuits: Circuit from client to the rendezvous point Service-side onion service circuits are not in scope for this proposal, and this is because hiding those would require more bandwidth and also more advanced WTF-PAD features. Furthermore, this proposal only aims to cloak the naive distinguishing features mentioned in the [KNOWN_DISTINGUISHERS] section, and can by no means guarantee that client-side onion service circuits are totally indistinguishable by other means. The machines specified in this proposal are meant to be lightweight and created for a specific purpose. This means that they can be easily extended with additional states to do more advanced hiding. 3. Known distinguishers against onion service circuits [KNOWN_DISTINGUISHERS] Over the past years it's been assumed that motivated adversaries can distinguish onion-service traffic from normal Tor traffic given their special characteristics. As far as we know, there has been relatively little research-level work done to this direction. The main article published in this area is the USENIX paper "Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services" by Kwon et al. [0] The above paper deals with onion service circuits in sections 3.2 and 5.1. It uses the following three "naive" circuit features to distinguish circuits: 1) Circuit construction sequence 2) Number of incoming and outgoing cells 3) Duration of Activity ("DoA") All onion service circuits have particularly loud signatures to the above characteristics, but WTF-PAD (prop#254) gives us tools to effectively silence those signatures to the point where the paper's classifiers won't work. 4. Hiding circuit features using WTF-PAD According to section [KNOWN_DISTINGUISHERS] there are three circuit features we are attempting to hide. Here is how we plan to do this using the WTF-PAD system: 1) Circuit construction sequence The USENIX paper uses the directions of the first 10 cells sent in a circuit to fingerprint them. Client-side onion service circuits have unique circuit construction sequences and hence they can be fingeprinted using just the first 10 cells. We use WTF-PAD to destroy this feature of onion service circuits by carefully sending padding cells (relay DROP cells) during circuit construction and making them look exactly like most general tor circuits up till the end of the circuit construction sequence. 2) Number of incoming and outgoing cells The USENIX paper uses the amount of incoming and outgoing cells to distinguish circuit types. For example, client-side introduction circuits have the same amount of incoming and outgoing cells, whereas client-side rendezvous circuits have more incoming than outgoing cells. We use WTF-PAD to destroy this feature by changing the number of cells sent in introduction circuits. We leave rendezvous circuits as is, since the actual rendezvous traffic flow usually resembles well normal Tor circuits. 3) Duration of Activity ("DoA") The USENIX paper uses the period of time during which circuits send and receive cells to distinguish circuit types. For example, client-side introduction circuits are really short lived, wheras service-side introduction circuits are very long lived. OTOH, rendezvous circuits have the same median lifetime as general Tor circuits which is 10 minutes. We use WTF-PAD to destroy this feature of client-side introduction circuits by setting a special WTF-PAD option, which keeps the circuits open for 10 minutes completely mimicking the DoA of general Tor circuits. 4.1. A dive into general circuit construction sequences [CIRCCONSTRUCTION] In this section we give an overview of how circuit construction looks like to a network or guard-level adversary. We use this knowledge to make the right padding machines that can make intro and rend circuits look like these general circuits. In particular, most general Tor circuits used to surf the web or download directory information, start with the following 6-cell relay cell sequence (cells surrounded in [brackets] are outgoing, the others are incoming): [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED When this is done, the client has established a 3-hop circuit and also opened a stream to the other end. Usually after this comes a series of DATA cell that either fetches pages, establishes an SSL connection or fetches directory information: [DATA] -> [DATA] -> DATA -> DATA The above stream of 10 relay cells defines the grand majority of general circuits that come out of Tor browser during our testing, and it's what we are gonna use to make introduction and rednezvous circuits blend in. Please note that in this section we only investigate relay cells and not connection-level cells like CREATE/CREATED or AUTHENTICATE/etc. that are used during the link-layer handshake. The rationale is that connection-level cells depend on the type of guard used and are not an effective fingerprint for a network/guard-level adversary. 5. WTF-PAD machines For the purposes of this proposal we will make use of four WTF-PAD machines as follows: - Client-side introduction circuit hiding machine (origin-side) - Client-side introduction circuit hiding machine (relay-side) - Client-side rendezvous circuit hiding machine (origin-side) - Client-side rendezvous circuit hiding machine (relay-side) In the following sections we will analyze these machines. 5.1. Client-side introduction circuit hiding machines [INTRO_CIRC_HIDING] These two machines are meant to hide client-side introduction circuits. The origin-side machine sits on the client and sends padding towards the introduction circuit, whereas the relay-side machine sits on the middle-hop (second hop of the circuit) and sends padding towards the client. The padding from the origin-side machine terminates at the middle-hop and does not get forwarded to the actual introduction point. Both of these machines only get activated for introduction circuits, and only after an INTRODUCE1 cell has been sent out. This means that before the machine gets activated our cell flow looks like this: [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [INTRODUCE1] Comparing the above with section [CIRCCONSTRUCTION], we see that the above cell sequence matches the one from general circuits up to the first 7 cells. However, in normal introduction circuits this is followed by an INTRODUCE_ACK and then the circuit gets teared down, which does not match the sequence from [CIRCCONSTRUCTION]. Hence when our machine is used, after sending an [INTRODUCE1] cell, we also send a [PADDING_NEGOTIATE] cell, which gets answered by a PADDING_NEGOTIATED cell and an INTRODUCE_ACKED cell. This makes us match the [CIRCCONSTRUCTION] sequence up to the first 10 cells. After that, we continue sending padding from the relay-side machine so as to fake a directory download, or an SSL connection setup. We also want to continue sending padding so that the connection stays up longer to destroy the "Duration of Activity" fingerprint. To calculate the padding overhead, we see that the origin-side machine just sends a single [PADDING_NEGOATIATE] cell, wheras the origin-side machine sends a PADDING_NEGOTIATED cell and between 7 to 10 DROP cells. This means that the average overhead of this machine is 11 padding cells. In terms of WTF-PAD terminology, these machines have three states (START, OBF, END). They move from the START to OBF state when the first non-padding cell is received on the circuit, and they stay in the OBF state until all the padding gets depleted. The OBF state is controlled by a histogram which specifies the parameters described in the paragraphs above. After all the padding finishes, it moves to END state. We also set a special WTF-PAD flag which keeps the circuit open even after the introduction is performed. In particular, with this feature the circuit will stay alive for the same durations as normal web circuits before they expire (usually 10 minutes). 5.2. Client-side rendezvous circuit hiding machines The rendezvous circuit machines apply on client-side rendezvous circuits and only after the rendezvous point has been established (REND_ESTABLISHED has been received). Up to that point, the following cell sequence has been observed on the circuit: [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [ESTABLISH_REND] -> REND_ESTABLISHED which matches the general circuit construction sequence [CIRCCONSTRUCTION] up to the first 6 cells. However after that, normal rendezvous circuits receive a RENDEZVOUS2 cell followed by a [BEGIN] and a CONNECTED, which does not fit the circuit construction sequence we are trying to imitate. Hence our machine gets activated right after REND_ESTABLISHED is received, and continues by sending a [PADDING_NEGOTIATE] and a [DROP] cell, before receiving a PADDING_NEGOTIATED and a DROP cell, effectively blending into the general circuit construction sequence on the first 10 cells. After that our machine gets deactivated, and we let the actual rendezvous circuit shape the traffic flow. Since rendezvous circuits usually immitate general circuits (their purpose is to surf the web), we can expect that they will look alike. In terms of overhead, this machine is quite light. Both sides send 2 padding cells, for a total of 4 padding cells. 6. Overhead analysis Given the parameters above, intro circuit machines have an overhead of 11 padding cells, and rendezvous circuit machines have an overhead of 4 cpadding ells. . This means that for every intro and rendezvous circuit there will be an overhead of 15 padding cells in average, which is about 7.5kb. In the PrivCount paper [1] we learn that the Tor network sees about 12 million successful descriptor fetches per day. We can use this figure to assume that the Tor network also sees about 12 million intro and rendezvous circuits per day. Given the 7.5kb overhead of each of these circuits, we get that our padding machines infer an additional 94GB overhead per day on the network, which is about 3.9GB per hour. XXX Isn't this kinda intense????? Using the graphs from metrics we see that the Tor network has total capacity of 300 Gbit/s which is about 135000GB per hour, so 3.9GB per hour is not that much, but still... 7. Discussion 7.1. Alternative approaches These machines try to hide onion service client-side circuits by obfuscating their looks. This is a reasonable approach, but if the resulting circuits look unlike any other Tor circuits, they would still be fingerprintable just by that fact. Another approach we could take is make normal client circuits look like onion service circuits, or just make normal clients establish fake onion service circuits periodically. The hope here is that the adversary won't be able to distinguish fake onion service circuits from real ones. This approach has not been taken yet, mainly because it requires additional WTF-PAD features and poses greater overhead risks. 7.2. Future work As discussed in [SCOPE], this proposal only aims to hide some very specific features of client-side onion service circuits. There is lots of work to be done here to see what other features can be used to distinguish such circuits, and also what other classifiers can be built using deep learning and whatnot. A. Acknowledgements This research was supported by NSF grants CNS-1526306 and CNS-1619454. --- [0]: https://www.usenix.org/node/190967 https://blog.torproject.org/technical-summary-usenix-fingerprinting-paper [1]: "Understanding Tor Usage with Privacy-Preserving Measurement" by Akshaya Mani, T Wilson-Brown, Rob Jansen, Aaron Johnson, and Micah Sherr In Proceedings of the Internet Measurement Conference 2018 (IMC 2018).
Filename: 303-protover-removal-policy.txt Title: When and how to remove support for protocol versions Author: Nick Mathewson Created: 21 May 2019 Status: Open 1. Background With proposal 264, added support for "subprotocol versions" -- a means to declare which features are required for participation in the Tor network. We also created a mechanism (refined later in proposal 297) for telling Tor clients and relays that they cannot participate effectively in the Tor network, and they need to shut down. In this document, we describe a policy according to which these decisions should be made in practice. 2. Recommending features (for clients and relays) A subprotocol version SHOULD become recommended soon after all release series that did not provide it become unsupported (within a month or so). For example, the current oldest LTS release series is 0.2.9; when it becomes unsupported in 2020, the oldest supported release series will be 0.3.5. Suppose that 0.2.9 supports a subprotocol Cupcake=1, and that all stable 0.3.5.x versions support Cupcake=1-3. Around one month after the end of 0.2.9 support, Cupcake=3 should become a _recommended_ protocol for clients and relays. Additionally, a feature can become _recommended_ because of security reasons. If we believe that it is a terrible idea to run an old protocol, we can make it _recommended_ for relays or clients or both. We should not do this lightly, since it will be annoying. 3. Requiring features (for relays) We regularly update the directory authorities to require relays to run certain versions of Tor or later. We generally do this after a short outreach campaign to get as many relays as possible to upgrade. We MAY make a feature required for relays one month after every version without it is obsolete and unsupported, though it is better to wait three months if possible. We SHOULD make a feature required for relays within 12 months after every version without it is obsolete and unsupported. 4. Requiring features (for clients) Clients take the longest time to update, and are often the least able to fetch upgrades. Because of this, we should be very careful about making subprotocol versions required on clients, and should only do so for fairly compelling reasons. We SHOULD NOT make a feature required for clients until it has been _recommended_ for clients for at first 9 months. We SHOULD make a feature required for clients if it has been _recommended_ for clients for at least 18 months.
Filename: 304-socks5-extending-hs-error-codes.txt Title: Extending SOCKS5 Onion Service Error Codes Author: David Goulet, George Kadianakis Created: 22-May-2019 Status: Closed Note: We are extending SOCKS5 here but in terms, when Tor Browser supports HTTPCONNECT, we should not do that anymore. 0. Abstract We propose extending the SOCKS5 protocol to allow returning more meaningful response failure onion service codes back to the client. This is inspired by proposal 229 [PROP229] minus the new authentication method. 1. Introduction The motivation behind this proposal is because we need a synchronous way to return a reason on why the SOCKS5 connection failed. The alternative is to use a control port event but then the caller needs to match the SOCKS failure to the control event. And tor provides no guarantee that a control event will be emitted before the SOCKS failure or vice versa. With this proposal, the client can get the reason on why the onion service connection failed with the SOCKS5 returned error code. 2. Proposal 2.1. New SocksPort Flag In order to have backward compatibility with third party applications that do not support the new proposed SOCKS5 error code, we propose a new SocksPort flag that needs to be set in the tor configuration file in order for those code to be sent back. The new SocksPort flag is: "ExtendedErrors" -- Tor will report new SOCKS5 error code detailed below in section 2.2 (once merged, they will end up in socks-extension.txt). It is possible that more codes will be added in the future so an application using this flag should possibly expect unknown codes to be returned. 2.2. Onion Service Extended SOCKS5 Error Code We introduce the following additional SOCKS5 reply codes to be sent in the REP field of a SOCKS5 message iff the "ExtendedErrors" on the SocksPort is set (see section 2.1 above). The SOCKS5 specification [RFC1928] defines a range of code that are "unassigned" so we'll be using those on the far end of the range in order to inform the client of onion service failures: Where: * X'F0' Onion Service Descriptor Can Not be Found The requested onion service descriptor can't be found on the hashring and thus not reachable by the client. * X'F1' Onion Service Descriptor Is Invalid The requested onion service descriptor can't be parsed or signature validation failed. * X'F2' Onion Service Introduction Failed Client failed to introduce to the service meaning the descriptor was found but the service is not anymore at the introduction points. The service has likely changed its descriptor or is not running. * X'F3' Onion Service Rendezvous Failed Client failed to rendezvous with the service which means that the client is unable to finalize the connection. * X'F4' Onion Service Missing Client Authorization Tor was able to download the requested onion service descriptor but is unable to decrypt its content because it is missing client authorization information for it. * X'F5' Onion Service Wrong Client Authorization Tor was able to download the requested onion service descriptor but is unable to decrypt its content using the client authorization information it has. This means the client access were revoked. 3. Compatibility No new field or extension has been added. Only new code values from the unassigned range are being used. We expect these to not be a problem for backward compatibility. These codes are only sent back if the new proposed SocksPort flag, "ExtendedErrors", is set and making it easier for backward and foward compatibility. References: [PROP229] https://gitweb.torproject.org/torspec.git/tree/proposals/229-further-socks5-extensions.txt [RFC1928] https://www.ietf.org/rfc/rfc1928.txt
Filename: 305-establish-intro-dos-defense-extention.txt Title: ESTABLISH_INTRO Cell DoS Defense Extension Author: David Goulet, George Kadianakis Created: 06-June-2019 Status: Closed 0. Abstract We propose introducing a new cell extension to the onion service version 3 ESTABLISH_INTRO cell in order for a service operator to send directives to the introduction point. 1. Introduction The idea behind this proposal is to provide a way for a service operator to give to the introduction points Denial of Service (DoS) defense parameters through the ESTABLISH_INTRO cell. We are currently developing onion service DoS defenses at the introduction point layer which for now has consensus parameter values for the defenses' knobs. This proposal would allow the service operator more flexibility for tuning these knobs and/or future parameters. 2. ESTABLISH_INTRO Cell DoS Extention We introduce a new extention to the ESTABLISH_INTRO cell. The EXTENSIONS field will be leveraged and a new protover will be introduced to reflect that change. As a reminder, this is the content of an ESTABLISH_INTRO cell (taken from rend-spec-v3.txt section 3.1.1): AUTH_KEY_TYPE [1 byte] AUTH_KEY_LEN [2 bytes] AUTH_KEY [AUTH_KEY_LEN bytes] N_EXTENSIONS [1 byte] N_EXTENSIONS times: EXT_FIELD_TYPE [1 byte] EXT_FIELD_LEN [1 byte] EXT_FIELD [EXT_FIELD_LEN bytes] HANDSHAKE_AUTH [MAC_LEN bytes] SIG_LEN [2 bytes] SIG [SIG_LEN bytes] We propose a new EXT_FIELD_TYPE value: [01] -- DOS_PARAMETERS. If this flag is set, the extension should be used by the introduction point to learn what values the denial of service subsystem should be using. The EXT_FIELD content format is: N_PARAMS [1 byte] N_PARAMS times: PARAM_TYPE [1 byte] PARAM_VALUE [8 byte] The PARAM_TYPE proposed values are: [01] -- DOS_INTRODUCE2_RATE_PER_SEC The rate per second of INTRODUCE2 cell relayed to the service. [02] -- DOS_INTRODUCE2_BURST_PER_SEC The burst per second of INTRODUCE2 cell relayed to the service. The PARAM_VALUE size is 8 bytes in order to accomodate 64bit values (uint64_t). It MUST match the specified limit for the following PARAM_TYPE: [01] -- Min: 0, Max: 2147483647 [02] -- Min: 0, Max: 2147483647 A value of 0 means the defense is disabled. If the rate per second is set to 0 (param 0x01) then the burst value should be ignored. And vice-versa, if the burst value is 0 (param 0x02), then the rate value should be ignored. In other words, setting one single parameter to 0 disables the INTRODUCE2 rate limiting defense. The burst can NOT be smaller than the rate. If so, the parameters should be ignored by the introduction point. The maximum is set to INT32_MAX meaning (2^31 - 1). Our consensus parameters are capped to that limit and these parameters happen to be also consensus parameters as well hence the common limit. Any valid value does have precedence over the network wide consensus parameter. This will increase the payload size by 21 bytes: This extension type and length is 2 extra bytes, the N_EXTENSIONS field is always present and currently set to 0. Then the EXT_FIELD is 19 bytes because one parameter is 9 bytes so for two parameters, it is 18 bytes plus 1 byte for the N_PARAMS for a total of 19. The ESTABLISH_INTRO v3 cell currently uses 134 bytes for its payload. With this increase, 343 bytes remain unused (498 maximum payload size minus 155 bytes new payload). 3. Protocol Version We introduce a new protocol version in order for onion service that wants to specifically select introduction points supporting this new extension. But also, it should be used to know when to send this extension or not. The new version for the "HSIntro" protocol is: "5" -- support ESTABLISH_INTRO cell DoS parameters extension for onion service version 3 only. 4. Configuration Options We also propose new torrc options in order for the operator to control those values passed through the ESTABLISH_INTRO cell. "HiddenServiceEnableIntroDoSDefense 0|1" If this option is set to 1, the onion service will always send to an introduction point, supporting this extension (using protover), the denial of service defense parameters regardless if the consensus enables them or not. The values are taken from HiddenServiceEnableIntroDoSRatePerSec and HiddenServiceEnableIntroDoSBurstPerSec torrc option. (Default: 0) "HiddenServiceEnableIntroDoSRatePerSec N sec" Controls the introduce rate per second the introduction point should impose on the introduction circuit. The default values are only used if the consensus param is not set. (Default: 25, Min: 0, Max: 4294967295) "HiddenServiceEnableIntroDoSBurstPerSec N sec" Controls the introduce burst per second the introduction point should impose on the introduction circuit. The default values are only used if the consensus param is not set. (Default: 200, Min: 0, Max: 4294967295) They respectively control the parameter type 0x01 and 0x02 in the ESTABLISH_INTRO cell detailed in section 2. The default values of the rate and burst are taken from ongoing anti-DoS implementation work [1][2]. They aren't meant to be defined with this proposal. 5. Security Considerations Using this new extension leaks to the introduction point the service's tor version. This could in theory help any kind of de-anonymization attack on a service since at first it partitions it in a very small group of running tor. Furthermore, when the first tor version supporting this extension will be released, very few introduction points will be updated to that version. Which means that we could end up in a situation where many services want to use this feature and thus will only select a very small subset of relays supporting it overloading them but also making it an easier vector for an attacker that whishes to be the service introduction point. For the above reasons, we propose a new consensus parameter that will provide a "go ahead" for all service out there to start using this extension only if the introduction point supports it. "HiddenServiceEnableIntroDoSDefense" If set to 1, this makes tor start using this new proposed extension if available by the introduction point (looking at the new protover). This parameter should be switched on when a majority of relays have upgraded to a tor version that supports this extension for which we believe will also give enough time for most services to move to this new stable version making the anonymity set much bigger. We believe that there are services that do not care about anonymity on the service side and thus could benefit from this feature right away if they wish to use it. 5. Discussions One possible new avenue to explore is for the introduction point to send back a new type of cell which would tell the service that the DoS defenses have been triggered. It could include some statistics in the cell which can ultimately be reported back to the service operator to use those for better decisions for the parameters. But also for the operator to be noticed that their service is under attack or very popular which could mean time to increase or disable the denial of service defenses. A. Acknowledgements This research was supported by NSF grants CNS-1526306 and CNS-1619454. References: [1] https://lists.torproject.org/pipermail/tor-dev/2019-May/013837.html [2] https://trac.torproject.org/15516
Filename: 306-ipv6-happy-eyeballs.txt Title: A Tor Implementation of IPv6 Happy Eyeballs Author: Neel Chauhan Created: 25-Jun-2019 Supercedes: 299 Status: Open Ticket: https://trac.torproject.org/projects/tor/ticket/29801 1. Introduction As IPv4 address space becomes scarce, ISPs and organizations will deploy IPv6 in their networks. Right now, Tor clients connect to entry nodes using IPv4 connectivity by default. When networks first transition to IPv6, both IPv4 and IPv6 will be enabled on most networks in a so-called "dual-stack" configuration. This is to not break existing IPv4-only applications while enabling IPv6 connectivity. However, IPv6 connectivity may be unreliable and clients should be able to connect to the entry node using the most reliable technology, whether IPv4 or IPv6. In ticket #27490, we introduced the option ClientAutoIPv6ORPort which lets a client randomly choose between IPv4 or IPv6. However, this random decision does not take into account unreliable connectivity or falling back to the alternate IP version should one be unreliable or unavailable. One way to select between IPv4 and IPv6 on a dual-stack network is a so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a client attempts the preferred IP family, whether IPv4 or IPv6. Should it work, the client sticks with the preferred IP family. Otherwise, the client attempts the alternate version. This means if a dual-stack client has both IPv4 and IPv6, and IPv6 is unreliable, preferred or not, the client uses IPv4, and vice versa. However, if IPv4 and IPv6 are both equally reliable, and IPv6 is preferred, we use IPv6. In Proposal 299, we have attempted a IP fallback mechanism using failure counters and preferring IPv4 and IPv6 based on the state of the counters. However, Prop299 was not standard Happy Eyeballs and an alternative, standards-compliant proposal was requested in [P299-TRAC] to avoid issues from complexity caused by randomness. This proposal describes a Tor implementation of Happy Eyeballs and is intended as a successor to Proposal 299. 2. Address/Relay Selection This section describes the necessary changes for address selection to implement Prop306. 2.1. Address Handling Changes To be able to handle Happy Eyeballs in Tor, we will need to modify the data structures used for connections to entry nodes, namely the extend info structure. Entry nodes are usually guards, but some clients don't use guards: * Bootstrapping clients can connect to fallback directory mirrors or authorities * v3 single onion services can use IPv4 or IPv6 addresses to connect to introduction and rendezvous points, and * Clients can be configured to disable entry guards Bridges are out of scope for this proposal, because Tor does not support multiple IP addresses in a single bridge line. The extend info structure should contain both an IPv4 and an IPv6 address. This will allow us to try IPv4 and the IPv6 addresses should both be available on a relay and the client is dual-stack. When processing: * relay descriptors, * hard-coded authority and fallback directory lists, * onion service descriptors, or * onion service introduce cells, and filling in the extend info data structure, we need to fill in both the IPv4 and IPv6 address if both are available. If only one family is available for a relay (IPv4 or IPv6), we should leave the other family null. 2.2 Bootstrap Changes Tor's hard-coded authority and fallback directory mirror lists contain some entries with IPv6 ORPorts. As of January 2020, 56% of authorities and 47% of fallback directories have IPv6. During bootstrapping, we should have an option for the maximum number of IPv4-only nodes, before the next node must have an IPv6 ORPort. The parameter is as follows: * MaxNumIPv4BootstrapAttempts NUM During bootstrap, the minimum fraction of nodes with IPv6 ORPorts will be 1/(1 + MaxNumIPv4BootstrapAttempts). And the average fraction will be larger than both minimum fraction, and the actual proportion of IPv6 ORPorts in the fallback directory list. (Clients mainly use fallback directories for bootstrapping.) Since this option is used during bootstrapping, it can not have a corresponding consensus parameter. The default value for MaxNumIPv4BootstrapAttempts should be 2. This means that every third bootstrap node must have an IPv6 ORPort. And on average, just over half of bootstrap nodes chosen by clients will have an IPv6 ORPort. This change won't have much impact on load-balancing, because almost half the fallback directory mirrors have IPv6 ORPorts. The minimum value of MaxNumIPv4BootstrapAttempts is 0. (Every bootstrap node must have an IPv6 ORPort. This setting is equivalent to ClientPreferIPv6ORPort 1.) The maximum value of MaxNumIPv4BootstrapAttempts should be 100. (Since most clients only make a few bootstrap connections, bootstrap nodes will be chosen at random, regardless of their IPv6 ORPorts.) 2.3. Guard Selection Changes When we select guard candidates, we should have an option for the number of primary IPv6 entry guards. The parameter is as follows: * NumIPv6Guards NUM If UseEntryGuards is set to 1, we will select exactly this number of IPv6 relays for our primary guard list, which is the set of relays we strongly prefer when connecting to the Tor network. (This number should also apply to all of Tor's other guard lists, scaled up based on the relative size of the list.) If NUM is -1, we try to learn the number from the NumIPv6Guards consensus parameter. If the consensus parameter isn't set, we should default to 1. The default value for NumIPv6Guards should be -1. (Use the consensus parameter, or the underlying default value of 1.) As of September 2019, approximately 20% of Tor's guards supported IPv6, by consensus weight. (Excluding exits that are also guards, because clients avoid choosing exits in their guard lists.) If all Tor clients implement NumIPv6Guards, then these 20% of guards will handle approximately 33% of Tor's traffic. (Because the default value of NumPrimaryGuards is 3.) This may have a significant impact on Tor's load-balancing. Therefore, we should deploy this feature gradually, and try to increase the number of relays that support IPv6 to at least 33%. To minimise the impact on load-balancing, IPv6 support should only be required for exactly NumIPv6Guards during guard list selection. All other guards should be IPv4-only guards. Once approximately 50% of guards support IPv6, NumIPv6Guards can become a minimum requirement, rather than an exact requirement. The minimum configurable value of NumIPv6Guards is -1. (Use the consensus parameter, or the underlying default.) The minimum resulting value of NumIPv6Guards is 0. (Guards will be chosen at random, regardless of their IPv6 ORPorts.) The maximum value of NumIPv6Guards should be the configured value of NumPrimaryGuards. (Every guard must have an IPv6 ORPort. This setting is equivalent to ClientPreferIPv6ORPort 1.) 3. Relay Connections If there is an existing authenticated connection, we must use it similar to how we used it pre-Prop306. If there is no existing authenticated connection for an entry node, tor currently attempts to connect using the first available, allowed, and preferred address. (Determined using the existing Client IPv4 and IPv6 options.) We should also allow falling back to the alternate address. For this, a design will be given in Section 3.1. 3.1. TCP Connection to Preferred Address On First TCP Success In this design, we will connect via TCP to the first preferred address. On a failure or after a 250 msec delay, we attempt to connect via TCP to the alternate address. On a success, Tor attempts to authenticate and closes the other connection. This design is close to RFC 8305 and is similar to how Happy Eyeballs is implemented in a web browser. 3.2. Handling Connection Successes And Failures Should a connection to a entry node succeed and is authenticated via TLS, we can then use the connection. In this case, we should cancel all other connection timers and in-progress connections. Cancelling the timers is necessary so we don't attempt new unnecessary connections when our existing connection is successful, preventing denial-of-service risks. However, if we fail all available and allowed connections, we should tell the rest of Tor that the connection has failed. This is so we can attempt another entry node. 3.3. Connection Attempt Delays As mentioned in [TEOR-P306-REP], initially, clients should prefer IPv4 by default. The Connection Attempt Delay, or delay between IPv4 and IPv6 connections should be 250 msec. This is to avoid the overhead from tunneled IPv6 connections. The Connection Attempt Delay should not be dynamically adjusted, as it adds privacy risks. This value should be fixed, and could be manually adjusted using this torrc option or consensus parameter: * ConnectionAttemptDelay N [msec|second] The Minimum and Maximum Connection Attempt Delays should also not be dynamically adjusted for privacy reasons. The Minimum should be fixed at 10 msec as per RFC 8305. But the maximum should be higher than the RFC 8305 recommendation of 2 seconds. For Tor, we should make this timeout value 30 seconds to match Tor's existing timeout. We need to make it possible for users to set the Maximum Connection Attempt Delay value higher for slower and higher-latency networks such as dial-up and satellite. 4. Option Changes As we enable IPv6-enabled clients to connect out of the box, we should adjust the default options to enable IPv6 while not breaking IPv4-only clients. The new default options should be: * ClientUseIPv4 1 (to enable IPv4) * ClientUseIPv6 1 (to enable IPv6) * ClientPreferIPv6ORPort 0 (for load-balancing reasons so we don't overload IPv6-only guards) * ConnectionAttemptDelay 250 msec (the recommended delay between IPv4 and IPv6, as per RFC 8305) One thing to note is that clients should be able to connect with the above options on IPv4-only, dual-stack, and IPv6-only networks, and they should also work if ClientPreferIPv6ORPort is 1. But we shouldn't expect IPv4 or IPv6 to work if ClientUseIPv4 or ClientUseIPv6 is set to 0. When the majority of clients and relay are IPv6-capable, we could set the default value of ClientPreferIPv6ORPort to 1, in order to take advantage of IPv6. We could add a ClientPreferIPv6ORPort consensus parameter, so we can make this change network-wide. 5. Relay Statistics Entry nodes could measure the following statistics for both IPv4 and IPv6: * Number of successful connections * Number of extra Prop306 connections (unsuccessful or cancelled) * Client closes the connection before completing TLS * Client closes the connection before sending any circuit or data cells * Number of client and relay connections * We can distinguish between authenticated (relay, authority reachability) and unauthenticated (client, bridge) connections Should we implement Section 5: * We can send this information to the directory authorities using relay extra-info descriptors * We should consider the privacy implications of these statistics, and how much noise we need to add to them * We can include these statistics in the Heartbeat logs 6. Initial Feasibility Testing We should test this proposal with the following scenarios: * Different combinations of values for the options ClientUseIPv4, ClientUseIPv6, and ClientPreferIPv6ORPort on IPv4-only, IPv6-only, and dual-stack connections * Dual-stack connections of different technologies, including high-bandwidth and low-latency (e.g. FTTH), moderate-bandwidth and moderate-latency (e.g. DSL, LTE), and high-latency and low-bandwidth (e.g. satellite, dial-up) to see if Prop306 is reliable and feasible 7. Minimum Viable Prop306 Product The mimumum viable product for Prop306 must include the following: * The address handling, bootstrap, and entry guard changes described in Section 2. (Single Onion Services are optional, Bridge Clients are out of scope. The consensus parameter and torrc options are optional.) * The alternative address retry algorithm in Section 3.1. * The Connection Success/Failure mechanism in Section 3.2. * The Connection Delay mechanism in Section 3.3. (The ConnectionAttemptDelay torrc option and consensus parameter are optional.) * A default setup capable of both IPv4 and IPv6 connections with the options described in Section 4. (The ClientPreferIPv6ORPort consensus parameter is optional.) 8. Optional Features Some features which are optional include: * Single Onion services: extend info address changes for onion service descriptors and introduce cells. (Section 2.1.) * Bridge clients are out of scope: they would require bridge line format changes, internal bridge data structure changes, and extend info address changes. (Section 2.1.) * MaxNumIPv4BootstrapAttempts torrc option. We may need this option if the proposed default doesn't work for some clients. (Section 2.2.) * NumIPv6Guards torrc option and consensus parameter. We may need this option if the proposed default doesn't work for some clients. (Section 2.3.) * ConnectionAttemptDelay torrc option and consensus parameter. We will need this option if the Connection Attempt Delay needs to be manually adjusted, for instance, if clients often fail IPv6 connections. (Section 3.3.) * ClientPreferIPv6ORPort consensus parameter. (Section 4.) * IPv4, IPv6, client, relay, and extra Prop306 connection statistics. While optional, these statistics may be useful for debugging and reliability testing, and metrics on IPv4 vs IPv6. (Section 5.) 9. Acknowledgments Thank you so much to teor for your discussion on this happy eyeballs proposal. I wouldn't have been able to do this has it not been for your help. 10. Refrences [P299-TRAC]: https://trac.torproject.org/projects/tor/ticket/29801 [TEOR-P306-REP]: https://lists.torproject.org/pipermail/tor-dev/2019-July/013919.html
Filename: 307-onionbalance-v3.txt Title: Onion Balance Support for Onion Service v3 Author: Nick Mathewson Created: 03-April-2019 Status: Reserve [This proposal is currently in reserve status because bug tor#29583 makes it unnecessary. (2020 July 31)] 0. Draft Notes 2019-07-25: At this point in time, the cross-certification is not implemented correctly in >= tor-0.3.2.1-alpha. See https://trac.torproject.org/29583 for more details. This proposal assumes that this bug is fixed. 1. Introduction The OnionBalance tool allows several independent Tor instances to host an onion service, while clients can access that onion service without having to take its distributed status into account. OnionBalance works by having each instance run a separate onion service. Then, a management server periodically downloads the descriptors from those onion services, and generates a new descriptor containing the introduction points from each instance's onion service. OnionBalance is used by several high-profile onion services, including Facebook and The Tor Project. Unfortunately, because of the cross-certification features in v3 onion services, OnionBalance no longer works for them. To a certain extent, this breakage is because of a security improvement: It's probably a good thing that random third parties can no longer grab a onion service's introduction points and claim that they are introduction points for a different service. But nonetheless, a lack of a working OnionBalance remains an obstacle for v3 onion service migration. This proposal describes extensions to v3 onion service design to accommodate OnionBalance. 2. Background and Solution If an OnionBalance management server wants to provide an aggregate descriptor for a v3 onion service, it faces several obstacles that it didn't have in v2. When the management server goes to construct an aggregated descriptor, it will have a mismatch on the "auth-key", "enc-key-cert", and "legacy-key-cert" fields: these fields are supposed to certify the onion service's current descriptor-signing key, but each of these keys will be generated independently by each instance. Because they won't match each other, there is no possible key that the aggregated descriptor could use for its descriptor signing key. In this design, we require that each instance should know in advance about a descriptor-signing public key that the aggregate descriptor will use for each time period. (I'll explain how they can do this later, in section 3 below.) They don't have to know the corresponding private key. When generating their own onion service descriptors for a given time period, the instances generate these additional fields to be used for the aggregate descriptor: "meta-auth-key" "meta-enc-key-cert" "meta-legacy-key-cert" These fields correspond to "auth-key", "enc-key-cert", and "legacy-key-cert" respectively, but differ in one regard: the descriptor-signing public key that they certify is _not_ the instance's own descriptor-signing key, but rather the aggregate public key for the time period. Ordinary clients ignore these new fields. When the management server creates the aggregate descriptor, it checks that the signing key for each of these "meta" fields matches the signing key for its corresponding non-"meta" field, and that they certify the correct descriptor-signing key-- and then uses these fields in place of their corresponding non-"meta" variants. 2.1. A quick note on synchronization In the design above, and in the section below, I frequently refer to "the current time period". By this, I mean the time period for which the descriptor is encoded, not the time period in which it is generated. Instances and management servers should generate descriptors for the two closest time periods, as they do today: no additional synchronization should needed here. 3. How to distribute descriptor-signing keys The design requires that every instance of the onion service knows about the public descriptor-signing key that will be used for the aggregate onion service. Here I'll discuss how this can be achieved. 3.1. If the instances are trusted. If the management server trusts each of the instances, it can distribute a shared secret to each one of them, and use this shared secret to derive each time period's private key. For example, if the shared secret is SK, then the private descriptor- signing key for each time period could be derived as: H("meta-descriptor-signing-key-deriv" | onion_service_identity INT_8(period_num) | INT_8(period_length) | SK ) (Remember that in the terminology of rend-spec-v3, INT_8() denotes a 64-bit integer, see section 0.2 in rend-spec-v3.txt.) If shared secret is ever compromised, then an attacker can impersonate the onion service until the shared secret is changed, and can correlate all past descriptors for the onion service. 3.2. If the instances are not trusted: Option One If the management server does not trust the instances with descriptor-signing public keys, another option for it is to simply distribute a load of public keys in advance, and use them according to a schedule. In this design, the management server would pre-generate the "descriptor-signing-key-cert" fields for a long time in advance, and distribute them to the instances offline. Each one would be associated with its corresponding time period. If these certificates were revealed to an attacker, the attacker could correlate descriptors for the onion service with one another, but could not impersonate the service. 3.3. If the instances are not trusted: Option Two Another option for the trust model of 3.2 above is to use the same key-blinding method as used for v3 onion services. The management server would hold a private descriptor-signing key, and use it to derive a different private descriptor-signing key for each time period. The instance servers would hold the corresponding public key, and use it to derive a different public descriptor-signing key for each time period. (For security, the key-blinding function in this case should use a different nonce than used in the) This design would allow the instances to only be configured once, which would be simpler than 3.2 above-- but at a cost. The management server's use of a long-term private descriptor-signing key would require it to keep that key online. (It could keep the derived private descriptor-signing keys online, but the parent key could be derived from them.) Here, if the instance's knowledge were revealed to an attack, the attacker could correlate descriptors for the onion service with one another, but could not impersonate the service. 4. Some features of this proposal We retain the property that each instance service remains accessible as a working onion service. However, anyone who can access it can identify it as an instance of an OnionBalance service, and correlate its descriptor to the aggregate descriptor. Instances could use client authorization to ensure that only the management server can decrypt their introduction points. However, because of the key-blinding features of v3 onion services, nobody who doesn't know the onion addresses for the instances can access them anyway: It would be sufficient to keep these addresses secret. Although anybody who successfully accesses an instance can correlate its descriptor to the meta-descriptor, this only works for two descriptors within a single time period: You can't match an instance descriptor from one time period to a meta-descriptor from another. A. Acknowledgments Thanks to the network team for helping me clarify my ideas here, explore options, and better understand some of the implementations and challenges in this problem space. This research was supported by NSF grants CNS-1526306 and CNS-1619454.
Filename: 308-counter-galois-onion.txt Title: Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography Authors: Jean Paul Degabriele, Alessandro Melloni, Martijn Stam Created: 13 Sep 2019 Last-Modified: 13 Sep 2019 Status: Superseded NOTE: This proposal is superseded by an improved version of the Counter Galois Onion design based on the authors' forthcoming paper, "Counter Galois Onion: Fast, Forward-Secure, and Non-Malleable Onion Encryption for Tor". The improved proposal will be publicly available once the paper is closer to being ready for publication. -nickm 1. Background and Motivation In Proposal 202, Mathewson expressed the need to update Tor's Relay cryptography and protect against tagging attacks. Towards this goal he outlined two possible approaches for constructing an onion encryption scheme that should be able to withstand tagging attacks. Later, in Proposal 261, Mathewson proposed a concrete scheme based on the tweakable wide-block cipher AEZ. The security of Proposal 261 was analysed in [DS18]. An alternative scheme was suggested in Proposal 295 which combines an instantiation of the PIV construction from [ST14] and a variant of the GCM-RUP construction from [ADL17]. In this document we propose yet another scheme, Counter Galois Onion (CGO) which improves over proposals 261 and 295 in a number of ways. CGO has a minimalistic design requiring only a block cipher in counter-mode and a universal hash function. To take advantage of Intel's AES-NI and PCLMULQDQ instructions we recommend using AES and POLYVAL [GLL18]. In terms of security, it protects against tagging attacks while simultaneously providing forward security with respect to end-to-end authenticity and confidentiality. Furthermore CGO performs better than proposal 295 in terms of efficiency and its support of "leaky pipes". 1.2 Design Overview CGO makes due with a universal hash function while simultaneously satisfying forward security. It employs two distinct types of encryption, a dynamic encryption scheme DEnc and a static encryption scheme SEnc. DEnc is used for end-to-end encryption (layer n) and SEnc is used for the intermediate layers (n-1 to 1). DEnc is a Forward- Secure Authenticated Encryption scheme for securing end-to-end communication and SEnc provides the non-malleability for protecting against tagging attacks. In order to provide forward security, the key material in DEnc is updated with every encryption whereas in SEnc the key material is static. To support leaky pipes, in the forward direction each OR first attempts a partial decryption using DEnc and if it fails it reverts to decrypting using SEnc. The rest of the document describes the scheme's operation in terms of the low-level primitives and we make no further mention of DEnc and SEnc. However, on an intuitive level it can be helpful to think of: a) the combinations of E(KSf_I, *) and PH(HSf_I, *) as well as E(KDf_I, *) and PH(HDf_I, *) as two instances of a tweakable block cipher, b) the operation E(Sf_I, <0>) | E(Sf_I, <1>) | E(Sf_I, <2>) | ... as a PRG with seed Sf_I, c) and E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>) as counter-mode encryption with <IV> as the initial vector. 2. Preliminaries 2.1. Notation Symbol Meaning ------ ------- M Plaintext Sf_I PRG Seed, forward direction, layer I Sb_I PRG Seed, backward direction, layer I Cf_I Ciphertext, forward direction, layer I Cb_I Ciphertext, backward direction, layer I Tf_I Tag, forward direction, layer I LTf_I Last Tag, forward direction, layer I Tb_I Tag, backward direction, layer I LTb_I Last Tag, backward direction, layer I Nf_I Nonce, forward direction, layer I LNf_I Last Nonce, forward direction, layer I Nb_I Nonce, backward direction, layer I LNb_I Last Nonce, backward direction, layer I JSf_I Static Block Cipher Key, forward direction, layer I JSb_I Static Block Cipher Key, backward direction, layer I KSf_I Static Block Cipher Key, forward direction, layer I KSb_I Static Block Cipher Key, backward direction, layer I KDf_I Dynamic Block Cipher Key, forward direction, layer I KDb_I Dynamic Block Cipher Key, backward direction, layer I HSf_I Static Poly-Hash Key, forward direction, layer I HSb_I Static Poly-Hash Key, backward direction, layer I HDf_I Dynamic Poly-Hash Key, forward direction, layer I HDb_I Dynamic Poly-Hash Key, backward direction, layer I ^ Bitwise XOR operator | Concatenation && Logical AND operator Z[a, b] For a string Z, the substring from byte a to byte b (indexing starts at 1) INT(X) Translate string X into an unsigned integer 2.2. Security parameters POLY_HASH_LEN -- The length of the polynomial hash function's output, in bytes. For POLYVAL, POLY_HASH_LEN = 16. PAYLOAD_LEN -- The longest allowable cell payload, in bytes (509). HASH_KEY_LEN -- The key length used to digest messages in bytes. For POLYVAL, DIG_KEY_LEN = 16. BC_KEY_LEN -- The key length, in bytes, of the block cipher used. For AES we recommend ENC_KEY_LEN = 16. BC_BLOCK_LEN -- The block length, in bytes, of the block cipher used. For AES, BC_BLOCK_LEN = 16. 2.3. Primitives The polynomial hash function is POLYVAL with a HASH_KEY_LEN-byte key. We write this as PH(H, M) where H is the key and M the message to be hashed. We use AES with a BC_KEY_LEN-byte key. For AES encryption (resp., decryption) we write E(K, X) (resp., D(K, X)) where K is a BC_KEY_LEN-byte key and X the block to be encrypted (resp., decrypted). For an integer j, we use <j> to denote the string of length BC_BLOCK_LEN representing that integer. 2.4 Key derivation and initialisation (replaces Section 5.2.2) For newer KDF needs, Tor uses the key derivation function HKDF from RFC5869, instantiated with SHA256. (This is due to a construction from Krawczyk.) The generated key material is: K = K_1 | K_2 | K_3 | ... Where H(x, t) is HMAC_SHA256 with value x and key t and K_1 = H(m_expand | INT8(1) , KEY_SEED ) and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ) and m_expand is an arbitrarily chosen value, and INT8(i) is an octet with the value "i". In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, salt == t_key, and IKM == secret_input. 2.4.1. Key derivation using the KDF When used in the ntor handshake, for each layer I, the key material is split into the following sequence of contiguous values: Length Purpose Notation ------ ------- -------- BC_KEY_LEN forward Seed Sf_I BC_KEY_LEN backward Seed Sb_I if (I < n) in addition derive the following static keys: BC_KEY_LEN forward BC Key KSf_I BC_KEY_LEN backward BC Key KSb_I BC_KEY_LEN forward CTR Key JSf_I BC_KEY_LEN backward CTR Key JSb_I HASH_KEY_LEN forward poly hash key HSf_I HASH_KEY_LEN backward poly hash key HSb_I Excess bytes from K are discarded. 2.4.2. Initialisation from Seed For each layer I compute E(Sf_I, <0>) | E(Sf_I, <1>) | E(Sf_I, <2>) | ... and parse the output as: Length Purpose Notation ------ ------- -------- BC_BLOCK_LEN forward Nonce Nf_I BC_KEY_LEN forward BC Key KDf_I HASH_KEY_LEN forward poly hash key HDf_I BC_KEY_LEN new forward Seed Sf'_I Discard excess bytes, replace Sf_I with Sf'_I, and set LNf_n and LTf_I to the zero string. Similarly for the backward direction, compute E(Sb_I, <0>) | E(Sb_I, <1>) | E(Sb_I, <2>) | ... and parse the output as: Length Purpose Notation ------ ------- -------- BC_BLOCK_LEN backward Nonce Nb_I BC_KEY_LEN forward BC Key KDb_I HASH_KEY_LEN forward poly hash key HDb_I BC_KEY_LEN new backward Seed Sb'_I Discard excess bytes, replace Sb_I with Sb'_I, and set LNb_n and LTb_I to the zero string. NOTE: For layers n-1 to 1 the values Nf_I, KDf_I, HDf_I, Sf_I and their backward counterparts are only required in order to support leaky pipes. If leaky pipes is not required these values can be safely omitted. 3. Routing relay cells Let n denote the number of nodes in the circuit. Then encryption layer n corresponds to the encryption between the OP and the exit/destination node. 3.1. Forward Direction The forward direction is the direction that CREATE/CREATE2 cells are sent. 3.1.1. Routing From the Origin When an OP sends a relay cell, the cell is produced as follows: The OP computes E(Sf_n, <0>) | E(Sf_n, <1>) | E(Sf_n, <2>) | ... and parses the output as Length Purpose Notation ------ ------- -------- 509 encryption pad Z BC_BLOCK_LEN backward Nonce Nf'_I BC_KEY_LEN forward BC Key KDf'_I HASH_KEY_LEN forward poly hash key HDf'_I BC_KEY_LEN new forward Seed Sf'_I Excess bytes are discarded. It then computes the n'th layer ciphertext (Tf_n, Cf_n) as follows: Cf_n = M ^ Z X_n = PH(HDf_n, (LNf_n | Cf_n)) Y_n = Nf_n ^ X_n Tf_n = E(KDf_n, Y_n) ^ X_n and updates its state by overwriting the old variables with the new ones. LNf_n = Nf_n Nf_n = Nf'_n KDf_n = KDf'_n HDf_n = HDf'_n Sf_n = Sf'_n It then applies the remaining n-1 layers of encryption to (Tf_n, Cf_n) as follows: For I = n-1 to 1: IV = INT(Tf_{I+1}) Z = E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>) % BC_BLOCK_LEN = 16 Cf_I = Cf_{I+1} ^ Z[1, 509] X_I = PH(HSf_n, (LTf_{I+1} | Cf_I)) Y_I = Tf_{I+1} ^ X_I Tf_I = E(KSf_I, Y_I) ^ X_I LTf_{I+1} = Tf_{I+1} Upon completion the OP sends (Tf_1, Cf_1) to node 1. 3.1.2. Relaying Forward at Onion Routers When a forward relay cell (Tf_I, Cf_I) is received by OR I, it decrypts it performs the following set of steps: 'Forward' relay cell: X_I = PH(HDf_n, (LNf_I | Cf_I)) Y_I = Tf_I ^ X_I if (Nf_I == D(KDf_I, Y_I) ^ X_I) % cell recognized and authenticated compute E(Sf_I, <0>) | E(Sf_I, <1>) | E(Sf_I, <2>) | ... and parse the output as Z, Nf'_I, KDf'_I, HDf'_I, Sf'_I M = Cf_n ^ Z LNf_I = Nf_I Nf_I = Nf'_I KDf_I = KDf'_I HDf_I = HDf'_I Sf_I = Sf'_I return M else if (I == n) % last node, decryption has failed send DESTROY cell to tear down the circuit else % decrypt and forward cell X_I = PH(HSf_I, (LTf_{I+1} | Cf_I)) Y_I = Tf_I ^ X_I Tf_{I+1} = D(KSf_I, Y_I) ^ X_I IV = INT(Tf_{I+1}) Z = E(JSf_I, <IV>) | E(JSf_I, <IV+1>) | ... | E(JSf_I, <IV+31>) % BC_BLOCK_LEN = 16 Cf_{I+1} = Cf_I ^ Z[1, 509] forward (Tf_{I+1}, Cf_{I+1}) to OR I+1 3.2. Backward Direction The backward direction is the opposite direction from CREATE/CREATE2 cells. 3.2.1. Routing From the Exit Node At OR n encryption proceeds as follows: It computes E(Sb_n, <0>) | E(Sb_n, <1>) | E(Sb_n, <2>) | ... and parses the output as Length Purpose Notation ------ ------- -------- 509 encryption pad Z BC_BLOCK_LEN backward Nonce Nb'_I BC_KEY_LEN forward BC Key KDb'_I HASH_KEY_LEN forward poly hash key HDb'_I BC_KEY_LEN new forward Seed Sb'_I Excess bytes are discarded. It then computes the ciphertext (Tf_n, Cf_n) as follows: Cb_n = M ^ Z X_n = PH(HDb_n, (LNb_n | Cb_n)) Y_n = Nb_n ^ X_n Tb_n = E(KDb_n, Y_n) ^ X_n) and updates its state by overwriting the old variables with the new ones. LNb_n = Nb_n Nb_n = Nb'_n KDb_n = KDb'_n HDb_n = HDb'_n Sb_n = Sb'_n 3.2.2. Relaying Backward at the Onion Routers At OR I (for I < n) when a ciphertext (Tb_I, Cb_I) in the backward direction is received it is processed as follows: X_I = PH(HSb_n, (LTb_{I-1} | Cb_I)) Y_I = Tb_I ^ X_I Tb_{I-1} = D(KSb_I, Y_I) ^ X_I IV = INT(Tb_{I-1}) Z = E(JSb_I, <IV>) | E(JSb_I, <IV+1>) | ... | E(JSb_I, <IV+31>) % BC_BLOCK_LEN = 16 Cb_{I-1} = Cb_I ^ Z[1, 509] The ciphertext (Tb_I, Cb_I) is then passed along the circuit towards the OP. 3.2.2. Routing to the Origin When a ciphertext (Tb_1, Cb_1) arrives at an OP, the OP decrypts it in two stages. It first reverses the layers from 1 to n-1 as follows: For I = 1 to n-1: X_I = PH(HSb_I, (LTb_{I+1} | Cb_I)) Y_I = Tb_I ^ X_I Tb_{I+1} = E(KSb_I, Y_I) ^ X_I IV = INT(Tb_{I+1}) Z = E(JSb_I, <IV>) | E(JSb_I, <IV+1>) | ... | E(JSb_I, <IV+31>) % BC_BLOCK_LEN = 16 Cb_{I+1} = Cb_I ^ Z[1, 509] Upon completion the n'th layer of encryption is removed as follows: X_n = PH(HDb_n, (LNb_n | Cb_n)) Y_n = Tb_n ^ X_n if (Nb_n = D(KDb_n, Y_n) ^ X_n) % authentication is successful compute E(Sb_n, <0>) | E(Sb_n, <1>) | E(Sb_n, <2>) | and parse the output as Z, Nb'_n, KDb'_n, HDb'_n, Sb'_n M = Cb_n ^ Z LNb_n = Nb_n Nb_n = Nb'_n KDb_n = KDb'_n HDb_n = HDb'_n Sb_n = Sb'_n return M else send DESTROY cell to tear down the circuit 4. Application connections and stream management 4.1. Amendments to the Relay Cell Format Within a circuit, the OP and the end node use the contents of RELAY packets to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP. The payload of each unencrypted RELAY cell consists of: Relay command [1 byte] StreamID [2 bytes] Length [2 bytes] Data [PAYLOAD_LEN-21 bytes] The old Digest field is removed since sufficient information for authentication is now included in the nonce part of the payload. The old 'Recognized' field is removed. Instead a cell is recognized via a partial decryption using the node's dynamic keys - namely the following steps (already included in Section 3): Forward direction: X_I = PH(HDf_n, (LNf_I | Cf_I)) Y_I = Tf_I ^ X_I if (Nf_I == D(KDf_I, Y_I) ^ X_I) % cell is recognized and authenticated Backward direction (executed by the OP): If the OP is aware of the number of layers present in the cell there is no need to attempt to recognize the cell. Otherwise the OP can, for each layer, first attempt a partial decryption using the dynamic keys for that layer as follows: X_I = PH(HDb_I, (LNb_I | Cb_I)) Y_I = Tb_I ^ X_I if (Nb_I = D(KDb_I, Y_I) ^ X_I) % cell is recognized and authenticated The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the payload is padding bytes. 4.2. Appending the encrypted nonce and dealing with version-homogenic and version-heterogenic circuits When a cell is prepared to be routed from the origin (see Section 3.1.1) the encrypted nonce N is appended to the encrypted cell (occupying the last 16 bytes of the cell). If the cell is prepared to be sent to a node supporting the new protocol, S is combined with other sources to generate the layer's nonce. Otherwise, if the node only supports the old protocol, n is still appended to the encrypted cell (so that following nodes can still recover their nonce), but a synchronized nonce (as per the old protocol) is used in CTR-mode. When a cell is sent along the circuit in the 'backward' direction, nodes supporting the new protocol always assume that the last 16 bytes of the input are the nonce used by the previous node, which they process as per Section 3.2.1. If the previous node also supports the new protocol, these cells are indeed the nonce. If the previous node only supports the old protocol, these bytes are either encrypted padding bytes or encrypted data. 5. Security and Design Rationale We are currently working on a security proof to better substantiate our security claims. Below is a short informal summary on the security of CGO and its design rationale. 5.1. Resistance to crypto-tagging attacks Protection against crypto-tagging attacks is provided by layers n-1 to 1. This part of the scheme is based on the paradigm from [ADL17] which has the property that if any single bit of the OR's input is changed then all of the OR's output will be randomised. Specifically, if (Tf_I, Cf_I) is travelling in the forward direction and is processed by an honest node I, a single bit flip to either Tf_I or Cf_I will result in both Tf_{I+1} and Cf_{I+1} being completely randomised. In addition, the processing of (Tf_I, Cf_I) includes LTf_{I+1} so that any modification to (Tf_I, Cf_I) at time j will in turn randomise the value (Tf_{I+1}, Cf_{I+1}) at any time >= j . Thus once a circuit is tampered with it is not possible to recover from it at a later stage. This helps to protect against the standard crypto-tagging attack and variations thereof (Section 5.2 in [DS18]). A similar argument holds in the backward direction. 5.2. End-to-end authenticated encryption Layer n provides end-to-end authenticated encryption. Similar to the old protocol, this proposal only offers end-to-end authentication rather than per-hop authentication. However, CGO provides 128-bit authentication as opposed to the 32-bit authentication provided by the old protocol. A main observation underpinning the design of CGO is that the n'th layer does not need to be secure against the release of unverified plaintext (RUP). RUP security is only needed to protect against tagging attacks and the n'th layer does not help in that regard (but the layers below do). Consequently we employ a different scheme at the n'th layer which is designed to provide forward-secure authenticated encryption. 5.3 Forward Security As mentioned in the previous section CGO provides end-to-end authenticated encryption that is also forward secure. Our notion of forward security follows the definitions of Bellare and Yee [BY03] for both confidentiality and authenticity. Forward-secure confidentiality says that upon corrupting either the sender (or the receiver), the secrecy of the messages that have already been sent (or received) is still guaranteed. As for forward-secure authentication, upon corrupting the sender the authenticity of previously authenticated messages is still guaranteed (even if they have not yet been received). In order to achieve forward-secure authenticated encryption, CGO updates the key material of the n'th layer encryption with every cell that is processed. In order to support leaky pipes the lower layers also need to maintain a set of dynamic keys that are used to recognize cells that are intended for them. This key material is only used for partial processing, i.e. recognizing the cell, and is only updated if verification is successful. If the cell is not recognized, the node reverts to processing the cell with the static key material. If support for leaky-pipes is not required this extra processing can be omitted. 6. Efficiency Considerations Although we have not carried out any experiments to verify this, we expect CGO to perform relatively well in terms of efficiency. Firstly, it manages to achieve forward security with just a universal hash as opposed to other proposals which suggested the use of SHA2 or SHA3. In this respect we recommend using POLYVAL [GLL18], a variant of GHASH that is more compatible with Intel's PCMULQDQ instruction. Furthermore CGO admits a certain degree of parallelisability. Supporting leaky pipes requires an OR to first verify the cell using the the dynamic key material and if the cell is unrecognised it goes on to process the cell with the static key material. The important thing to note (see for instance Section 3.1.2) is that the initial processing of the cell using the static key material is almost identical to the verification using the dynamic key material, and the two computations are independent of each other. As such, although in Section 3 these were described as being evaluated sequentially, they can in fact be computed in parallel. In particular the two polynomial hashes could be computed in parallel by using the new vectorised VPCMULQDQ instruction. We are currently looking into further optimisations of the scheme as presented here. One such optimisation is the possibility of removing KDf_I and KDb_I while retaining forward security. This would further improve the efficiency of the scheme by reducing the amount of dynamic key material that needs to be updated with every cell that is processed. References [ADL17] Tomer Ashur, Orr Dunkelman, Atul Luykx, "Boosting Authenticated Encryption Robustness with Minimal Modifications", CRYPTO 2017. [BY03] Mihir Bellare, Bennett Yee, "Forward-Security in Private-Key Cryptography", CT-RSA 2003. [DS18] Jean Paul Degabriele, Martijn Stam, "Untagging Tor: A Formal Treatment of Onion Encryption", EUROCRYPT 2018. [GLL18] Shay Gueron, Adam Langley, Yehuda Lindell, "AES-GCM-SIV: Nonce Misuse-Resistant Authenticated Encryption", RFC 8452, April 2019. [ST13] Thomas Shrimpton, R. Seth Terashima, "A Modular Framework for Building Variable-Input Length Tweakable Ciphers", ASIACRYPT 2013.
Filename: 309-optimistic-socks-in-tor.txt Title: Optimistic SOCKS Data Author: Tom Ritter Created: 21-June-2019 Status: Open Ticket: #5915 0. Abstract We propose that tor should have a SocksPort option that causes it to lie to the application that the SOCKS Handshake has succeeded immediately, allowing the application to begin sending data optimistically. 1. Introduction In the past, Tor Browser had a patch that allowed it to send data optimistically. This effectively eliminated a round trip through the entire circuit, reducing latency. This feature was buggy, and specifically caused problems with MOAT, as described in [0] and Tor Messenger as described in [1]. It is possible that the other issues observed with it were the same issue, it is possible they were different. Rather than trying to identify and fix the problem in Tor Browser, an alternate idea is to have tor lie to the application, causing it to send the data optimistically. This can benefit all users of tor. This proposal documents that idea. [0] https://trac.torproject.org/projects/tor/ticket/24432#comment:19 [1] https://trac.torproject.org/projects/tor/ticket/19910#comment:3 2. Proposal 2.1. Behavior When the SocksPort flag defined below is present, Tor will immediately report a successful SOCKS handshake subject for non-onion connections. If, later, tor recieves an end cell rather than a connected cell, it will hang up the SOCKS connection. The requirement to omit this for onion connections is because in #30382 we implemented a mechanism to return a special SOCKS error code if we are connecting to an onion site that requires authentication. Returning an early success would prevent this from working. Redesigning the mechanism to communicate auth-required onion sites to the browser, while also supporting optimistic data, are left to a future proposal. 2.2. New SocksPort Flag In order to have backward compatibility with third party applications that do not support or do not want to use optimistic data, we propose a new SocksPort flag that needs to be set in the tor configuration file in order for the optimistic beahvior to occur. The new SocksPort flag is: "OptimisticData" -- Tor will immediately report a successful SOCKS handshake subject for non-onion connections and hang up if it gets an end cell rather than a connected cell. 3. Application Error Handling This behavior will cause the application talking to Tor to potentially behave abnormally as it will believe that it has completed a TCP connection. If no such connection can be made by tor, the program may behave in a way that does not accurately represent the behavior of the connection. Applications SHOULD test various connection failure modes and ensure their behavior is acceptable before using this feature. References: [RFC1928] https://www.ietf.org/rfc/rfc1928.txt
Filename: 310-bandaid-on-guard-selection.txt Title: Towards load-balancing in Prop 271 Author: Florentin Rochet, Aaron Johnson et al. Created: 2019-10-27 Supersedes: 271 Status: Closed 1. Motivation and Context Prop 271 causes guards to be selected with probabilities different than their weights due to the way it samples many guards and then chooses primary guards from that sample. We are suggesting a straightforward fix to the problem, which is, roughly speaking, to choose primary guards in the order in which they were sampled. In more detail, Prop 271 chooses guards via a multi-step process: 1. It chooses 20 distinct guards (and sometimes more) by sampling without replacement with probability proportional to consensus weight. 2. It produces two subsets of the sample: (1) "filtered" guards, which are guards that satisfy various torrc constraints and path bias, and (2) "confirmed" guards, which are guards through which a circuit has been constructed. 3. The "primary" guards (i.e. the actual guards used for circuits) are chosen from the confirmed and/or filtered subsets. I'm ignoring the additional "usable" subsets for clarity. This description is based on Section 4.6 of the specification (https://gitweb.torproject.org/torspec.git/tree/guard-spec.txt). 1.1 Picturing the problem when Tor starts the first time The primary guards are selected *uniformly at random* from the filtered guards when no confirmed guards exist. No confirmed guards appear to exist until some primary guards have been selected, and so when Tor is started the first time the primary guards always come only from the filtered set. The uniformly-random selection causes a bias in primary-guard selection away from consensus weights and towards a more uniform selection of guards. As just an example of the problem, if there were only 20 guards in the network, the sampled set would be all guards and primary guard selection would be entirely uniformly random, ignoring weights entirely. This bias is worse the larger the sampled set is relative to the entire set of guards, and it has a significant effect on Tor simulations in Shadow, which are typically on smaller networks. 2. Solution Design We propose a solution that fits well within the existing guard-selection algorithm. Our solution is to select primary guards in the order they were sampled. This ordering should be applied after the filtering and/or confirmed guard sets are constructed as normal. That is, primary guards should be selected from the filtered guards (if no guards are both confirmed and filtered) or from the set of confirmed and filtered guards (if such guards exist) in the order they were initially sampled. This solution guarantees that each primary guard is selected (without replacement) from all guards with a probability that is proportional to its consensus weight. 2.1 Performance implications This proposal is a straightforward fix to the unbalanced network that may arise from the uniform selection of sampled relays. It solves the performance correctness in Shadow for which simulations live on a small timeframe. However, it does not solve all the load-balancing problems of Proposal 271. One other load-balancing issue comes when we choose our guards on a date but then make decisions about them on a different date. Building a sampled list of relays at day 0 that we intend to use in a long time for most of them is taking the risk to slowly make the network unbalanced. 2.2 Security implications This proposal solves the following problems: Prop271 reduces Tor's security by increasing the number of clients that an adversary running small relays can observe. In addition, an adversary has to wait less time than it should after it starts a malicious guard to be chosen by a client. This weakness occurs because the malicious guard only needs to enter the sampled list to have a chance to be chosen as primary, rather than having to wait until all previously-sampled guards have already expired. 2.3 Implementation notes The code used for ordering the confirmed list by confirmed idx should be removed, and a sampled order should be applied throughout the various lists. The next sampled idx should be recalculed from the state file, and the sampled_idx values should be recalculated to be a dense array when we save the state. 3. Going Further -- Let's not choose our History (future work) A deeper refactoring of Prop 271 would try to solve the load balancing problem of choosing guards on a date but then making decisions about them on a different date. One suggestion is to remove the sampled list, which we can picture as a "forward history" and to have instead a real history of previously sampled guards. When moving to the next guard, we could consider *current* weights and make the decision. The history should resist attacks that try to force clients onto compromised guards, using relays that are part of the history if they're still available (in sampled order), and by tracking its size. This should maintain the initial goals of Prop 271.
Filename: 311-relay-ipv6-reachability.txt Title: Tor Relay IPv6 Reachability Author: teor, Nick Mathewson Created: 22-January-2020 Status: Accepted Ticket: #24404 0. Abstract We propose that Tor relays (and bridges) should check the reachability of their IPv6 ORPort, before deciding whether to publish their descriptor. To check IPv6 ORPort reachability, relays and bridges need to be able to extend circuits via other relays, and back to their own IPv6 ORPort. 1. Introduction Tor relays (and bridges) currently check the reachability of their IPv4 ORPort and DirPort before publishing them in their descriptor. But relays and bridges do not test the reachability of their IPv6 ORPorts. However, directory authorities make direct connections to relay IPv4 and IPv6 ORPorts, to test each relay's reachability. Once a relay has been confirmed as reachable by a majority of authorities, it is included in the consensus. (Currently, 6 out of 9 directory authorities perform IPv4 and IPv6 reachability checks. The others just check IPv4.) The Bridge authority makes direct connections to bridge IPv4 ORPorts, to test each bridge's reachability. Depending on its configuration, it may also test IPv6 ORPorts. Once a bridge has been confirmed as reachable by the bridge authority, it is included in the bridge networkstatus used by BridgeDB. Many relay (and bridge) operators don't know when their relay's IPv6 ORPort is unreachable. They might not find out until they check [Relay Search], or their traffic may drop. For new operators, it might just look like Tor simply isn't working, or it isn't using much traffic. IPv6 ORPort issues are a significant source of relay operator support requests. Implementing IPv6 ORPort reachability checks will provide immediate, direct feedback to operators in the relay's logs. It also enables future work, such as automatically discovering relay and bridge addresses for IPv6 ORPorts (see [Proposal 312: Relay Auto IPv6 Address]). 2. Scope This proposal modifies Tor's behaviour as follows: Relays (including directory authorities): * circuit extension, * OR connections for circuit extension, * reachability testing. Bridges: * reachability testing only. This proposal does not change client behaviour. Throughout this proposal, "relays" includes directory authorities, except where they are specifically excluded. "relays" does not include bridges, except where they are specifically included. (The first mention of "relays" in each section should specifically exclude or include these other roles.) When this proposal describes Tor's current behaviour, it covers all supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except where another version is specifically mentioned. 3. Allow Relay IPv6 Extends To check IPv6 ORPort reachability, relays and bridges need to be able to extend circuits via other relays, and back to their own IPv6 ORPort. We propose that relays start to extend some circuits over IPv6 connections. We do not propose any changes to bridge extend behaviour. 3.1. Current IPv6 ORPort Implementation Currently, all relays (and bridges) must have an IPv4 ORPort. IPv6 ORPorts are optional. Tor supports making direct IPv6 OR connections: * from directory authorities to relay ORPorts, * from the bridge authority to bridge ORPorts, * from clients to relay and bridge ORPorts. Tor relays and bridges accept IPv6 ORPort connections. But IPv6 ORPorts are not currently included in extend requests to other relays. And even if an extend cell contains an IPv6 ORPort, bridges and relays will not extend via an IPv6 connection to another relay. Instead, relays will extend circuits: * Using an existing authenticated connection to the requested relay (which is typically over IPv4), or * Over a new connection via the IPv4 ORPort in an extend cell. If a relay receives an extend cell that only contains an IPv6 ORPort, the extend typically fails. 3.2. Relays Extend to IPv6 ORPorts We propose that relays make some connections via the IPv6 ORPorts in extend cells. Relays will extend circuits: * using an existing authenticated connection to the requested relay (which may be over IPv4 or IPv6), or * over a new connection via the IPv4 or IPv6 ORPort in an extend cell. Since bridges try to imitate client behaviour, they will not adopt this new behaviour, until clients begin routinely connecting via IPv6. (See [Proposal 306: Client Auto IPv6 Connections].) 3.2.1. Making IPv6 ORPort Extend Connections Relays may make a new connection over IPv6 when: * they have an IPv6 ORPort, * there is no existing authenticated connection to the requested relay, and * the extend cell contains an IPv6 ORPort. If these conditions are satisfied, and the extend cell also contains an IPv4 ORPort, we propose that the relay choose between an IPv4 and an IPv6 connection at random. If the extend cell does not contain an IPv4 ORPort, we propose that the relay connects over IPv6. (Relays should support IPv6-only extend cells, even though they are not used to test relay reachability in this proposal.) A successful IPv6 connection also requires that: * the requested relay has an IPv6 ORPort. But extending relays must not check the consensus for other relays' IPv6 information. Consensuses may be out of date, particularly when relays are doing reachability checks for new IPv6 ORPorts. See section 3.3.2 for other situations where IPv6 information may be incorrect or unavailable. 3.2.2. No Tor Client Changes Tor clients currently include IPv4 ORPorts in their extend cells, but they do not include IPv6 ORPorts. We do not propose any client IPv6 extend cell changes at this time. The Tor network needs more IPv6 relays, before clients can safely use IPv6 extends. (Relays do not require anonymity, so they can safely use IPv6 extends to test their own reachability.) We also recommend prioritising client to relay IPv6 connections (see [Proposal 306: Client Auto IPv6 Connections]) over relay to relay IPv6 connections. Because client IPv6 connections have a direct impact on users. 3.3. Alternative Extend Designs We briefly mention some potential extend designs, and the reasons that they were not used in this proposal. (Some designs may be proposed for future Tor versions, but are not necessary at this time.) 3.3.1. Future Relay IPv6 Extend Behaviour Random selection of extend ORPorts is a simple design, which supports IPv6 ORPort reachability checks. However, it is not the most efficient design when: * both relays meet the requirements for IPv4 and IPv6 extends, * a new connection is required, * the relays have either IPv4 or IPv6 connectivity, but not both. In this very specific case, this proposal results in an average of 1 circuit extend failure per new connection. (Because relays do not try to connect to the other ORPort when the first one fails.) If relays try both the IPv4 and IPv6 ORPorts, then the circuit would succeed. For example, relays could try the alternative port after a 250ms delay, as in [Proposal 306: Client Auto IPv6 Connections]. The design in this proposal results in an average circuit delay of up to 125ms (250ms / 2) per new connection, rather than failure. However, partial relay connectivity should be uncommon. And relays keep connections open long-term, so new relay connections are a small proportion of extend requests. Therefore, we defer implementing any more complex designs. Since we propose to use IPv6 extends to test relay reachability, occasional circuit extend failures have a very minor impact. 3.3.2. Future Bridge IPv6 Extend Behaviour When clients automatically connect to relay IPv4 and IPv6 ORPorts by default, bridges should also adopt this behaviour. (For example, see [Proposal 306: Client Auto IPv6 Connections].) 3.3.3. Allowing Extends to Prefer IPv4 or IPv6 Here is an alternate design, which allows extending clients (or relays doing reachability tests) to prefer either IPv4 or IPv6: Suppose that a relay's extend cell contains the IPv4 address and the IPv6 address in their _preferred order_. So if the party generating the extend cell would prefer an IPv4 connection, it puts the IPv4 addess first; if it would prefer an IPv6 connection, it puts the IPv6 address first. The relay that receives the extend cell could respond in several ways: * One possibility (similar to section 3.2.1) is to choose at random, with a higher probability given to the first option. * One possibility (similar to section 3.3.1) is to try the first, and then try the second if the first one fails. This scheme has some advantage, in that it lets the self-testing relay say "please try IPv6 if you can" or "please try IPv4 if you can" in a reliable way, and lets us migrate from the current behavior to the 3.3.1 behavior down the road. However, it might not be necessary: clients should not care if their extends are over IPv4 or IPv6, they just want to get to an exit safely. (And clients should not depend on using IPv4 or IPv6, because relays may use an existing authenticated connection to extend.) The only use case where extends might want to prefer IPv4 or IPv6 is relay reachability tests. But we want our reachability test design to succeed, without depending on the specific extend implementation. 3.4. Rejected Extend Designs Some designs may never be suitable for the Tor network. We rejected designs where relays check the consensus to see if other relays support IPv6, because: * relays may have different consensuses, * the extend cell may have been created using a version of the [Onion Service Protocol] which supports IPv6, or * the extend cell may be from a relay that has just added IPv6, and is testing the reachability of its own ORPort (see Section 4). We avoided designs where relays try to learn if other relays support IPv6, because these designs: * are more complex than random selection, * potentially leak information between different client circuits, * may enable denial of service attacks, where a flood of incorrect extend cells causes a relay to believe that another relay is unreachable on an ORPort that actually works, and * require careful tuning to match the typical interval at which network connectivity is actually changing. 4. Check Relay and Bridge IPv6 ORPort Reachability We propose that relays (and bridges) check their own IPv6 ORPort reachability. To check IPv6 ORPort reachability, relays (and bridges) extend circuits via other relays (but not other bridges), and back to their own IPv6 ORPort. If IPv6 reachability checks fail, relays (and bridges) should refuse to publish their descriptors, if they believe IPv6 reachability checks are reliable, and their IPv6 address was explicitly configured. (See [Proposal 312: Relay Auto IPv6 Address] for the ways relays can guess their IPv6 addresses.) Directory authorities always publish their descriptors. 4.1. Current Reachability Implementation Relays and bridges check the reachability of their IPv4 ORPorts and DirPorts, and refuse to publish their descriptor if either reachability check fails. (Directory authorities test their own reachability, but they only warn, and publish their descriptor regardless of reachability.) IPv4 ORPort reachability checks succeed when any create cell is received on any inbound OR connection. The check succeeds, even if the cell is from an IPv6 ORPort, or a circuit built by a client. Directory authorities make direct connections to relay IPv4 and IPv6 ORPorts, to test each relay's reachability. Relays that fail either reachability test, on enough directory authorities, are excluded from the consensus. The Bridge authority makes direct connections to bridge IPv4 ORPorts, to test each bridge's reachability. Depending on its configuration, it may also test IPv6 ORPorts. Bridges that fail either reachability test are excluded from BridgeDB. 4.2. Checking IPv6 ORPort Reachability We propose that testing relays (and bridges) select some IPv6 extend-capable relays for their reachability circuits, and include their own IPv4 and IPv6 ORPorts in the final extend cells on those circuits. The final extending relay will extend to the testing relay: * using an existing authenticated connection to the testing relay (which may be over IPv4 or IPv6), or * over a new connection via the IPv4 or IPv6 ORPort in the extend cell. The testing relay will confirm that test circuits can extend to both its IPv4 and IPv6 ORPorts. Checking IPv6 ORPort reachability will create extra IPv6 connections on the tor network. (See [Proposal 313: Relay IPv6 Statistics].) It won't directly create much extra traffic, because reachability circuits don't send many cells. But some client circuits may use the IPv6 connections created by relay reachability self-tests. 4.2.1. Selecting the Final Extending Relay IPv6 ORPort reachability checks require an IPv6 extend-capable relay as the second-last hop of reachability circuits. (The testing relay is the last hop.) IPv6-extend capable relays must have: * Relay subprotocol version 3 (or later), and * an IPv6 ORPort. (See section 5.1 for the definition of Relay subprotocol version 3.) The other relays in the path do not require any particular protocol versions. 4.2.2. Extending from the Second-Last Hop IPv6 ORPort reachability circuits should put the IPv4 and IPv6 ORPorts in the extend cell for the final extend in reachability circuits. Supplying both ORPorts makes these extend cells indistinguishable from future client extend cells. If reachability succeeds, the testing relay (or bridge) will accept the final extend on one of its ORPorts, selected at random by the extending relay (see section 3.2.1). 4.2.3. Separate IPv4 and IPv6 Reachability Flags Testing relays (and bridges) will record reachability separately for IPv4 and IPv6 ORPorts, based on the ORPort that the test circuit was received on. Here is a reliable way to do reachability self-tests for each ORPort: 1. Check for create cells on inbound ORPort connections from other relays Check for a cell on any IPv4 and any IPv6 ORPort. (We can't know which listener(s) correspond to the advertised ORPorts, particularly when using port forwarding.) Make sure the cell was received on an inbound OR connection, and make sure the connection is authenticated to another relay. (Rather than to a client: clients don't authenticate.) 2. Check for created cells from testing circuits on outbound OR connections Check for a returned created cell on our IPv4 and IPv6 self-test circuits. Make sure those circuits were on outbound OR connections. By combining these tests, we confirm that we can: * reach our own ORPorts with testing circuits, * send and receive cells via inbound OR connections to our own ORPorts from other relays, and * send and receive cells via outbound OR connections to other relays' ORPorts. Once we validate the created cell, we have confirmed that the final remote relay has our private keys. Therefore, this test reliably detects ORPort reachability, in most cases. There are a few exceptions: A. Duplicate Relay Keys Duplicate keys are only possible if a relay's private keys have been copied to another relay. That's either a misconfiguration, or a security issue. Directory authorities ensure that only one relay with each key is included in the consensus. If a relay was set up using a copy of another relay's keys, then its reachability self-tests might connect to that other relay. (If the second hop in a testing circuit has an existing OR connection to the other relay.) Relays could test if the inbound create cells they receive, match the create cells that they have sent on self-test circuits. But this seems like unnecessary complexity, because duplicate keys are rare. At best, it would provide a warning for some operators who have accidentally duplicated their keys. (But it doesn't provide any extra security, because operators can disable self-tests using AssumeReachable.) B. Multiple ORPorts in an Address Family Some relays have multiple IPv4 ORPorts, or multiple IPv6 ORPorts. In some cases, only some ports are reachable. (This configuration is uncommon, but multiple ORPorts are supported.) Here is how these tests can pass, even if the advertised ORPort is unreachable: * the final extend cell contains the advertised IPv6 address of the self-testing relay, * if the extending relay already has a connection to a working NoAdvertise ORPort, it may use that connection instead. 4.2.4. No Changes to DirPort Reachability We do not propose any changes to relay IPv4 DirPort reachability checks at this time. The following configurations are currently not supported: * bridge DirPorts, and * relay IPv6 DirPorts. Therefore, they are also out of scope for this proposal. 4.3. Refusing to Publish Descriptor if IPv6 ORPort is Unreachable If an IPv6 ORPort reachability check fails, relays (and bridges) should log a warning. If IPv6 reachability checks fail, relays (and bridges) should refuse to publish their descriptors, if they believe IPv6 reachability checks are reliable, and their IPv6 address was explicitly configured. (See [Proposal 312: Relay Auto IPv6 Address] for the ways relays can guess their IPv6 addresses.) Directory authorities always publish their descriptors. 4.3.1. Refusing to Publish the Descriptor If IPv6 reachability checks fail, relays (and bridges) should refuse to publish their descriptors, if: * enough existing relays support IPv6 extends, and * the IPv6 address was explicitly configured by the operator (rather than guessed using [Proposal 312: Relay Auto IPv6 Address]). Directory authorities may perform reachability checks, and warn if those checks fail. But they always publish their descriptors. We set a threshold of consensus relays for reliable IPv6 ORPort checks: * at least 30 relays, and * at least 1% of the total consensus weight, must support IPv6 extends. We chose these parameters so that the number of relays is triple the number of directory authorities, and the consensus weight is high enough to support occasional reachability circuits. In small networks with: * less than 2000 relays, or * a total consensus weight of zero, the threshold should be the minimum tor network size to test reachability: * at least 2 relays, excluding this relay. (Note: we may increase this threshold to 3 or 4 relays if we discover a higher minimum during testing.) If the current consensus satisfies this threshold, testing relays (and bridges, but not directory authorities) that fail IPv6 ORPort reachability checks should refuse to publish their descriptors. To ensure an accurate threshold, testing relays should exclude: * the testing relay itself, and * relays that they will not use in testing circuits, from the: * relay count, and * the numerator of the threshold percentage. Typically, relays will be excluded if they are in the testing relay's: * family, * IPv4 address /16 network, * IPv6 address /32 network (a requirement as of Tor 0.4.0.1-alpha), unless EnforceDistinctSubnets is 0. As a useful side-effect, these different thresholds for each relay family will reduce the likelihood of the network flapping around the threshold. If flapping has an impact on the network health, directory authorities should set the AssumeIPv6Reachable consensus parameter. (See the next section.) 4.3.2. Add AssumeIPv6Reachable Option We add an AssumeIPv6Reachable torrc option and consensus parameter. If IPv6 ORPort checks have bugs that impact the health of the network, they can be disabled by setting AssumeIPv6Reachable=1 in the consensus parameters. If IPv6 ORPort checks have bugs that impact a particular relay (or bridge), they can be disabled by setting "AssumeIPv6Reachable 1" in the relay's torrc. This option disables IPv6 ORPort reachability checks, so relays publish their descriptors if their IPv4 ORPort reachability checks succeed. (Unlike AssumeReachable, AssumeIPv6Reachable has no effect on the existing dirauth IPv6 reachability checks, which connect directly to relay ORPorts.) The default for the torrc option is "auto", which checks the consensus parameter. If the consensus parameter is not set, the default is "0". "AssumeReachable 1" overrides all values of "AssumeIPv6Reachable", disabling both IPv4 and IPv6 ORPort reachability checks. Tor should warn if AssumeReachable is 1, but AssumeIPv6Reachable is 0. (On directory authorities, "AssumeReachable 1" also disables dirauth IPv4 and IPv6 reachability checks, which connect directly to relay ORPorts. AssumeIPv6Reachable does not disable directory authority to relay IPv6 checks.) 4.4. Optional Efficiency and Reliability Changes We propose some optional changes for efficiency and reliability, and describe their impact. Some of these changes may be more appropriate in future releases, or along with other proposed features. 4.4.1. Extend IPv6 From All Supported Second-Last Hops The testing relay (or bridge) puts both IPv4 and IPv6 ORPorts in its final extend cell, and the receiving ORPort is selected at random by the extending relay (see sections 3.2.1 and 4.2). Therefore, approximately half of IPv6 ORPort reachability circuits will actually end up confirming IPv4 ORPort reachability. We propose this optional change, to improve the rate of IPv6 ORPort reachability checks: If the second-last hop of an IPv4 ORPort reachability circuit supports IPv6 extends, testing relays may put the IPv4 and IPv6 ORPorts in the extend cell for the final extend. As the number of relays that support IPv6 extends increases, this change will increase the number of IPv6 reachability confirmations. In the ideal case, where the entire network supports IPv4 and IPv6 extends, IPv4 and IPv6 ORPort reachability checks would require a similar number of circuits. 4.4.2. Close Existing Connections Before Testing Reachability When a busy relay is performing reachability checks, it may already have established inbound or outbound connections to the second-last hop in its reachability test circuits. The extending relay may use these connections for the extend, rather than opening a connection to the target ORPort (see sections 3.2 and 4.2.2). Bridges only establish outbound connections to other relays, and only over IPv4 (except for reachability test circuits). So they are still potentially affected by this issue. We propose these optional changes, to improve the efficiency of IPv4 and IPv6 ORPort reachability checks: Testing relays (and bridges): * close any outbound connections to the second-last hop of reachability circuits, and * close inbound connections to the second-last hop of reachability circuits, if those connections are not using the target ORPort. Even though it is unlikely that bridges will have inbound connections to a non-target ORPort, bridges should still do inbound connection checks, for consistency. These changes are particularly important if a relay is connected to all other relays in the network, but only over IPv4. (Or in the future, only over IPv6.) We expect that these changes will slightly increase the number of relay re-connections, but reduce the number of reachability test circuits required to confirm reachability. 4.4.3. Accurately Identifying Test Circuits The testing relay (or bridge) may confirm that the create cells it is receiving are from its own test circuits, and that test circuits are capable of returning create cells to the origin. Currently, relays confirm reachability if any create cell is received on any inbound connection (see section 4.1). Relays do not check that the circuit is a reachability test circuit, and they do not wait to receive the return created cell. This behaviour has resulted in difficult to diagnose bugs on some rare relay configurations. We propose these optional changes, to improve the efficiency of IPv4 and IPv6 ORPort reachability checks: Testing relays may: * check that the create cell is received from a test circuit (by comparing the received cell to the cells sent by test circuits), * check that the create cell is received on an inbound connection (this is existing behaviour), * if the create cell from a test circuit is received on an outbound connection, destroy the circuit (rather than returning a created cell), and * check that the created cell is returned to the relay on a test circuit (by comparing the remote address of the final hop on the circuit, to the local IPv4 and IPv6 ORPort addresses). Relays can efficiently match inbound create cells to test circuits by storing a set of their test circuits' extend cells g^X values, and then check incoming cells create cells against that set. If we make these changes, relays should track whether they are "maybe reachable" (under the current definition of 'reachable') and "definitely reachable" (based on the new definition). They should log different messages depending on whether they are "maybe reachable" but these new tests fail, or whether they are completely unreachable. 4.4.4. Allowing More Relay IPv6 Extends Currently, clients, relays, and bridges do not include IPv6 ORPorts in their extend cells. In this proposal, we only make relays (and bridges) extend over IPv6 on the final hop of test circuits. This limited use of IPv6 extends means that IPv6 connections will still be uncommon. We propose these optional changes, to increase the number of IPv6 connections between relays: To increase the number of IPv6 connections, relays that support IPv6 extends may want to use them for all hops of their own circuits. Relays make their own circuits for reachability tests, bandwidth tests, and ongoing preemptive circuits. (Bridges can not change their behaviour, because they try to imitate clients.) We propose a torrc option and consensus parameter RelaySendIPv6Extends, which is only supported on relays (and not bridges or clients). This option makes relays send IPv4 and IPv6 ORPorts in all their extend cells, when supported by the extending and receiving relay. (See section 3.2.1.) The default value for this option is "auto", which checks the consensus parameter. If the consensus parameter is not set, it defaults to "0" in the initial release. Once IPv6 extends have had enough testing, we may enable SendIPv6CircuitExtends on the network. The consensus parameter will be set to 1. The default will be changed to "1" (if the consensus parameter is not set). We defer any client (and bridge) changes to a separate proposal, to be implemented when there are more IPv6 relays in the network. But we note that relay IPv6 extends will provide some cover traffic when clients eventually use IPv6 extends in their circuits. As a useful side effect, increasing the number of IPv6 connections in the network makes it more likely that an existing connection can be used for the final hop of a relay IPv6 ORPort reachability check. 4.4.5. Relay Bandwidth Self-Tests Over IPv4 and IPv6 In this proposal, we only make relays (and bridges) use IPv6 for their reachability self-tests. We propose this optional change, to improve the accuracy of relay (and bridge) bandwidth self-tests: Relays (and bridges) perform bandwidth self-tests over IPv4 and IPv6. If we implement good abstractions for relay self-tests, then this change will not need much extra code. If we implement IPv6 extends for all relay circuits (see section 4.4.4), then this change will effectively be redundant. Doing relay bandwidth self-tests over IPv6 will create extra IPv6 connections and IPv6 bandwidth on the tor network. (See [Proposal 313: Relay IPv6 Statistics].) In addition, some client circuits may use the IPv6 connections created by relay bandwidth self-tests. 4.5. Alternate Reachability Designs We briefly mention some potential reachability designs, and the reasons that they were not used in this proposal. 4.5.1. Removing IPv4 ORPorts from Extend Cells We avoid designs that only include IPv6 ORPorts in extend cells, and remove IPv4 ORPorts. Only including the IPv6 ORPort would provide slightly more specific reachability check circuits. However, we don't need IPv6-only designs, because relays continue trying different reachability circuits until they confirm reachability. IPv6-only designs also make it easy to distinguish relay reachability extend cells from other extend cells. This distinguisher will become more of an issue as IPv6 extends become more common in the network (see sections 4.2.2 and 4.4.4). Removing the IPv4 ORPort also provides no fallback, if the IPv6 ORPort is actually unreachable. IPv6-only failures do not affect reachability checks, but they will become important in the future, as other circuit types start using IPv6 extends. IPv6-only reachability designs also increase the number of special cases in the implementation. (And the likelihood of subtle bugs.) These designs may be appropriate in future, when there are IPv6-only bridges or relays. 5. New Relay Subprotocol Version We reserve Tor subprotocol "Relay=3" for tor versions where: * relays may perform IPv6 extends, and * bridges might not perform IPv6 extends, as described in this proposal. 5.1. Tor Specification Changes We propose the following changes to the [Tor Specification], once this proposal is implemented. Adding a new Relay subprotocol version lets testing relays identify other relays that support IPv6 extends. It also allows us to eventually recommend or require support for IPv6 extends on all relays. Append to the Relay version 2 subprotocol specification: Relay=2 has limited IPv6 support: * Clients might not include IPv6 ORPorts in EXTEND2 cells. * Relays (and bridges) might not initiate IPv6 connections in response to EXTEND2 cells containing IPv6 ORPorts, even if they are configured with an IPv6 ORPort. However, relays accept inbound connections to their IPv6 ORPorts, and will extend circuits via those connections. "3" -- relays support extending over IPv6 connections in response to an EXTEND2 cell containing an IPv6 ORPort. Bridges might not extend over IPv6, because they try to imitate client behaviour. A successful IPv6 extend requires: * Relay subprotocol version 3 (or later) on the extending relay, * an IPv6 ORPort on the extending relay, * an IPv6 ORPort for the accepting relay in the EXTEND2 cell, and * an IPv6 ORPort on the accepting relay. (Because different tor instances can have different views of the network, these checks should be done when the path is selected. Extending relays should only check local IPv6 information, before attempting the extend.) When relays receive an EXTEND2 cell containing both an IPv4 and an IPv6 ORPort, and there is no existing authenticated connection with the target relay, the extending relay may choose between IPv4 and IPv6 at random. The extending relay might not try the other address, if the first connection fails. (TODO: check final behaviour after code is merged.) As is the case with other subprotocol versions, tor advertises, recommends, or requires support for this protocol version, regardless of its current configuration. In particular: * relays without an IPv6 ORPort, and * tor instances that are not relays, have the following behaviour, regardless of their configuration: * advertise support for "Relay=3" in their descriptor (if they are a relay, bridge, or directory authority), and * react to consensuses recommending or requiring support for "Relay=3". This subprotocol version is described in proposal 311, and implemented in Tor 0.4.4.1-alpha. (TODO: check version after code is merged). 6. Test Plan We provide a quick summary of our testing plans. 6.1. Test IPv6 ORPort Reachability and Extends We propose to test these changes using chutney networks with AssumeReachable disabled. (Chutney currently enables AssumeReachable by default.) We also propose to test these changes on the public network with a small number of relays and bridges. Once these changes are merged, volunteer relay and bridge operators will be able to test them by: * compiling from source, * running nightly builds, or * running alpha releases. 6.2. Test Existing Features We will modify and test these existing features: * IPv4 ORPort reachability checks We do not plan on modifying these existing features: * relay reachability retries TODO: Do relays re-check their own reachability? How often? * relay canonical connections * "too many connections" warning logs But we will test that they continue to function correctly, and fix any bugs triggered by the modifications in this proposal. 6.3. Test Legacy Relay Compatibility We will also test IPv6 extends from newer relays (which implement this proposal) to older relays (which do not). Although this proposal does not create these kinds of circuits, we need to check for bugs and excessive logs in older tor versions. 7. Ongoing Monitoring To monitor the impact of these changes: * relays should collect basic IPv6 connection statistics, and * relays and bridges should collect basic IPv6 bandwidth statistics. (See [Proposal 313: Relay IPv6 Statistics]). Some of these statistics may be included in tor's heartbeat logs, making them accessible to relay operators. We do not propose to collect additional statistics on: * circuit counts, or * failure rates. Collecting statistics like these could impact user privacy. We also plan to write a script to calculate the number of IPv6 relays in the consensus. This script will help us monitor the network during the deployment of these new IPv6 features. 8. Changes to Other Proposals [Proposal 306: Client Auto IPv6 Connections] needs to be modified to keep bridge IPv6 behaviour in sync with client IPv6 behaviour. (See section 3.3.2.) References: [Onion Service Protocol]: In particular, Version 3 of the Onion Service Protocol supports IPv6: https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt [Proposal 306: Client Auto IPv6 Connections]: One possible design for automatic client IPv4 and IPv6 connections is at: https://gitweb.torproject.org/torspec.git/tree/proposals/306-ipv6-happy-eyeballs.txt (TODO: modify to include bridge changes with client changes) [Proposal 312: Relay Auto IPv6 Address]: https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt [Proposal 313: Relay IPv6 Statistics]: https://gitweb.torproject.org/torspec.git/tree/proposals/313-relay-ipv6-stats.txt [Relay Search]: https://metrics.torproject.org/rs.html [Tor Specification]: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt
Filename: 312-relay-auto-ipv6-addr.txt Title: Tor Relay Automatic IPv6 Address Discovery Author: teor, Nick Mathewson, s7r Created: 28-January-2020 Status: Accepted Ticket: #33073 0. Abstract We propose that Tor relays (and bridges) should automatically find their IPv6 address. Like tor's existing IPv4 address auto-detection, the chosen IPv6 address will be published as an IPv6 ORPort in the relay's descriptor. Clients, relays, and authorities connect to relay descriptor IP addresses. Therefore, IP addresses in descriptors need to be publicly routable. (If the relay is running on the public tor network.) To discover their IPv6 address, some relays may fetch directory documents over IPv6. (For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so.) 1. Introduction Tor relays (and bridges) currently find their IPv4 address, and use it as their ORPort and DirPort address when publishing their descriptor. But relays and bridges do not automatically find their IPv6 address. However, relay operators can manually configure an ORPort with an IPv6 address, and that ORPort is published in their descriptor in an "or-address" line (see [Tor Directory Protocol]). Many relay operators don't know their relay's IPv4 or IPv6 addresses. So they rely on Tor's IPv4 auto-detection, and don't configure an IPv6 address. When operators do configure an IPv6 address, it's easy for them to make mistakes. IPv6 ORPort issues are a significant source of relay operator support requests. Implementing IPv6 address auto-detection, and IPv6 ORPort reachability checks (see [Proposal 311: Relay IPv6 Reachability]) will increase the number of working IPv6-capable relays in the tor network. 2. Scope This proposal modifies Tor's behaviour as follows: Relays, bridges, and directory authorities: * automatically find their IPv6 address, and * for consistency between IPv4 and IPv6 detection: * start using IPv4 ORPort for IPv4 address detection, and * re-order IPv4 address detection methods. Relays (but not bridges, or directory authorities): * fetch some directory documents over IPv6. For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) For security reasons, directory authorities must only use addresses that are explicitly configured in their torrc. This proposal makes a small, optional change to existing client behaviour: * clients also check IPv6 addresses when rotating TLS keys for new networks. In addition to the changes to IPv4 address resolution, most of which won't affect clients. (Because they do not set Address or ORPort.) Throughout this proposal, "relays" includes directory authorities, except where they are specifically excluded. "relays" does not include bridges, except where they are specifically included. (The first mention of "relays" in each section should specifically exclude or include these other roles.) When this proposal describes Tor's current behaviour, it covers all supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except where another version is specifically mentioned. 3. Finding Relay IPv6 Addresses We propose that Tor relays (and bridges) should automatically find their IPv6 address. Like tor's existing IPv4 address auto-detection, the chosen IPv6 address will be published as an IPv6 ORPort in the relay's descriptor. Clients, relays, and authorities connect to relay descriptor IP addresses. Therefore, IP addresses in descriptors need to be publicly routable. (If the relay is running on the public tor network.) Relays should ignore any addresses that are reserved for private networks, and check the reachability of addresses that appear to be public (see [Proposal 311: Relay IPv6 Reachability]). Relays should only publish IP addresses in their descriptor, if they are public and reachable. (If the relay is not running on the public tor network, it may use any IP address.) To discover their IPv6 address, some relays may fetch directory documents over IPv6. (For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so. For security reasons, directory authorities only use addresses that are explicitly configured in their torrc.) 3.1. Current Relay IPv4 Address Discovery Currently, all relays (and bridges) must have an IPv4 address. IPv6 addresses are optional for relays. Tor currently tries to find relay IPv4 addresses in this order: 1. the Address torrc option 2. the address of the hostname (resolved using DNS, if needed) 3. a local interface address (by making an unused socket, if needed) 4. an address reported by a directory server (using X-Your-Address-Is) When using the Address option, or the hostname, tor supports: * an IPv4 address literal, or * resolving an IPv4 address from a hostname. If tor is running on the public network, and an address isn't globally routable, tor ignores it. (If it was explicitly set in Address, tor logs an error.) If there are multiple valid addresses, tor chooses: * the first address returned by the resolver, * the first address returned by the local interface API, and * the latest address(es) returned by a directory server, DNS, or the local interface API. 3.1.1. Current Relay IPv4 and IPv6 Address State Management Currently, relays (and bridges) manage their IPv4 address discovery state, as described in the following table: a b c d e f 1. Address literal . . . . . . 1. Address hostname S N . . . T 2. auto hostname S N . . F T 3. auto interface ? ? . . F ? 3. auto socket ? ? . . F ? 4. auto dir header D N D D F A IPv6 address discovery only uses the first IPv6 ORPort address: a b c d e f 1. ORPort listener . . C . F . 1. ORPort literal . . C C F . 1. ORPort hostname S N C C F T The tables are structured as follows: * rows are address resolution stage variants * each address resolution stage has a number, and a description * the description includes any variants (for example: IP address literal, or hostname) * columns describe each variant's state management. The state management key is: a. What kind of API is used to perform the address resolution? * . a CPU-bound API * S a synchronous query API * ? an API that is probably CPU-bound, but may be synchronous on some platforms * D tor automatically updates the stored directory address, whenever a directory document is received b. What does the API depend on? * . a CPU-bound API * N a network-bound API * ? an API that is probably CPU-bound, but may be network-bound on some platforms c. How are any discovered addresses stored? * . addresses are not stored (but they may be cached by some higher-level tor modules) * D addresses are stored in the directory address suggestion variable * C addresses are stored in the port config listener list d. What event makes the address resolution happen? * . when tor wants to know its own address * D when a directory document is received * C when tor parses its config at startup, and during reconfiguration e. What conditions make tor attempt this address resolution method? * . this method is always attempted * F this method is only attempted when all other higher-priority methods fail to return an address f. Can this method timeout? * . can't time out * T might time out * ? probably doesn't time out, but might time out on some platforms * A can't time out, because it is asynchronous. If a stored address is available, it is returned immediately. 3.2. Finding Relay IPv6 Addresses We propose that relays (and bridges) try to find their IPv6 address. For consistency, we also propose to change the address resolution order for IPv4 addresses. We use the following general principles to choose the order of IP address methods: * Explicit is better than Implicit, * Local Information is better than a Remote Dependency, * Trusted is better than Untrusted, and * Reliable is better than Unreliable. Within these constraints, we try to find the simplest working design. If a relay is given the wrong address by an attacker, the attacker can direct all inbound relay traffic to their own address. They can't decrypt the traffic without the relay's private keys, but they can monitor traffic patterns. Therefore, relays should only use untrusted address discovery methods, if every other method has failed. Any method that uses DNS is potentially untrusted, because DNS is often a remote, unauthenticated service. And addresses provided by other directory servers are also untrusted. For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Based on these principles, we propose that tor tries to find relay IPv4 and IPv6 addresses in this order: 1. the Address torrc option 2. the advertised ORPort address 3. a local interface address (by making an unused socket, if needed) 4. the address of the host's own hostname (resolved using DNS, if needed) 5. an address reported by a directory server (using X-Your-Address-Is) Each of these address resolution steps is described in more detail, in its own subsection. For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) We avoid using advertised DirPorts for address resolution, because: * they are not supported on bridges, * they are not supported on IPv6, * they may not be configured on a relay, and * it is unlikely that a relay operator would configure an ORPort without an IPv4 address, but configure a DirPort with an IPv4 address. While making these changes, we want to preserve tor's existing behaviour: * resolve Address using the local resolver, if needed, * ignore private addresses on public tor networks, and * when there are multiple valid addresses: * if a list of addresses is received, choose the first address, and * if different addresses are received over time, choose the most recent address. 3.2.1. Make the Address torrc Option Support IPv6 First, we propose that relays (and bridges) use the Address torrc option to find their IPv4 and IPv6 addresses. There are two cases we need to cover: 1. Explicit IP addresses: * allow the option to be specified up to two times, * use the IPv4 address for IPv4, * use the IPv6 address for IPv6. Configuring two addresses in the same address family is a config error. 2. Hostnames / DNS names: * allow the option to be specified up to two times, * look up the configured name, * use the first IPv4 and IPv6 address returned by the resolver, and Resolving multiple addresses in the same address family is not a runtime error, but only the first address from each family will be used. These lookups should ignore private addresses on public tor networks. If multiple IPv4 or IPv6 addresses are returned, the first public address from each family should be used. We should support the following combinations of address literals and hostnames: Legacy configurations: A. No configured Address option B. Address IPv4 literal C. Address hostname (use IPv4 and IPv6 DNS addresses) New configurations: D. Address IPv6 literal E. Address IPv4 literal / Address IPv6 literal F. Address hostname / Address hostname (use IPv4 and IPv6 DNS addresses) G. Address IPv4 literal / Address hostname (only use IPv6 DNS addresses) H. Address hostname (only use IPv4 DNS addresses) / Address IPv6 literal If we can't find an IPv4 or IPv6 address using the configured Address options: No IPv4: guess IPv4, and its reachability must succeed. No IPv6: guess IPv6, publish if reachability succeeds. Combinations A and B are the most common legacy configurations. We want to support the following outcomes for all legacy configurations: * automatic upgrades to guessed and reachable IPv6 addresses, * continuing to operate on IPv4 when the IPv6 address can't be guessed, and * continuing to operate on IPv4 when the IPv6 address has been guessed, but it is unreachable. At this time, we do not propose guessing multiple IPv4 or IPv6 addresses and testing their reachability (see section 3.4.2). It is an error to configure an Address option with a private IPv4 or IPv6 address. Tor should warn if a configured Address hostname does not resolve to any publicly routable IPv4 or IPv6 addresses. (In both these cases, if tor is configured with a custom set of directory authorities, private addresses should be allowed, with a notice-level log.) For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Therefore, we propose that directory authorities only accept IPv4 or IPv6 address literals in their Address option. They must not attempt to resolve their Address using DNS. It is a config error to provide a hostname as a directory authority's Address. If the Address option is not configured for IPv4 or IPv6, or the hostname lookups do not provide both IPv4 and IPv6 addresses, address resolution should go to the next step. 3.2.2. Use the Advertised ORPort IPv4 and IPv6 Addresses Next, we propose that relays (and bridges) use the first advertised ORPort IPv4 and IPv6 addresses, as configured in their torrc. The ORPort address may be a hostname. If it is, tor should try to use it to resolve an IPv4 and IPv6 address, and open ORPorts on the first available IPv4 and IPv6 address. Tor should respect the IPv4Only and IPv6Only port flags, if specified. (Tor currently resolves IPv4 and IPv6 addresses from hostnames in ORPort lines.) Relays (and bridges) currently use the first advertised ORPort IPv6 address as their IPv6 address. We propose to use the first advertised IPv4 ORPort address in a similar way, for consistency. Therefore, this change may affect existing relay IPv4 addressses. We expect that a small number of relays may change IPv4 address, from a guessed IPv4 address, to their first advertised IPv4 ORPort address. In rare cases, relays may have been using non-advertised ORPorts for their addresses. This change may also change their addresses. Tor currently uses its listener port list to look up its IPv6 ORPort for its descriptor. We propose that tor's address discovery uses the listener port list for both IPv4 and IPv6. (And does not attempt to independently parse or resolve ORPort configs.) This design decouples ORPort option parsing, ORPort listener opening, and address discovery. It also implements a form of caching: IPv4 and IPv6 addresses resolved from hostnames are stored in the listener port list, then used to open listeners. Therefore, tor should continue to use the same address, while the listener remains open. (See also sections 3.2.7 and 3.2.8.) For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Therefore, we propose that directory authorities only accept IPv4 or IPv6 address literals in the address part of the ORPort and DirPort options. They must not attempt to resolve these addresses using DNS. It is a config error to provide a hostname as a directory authority's ORPort or DirPort. If directory authorities don't have an IPv4 address literal in their Address or ORPort, they should issue a configuration error, and refuse to launch. If directory authorities don't have an IPv6 address literal in their Address or ORPort, they should issue a notice-level log, and fall back to only using IPv4. For the purposes of address resolution, tor should ignore private configured ORPort addresses on public tor networks. (Binding to private ORPort addresses is supported, even on public tor networks, for relays that use NAT to reach the Internet.) If an ORPort address is private, address resolution should go to the next step. 3.2.3. Use Local Interface IPv6 Address Next, we propose that relays (and bridges) use publicly routable addresses from the OS interface addresses or routing table, as their IPv4 and IPv6 addresses. Tor has local interface address resolution functions, which support most major OSes. Tor uses these functions to guess its IPv4 address. We propose using them to also guess tor's IPv6 address. We also propose modifying the address resolution order, so interface addresses are used before the local hostname. This decision is based on our principles: interface addresses are local, trusted, and reliable; hostname lookups may be remote, untrusted, and unreliable. Some developer documentation also recommends using interface addresses, rather than resolving the host's own hostname. For example, on recent versions of macOS, the man pages tell developers to use interface addresses (getifaddrs) rather than look up the host's own hostname (gethostname and getaddrinfo). Unfortunately, these man pages don't seem to be available online, except for short quotes (see [getaddrinfo man page] for the relevant quote). If the local interface addresses are unavailable, tor opens a UDP socket to a publicly routable address, but doesn't actually send any packets. Instead, it uses the socket APIs to discover the interface address for the socket. (UDP is used because it is stateless, so the OS will not send any packets to open a connection.) For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Since local interface addresses are implicit, and may depend on DHCP, directory authorities do not use this address resolution method (or any of the other, lower-priority address resolution methods). Relays that use NAT to reach the Internet may have no publicly routable local interface addresses, even on the public tor network. The NAT box has the publicly routable addresses, and it may be a separate machine. Relays may also be unable to detect any local interface addresses. The required APIs may be unavailable, due to: * missing OS or library features, or * local security policies. Tor already ignores private IPv4 interface addresses on public relays. We propose to also ignore private IPv6 interface addresses. If all IPv4 or IPv6 interface addresses are private, address resolution should go to the next step. 3.2.4. Use Own Hostname IPv6 Addresses Next, we propose that relays (and bridges) get their local hostname, look up its addresses, and use them as its IPv4 and IPv6 addresses. We propose to use the same underlying lookup functions to look up the IPv4 and IPv6 addresses for: * the Address torrc option (see section 3.2.1), and * the local hostname. However, OS APIs typically only return a single hostname. (Rather than a separate hostname for IPv4 and IPv6.) For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Since hostname lookups may use DNS, directory authorities do not use this address resolution method. The hostname lookup should ignore private addresses on public relays. If multiple IPv4 or IPv6 addresses are returned, the first public address from each family should be used. If all IPv4 or IPv6 hostname addresses are private, address resolution should go to the next step. 3.2.5. Use Directory Header IPv6 Addresses Finally, we propose that relays get their IPv4 and IPv6 addresses from the X-Your-Address-Is HTTP header in tor directory documents. To support this change, we propose that relays start fetching directory documents over IPv4 and IPv6. We propose that bridges continue to only fetch directory documents over IPv4, because they try to imitate clients. (Most clients only fetch directory documents over IPv4, a few clients are configured to only fetch over IPv6.) When client behaviour changes to use both IPv4 and IPv6 for directory fetches, bridge behaviour can also change to match. (See section 3.4.1 and [Proposal 306: Client Auto IPv6 Connections].) For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Since directory headers are provided by other directory servers, directory authorities do not use this address resolution method. We propose to use a simple load balancing scheme for IPv4 and IPv6 directory requests: * choose between IPv4 and IPv6 directory requests at random. We do not expect this change to have any load-balancing impact on the public tor network, because the number of relays is much smaller than the number of clients. However, the 6 directory authorities with IPv6 enabled may see slightly more directory load, particularly over IPv6. To support this change, tor should also change how it handles IPv6 directory failures on relays: * avoid recording IPv6 directory failures as remote relay failures, because they may actually be due to a lack of IPv6 connectivity on the local relay, and * issue IPv6 directory failure logs at notice level, and rate-limit them to one per hour. If a relay is: * explicitly configured with an IPv6 address, or * a publicly routable, reachable IPv6 address is discovered in an earlier step, tor should start issuing IPv6 directory failure logs at warning level. Tor may also record these directory failures as remote relay failures. (Rather than ignoring them, as described in the previous paragraph.) (Alternately, tor could stop doing IPv6 directory requests entirely. But we prefer designs where all relays behave in a similar way, regardless of their internal state.) For some more complex directory load-balancing schemes, see section 3.5.4. Tor already ignores private IPv4 addresses in directory headers. We propose to also ignore private IPv6 addresses in directory headers. If all IPv4 and IPv6 addresses in directory headers are private, address resolution should return a temporary error. Whenever address resolution fails, tor should warn the operator to set the Address torrc option for IPv4 and IPv6. (If IPv4 is available, and only IPv6 is missing, the log should be at notice level.) These logs may need to be rate-limited. The next time tor receives a directory header containing a public IPv4 or IPv6 address, tor should use that address for reachability checks. If the reachability checks succeed, tor should use that address in its descriptor. Doing relay directory fetches over IPv6 will create extra IPv6 connections and IPv6 bandwidth on the tor network. (See [Proposal 313: Relay IPv6 Statistics].) In addition, some client circuits may use the IPv6 connections created by relay directory fetches. 3.2.6. Disabling IPv6 Address Resolution Relays (and bridges) that have a reachable IPv6 address, but that address is unsuitable for the relay, need to be able to disable IPv6 address resolution. Based on [Proposal 311: Relay IPv6 Reachability], and this proposal, those relays would: * discover their IPv6 address, * open an IPv6 ORPort, * find it reachable, * publish a descriptor containing that IPv6 ORPort, * have the directory authorities find it reachable, * have it published in the consensus, and * have it used by clients, regardless of how the operator configures their tor instance. Currently, relays are required to have an IPv4 address. So if the guessed IPv4 address is unsuitable, operators can set the Address option to a suitable IPv4 address. But IPv6 addresses are optional, so relay operators may need to disable IPv6 entirely. We propose a new torrc-only option, AddressDisableIPv6. This option is set to 0 by default. If the option is set to 1, tor disables IPv6 address resolution, IPv6 ORPorts, IPv6 reachability checks, and publishing an IPv6 ORPort in its descriptor. 3.2.6.1. Disabling IPv6 Address Resolution: Alternative Design As an alternative design, tor could change its interpretation of the IPv4Only flag, so that the following configuration lines disable IPv6: (In the absence of any non-IPv4Only ORPort lines.) * ORPort 9999 IPv4Only * ORPort 1.1.1.1:9999 IPv4Only However, we believe that this is a confusing design, because we want to enable IPv6 address resolution on this similar, very common configuration: * ORPort 1.1.1.1:9999 Therefore, we avoid this design, becuase it changes the meaning of existing flags and options. 3.2.7. Automatically Enabling an IPv6 ORPort We propose that relays (and bridges) that discover their IPv6 address, should open an IPv6 ORPort, and test its reachability (see [Proposal 311: Relay IPv6 Reachability], particularly section 4.3.1). The ORPort should be opened on the port configured in the relay's ORPort torrc option. Relay operators can use the IPv4Only and IPv6Only options to configure different ports for IPv4 and IPv6. If the ORPort is auto-detected, there will not be any specific bind address. (And the detected address may actually be on a NAT box, rather than the local machine.) Therefore, relays should attempt to bind to all IPv4 and IPv6 addresses (or all interfaces). Some operating systems expect applications to bind to IPv4 and IPv6 addresses using separate API calls. Others don't support binding only to IPv4 or IPv6, and will bind to all addresses whenever there is no specified IP address (in a single API call). Tor should support both styles of networking API. In particular, if binding to all IPv6 addresses fails, relays should still try to discover their public IPv6 address, and check the reachability of that address. Some OSes may not support the IPV6_V6ONLY flag, but they may instead bind to all addresses at runtime. (The tor install may also have compile-time / runtime flag mismatches.) If both reachability checks succeed, relays should publish their IPv4 and IPv6 ORPorts in their descriptor. If only the IPv4 ORPort check succeeds, and the IPv6 address was guessed (rather than being explicitly configured), then relays should: * publish their IPv4 ORPort in their descriptor, * stop publishing their IPv6 ORPort in their descriptor, and * log a notice about the failed IPv6 ORPort reachability check. 3.2.8. Proposed Relay IPv4 and IPv6 Address State Management We propose that relays (and bridges) manage their IPv4 and IPv6 address discovery state, as described in the following table: a b c d e f 1. Address literal . . . . . . 1. Address hostname S N . . . T 2. ORPort listener . . C . F . 2. ORPort literal . . C C F . 2. ORPort hostname S N C C F T 3. auto interface ? ? . . F ? 3. auto socket ? ? . . F ? 4. auto hostname S N . . F T 5. auto dir header D N D D F A See section 3.1.1 for a description and key for this table. See the rest of section 3.2 for a detailed description of each method and variant. For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Therefore, they stop after step 2. (And don't use the "hostname" variants in steps 1 and 2.) For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) 3.3. Consequential Tor Client Changes We do not propose any required client address resolution changes at this time. However, clients will use the updated address resolution functions to detect when they are on a new connection, and therefore need to rotate their TLS keys. This minor client change allows us to avoid keeping an outdated version of the address resolution functions, which is only for client use. Clients should skip address resolution steps that don't apply to them, such as: * the ORPort option, and * the Address option, if it becomes a relay module option. 3.4. Alternative Address Resolution Designs We briefly mention some potential address resolution designs, and the reasons that they were not used in this proposal. (Some designs may be proposed for future Tor versions, but are not necessary at this time.) 3.4.1. Future Bridge IPv6 Address Resolution Behaviour When clients automatically fetch directory documents via relay IPv4 and IPv6 ORPorts by default, bridges should also adopt this dual-stack behaviour. (For example, see [Proposal 306: Client Auto IPv6 Connections].) When bridges fetch directory documents via IPv6, they will be able to find their IPv6 address using directory headers (see 3.2.5). 3.4.2. Guessing Muliple IPv4 or IPv6 Addresses We avoid designs which guess (or configure) multiple IPv4 or IPv6 addresses, test them all for reachability, and choose one that works. Using multiple addresses is rare, and the code to handle it is complex. It also requires careful design to avoid: * conflicts between multiple relays (or bridges) on the same address (tor allows up to 2 relays per IPv4 address), * relay flapping, * race conditions, and * relay address switching. 3.4.3. Rejected Address Resolution Designs We reject designs that try all the different address resolution methods, score addresses, and then choose the address with the highest score. These designs are a generalisation of designs that try different methods in a set order (like this proposal). They are more complex than required. Complex designs can confuse operators, particularly when they fail. Operators should not need complex address resolution in tor: most relay (and bridge) addresses are fixed, or change occasionally. And most relays can reliably discover their address using directory headers, if all other methods fail. (Bridges won't discover their IPv6 address from directory headers, see section 3.2.5.) If complex address resolution is required, it can be configured using a dynamic DNS name in the Address torrc option, or via the control port. We also avoid designs that use any addresses other than the first (or latest) valid IPv4 and IPv6 address. These designs are more complex, and they don't have clear benefits: * sort addresses numerically (avoid address flipping) * sort addresses by length, then numerically (also minimise consensus size) * store a list of previous addresses in the state file, and use the most recently used address that's currently available. Operators who want to avoid address flipping should set the Address option in the torrc. Operators who want to minimise the size of the consensus should use all-zero IPv6 host identifiers. 3.5. Optional Efficiency and Reliability Changes We propose some optional changes for efficiency and reliability, and describe their impact. Some of these changes may be more appropriate in future releases, or along with other proposed features. Some of these changes make tor ignore some potential IP addresses. Ignoring addresses risks relays having no available ORPort addresses, and refusing to publish their descriptor. So before we ignore any addresses, we should make sure that: * tor's other address detection methods are robust and reliable, and * we would prefer relays to shut down, rather than use the ignored address. As a less severe alternative, low-quality methods can be put last in the address resolution order. (See section 3.2.) If relays prefer addresses from particular sources (for example: ORPorts), they should try these sources regularly, so that their addresses do not become too old. If relays ignore addresses from some sources (for example: DirPorts), they must regularly try other sources (for example: ORPorts). 3.5.1. Using Authenticated IPv4 and IPv6 Addresses We propose this optional change, to improve relay (and bridge) address accuracy and reliability. Relays should try to use authenticated connections to discover their own IPv4 and IPv6 addresses. Tor supports two kinds of authenticated address information: * authenticated directory connections, and * authenticated NETINFO cells. See the following sections for more details. See also sections 3.5.2 to 3.5.4. 3.5.1.1. Authenticated Directory Connections We propose this optional change, to improve relay address accuracy and reliability. (Bridges are not affected, because they already use authenticated directory connections, just like clients.) Tor supports authenticated, encrypted directory fetches using BEGINDIR over ORPorts (see the [Tor Specification] for details). Relays currently fetch unencrypted directory documents over DirPorts. The directory document itself is signed, but the HTTP headers are not authenticated. (Clients and bridges only fetch directory documents using authenticated directory fetches.) Using authenticated directory headers for relay addresses: * provides authenticated address information, * reduces the number of attackers that can deliberately give a relay an incorrect IP address, and * avoids caches (or other machines) accidentally mangling, deleting, or repeating X-Your-Address-Is headers. To make this change, we need to modify tor's directory connection code: * when making directory requests, relays should fetch some directory documents using BEGINDIR over ORPorts. Once tor regularly gets authenticated X-Your-Address-Is headers, relays can change how they handle unauthenticated addresses. When they receive an unauthenticated address suggestion, relays can: * ignore the address, or * use the address as the lowest priority address method. See section 3.5 for some factors to consider when making this design decision. For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Since directory headers are provided by other directory servers, directory authorities do not use this address resolution method. For anonymity reasons, bridges are unable to fetch directory documents over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) Bridges currently use authenticated IPv4 connections for all their directory fetches, to imitate default client behaviour. We describe a related change, which is also optional: We can increase the number of ORPort directory fetches: * if tor has an existing ORPort connection to a relay that it has selected for a directory fetch, it should use an ORPort fetch, rather than opening an additional DirPort connection. Using an existing ORPort connection: * saves one DirPort connection and file descriptor, * but slightly increases the cryptographic processing done by the relay, and by the directory server it is connecting to. However, the most expensive cryptographic operations have already happened, when the ORPort connection was opened. This change does not increase the number of NETINFO cells, because it re-uses existing OR connections. See the next section for more details. 3.5.1.2. Authenticated NETINFO Cells We propose this optional change, to improve relay (and bridge) address accuracy and reliability. (Bridge IPv6 addresses are not affected, because bridges only make OR connections over IPv4, to imitate default client behaviour.) Tor supports authenticated IPv4 and IPv6 address information, using the NETINFO cells exchanged at the beginning of each ORPort connection (see the [Tor Specification] for details). Relays do not currently use any address information from NETINFO cells. Using authenticated NETINFO cells for relay addresses: * provides authenticated address information, * reduces the number of attackers that can deliberately give a relay an incorrect IP address, and * does not require a directory fetch (NETINFO cells are sent during connection setup). To make this change, we need to modify tor's cell processing: * when processing NETINFO cells, tor should store the OTHERADDR field, like it currently does for X-Your-Address-Is HTTP headers, and * IPv4 and IPv6 addresses should be stored separately. See the previous section, and section 3.2.5 for more details about the X-Your-Address-Is HTTP header. Once tor uses NETINFO cell addresses, relays can change how they handle unauthenticated X-Your-Address-Is headers. When they receive an unauthenticated address suggestion, relays can: * ignore the address, or * use the address as the lowest priority address method. See section 3.5 for some factors to consider when making this design decision. We propose that tor continues to use the X-Your-Address-Is header, and adds support for addresses in NETINFO cells. X-Your-Address-Is headers are sent once per directory document fetch, but NETINFO cells are only sent once per OR connection. If a relay: * only gets addresses from NETINFO cells from authorities, and * has an existing, long-term connection to every authority, then it may struggle to detect address changes. Once all supported tor versions use NETINFO cells for address detection, we should review this design decision. If we are confident that almost all relays will be forced to make new connections when their address changes, then tor may be able to stop using X-Your-Address-Is HTTP headers. For security reasons, directory authorities only use addresses that are explicitly configured in their torrc. Since NETINFO cells are provided by other directory servers, directory authorities do not use this address resolution method. Bridges only make OR connections, and those OR connections are only over IPv4, to imitate default client behaviour. For anonymity reasons, bridges are unable to make regular connections over IPv4 and IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) As an alternative design, if tor's addresses are stale, it could close some of its open directory authority connections. (Similar to section 4.4.2 in [Proposal 311: Relay IPv6 Reachability], where relays close existing OR connections, before testing their own reachability.) However, this design is more complicated, because it involves tracking address age, as well as the address itself. 3.5.2. Preferring IPv4 and IPv6 Addresses from Directory Authorities We propose this optional change, to improve relay (but not bridge) address accuracy and reliability. Relays prefer IPv4 and IPv6 address suggestions received from Directory Authorities. Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. When they receive an address suggestion from a directory mirror, relays can: * ignore the address, or * use the address as the lowest priority address method. See section 3.5 for some factors to consider when making this design decision. Bridges only make OR connections, and those OR connections are only over IPv4, to imitate default client behaviour. For anonymity reasons, bridges are unable to make regular connections over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) See also sections 3.5.1 to 3.5.4. 3.5.3. Ignoring Addresses on Inbound Connections We propose this optional change, to improve relay (and bridge) address accuracy and reliability. Relays ignore IPv4 and IPv6 address suggestions received on inbound connections. We make this change, because we want to detect the IP addresses of the relay's outbound routes, rather than the addresses that that other relays believe they are connecting to for inbound connections. If we make this change, relays may need to close some inbound connections, before doing address detection. If we also make the changes in sections 3.5.1 and 3.5.2, busy relays could have persistent, inbound OR connections from all directory authorities. (Currently, there are 9 directory authorities with IPv4 addresses, and 6 directory authorities with IPv6 addresses.) Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. See also sections 3.5.1 to 3.5.4. 3.5.4. Load Balancing We propose some optional changes to improve relay (and bridge) load-balancing across directory authorities. Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. See also sections 3.5.1 to 3.5.3. 3.5.4.1. Directory Authority Load Balancing Relays may prefer: * authenticated connections (section 3.5.1). Relays and bridges may prefer: * connecting to Directory Authorities (section 3.5.2), or * ignoring addresses on inbound connections (section 3.5.3) (and therefore, they may close some inbound connections, leading to extra connection re-establishment load). All these changes are optional, so they might not be implemented. Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. If both changes are implemented, we would like all relays (and bridges) to do frequent directory fetches: * using BEGINDIR over ORPorts, * to directory authorities. However, this extra load from relays may be unsustainable during high network load (see [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s]). For anonymity reasons, bridges should avoid connecting to directory authorities too frequently, to imitate default client behaviour. Therefore, we propose a simple load-balancing scheme between address resolution and non-address resolution requests: * when relays first start up, they should make two directory authority ORPort fetch attempts, one on IPv4, and one on IPv6, * relays should also make occasional directory authority ORPort directory fetch attempts, on IPv4 and IPv6, to learn if their addresses have changed. We propose a new torrc option and consensus parameter: RelayMaxIntervalWithoutAddressDetectionRequest N seconds|minutes|hours Relays make most of their directory requests via directory mirror DirPorts, to reduce the load on directory authorities. When this amount of time has passed since a relay last connected to a directory authority ORPort, the relay makes its next directory request via a directory authority ORPort. (Default: 15 minutes) The final name and description for this option will depend on which optional changes are actually implemented in tor. In particular, this option should only consider requests that tor may use to discover its IP addresses. For example: * if tor uses NETINFO cells for addresses (section 3.5.1.2), then all OR connections to an authority should be considered, * if tor does not use NETINFO cells for addresses, and only uses X-Your-Address-Is headers, then only directory fetches from authorities should be considered. We set the default value of this option to 15 minutes, because: * tor's reachability tests fail if the ORPort is unreachable after 20 minutes. So we want to do at least two address detection requests in the first 20 minutes; * the minimum consensus period is 30 minutes, and we want to do at least one address detection per consensus period. (Consensuses are usually created every hour. But if there is no fresh consensus, directory authorities will try to create a consensus every 30 minutes); and * the default value for TestingAuthDirTimeToLearnReachability is 30 minutes. So directory authorities will make reachability test OR connections to each relay, at least every 30 minutes. Therefore, relays will see NETINFO cells from directory authorities about this often. (Relays may use NETINFO cells for address detection, see section 3.5.1.2.) See also section 3.5.4.3, for some general load balancing criteria, that may help when tuning the address detection interval. We propose a related change, which is also optional: If relays use address suggestions from directory mirrors, they may choose between ORPort and DirPort connections to directory mirrors at random. Directory mirrors typically have enough spare CPU and bandwidth to handle ORPort directory requests. (And the most expensive cryptography happens when the ORPort connection is opened.) See also sections 3.5.1 to 3.5.3. 3.5.4.2. Load Balancing Between IPv4 and IPv6 Directories We propose this optional change, to improve the load-balancing between IPv4 and IPv6 directories, when used by relays to find their IPv4 and IPv6 addresses (see section 3.2.5). For anonymity reasons, bridges are unable to make regular connections over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. This change may only be necessary if the following changes result in poor load-balancing, or other relay issues: * randomly selecting IPv4 or IPv6 directories (see section 3.2.5), * preferring addresses from directory authorities, via an authenticated connection (see sections 3.5.1 and 3.5.2), or * ignoring addresses on inbound connections, and therefore closing and re-opening some connections (see section 3.5.3). We propose that the RelayMaxIntervalWithoutAddressDetection option is counted separately for IPv4 and IPv6 (see the previous section for details). For example: * if 30 minutes has elapsed since the last IPv4 address detection request, then the next directory request should be an IPv4 address detection request, and * if 30 minutes has elapsed since the last IPv6 address detection request, then the next directory request should be an IPv6 address detection request. If both intervals have elapsed at the same time, the relay should choose between IPv4 and IPv6 at random. See also section 3.5.4.3, for some general load balancing criteria, that may help when tuning the address detection interval. Alternately, we could wait until [Proposal 306: Client Auto IPv6 Connections] is implemented, and use the directory fetch design from that proposal. See also sections 3.5.1 to 3.5.3. 3.5.4.3. General Load Balancing Criteria We propose the following criteria for choosing load-balancing intervals: The selected interval should be chosen based on the following factors: * relays need to discover their IPv4 and IPv6 addresses to publish their descriptors, * it only takes one successful directory fetch from one authority for a relay to discover its IP address (see section 3.5.2), * if relays fall back to addresses discovered from directory mirrors, when directory authorities are unavailable (see section 3.5.2), * BEGINDIR over ORPort requires and TLS connection, and some additional tor cryptography, so it is more expensive for authorities than a DirPort fetch (and it can not be cached by a HTTP cache) (see section 3.5.1), * closing and re-opening some OR connections (see section 3.5.3), * minimising wasted CPU (and bandwidth) for IPv6 connection attempts on IPv4-only relays, and * other potential changes to relay directory fetches (see [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s]) The selected interval should allow almost all relays to update both their IPv4 and IPv6 addresses: * at least twice when they bootstrap and test reachability (to allow for fetch failures), * at least once per consensus interval (that is, every 30 minutes), and * from a directory authority (if required). For anonymity reasons, bridges are unable to make regular connections over IPv6, until clients start to do so. (See [Proposal 306: Client Auto IPv6 Connections].) Directory authorities do not use these address detection methods to discover their own addresses, for security reasons. In this proposal, relays choose between IPv4 and IPv6 directory fetches at random (see section 3.2.5 for more detail). But if this change causes issues on IPv4-only relays, we may have to try IPv6 less often. See also sections 3.5.1 to 3.5.3. 3.5.5. Detailed Address Resolution Logs We propose this optional change, to help diagnose relay address resolution issues. Relays (and bridges) should log the address chosen using each address resolution method, when: * address resolution succeeds, * address resolution fails, * reachability checks fail, or * publishing the descriptor fails. These logs should be rate-limited separately for successes and failures. The logs should tell operators to set the Address torrc option for IPv4 and IPv6 (if available). 3.5.6. Add IPv6 Support to is_local_addr() We propose this optional change, to improve the accuracy of IPv6 address detection from directory documents. Directory servers use is_local_addr() to detect if the requesting tor instance is on the same local network. If it is, the directory server does not include the X-Your-Address-Is HTTP header in directory documents. Currently, is_local_addr() checks for: * an internal IPv4 or IPv6 address, or * the same IPv4 /24 as the directory server. We propose also checking for: * the same IPv6 /48 as the directory server. We choose /48 because it is typically the smallest network in the global IPv6 routing tables, and it was previously the recommended per-customer network block. (See [RFC 6177: IPv6 End Site Address Assignment].) Tor currently uses: * IPv4 /8 and IPv6 /16 for port summaries, * IPv4 /16 and IPv6 /32 for path selection (avoiding relays in the same network block). See also the next section, which uses IPv6 /64 for sybils. 3.5.7. Add IPv6 Support to AuthDirMaxServersPerAddr We propose this optional change, to improve the health of the network, by rejecting too many relays on the same IPv6 address. Modify get_possible_sybil_list() so it takes an address family argument, and returns a list of IPv4 or IPv6 sybils. Use the modified get_possible_sybil_list() to exclude relays from the authority's vote, if there are more than: * AuthDirMaxServersPerAddr on the same IPv4 address, or * AuthDirMaxServersPerIPv6Site in the same IPv6 /64. We choose IPv6 /64 as the IPv6 site size, because: * provider site allocations range between /48 and /64 (with a recommendation of /56), * /64 is the typical host allocation (see [RFC 6177: IPv6 End Site Address Assignment]), * we don't want to discourage IPv6 address adoption on the tor network. Tor currently uses: * IPv4 /8 and IPv6 /16 for port summaries, * IPv4 /16 and IPv6 /32 for path selection (avoiding relays in the same network block). See also the previous section, which uses IPv6 /48 for the local network. This change allows: * up to AuthDirMaxServersPerIPv6Site relays on the smallest IPv6 site (/64, which is also the typical IPv6 host), and * thousands of relays on the recommended IPv6 site size of /56. The number of relays in an IPv6 block was previously unlimited, and sybils were only limited by the scarcity of IPv4 addresses. We propose choosing a default value for AuthDirMaxServersPerIPv6Site by analysing the current IPv6 addresses on the tor network. Reasonable default values are likely in the range 4 to 50. If tor every allows IPv6-only relays, we should review the default value of AuthDirMaxServersPerIPv6Site. Since these relay exclusions happen at voting time, they do not require a new consensus method. 3.5.8. Use a Local Interface Address on the Default Route We propose this optional change, to improve the accuracy of local interface IPv4 and IPv6 address detection (see section 3.2.3), on relays (and bridges). Directory authorities do not use this address detection method to discover their own addresses, for security reasons. Rewrite the get_interface_address*() functions to choose an interface address on the default route, or to sort default route addresses first in the list of addresses. (If the platform API allows us to find the default route.) For more information, see [Ticket 12377: Prefer default route when checking local interface addresses]. This change might not be necessary, because the directory header IP address method will find the IP address of the default route, in most cases (see section 3.2.5). 3.5.9. Add IPv6 Support via Other DNS APIs We propose these optional changes, to add IPv6 support to hostname resolution on older OSes. These changes affect: * the Address torrc option, when it is a hostname (see section 3.2.1), and * automatic hostname resolution (see section 3.2.4), on relays and bridges. Directory authorities do not use this address detection method to discover their own addresses, for security reasons. Tor currently uses getaddrinfo() on most systems, which supports IPv6 DNS. But tor also supports the legacy gethostbyname() DNS API, which does not support IPv6. There are two alternative APIs we could use for IPv6 DNS, if getaddrinfo() is not available: * libevent DNS API, and * gethostbyname2(). But this change may be unnecessary, because: * Linux has used getaddrinfo() by default since glibc 2.20 (2014) * macOS has recommended getaddrinfo() since before 2006 * since macOS adopts BSD changes, most BSDs would have switched to getaddrinfo() in a similar timeframe * Windows has supported getaddrinfo() since Windows Vista; tor's minimum supported Windows version is Vista. See [Tor Supported Platforms] for more detai If a large number of systems do not support getaddrinfo(), we propose implementing one of these alternatives: The libevent DNS API supports IPv6 DNS, and tor already has a dependency on libevent. Therefore, we should prefer the libevent DNS API. (Unless we find it difficult to implement.) We could also use gethostbyname2() to add IPv6 support to hostname resolution on older OSes, which don't support getaddrinfo(). Handling multiple addresses: When looking up hostnames using libevent, the DNS callbacks provide a list of all addresses received. Therefore, we should ignore any private addresses, and then choose the first address in the list. When looking up hostnames using gethostbyname() or gethostbyname2(), if the first address is a private address, we may want to look at the entire list of addresses. Some struct hostent versions (example: current macOS) also have a h_addr_list rather than h_addr. (They define h_addr as h_addr_list[0], for backwards compatibility.) However, having private and public addresses resolving from the same hostname is a rare configuration, so we might not need to make this change. (On OSes that support getaddrinfo(), tor searches the list of addresses for a publicly routable address.) Alternative change: remove gethostbyname(): As an alternative, if we believe that all supported OSes have getaddrinfo(), we could simply remove the gethostbyname() code, rather than trying to modify it to work with IPv6. Most relays can reliably discover their address using directory headers, if all other methods fail. Or operators can set the Address torrc option to an IPv4 or IPv6 literal. 3.5.10. Change Relay OutboundBindAddress Defaults We propose this optional change, to improve the reliability of IP address-based filters in tor. These filters typically affect relays and directory authorities. But we propose that bridges and clients also make this change, for consistency. For example, the tor network treats relay IP addresses differently when: * resisting denial of service, and * selecting canonical, long-term connections. (See [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s] for the initial motivation for this change: resisting significant bandwidth load on directory authorities.) Now that tor knows its own addresses, we propose that relays (and bridges) set their IPv4 and IPv6 OutboundBindAddress to these discovered addresses, by default. If binding fails, tor should fall back to an unbound socket. Operators would still be able to set a custom IPv4 and IPv6 OutboundBindAddress, if needed. Currently, tor doesn't bind to a specific address, unless OutboundBindAddress is configured. So on relays with multiple IP addresses, the outbound address comes from the chosen route for each TCP connection or UDP packet (usually the default route). 3.5.11. IPv6 Address Privacy Extensions We propose this optional change, to improve the reliability of relays (and bridges) that use IPv6 address privacy extensions (see section 3.5 of [RFC 4941: Privacy Extensions for IPv6]). Directory authorities: * should not use IPv6 address privacy extensions, because their addresses need to stay the same over time, and * do not use address detection methods that would automatically select an IPv6 address with privacy extensions, for security reasons. We propose that tor should avoid using IPv6 addresses generated using privacy extensions, unless no other publicly routable addresses are available. In practice, each operating system has a different way of detecting IPv6 address privacy extensions. And some operating systems may not tell applications if a particular address is using privacy extensions. So implementing this change may be difficult. On operating systems that provide IPv6 address privacy extension state, IPv6 addresses may be: * "public" - these addresses do not change * "temporary" - these addresses change due to IPv6 privacy extensions. Therefore, tor should prefer "public" IPv6 addresses, when they are available. However, even if we do not make this change, tor should be compatible with the RFC 4941 defaults: * a new IPv6 address is generated each day * deprecated addresses are removed after one week * temporary addresses should be disabled, unless an application opts in to using them (See sections 3.5 and 3.6 of [RFC 4941: Privacy Extensions for IPv6].) In particular, it can take up to 4.5 hours for a client to receive a new address for a relay. Here are the maximum times: * 30 minutes for directory authorities to do reachability checks (see TestingAuthDirTimeToLearnReachability in the [Tor Manual Page]). * 1 hour for a reachable relay to be included in a vote * 10 minutes for votes to be turned into a consensus * 2 hours and 50 minutes for clients (See the [Tor Directory Protocol], sections 1.4 and 5.1, and the corresponding Directory Authority options in the [Tor Manual Page].) But 4.5 hours is much less than 1 week, and even significantly less than 1 day. So clients and relays should be compatible with the IPv6 privacy extensions defaults, even if they are used for all applications. However, bandwidth authorities may reset a relay's bandwidth when its IPv6 address changes. (The tor network currently uses torflow and sbws as bandwidth authorities, neither implementation resets bandwidth when IPv6 addresses change.) Since bandwidth authorities only scan the whole tor network about once a day, resetting a relay's bandwidth causes a huge penalty. Therefore, we propose that sbws should not reset relay bandwidths when IPv6 addresses change. (See [Ticket 28725: Reset relay bandwidths when their IPv6 address changes].) 3.5.12. Quick Extends After Relay Restarts We propose this optional change, to reduce client circuit failures, after a relay restarts. We propose that relays (and bridges) should open their ORPorts, and support client extends, as soon as possible after they start up. (Clients may already have the relay's addresses from a previous consensus.) Automatically enabling an IPv6 ORPort creates a race condition with IPv6 extends (see section 3.2.7 of this proposal, and [Proposal 311: Relay IPv6 Reachability]). This race condition has the most impact when: 1. a relay has outbound IPv6 connectivity, 2. the relay detects a publicly routable IPv6 address, 3. the relay opens an IPv6 ORPort, 4. but the IPv6 ORPort is not reachable. Between steps 3 and 4, the relay could successfully extend over IPv6, even though its IPv6 ORPort is unreachable. However, we expect this case to be rare. A far more common case is that a working relay has just restarted, and clients still have its addresses, therefore they continue to try to extend through it. If the relay refused to extend, all these clients would have to retry their circuits. To support this case, tor relays should open IPv4 and IPv6 ORPorts, and perform extends, as soon as they can after startup. Relays can extend to other relays, as soon as they have validated the directory documents containing other relays' public keys. In particular, relays which automatically detect their IPv6 address, should support IPv6 extends as soon as they detect an IPv6 address. (Relays may also attempt to bind to all IPv6 addresses on all interfaces. If that bind is successful, they may choose to extend over IPv6, even before they know their own IPv6 address.) Relays should not wait for reachable IPv4 or IPv6 ORPorts before they start performing client extends. DirPort requests are less critical, because relays and clients will retry directory fetches using multiple mirrors. However, DirPorts may also open as early as possible, for consistency. (And for simpler code.) Tor's existing code handles this use case, so the code changes required to support IPv6 may be quite small. But we should still test this use case for clients connecting over IPv4 and IPv6, and extending over IPv4 and IPv6. Directory authorities do not rely on their own reachability checks, so they should be able to perform extends (and serve cached directory documents) shortly after startup. 3.5.13. Using Authority Addresses for Socket-Based Address Detection We propose this optional change, to avoid issues with firewalls during relay (and bridge) address detection. (And to reduce user confusion about firewall notifications which show a strange IP address, particularly on clients.) Directory authorities do not use a UDP socket to discover their own addresses, for security reasons. Therefore, we are free to use any directory address for this check, without the risk of a directory authority making a UDP socket to itself, and discovering its own private address. We propose that tor should use a directory authority IPv4 and IPv6 address, for any sockets that it opens to detect local interface addresses (see section 3.2.3). We propose that this change is applied regardless of the role of the current tor instance (relay, bridge, directory authority, or client). Tor currently uses the arbitrary IP addresses 18.0.0.1 and [2002::], which may be blocked by firewalls. These addresses may also cause user confusion, when they appear in logs or notifications. The relevant function is get_interface_address6_via_udp_socket_hack() in lib/net. The hard-coded addresses are in app/config. Directly using these addresses would break tor's module layering rules, so we propose: * copying one directory authority's hard-coded IPv4 and IPv6 addresses to an ADDRESS_PRIVATE macro or variable in lib/net/address.h * writing a unit test that makes sure that the address used by get_interface_address6_via_udp_socket_hack() is still in the list of hard-coded directory authority addresses. When we choose the directory authority, we should avoid using a directory authority that has different hard-coded and advertised IP addresses. (To avoid user confusion.) 4. Directory Protocol Specification Changes We propose explicitly supporting IPv6 X-Your-Address-Is HTTP headers in the tor directory protocol. We propose the following changes to the [Tor Directory Protocol] specification, in section 6.1: Servers MAY include an X-Your-Address-Is: header, whose value is the apparent IPv4 or IPv6 address of the client connecting to them. IPv6 addresses SHOULD/MAY (TODO) be formatted enclosed in square brackets. TODO: require brackets? What does Tor currently do? For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD report the IP from which the circuit carrying the BEGIN_DIR stream reached them. Servers SHOULD disable caching of multiple network statuses or multiple server descriptors. Servers MAY enable caching of single descriptors, single network statuses, the list of all server descriptors, a v1 directory, or a v1 running routers document, with appropriate expiry times (around 30 minutes). Servers SHOULD disable caching of X-Your-Address-Is headers. 5. Test Plan We provide a quick summary of our testing plans. 5.1. Testing Relay IPv6 Addresses Discovery We propose to test these changes using chutney networks. However, chutney creates a limited number of configurations, so we also need to test these changes with relay operators on the public network. Therefore, we propose to test these changes on the public network with a small number of relays and bridges. Once these changes are merged, volunteer relay and bridge operators will be able to test them by: * compiling from source, * running nightly builds, or * running alpha releases. 5.2. Test Existing Features We will modify and test these existing features: * Find Relay IPv4 Addresses We do not plan on modifying these existing features: * relay address retries * existing warning logs But we will test that they continue to function correctly, and fix any bugs triggered by the modifications in this proposal. 6. Ongoing Monitoring To monitor the impact of these changes: * relays should collect basic IPv6 connection statistics, and * relays and bridges should collect basic IPv6 bandwidth statistics. (See [Proposal 313: Relay IPv6 Statistics]). Some of these statistics may be included in tor's heartbeat logs, making them accessible to relay operators. We do not propose to collect additional statistics on: * circuit counts, or * failure rates. Collecting statistics like these could impact user privacy. We also plan to write a script to calculate the number of IPv6 relays in the consensus. This script will help us monitor the network during the deployment of these new IPv6 features. 7. Changes to Other Proposals [Proposal 306: Client Auto IPv6 Connections] needs to be modified to keep bridge IPv6 behaviour in sync with client IPv6 behaviour. (See section 3.2.5.) References: [getaddrinfo man page]: See the quoted section in: https://stackoverflow.com/a/42351676 [Proposal 306: Client Auto IPv6 Connections]: One possible design for automatic client IPv4 and IPv6 connections is at https://gitweb.torproject.org/torspec.git/tree/proposals/306-ipv6-happy-eyeballs.txt (TODO: modify to include bridge changes with client changes) [Proposal 311: Relay IPv6 Reachability]: https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt [Proposal 313: Relay IPv6 Statistics]: https://gitweb.torproject.org/torspec.git/tree/proposals/313-relay-ipv6-stats.txt [RFC 4941: Privacy Extensions for IPv6]: https://tools.ietf.org/html/rfc4941 Or the older RFC 3041: https://tools.ietf.org/html/rfc3041 [RFC 6177: IPv6 End Site Address Assignment]: https://tools.ietf.org/html/rfc6177#page-7 [Ticket 12377: Prefer default route when checking local interface addresses]: https://trac.torproject.org/projects/tor/ticket/12377 [Ticket 28725: Reset relay bandwidths when their IPv6 address changes]: https://trac.torproject.org/projects/tor/ticket/29725#comment:3 [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s]: https://trac.torproject.org/projects/tor/ticket/33018 [Tor Directory Protocol]: (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt [Tor Manual Page]: https://2019.www.torproject.org/docs/tor-manual.html.en [Tor Specification]: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt [Tor Supported Platforms]: https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/SupportedPlatforms#OSSupportlevels
Filename: 313-relay-ipv6-stats.txt Title: Tor Relay IPv6 Statistics Author: teor, Karsten Loesing, Nick Mathewson Created: 10-February-2020 Status: Accepted Ticket: #33159 0. Abstract We propose that: * tor relays should collect statistics on IPv6 connections, and * tor relays and bridges should collect statistics on consumed bandwidth. Like tor's existing connection and consumed bandwidth statistics, these new IPv6 statistics will be published in each relay's extra-info descriptor. We also plan to write a script that shows the number of relays in the consensus that support: * IPv6 extends, and * IPv6 client connections. This script will be used for medium-term monitoring, during the deployment of tor's IPv6 changes in 2020. (See [Proposal 311: Relay IPv6 Reachability] and [Proposal 312: Relay Auto IPv6 Address].) 1. Introduction Tor relays (and bridges) can accept IPv6 client connections via their ORPort. But current versions of tor need to have an explicitly configured IPv6 address (see [Proposal 312: Relay Auto IPv6 Address]), and they don't perform IPv6 reachability self-checks (see [Proposal 311: Relay IPv6 Reachability]). As we implement these new IPv6 features in tor, we want to monitor their impact on the IPv6 connections and bandwidth in the tor network. Tor developers also need to know how many relays support these new IPv6 features, so they can test tor's IPv6 reachability checks. (In particular, see section 4.3.1 in [Proposal 311: Relay IPv6 Reachability]: Refusing to Publish the Descriptor.) 2. Scope This proposal modifies Tor's behaviour as follows: Relays, bridges, and directory authorities collect statistics on: * IPv6 connections, and * IPv6 consumed bandwidth. The design of these statistics will be based on tor's existing connection and consumed bandwidth statistics. Tor's existing consumed bandwidth statistics truncate their totals to the nearest kilobyte. The existing connection statistics do not perform any binning. We do not proposed to add any extra noise or binning to these statistics. Instead, we expect to leave these changes until we have a consistent privacy-preserving statistics framwework for tor. As an example of this kind of framework, see [Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]. We avoid: * splitting connection statistics into clients and relays, and * collecting circuit statistics. These statistics are more sensitive, so we want to implement privacy-preserving statistics, before we consider adding them. Throughout this proposal, "relays" includes directory authorities, except where they are specifically excluded. "relays" does not include bridges, except where they are specifically included. (The first mention of "relays" in each section should specifically exclude or include these other roles.) Tor clients do not collect any statistics for public reporting. Therefore, clients are out of scope in this proposal. When this proposal describes Tor's current behaviour, it covers all supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except where another version is specifically mentioned. This proposal also includes a medium-term monitoring script, which calculates the number of relays in the consensus that support IPv6 extends, and IPv6 client connections. 3. Monitoring IPv6 Relays in the Consensus We propose writing a script that calculates: * the number of relays, and * the consensus weight fraction of relays, in the consensus that: * have an IPv6 ORPort, * support IPv6 reachability checks, * support IPv6 clients, and * support IPv6 reachability checks, and IPv6 clients. In order to provide easy access to these statistics, we propose that the script should: * download a consensus (or read an existing consensus), and * calculate and report these statistics. The following consensus weight fractions should divide by the total consensus weight: * have an IPv6 ORPort (all relays have an IPv4 ORPort), and * support IPv6 reachability checks (all relays support IPv4 reachability). The following consensus weight fractions should divide by the "usable Guard" consensus weight: * support IPv6 clients, and * support IPv6 reachability checks and IPv6 clients. "Usable Guards" have the Guard flag, but do not have the Exit flag. If the Guard also has the BadExit flag, the Exit flag should be ignored. Note that this definition of "Usable Guards" is only valid when the consensus contains many more guards than exits. That is, Wgd must be 0 in the consensus. (See the [Tor Directory Protocol] for more details.) Therefore, the script should check that Wgd is 0. If it is not, the script should log a warning about the accuracy of the "Usable Guard" statistics. 4. Collecting IPv6 Consumed Bandwidth Statistics We propose that relays (and bridges) collect IPv6 consumed bandwidth statistics. To minimise development and testing effort, we propose re-using the existing "bw_array" code in rephist.c. In particular, tor currently counts these bandwidth statistics: * read, * write, * dir_read, and * dir_write. We propose adding the following bandwidth statistics: * ipv6_read, and * ipv6_write. (The IPv4 statistics can be calculated by subtracting the IPv6 statistics from the existing total consumed bandwidth statistics.) We believe that collecting IPv6 consumed bandwidth statistics is about as safe as the existing IPv4+IPv6 total consumed bandwidth statistics. See also section 7.5, which adds a BandwidthStatistics torrc option and consensus parameter. BandwidthStatistics is an optional change. 5. Collecting IPv6 Connection Statistics We propose that relays (but not bridges) collect IPv6 connection statistics. Bridges refuse to collect the existing ConnDirectionStatistics, so we do not believe it is safe to collect the smaller IPv6 totals on bridges. To minimise development and testing effort, we propose re-using the existing "bidi" code in rephist.c. (This code may require some refactoring, because the "bidi" totals are globals, rather than a struct.) In particular, tor currently counts these connection statistics: * below threshold, * mostly read, * mostly written, and * both read and written. We propose adding IPv6 variants of all these statistics. (The IPv4 statistics can be calculated by subtracting the IPv6 statistics from the existing total connection statistics.) See also section 7.6, which adds a ConnDirectionStatistics consensus parameter. This consensus paramter is an optional change. 6. Directory Protocol Specification Changes We propose adding IPv6 variants of the consumed bandwidth and connection direction statistics to the tor directory protocol. We propose the following additions to the [Tor Directory Protocol] specification, in section 2.1.2. Each addition should be inserted below the existing consumed bandwidth and connection direction specifications. "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL [At most once] "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL [At most once] Declare how much bandwidth the OR has used recently, on IPv6 connections. See "read-history" and "write-history" for more details. (The migration notes do not apply to IPv6.) "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL [At most once] Number of IPv6 connections, that are used uni-directionally or bi-directionally. See "conn-bi-direct" for more details. We also propose the following replacement, in the same section: "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL [At most once] "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL [At most once] Declare how much bandwidth the OR has spent on answering directory requests. See "read-history" and "write-history" for more details. (The migration notes do not apply to dirreq.) This replacement is optional, but it may avoid the 3 *read-history definitions getting out of sync. 7. Optional Changes We propose some optional changes to help relay operators, tor developers, and tor's network health. We also expect that these changes will drive IPv6 relay adoption. Some of these changes may be more appropriate as future work, or along with other proposed features. 7.1. Log IPv6 Statistics in Tor's Heartbeat Logs We propose this optional change, so relay operators can see their own IPv6 statistics: We propose that tor logs its IPv6 consumed bandwidth and connection statistics in its regular "heartbeat" logs. These heartbeat statistics should be collected over the lifetime of the tor process, rather than using the state file, like the statistics in sections 4 and 5. Tor's existing heartbeat logs already show its consumed bandwidth and connections (in the link protocol counts). We may also want to show IPv6 consumed bandwidth and connections as a propotion of the total consumed bandwidth and connections. These statistics only show a relay's local bandwidth usage, so they can't be used for reporting. 7.2. Show IPv6 Relay Counts on Consensus Health The [Consensus Health] website displays a wide rage of tor statistics, based on the most recent consensus. We propose this optional change, to: * help tor developers improve IPv6 support on the tor network, * help diagnose issues with IPv6 on the tor network, and * drive IPv6 adoption on tor relays. Consensus Health adds an IPv6 section, with relays in the consensus that: * have an IPv6 ORPort, and * support IPv6 reachability checks. The definitions of these statistics are in section 3. These changes can be tested using the script proposed in section 3. 7.3. Add an IPv6 Reachability Pseudo-Flag on Relay Search The [Relay Search] website displays tor relay information, based on the current consensus and relay descriptors. We propose this optional change, to: * help relay operators diagnose issues with IPv6 on their relays, and * drive IPv6 adoption on tor relays. Relay Search adds a pseudo-flag for relay IPv6 reachability support. This pseudo-flag should be given to relays that have: * a reachable IPv6 ORPort (in the consensus), and * support tor subprotocol version "Relay=3" (or later). See [Proposal 311: Relay IPv6 Reachability] for details. TODO: Is this a useful change? Are there better ways of driving IPv6 adoption? 7.4. Add IPv6 Connections and Consumed Bandwidth Graphs to Tor Metrics The [Tor Metrics: Traffic] website displays connection and bandwidth information for the tor network, based on relay extra-info descriptors. We propose these optional changes, to: * help tor developers improve IPv6 support on the tor network, * help diagnose issues with IPv6 on the tor network, and * drive IPv6 adoption on tor relays. Tor Metrics adds the following information to the graphs on the Traffic page: Consumed Bandwidth by IP version * added to the existing [Tor Metrics: Advertised bandwidth by IP version] page * as a stacked graph, like [Tor Metrics: Advertised and consumed bandwidth by relay flags] Fraction of connections used uni-/bidirectionally by IP version * added to the existing [Tor Metrics: Fraction of connections used uni-/bidirectionally] page * as a stacked graph, like [Tor Metrics: Advertised and consumed bandwidth by relay flags] 7.5. Add a BandwidthStatistics option We propose adding a new BandwidthStatistics torrc option and consensus parameter, which activates reporting of all these statistics. Currently, the existing statistics are controlled by ExtraInfoStatistics, but we propose using the new BandwidthStatistics option for them as well. The default value of this option should be "auto", which checks the consensus parameter. If there is no consensus parameter, the default should be 1. (The existing bandwidth statistics are reported by default.) 7.6. Add a ConnDirectionStatistics consensus parameter We propose using the existing ConnDirectionStatistics torrc option, and adding a consensus parameter with the same name. This option will control the new and existing connection statistics. The default value of this option should be "auto", which checks the consensus parameter. If there is no consensus parameter, the default should be 0. Bridges refuse to collect the existing ConnDirectionStatistics, so we do not believe it is safe to collect the smaller IPv6 totals on bridges. The new consensus parameter should also be ignored on bridges. If we implement the ConnDirectionStatistics consensus parameter, we can set the consensus parameter to 1 for a week or two, so we can collect these statistics. 8. Test Plan We provide a quick summary of our testing plans. 8.1. Testing IPv6 Relay Consensus Calculations We propose to test the IPv6 Relay consensus script using chutney networks. However, chutney creates a limited number of relays, so we also need to test these changes on consensuses from the public tor network. Some of these calculations are similar to the calculations that tor will do, to find out if IPv6 reachability checks are reliable. So we may be able to check the script against tor's reachability logs. (See section 4.3.1 in [Proposal 311: Relay IPv6 Reachability]: Refusing to Publish the Descriptor.) The Tor Metrics team may also independently check these calculations. Once the script is completed, its output will be monitored by tor developers, as more volunteer relay operators deploy the relevant tor versions. (And as the number of IPv6 relays in the consensus increases.) 8.2. Testing IPv6 Extra-Info Statistics We propose to test the connection and consumed bandwidth statistics using chutney networks. However, chutney runs for a short amount of time, and creates a limited amount of traffic, so we also need to test these changes on the public tor network. In particular, we have struggled to test statistics using chutney, because tor's hard-coded statistics period is 24 hours. (And most chutney networks run for under 1 minute.) Therefore, we propose to test these changes on the public network with a small number of relays and bridges. During 2020, the Tor Metrics team will analyse these statistics on the public tor network, and provide IPv6 progress reports. We expect that we may discover some bugs during the first analysis. Once these changes are merged, they will be monitored by tor developers, as more volunteer relay operators deploy the relevant tor versions. (And as the number of IPv6 relays in the consensus increases.) References: [Consensus Health]: https://consensus-health.torproject.org/ [Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]: https://gitweb.torproject.org/torspec.git/tree/proposals/288-privcount-with-shamir.txt [Proposal 311: Relay IPv6 Reachability]: https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt [Proposal 312: Relay Auto IPv6 Address]: https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt [Relay Search]: https://metrics.torproject.org/rs.html [Tor Directory Protocol]: (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt [Tor Manual Page]: https://2019.www.torproject.org/docs/tor-manual.html.en [Tor Metrics: Advertised and consumed bandwidth by relay flags]: https://metrics.torproject.org/bandwidth-flags.html [Tor Metrics: Advertised bandwidth by IP version]: https://metrics.torproject.org/advbw-ipv6.html [Tor Metrics: Fraction of connections used uni-/bidirectionally]: https://metrics.torproject.org/connbidirect.html [Tor Metrics: Traffic]: https://metrics.torproject.org/bandwidth-flags.html [Tor Specification]: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt
Filename: 314-allow-markdown-proposals.md Title: Allow Markdown for proposal format. Author: Nick Mathewson Created: 23 April 2020 Status: Closed

Introduction

This document proposes a change in our proposal format: to allow Markdown.

Motivation

Many people, particularly researchers, have found it difficult to write text in the format that we prefer. Moreover, we have often wanted to add more formatting in proposals, and found it nontrivial to do so.

Markdown is an emerging "standard" (albeit not actually a standardized one), and we're using it in several other places. It seems like a natural fit for our purposes here.

Details

We should pick a particular Markdown dialect. "CommonMark" seems like a good choice, since it's the basis of what github and gitlab use.

We should also pick a particular tool to use for validating Markdown proposals.

We should continue to allow text proposals.

We should continue to require headers for our proposals, and do so using the format at the head of this document: wrapping the headers inside triple backticks.

Filename: 315-update-dir-required-fields.txt Title: Updating the list of fields required in directory documents Author: Nick Mathewson Created: 23 April 2020 Status: Closed Implemented-In: 0.4.5.1-alpha Notes: The "hidden-service-dir" field was not made assumed-present; all other fields were updated. 1. Introduction When we add a new field to a directory document, we must at first describe it as "optional", since older Tor implementations will not generate it. When those implementations are obsolete and unsupported, however, we can safely describe those fields as "required", since they are always included in practice. Making fields required is not just a matter of bookkeeping: it helps prevent bugs in two ways. First, it simplifies our code. Second, it makes our code's requirements match our assumptions about the network. Here I'll describe a general policy for making fields required when LTS versions become unsupported, and include a list of fields that should become required today. This document does not require to us to make all optional fields required -- only those which we intend that all Tor instances should always generate and expect. When we speak of making a field "required", we are talking about describing it as "required" in dir-spec.txt, so that any document missing that field is no longer considered well-formed. 2. When fields should become required We have four relevant kinds of directory documents: those generated by public relays, those generated by bridges, those generated by authorities, and those generated by onion services. Relays generate extrainfo documents and routerdesc documents. For these, we can safely make a field required when it is always generated by all relay versions that the authorities allow to join the network. To avoid partitioning, authorities should start requiring the field before any relays or clients do. (If a relay field indicates the presence of a now-required feature, then instead of making the field mandatory, we may change the semantics so that the field is assumed to be present. Later we can remove the option.) Bridge relays have their descriptors processed by clients without necessarily passing through authorities. We can make fields mandatory in bridge descriptors once we can be confident that no bridge lacking them will actually connect to the network-- or that all such bridges are safe to stop using. For bridges, when a field becomes required, it will take some time before all clients require that field. This would create a partitioning opportunity, but partitioning at the first-hop position is not so strong: the bridge already knows the client's IP, which is a much better identifier than the client's Tor version. Authorities generate authority certificates, votes, consensus documents, and microdescriptors. For these, we can safely make a field required once all authorities are generating it, and we are confident that we do not plan to downgrade those authorities. Onion services generate service descriptors. Because of the risk of partitioning attacks, we should not make features in service descriptors required without a phased process, described in the following section. 2.1. Phased addition of onion service descriptor changes Phase one: we add client and service support for the new field, but have this support disabled by default. By default, services should not generate the new field, and clients should not parse it when it is present. This behavior is controlled by a pair of network parameters. (If the feature is at all complex, the network parameters should describe a _minimum version_ that should enable the feature, so that we can later enable it only in the versions where the feature is not buggy.) During this phase, we can manually override the defaults on particular clients and services to test the new field. Phase two: authorities use the network parameters to enable the client support and the service support. They should only do this once enough clients and services have upgraded to a version that supports the feature. Phase three: once all versions that support the feature are obsolete and unsupported, the feature may be marked as required in the specifications, and the network parameters ignored. Phase four: once all versions that used the network parameters are obsolete and unsupported, authorities may stop including those parameters in their votes. 3. Directory fields that should become required. These fields in router descriptors should become required: * identity-ed25519 * master-key-ed25519 * onion-key-crosscert * ntor-onion-key * ntor-onion-key-crosscert * router-sig-ed25519 * proto These fields in router descriptors should become "assumed present": * hidden-service-dir These fields in extra-info documents should become required: * identity-ed25519 * router-sig-ed25519 The following fields in microdescriptors should become required: * ntor-onion-key The following fields in votes and consensus documents should become required: * pr
Filename: 316-flashflow.md Title: FlashFlow: A Secure Speed Test for Tor (Parent Proposal) Author: Matthew Traudt, Aaron Johnson, Rob Jansen, Mike Perry Created: 23 April 2020 Status: Draft

1. Introduction

FlashFlow is a new distributed bandwidth measurement system for Tor that consists of a single authority node ("coordinator") instructing one or more measurement nodes ("measurers") when and how to measure Tor relays. A measurement consists of the following steps:

  1. The measurement nodes demonstrate to the target relay permission to perform measurements.
  2. The measurement nodes open many TCP connections to the target relay and create a one-hop circuit to the target relay on each one.
  3. For 30 seconds the measurement nodes send measurement cells to the target relay and verify that the cells echoed back match the ones sent. During this time the relay caps the amount of background traffic it transfers. Background and measurement traffic are handled separately at the relay. Measurement traffic counts towards all the standard existing relay statistics.
  4. For every second during the measurement, the measurement nodes report to the authority node how much traffic was echoed back. The target relay also reports the amount of per-second background (non-measurement) traffic.
  5. The authority node sums the per-second reported throughputs into 30 sums (one for each second) and calculates the median. This is the estimated capacity of the relay.

FlashFlow performs a measurement of every relay according to a schedule described later in this document. Periodically it produces relay capacity estimates in the form of a v3bw file, which is suitable for direct consumption by a Tor directory authority. Alternatively an existing load balancing system such as Simple Bandwidth Scanner could be modified to use FlashFlow's v3bw file as input.

It is envisioned that each directory authority that wants to use FlashFlow will run their own FlashFlow deployment consisting of a coordinator that they run and one or more measurers that they trust (e.g. because they run them themselves), similar to how each runs their own Torflow/sbws. Section 5 of this proposal describes long term plans involving multiple FlashFlow deployments. FlashFlow coordinators do not need to communicate with each other.

FlashFlow is more performant than Torflow: FlashFlow takes 5 hours to measure the entire existing Tor network from scratch (with 3 Gbit/s measurer capacity) while Torflow takes 2 days; FlashFlow measures relays it hasn't seen recently as soon as it learns about them (i.e. every new consensus) while Torflow can take a day or more; and FlashFlow accurately measures new high-capacity relays the first time and every time while Torflow takes days/weeks to assign them their full fair share of bandwidth (especially for non-exits). FlashFlow is more secure than Torflow: FlashFlow allows a relay to inflate its measured capacity by up to 1.33x (configured by a parameter) while Torflow allows weight inflation by a factor of 89x [0] or even 177x [1].

After an overview in section 2 of the planned deployment stages, section 3, 4, and 5 discuss the short, medium, and long term deployment plans in more detail.

2. Deployment Stages

FlashFlow's deployment shall be broken up into three stages.

In the short term we will implement a working FlashFlow measurement system. This requires code changes in little-t tor and an external FlashFlow codebase. The majority of the implementation work will be done in the short term, and the product is a complete FlashFlow measurement system. Remaining pieces (e.g. better authentication) are added later for enhanced security and network performance.

In the medium term we will begin collecting data with a FlashFlow deployment. The intermediate results and v3bw files produced will be made available (semi?) publicly for study.

In the long term experiments will be performed to study ways of using FF v3bw files to improve load balancing. Two examples: (1) using FF v3bw files instead of sbws's (and eventually phasing out torflow/sbws), and (2) continuing to run sbws but use FF's results as a better estimate of relay capacity than observed bandwidth. Authentication and other FlashFlow features necessary to make it completely ready for full production deployment will be worked on during this long term phase.

3. FlashFlow measurement system: Short term

The core measurement mechanics will be implemented in little-t tor, but a separate codebase for the FlashFlow side of the measurement system will also be created. This section is divided into three parts: first a discussion of changes/additions that logically reside entirely within tor (essentially: relay-side modifications), second a discussion of the separate FlashFlow code that also requires some amount of tor changes (essentially: measurer-side and coordinator-side modifications), and third a security discussion.

3.1 Little-T Tor Components

The primary additions/changes that entirely reside within tor on the relay side:

  • New torrc options/consensus parameters.
  • New cell commands.
  • Pre-measurement handshaking (with a simplified authentication scheme).
  • Measurement mode, during which the relay will echo traffic with measurers, set a cap on the amount of background traffic it transfers, and report the amount of transferred background traffic.

3.1.1 Parameters

FlashFlow will require some consensus parameters/torrc options. Each has some default value if nothing is specified; the consensus parameter overrides this default value; the torrc option overrides both.

FFMeasurementsAllowed: A global toggle on whether or not to allow measurements. Even if all other settings would allow a measurement, if this is turned off, then no measurement is allowed. Possible values: 0,

  1. Default: 0 (disallowed).

FFAllowedCoordinators: The list of coordinator TLS certificate fingerprints that are allowed to start measurements. Relays check their torrc when they receive a connection from a FlashFlow coordinator to see if it's on the list. If they have no list, they check the consensus parameter. If nether exist, then no FlashFlow deployment is allowed to measure this relay. Default: empty list.

FFMeasurementPeriod: A relay should expect on average, to be measured by each FlashFlow deployment once each measurement period. A relay will not allow itself to be measured more than twice by a FlashFlow deployment in any time window of this length. Relays should not change this option unless they really know what they're doing. Changing it at the relay will not change how often FlashFlow will attempt to measure the relay. Possible values are in the range [1 hour, 1 month] inclusive. Default: 1 day.

FFBackgroundTrafficPercent: The maximum amount of regular non-measurement traffic a relay should handle while being measured, as a percent of total traffic (measurement + non-measurement). This parameter is a trade off between having to limit background traffic and limiting how much a relay can inflate its result by handling no background traffic but reporting that it has done so. Possible values are in the range [0, 99] inclusive. Default: 25 (a maximum inflation factor of 1.33).

FFMaxMeasurementDuration: The maximum amount of time, in seconds, that is allowed to pass from the moment the relay is notified that a measurement will begin soon and the end of the measurement. If this amount of time passes, the relay shall close all measurement connections and exit its measurement mode. Note this duration includes handshake time, thus it necessarily is larger than the expected actual measurement duration. Possible values are in the range [10, 120] inclusive. Default: 45.

3.1.2 New Cell Types

FlashFlow will introduce a new cell command MEASUREMENT.

The payload of each MEASUREMENT cell consists of:

Measure command [1 byte] Data [varied]

The measure commands are:

0 -- MEAS_PARAMS [forward] 1 -- MEAS_PARAMS_OK [backward] 2 -- MEAS_BG [backward] 3 -- MEAS_ERR [forward and backward]

Forward cells are sent from the measurer/coordinator to the relay. Backward cells are sent from the relay to the measurer/coordinator.

MEAS_PARAMS and MEAS_PARAMS_OK are used during the pre-measurement stage to tell the target what to expect and for the relay to positively acknowledge the message. The target send a MEAS_BG cell once per second to report the amount of background traffic it is handling. MEAS_ERR cells are used to signal to the other party that there has been some sort of problem and that the measurement should be aborted. These measure commands are described in more detail in the next section.

FlashFlow also introduces a new relay command, MEAS_ECHO. Relay celsl with this relay command are the measurement traffic. The measurer generates and encrypts them, sends them to the target, the target decrypts them, then it sends them back. A variation where the measurer skips encryption of MEAS_ECHO cells in most cases is described in Appendix A, and was found to be necessary in paper prototypes to save CPU load at the measurer.

MEASUREMENT cells, on the other hand, are not encrypted (beyond the regular TLS on the connection).

3.1.3 Pre-Measurement Handshaking/Starting a Measurement

The coordinator establishes a one-hop circuit with the target relay and sends it a MEAS_PARAMS cell. If the target is unwilling to be measured at this time or if the coordinator didn't use a TLS certificate that the target trusts, it responds with an error cell and closes the connection. Otherwise it checks that the parameters of the measurement are acceptable (e.g. the version is acceptable, the duration isn't too long, etc.). If the target is happy, it sends a MEAS_PARAMS_OK, otherwise it sends a MEAS_ERR and closes the connection.

Upon learning the IP addresses of the measurers from the coordinator in the MEAS_PARAMS cell, the target whitelists their IPs in its DoS detection subsystem until the measurement ends (successfully or otherwise), at which point the whitelist is cleared.

Upon receiving a MEAS_PARAMS_OK from the target, the coordinator will instruct the measurers to open their circuits (one circuit per connection) with the target. If the coordinator or any measurer receives a MEAS_ERR, it reports the error to the coordinator and considers the measurement a failure. It is also a failure if any measurer is unable to open at least half of its circuits with the target.

The payload of MEAS_PARAMS cells [XXX more may need to be added]:

- meas_duration [2 bytes] [1, 600] - num_measurers [1 byte] [1, 10] - measurer_info [num_measurers times]

meas_duration is the duration, in seconds, that the actual measurement will last. num_measurers is how many link_specifier structs follow containing information on the measurers that the relay should expect. Future versions of FlashFlow and MEAS_PARAMS will use TLS certificates instead of IP addresses. [XXX probably need diff layout to allow upgrade to TLS certs instead of link_specifier structs. probably using ext-type-length-value like teor suggests] [XXX want to specify number of conns to expect from each measurer here?]

MEAS_PARAMS_OK has no payload: it's just padding bytes to make the cell PAYLOAD_LEN (509) bytes long.

The payload of MEAS_ECHO cells:

- arbitrary bytes [PAYLOAD_LEN bytes]

The payload of MEAS_BG cells [XXX more for extra info? like CPU usage]:

- second [2 byte] [1, 600] - sent_bg_bytes [4 bytes] [0, 2^32-1] - recv_bg_bytes [4 bytes] [0, 2^32-1]

second is the number of seconds since the measurement began. MEAS_BG cells are sent once per second from the relay to the FlashFlow coordinator. The first cell will have this set to 1, and each subsequent cell will increment it by one. sent_bg_bytes is the number of background traffic bytes sent in the last second (since the last MEAS_BG cell). recv_bg_bytes is the same but for received bytes.

The payload of MEAS_ERR cells [XXX need field for more info]:

- err_code [1 byte] [0, 255]

The error code is one of:

[... XXX TODO ...] 255 -- OTHER

3.1.4 Measurement Mode

The relay considers the measurement to have started the moment it receives the first MEAS_ECHO cell from any measurer. At this point, the relay

  • Starts a repeating 1s timer on which it will report the amount of background traffic to the coordinator over the coordinator's connection.
  • Enters "measurement mode" and limits the amount of background traffic it handles according to the torrc option/consensus parameter.

The relay decrypts and echos back all MEAS_ECHO cells it receives on measurement connections until it has reported its amount of background traffic the same number of times as there are seconds in the measurement (e.g. 30 per-second reports for a 30 second measurement). After sending the last MEAS_BG cell, the relay drops all buffered MEAS_ECHO cells, closes all measurement connections, and exits measurement mode.

During the measurement the relay targets a ratio of background traffic to measurement traffic as specified by a consensus parameter/torrc option. For a given ratio r, if the relay has handled x cells of measurement traffic recently, Tor then limits itself to y = xr/(1-r) cells of non-measurement traffic this scheduling round. If x is very small, the relay will perform the calculation s.t. x is the number of cells required to produce 10 Mbit/s of measurement traffic, thus ensuring some minimum amount of background traffic is allowed.

[XXX teor suggests in [4] that the number 10 Mbit/s could be derived more intelligently. E.g. based on AuthDirFastGuarantee or AuthDirGuardBWGuarantee]

3.2 FlashFlow Components

The FF coordinator and measurer code will reside in a FlashFlow repository separate from little-t tor.

There are three notable parameters for which a FF deployment must choose values. They are:

  • The number of sockets, s, the measurers should open, in aggregate, with the target relay. We suggest s=160 based on the FF paper.
  • The bandwidth multiplier, m. Given an existing capacity estimate for a relay, z, the coordinator will instruct the measurers to, in aggregate, send m*z Mbit/s to the target relay. We recommend m=2.25.
  • The measurement duration, d. Based on the FF paper, we recommend d=30 seconds.

The rest of this section first discusses notable functions of the FlashFlow coordinator, then goes on to discuss FF measurer code that will require supporting tor code.

3.2.1 FlashFlow Coordinator

The coordinator is responsible for scheduling measurements, aggregating results, and producing v3bw files. It needs continuous access to new consensus files, which it can obtain by running an accompanying Tor process in client mode.

The coordinator has the following functions, which will be described in this section:

  • result aggregation.
  • schedule measurements.
  • v3bw file generation.

3.2.1.1 Aggregating Results

Every second during a measurement, the measurers send the amount of verified measurement traffic they have received back from the relay. Additionally, the relay sends a MEAS_BG cell each second to the coordinator with amount of non-measurement background traffic it is sending and receiving.

For each second's reports, the coordinator sums the measurer's reports. The coordinator takes the minimum of the relay's reported sent and received background traffic. If, when compared to the measurer's reports for this second, the relay's claimed background traffic is more than what's allowed by the background/measurement traffic ratio, then the coordinator further clamps the relay's report down. The coordinator adds this final adjusted amount of background traffic to the sum of the measurer's reports.

Once the coordinator has done the above for each second in the measurement (e.g. 30 times for a 30 second measurement), the coordinator takes the median of the 30 per-second throughputs and records it as the estimated capacity of the target relay.

3.2.1.2 Measurement Schedule

The short term implementation of measurement scheduling will be simpler than the long term one due to (1) there only being one FlashFlow deployment, and (2) there being very few relays that support being measured by FlashFlow. In fact the FF coordinator will maintain a list of the relays that have updated to support being measured and have opted in to being measured, and it will only measure them.

The coordinator divides time into a series of 24 hour periods, commonly referred to as days. Each period has measurement slots that are longer than a measurement lasts (30s), say 60s, to account for pre- and post-measurement work. Thus with 60s slots there's 1,440 slots in a day.

At the start of each day the coordinator considers the list of relays that have opted in to being measured. From this list of relays, it repeatedly takes the relay with the largest existing capacity estimate. It selects a random slot. If the slot has existing relays assigned to it, the coordinator makes sure there is enough additional measurer capacity to handle this relay. If so, it assigns this relay to this slot. If not, it keeps picking new random slots until one has sufficient additional measurer capacity.

Relays without existing capacity estimates are assumed to have the 75th percentile capacity of the current network.

If a relay is not online when it's scheduled to be measured, it doesn't get measured that day.

3.2.1.2.1 Example

Assume the FF deployment has 1 Gbit/s of measurer capacity. Assume the chosen multiplier m=2. Assume there are only 5 slots in a measurement period.

Consider a set of relays with the following existing capacity estimates and that have opted in to being measured by FlashFlow.

  • 500 Mbit/s
  • 300 Mbit/s
  • 250 Mbit/s
  • 200 Mbit/s
  • 100 Mbit/s
  • 50 Mbit/s

The coordinator takes the largest relay, 500 Mbit/s, and picks a random slot for it. It picks slot 3. The coordinator takes the next largest, 300, and randomly picks slot 2. The slots are now:

0 | 1 | 2 | 3 | 4 -------|-------|-------|-------|------- | | 300 | 500 | | | | |

The coordinator takes the next largest, 250, and randomly picks slot 2. Slot 2 already has 600 Mbit/s of measurer capacity reserved (300*m); given just 1000 Mbit/s of total measurer capacity, there is just 400 Mbit/s of spare capacity while this relay requires 500 Mbit/s. There is not enough room in slot 2 for this relay. The coordinator picks a new random slot, 0.

0 | 1 | 2 | 3 | 4 -------|-------|-------|-------|------- 250 | | 300 | 500 | | | | |

The next largest is 200 and the coordinator randomly picks slot 2 again (wow!). As there is just enough spare capacity, the coordinator assigns this relay to slot 2.

0 | 1 | 2 | 3 | 4 -------|-------|-------|-------|------- 250 | | 300 | 500 | | | 200 | |

The coordinator randomly picks slot 4 for the last remaining relays, in that order.

0 | 1 | 2 | 3 | 4 -------|-------|-------|-------|------- 250 | | 300 | 500 | 100 | | 200 | | 50

3.2.1.3 Generating V3BW files

Every hour the FF coordinator produces a v3bw file in which it stores the latest capacity estimate for every relay it has measured in the last week. The coordinator will create this file on the host's local file system. Previously-generated v3bw files will not be deleted by the coordinator. A symbolic link at a static path will always point to the latest v3bw file.

$ ls -l v3bw -> v3bw.2020-03-01-05-00-00 v3bw.2020-03-01-00-00-00 v3bw.2020-03-01-01-00-00 v3bw.2020-03-01-02-00-00 v3bw.2020-03-01-03-00-00 v3bw.2020-03-01-04-00-00 v3bw.2020-03-01-05-00-00

[XXX Either FF should auto-delete old ones, logrotate config should be provided, a script provided, or something to help bwauths not accidentally fill up their disk]

[XXX What's the approxmiate disk usage for, say, a few years of these?]

3.2.2 FlashFlow Measurer

The measurers take commands from the coordinator, connect to target relays with many sockets, send them traffic, and verify the received traffic is the same as what was sent.

Notable new things that internal tor code will need to do on the measurer (client) side:

  1. Open many TLS+TCP connections to the same relay on purpose.

3.2.2.1 Open many connections

FlashFlow prototypes needed to "hack in" a flag in the open-a-connection-with-this-relay function call chain that indicated whether or not we wanted to force a new connection to be created. Most of Tor doesn't care if it reuses an existing connection, but FF does want to create many different connections. The cleanest way to accomplish this will be investigated.

On the relay side, these measurer connections do not count towards DoS detection algorithms.

3.3 Security

In this section we discuss the security of various aspects of FlashFlow and the tor changes it requires.

3.3.1 Weight Inflation

Target relays are an active part of the measurement process; they know they are getting measured. While a relay cannot fake the measurement traffic, it can trivially stop transferring client background traffic for the duration of the measurement yet claim it carried some. More generally, there is no verification of the claimed amount of background traffic during the measurement. The relay can claim whatever it wants, but it will not be trusted above the ratio the FlashFlow deployment is configured to know. This places an easy to understand, firm, and (if set as we suggest) low cap on how much a relay can inflate its measured capacity.

Consider a background/measurement ratio of 1/4, or 25%. Assume the relay in question has a hard limit on capacity (e.g. from its NIC) of 100 Mbit/s. The relay is supposed to use up to 25% of its capacity for background traffic and the remaining 75%+ capacity for measurement traffic. Instead the relay ceases carrying background traffic, uses all 100 Mbit/s of capacity to handle measurement traffic, and reports ~33 Mbit/s of background traffic (33/133 = ~25%). FlashFlow would trust this and consider the relay capable of 133 Mbit/s. (If the relay were to report more than ~33 Mbit/s, FlashFlow limits it to just ~33 Mbit/s.) With r=25%, FlashFlow only allows 1.33x weight inflation.

Prior work shows that Torflow allows weight inflation by a factor of 89x [0] or even 177x [1].

The ratio chosen is a trade-off between impact on background traffic and security: r=50% allows a relay to double its weight but won't impact client traffic for relays with steady state throughput below 50%, while r=10% allows a very low inflation factor but will cause throttling of client traffic at far more relays. We suggest r=25% (and thus 1/(1-0.25)=1.33x inflation) for a reasonable trade-off between performance and security.

It may be possible to catch relays performing this attack, especially if they literally drop all background traffic during the measurement: have the measurer (or some party on its behalf) create a regular stream through the relay and measure the throughput on the stream before/during/after the measurement. This can be explored longer term.

3.3.2 Incomplete Authentication

The short term FlashFlow implementation has the relay set two torrc options if they would like to allow themselves to be measured: a flag allowing measurement, and the list of coordinator TLS certificate that are allowed to start a measurement.

The relay drops MEAS_PARAMS cells from coordinators it does not trust, and immediately closes the connection after that. A FF coordinator cannot convince a relay to enter measurement mode unless the relay trusts its TLS certificate.

A trusted coordinator specifies in the MEAS_PARAMS cell the IP addresses of the measurers the relay shall expect to connect to it shortly. The target adds the measurer IP addresses to a whitelist in the DoS connection limit system, exempting them from any configured connection limit. If a measurer is behind a NAT, an adversary behind the same NAT can DoS the relay's available sockets until the end of the measurement. The adversary could also pretend to be the measurer. Such an adversary could induce measurement failures and inaccuracies. (Note: the whitelist is cleared after the measurement is over.)

4. FlashFlow measurement system: Medium term

The medium term deployment stage begins after FlashFlow has been implemented and relays are starting to update to a version of Tor that supports it.

New link- and relay-subprotocol versions will be used by the relay to indicate FF support. E.g. at the time of writing, the next relay subprotocol version is 4 [3].

We plan to host a FlashFlow deployment consisting of a FF coordinator and a single FF measurer on a single 1 Gbit/s machine. Data produced by this deployment will be made available (semi?) publicly, including both v3bw files and intermediate results.

Any development changes needed during this time would go through separate proposals.

5. FlashFlow measurement system: Long term

In the long term, finishing-touch development work will be done, including adding better authentication and measurement scheduling, and experiments will be run to determine the best way to integrate FlashFlow into the Tor ecosystem.

Any development changes needed during this time would go through separate proposals.

5.1 Authentication to Target Relay

Short term deployment already had FlashFlow coordinators using TLS certificates when connecting to relays, but in the long term, directory authorities will vote on the consensus parameter for which coordinators should be allowed to perform measurements. The voting is done in the same way they currently vote on recommended tor versions.

FlashFlow measurers will be updated to use TLS certificates when connecting to relays too. FlashFlow coordinators will update the contents of MEAS_PARAMS cells to contain measurer TLS certificates instead of IP addresses, and relays will update to expect this change.

5.2 Measurement Scheduling

Short term deployment only has one FF deployment running. Long term this may no longer be the case because, for example, more than one directory authority decides to adopt it and they each want to run their own deployment. FF deployments will need to coordinate between themselves to not measure the same relay at the same time, and to handle new relays as they join during the middle of a measurement period (during the day).

The measurement scheduling process shall be non-interactive. All the inputs (e.g. the shared random value, the identities of the coords, the relays currently in the network) are publicly known to (at least) the bwauths, thus each individual bwauth can calculate same multi-coord measurement schedule.

The following is quoted from Section 4.3 of the FlashFlow paper.

To measure all relays in the network, the BWAuths periodically determine the measurement schedule. The schedule determines when and by whom a relay should be measured. We assume that the BWAuths have sufficiently synchronized clocks to facilitate coordinating their schedules. A measurement schedule is created for each measurement period, the length p of which determines how often a relay is measured. We use a measurement period of p = 24 hours. To help avoid active denial-of-service attacks on targeted relays, the measurement schedule is randomized and known only to the BWAuths. Before the next measurement period starts, the BWAuths collectively generate a random seed (e.g. using Tor’s secure-randomness protocol). Each BWAuth can then locally determine the shared schedule using pseudorandom bits extracted from that seed. The algorithm to create the schedule considers each measurement period to be divided into a sequence of t-second measurement slots. For each old relay, slots for each BWAuth to measure it are selected uniformly at random without replacement from all slots in the period that have sufficient unallocated measurement capacity to accommodate the measurement. When a new relay appears, it is measured separately by each BWAuth in the first slots with sufficient unallocated capacity. Note that this design ensures that old relays will continue to be measured, with new relays given secondary priority in the order they arrive.

[XXX Teor leaves good ideas in his tor-dev@ post [5], including a good plain language description of what the FF paper quotes says, and a recommendation on which consensus to use when making a new schedule]

A problem arises when two relays are hosted on the same machine but measured at different times: they both will be measured to have the full capacity of their host. At the very least, the scheduling algo should schedule relays with the same IP to be measured at the same time. Perhaps better is measuring all relays in the same MyFamily, same ipv4/24, and/or same ipv6/48 at the same time. What specifically to do here is left for medium/long term work.

5.3 Experiments

[XXX todo]

5.4 Other Changes/Investigations/Ideas

  • How can FlashFlow data be used in a way that doesn't lead to poor load balancing given the following items that lead to non-uniform client behavior:
    • Guards that high-traffic HSs choose (for 3 months at a time)
    • Guard vs middle flag allocation issues
    • New Guard nodes (Guardfraction)
    • Exit policies other than default/all
    • Directory activity
    • Total onion service activity
    • Super long-lived circuits
  • Add a cell that the target relay sends to the coordinator indicating its CPU and memory usage, whether it has a shortage of sockets, how much bandwidth load it has been experiencing lately, etc. Use this information to lower a relays weight, never increase.
  • If FlashFlow and sbws work together (as opposed to FlashFlow replacing sbws), consider logic for how much sbws can increase/decrease FF results
  • Coordination of multiple FlashFlow deployments: scheduling of measurements, seeding schedule with shared random value.
  • Other background/measurement traffic ratios. Dynamic? (known slow relay => more allowed bg traffic?)
  • Catching relays inflating their measured capacity by dropping background traffic.
  • What to do about co-located relays. Can they be detected reliably? Should we just add a torrc option a la MyFamily for co-located relays?
  • What is the explanation for dennis.jackson's scary graphs in this [2] ticket? Was it because of the speed test? Why? Will FlashFlow produce the same behavior?

Citations

[0] F. Thill. Hidden Service Tracking Detection and Bandwidth Cheating in Tor Anonymity Network. Master’s thesis, Univ. Luxembourg, 2014. https://www.cryptolux.org/images/b/bc/Tor_Issues_Thesis_Thill_Fabrice.pdf [1] A. Johnson, R. Jansen, N. Hopper, A. Segal, and P. Syverson. PeerFlow: Secure Load Balancing in Tor. Proceedings on Privacy Enhancing Technologies (PoPETs), 2017(2), April 2017. https://ohmygodel.com/publications/peerflow-popets2017.pdf [2] Mike Perry: Graph onionperf and consensus information from Rob's experiments https://trac.torproject.org/projects/tor/ticket/33076 [3] tor-spec.txt Section 9.3 "Relay" Subprotocol versioning https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n2132 [4] Teor's second respose to FlashFlow proposal https://lists.torproject.org/pipermail/tor-dev/2020-April/014251.html [5] Teor's first respose to FlashFlow proposal https://lists.torproject.org/pipermail/tor-dev/2020-April/014246.html

Appendix A: Save CPU at measurer by not encrypting all MEAS_ECHO cells

Verify echo cells

A parameter will exist to tell the measurers with what frequency they shall verify that cells echoed back to them match what was sent. This parameter does not need to exist outside of the FF deployment (e.g. it doesn't need to be a consensus parameter).

The parameter instructs the measurers to check 1 out of every N cells.

The measurer keeps a count of how many measurement cells it has sent. It also logically splits its output stream of cells into buckets of size N. At the start of each bucket (when num_sent % N == 0), the measurer chooses a random index in the bucket. Upon sending the cell at that index (num_sent % N == chosen_index), the measurer records the cell.

The measurer also counts cells that it receives. When it receives a cell at an index that was recorded, it verifies that the received cell matches the recorded sent cell. If they match, no special action is taken. If they don't match, the measurer indicates failure to the coordinator and target relay and closes all connections, ending the measurement.

Example

Consider bucket_size is 1000. For the moment ignore cell encryption.

We start at idx=0 and pick an idx in [0, 1000) to record, say 640. At idx=640 we record the cell. At idx=1000 we choose a new idx in [1000,

  1. to record, say 1236. At idx=1236 we record the cell. At idx=2000 we choose a new idx in [2000, 3000). Etc.

There's 2000+ cells in flight and the measurer has recorded two items:

- (640, contents_of_cellA) - (1236, contents_of_cellB)

Consider the receive side now. It counts the cells it receives. At receive idx=640, it checks the received cell matches the saved cell from before. At receive idx=1236, it again checks the received cell matches. Etc.

Motivation

A malicious relay may want to skip decryption of measurement cells to save CPU cycles and obtain a higher capacity estimate. More generally, it could generate fake measurement cells locally, ignore the measurement traffic it is receiving, and flood the measurer with more traffic that it (the measurer) is even sending.

The security of echo cell verification is discussed in section 3.3.1.

Security

A smaller bucket size means more cells are checked and FF is more likely to detect a malicious target. It also means more bookkeeping overhead (CPU/RAM).

An adversary that knows bucket_size and cheats on one item out of every bucket_size items will have a 1/bucket_size chance of getting caught in the first bucket. This is the worst case adversary. While cheating on just a single item per bucket yields very little advantage, cheating on more items per bucket increases the likelihood the adversary gets caught. Thus only the worst case is considered here.

In general, the odds the adversary can successfully cheat in a single bucket are

(bucket_size-1)/bucket_size

Thus the odds the adversary can cheat in X consecutive buckets are

[(bucket_size-1)/bucket_size]^X

In our case, X will be highly varied: Slow relays won't see very many buckets, but fast relays will. The damage to the network a very slow relay can do by faking being only slightly faster is limited. Nonetheless, for now we motivate the selection of bucket_size with a slow relay:

  • Assume a very slow relay of 1 Mbit/s capacity that will cheat 1 cell in each bucket. Assume a 30 second measurement.
  • The relay will handle 1*30 = 30 Mbit of traffic during the measurement, or 3.75 MB, or 3.75 million bytes.
  • Cells are 514 bytes. Approximately (e.g. ignoring TLS) 7300 cells will be sent/recv over the course of the measurement.
  • A bucket_size of 50 results in about 146 buckets over the course of the 30s measurement.
  • Therefore, the odds of the adversary cheating successfully as (49/50)^(146), or about 5.2%.

This sounds high, but a relay capable of double the bandwidth (2 Mbit/s) will have (49/50)^(2*146) or 0.2% odds of success, which is quite low.

Wanting a <1% chance that a 10 Mbit/s relay can successfully cheat results in a bucket size of approximately 125:

  • 10*30 = 300 Mbit of traffic during 30s measurement. 37.5 million bytes.
  • 37,500,000 bytes / 514 bytes/cell = ~73,000 cells
  • bucket_size of 125 cells means 73,000 / 125 = 584 buckets
  • (124/125)^(584) = 0.918% chance of successfully cheating

Slower relays can cheat more easily but the amount of extra weight they can obtain is insignificant in absolute terms. Faster relays are essentially unable to cheat.

Filename: 317-secure-dns-name-resolution.txt Title: Improve security aspects of DNS name resolution Author: Christian Hofer Created: 21-Mar-2020 Status: Needs-Revision Overview: This document proposes a solution for handling DNS name resolution within Tor in a secure manner. In order to achieve this the responsibility for name resolution is moved from the exit relays to the clients. Therefore a security aware DNS resolver is required that is able to operate using Tor. The advantages are: * Users have no longer to trust exit relays but can choose trusted nameservers. * DNS requests are kept confidential from exit relays in case the nameservers are running behind onion services. * The authenticity and integrity of DNS records is verified by means of DNSSEC. Motivation: The way how Tor resolves DNS names has always been a hot topic within the Tor community and it seems that the discussion is not over yet. One example is this recent blog posting that addresses the importance of avoiding public DNS resolvers in order to mitigate analysis attacks. https://blog.torproject.org/new-low-cost-traffic-analysis-attacks-mitigations Then there is the paper "The Effect of DNS on Tor’s Anonymity" that discusses how to use DNS traffic for correlation attacks and what countermeasures should be taken. Based on this, there is this interesting medium article evaluating the situation two years after it was published. https://medium.com/@nusenu/who-controls-tors-dns-traffic-a74a7632e8ca Furthermore, there was already a proposal to improve the way how DNS resolution is done within Tor. Unfortunately, it seems that it has been abandoned, so this proposal picked up the presented ideas. https://gitweb.torproject.org/torspec.git/tree/proposals/219-expanded-dns.txt Design: The key aspect is the introduction of a DNS resolver module on the client side. It has to comply with the well known DNS standards as described in a series of RFCs. Additional requirements are the ability to communicate through the Tor network for ensuring confidentiality and the implementation of DNS security extensions (DNSSEC) for verifying the authenticity and integrity. Furthermore it has to cover two distinct scenarios, which are described in subsequent sections. The resolution scenario, the most common scenario for a DNS resolvers, is applicable for connections handled by the SocksPort. After successful socks handshakes the target address is resolved before attaching the connection. The proxy scenario is a more unusual use case, however it is required for connections handled by the DNSPort. In this case requests are forwarded as they are received without employing any resolution or verification means. In both scenarios the most noticeable change in terms of interactions between the resolver and the rest of Tor concerns the entry and exit points for passing connections forth and back. Additionally, the entry_connection needs to be extended so that it is capable of holding related state information. Security implications: This improves the security aspects of DNS name resolution by reducing the significance of exit relays. In particular: * Operating nameservers behind onion services allows end-to-end encryption for DNS lookups. * Employing DNSSEC verification prevents tampering with DNS records. * Configuring trusted nameservers on the client side reduces the number of entities that must be trusted. Specification: DNS resolver general implementation: The security aware DNS resolver module has to comply with existing DNS and DNSSEC specifications. A list of related RFCs: RFC883, RFC973, RFC1035, RFC1183, RFC1876, RFC1996, RFC2065, RFC2136, RFC2230, RFC2308, RFC2535, RFC2536, RFC2539, RFC2782, RFC2845, RFC2874, RFC2930, RFC3110, RFC3123, RFC3403, RFC3425, RFC3596, RFC3658, RFC3755, RFC3757, RFC3986, RFC4025, RFC4033, RFC4034, RFC4035, RFC4255, RFC4398, RFC4431,RFC4509, RFC4635, RFC4701, RFC5011, RFC5155, RFC5702, RFC5933, RFC6605, RFC6672, RFC6698, RFC6725, RFC6840, RFC6844, RFC6891, RFC7129, RFC7344, RFC7505, RFC7553, RFC7929, RFC8005, RFC8078, RFC8080, RFC8162. DNS resolver configuration settings: DNSResolver: If True use DNS resolver module for name resolution, otherwise Tor's behavior should be unchanged. DNSResolverIPv4: If True names should be resolved to IPv4 addresses. DNSResolverIPv6: If True names should be resolved to IPv6 addresses. In case IPv4 and IPv6 are enabled prefer IPv6 and use IPv4 as fallback. DNSResolverRandomizeCase: If True apply 0x20 hack to DNS names for outgoing requests. DNSResolverNameservers: A list of comma separated nameservers, can be an IPv4, an IPv6, or an onion address. Should allow means to configure the port and supported zones. DNSResolverHiddenServiceZones: A list of comma separated hidden service zones. DNSResolverDNSSECMode: Should support at least four modes. Off: No validation is done. The DO bit is not set in the header of outgoing requests. Trust: Trust validation of DNS recursor. The CD and DO bits are not set in the header of outgoing requests. Porcess: Employ DNSSEC validation but ignore the result. Validate: Employ DNSSEC validation and reject insecure data. DNSResolverTrustAnchors: A list of comma separated trust anchors in DS record format. https://www.iana.org/dnssec/files DNSResolverMaxCacheEntries: Specifies the maximum number of cache entries. DNSResolverMaxCacheTTL: Specifies the maximum age of cache entries in seconds. DNS resolver state (dns_lookup_st.h): action: Defines the active action. Available actions are: forward, resolve, validate. qname: Specifies the name that should be resolved or forwarded. qtype: Specifies the type that should be resolved or forwarded. start_time: Holds the initiation time. nameserver: Specifies the chosen nameserver. validation: Holds the DNSSEC validation state only applicable for the validate action. server_request: The original DNSPort request required for delivering responses in the proxy scenario. ap_conn: The original SocksPort entry_connection required for delivering responses in the resolution scenario. SocksPort related changes (resolution scenario): The entry point is directly after a successful socks handshake in connection_ap_handshake_process_socks (connetion_edge.c). Based on the target address type the entry_connection is either passed to the DNS resolver (hostname) or handled as usual (IPv4, IPv6, onion). In the former case the DNS resolver creates a new DNS lookup connection and attaches it instead of the given entry_connection. This connection is responsible for resolving the hostname of the entry_connection and verifying the response. Once the result is verified and the hostname is resolved, the DNS resolver replaces the target address in the entry_connection with the resolved address and attaches it. From this point on the entry_connection is processed as usual. DNSPort related changes (proxy scenario): The entry point is in evdns_server_callback (dnsserv.c). Instead of creating a dummy connection the received server_request is passed to the DNS resolver. It creates a DNS lookup connection with the action type forward and applies the name and type from the server_request. When the DNS resolver receives the answer from the nameserver it resolvers the server_request by adding all received resource records. Compatibility: Compatibility issues are not expected since there are no changes to the Tor protocol. The significant part takes place on the client side before attaching connections. Implementation: A complete implementation of this proposal can be found here: https://github.com/torproject/tor/pull/1869 The following steps should suffice to test the implementation: * check out the branch * build Tor as usual * enable the DNS resolver module by adding `DNSResolver 1` to torrc Useful services for verifying DNSSEC validation: * http://www.dnssec-or-not.com/ * https://enabled.dnssec.hkirc.hk/ * https://www.cloudflare.com/ssl/encrypted-sni/ Dig is useful for testing the DNSPort related changes: dig -p9053 torproject.org Performance and scalability: Since there are no direct changes to the protocol and this is an alternative approach for an already existing requirement, there are no performance issues expected. Additionally, the encoding and decoding of DNS message handling as well as the verification takes place on the client side. In terms of scalability the availability of nameservers might be one of the key concerns. However, this is the same issue as for nameservers on the clearweb. If it turns out that it is not feasible to operate nameservers as onion service in a performant manner it is always possible to fallback to clearweb nameservers by changing a configuration setting.
Filename: 318-limit-protovers.md Title: Limit protover values to 0-63. Author: Nick Mathewson Created: 11 May 2020 Status: Closed Implemented-In: 0.4.5.1-alpha

Limit protover values to 0-63.

I propose that we no longer accept protover values higher than 63, so that they can all fit nicely into 64-bit fields.

(This proposal is part of the Walking Onions spec project.)

Motivation

Doing this will simplify our implementations and our protocols. Right now, an efficient protover implementation needs to use ranges to represent possible protocol versions, and needs workarounds to prevent an attacker from constructing a protover line that would consume too much memory. With Walking Onions, we need lists of protocol versions to be represented in an extremely compact format, which also would benefit from a limited set of possible versions.

I believe that we will lose nothing by making this change. Currently, after nearly two decades of Tor development and 3.5 years of experiences with protovers in production, we have no protocol version high than 5.

Even if we did someday need to implement higher protocol versions, we could simply add a new subprotocol name instead. For example, instead of "HSIntro=64", we could say "HSIntro2=1".

Migration

Immediately, authorities should begin rejecting relays with protocol versions above 63. (There are no such relays in the consensus right now.)

Once this change is deployed to a majority of authorities, we can remove support in other Tor environments for protocol versions above 63.

Filename: 319-wide-everything.md Title: RELAY_FRAGMENT cells Author: Nick Mathewson Created: 11 May 2020 Status: Obsolete

(Proposal superseded by proposal 340)

(This proposal is part of the Walking Onions spec project.)

Introduction

Proposal 249 described a system for CREATE cells to become wider, in order to accommodate hybrid crypto. And in order to send those cell bodies across circuits, it described a way to split CREATE cells into multiple EXTEND cells.

But there are other cell types that can need to be wider too. For example, INTRODUCE and RENDEZVOUS cells also contain key material used for a handshake: if handshakes need to grow larger, then so do these cells.

This proposal describes an encoding for arbitrary "wide" relay cells, that can be used to send a wide variant of anything.

To be clear, although this proposal describes a way that all relay cells can become "wide", I do not propose that wide cells should actually be allowed for all relay cell types.

Proposal

We add a new relay cell type: RELAY_FRAGMENT. This cell type contains part of another relay cell. A RELAY_FRAGMENT cell can either introduce a new fragmented cell, or can continue one that is already in progress.

The format of a RELAY_FRAGMENT body is one of the following:

// First body in a series struct fragment_begin { // What relay_command is in use for the underlying cell? u8 relay_command; // What will the total length of the cell be once it is reassembled? u16 total_len; // Bytes for the cell body u8 body[]; } // all other cells. struct fragment_continued { // More bytes for the cell body. u8 body[]; }

To send a fragmented cell, first a party sends a RELAY_FRAGMENT cell containing a "fragment_begin" payload. This payload describes the total length of the cell, the relay command

Fragmented cells other than the last one in sequence MUST be sent full of as much data as possible. Parties SHOULD close a circuit if they receive a non-full fragmented cell that is not the last fragment in a sequence.

Fragmented cells MUST NOT be interleaved with other relay cells on a circuit, other than cells used for flow control. (Currently, this is only SENDME cells.) If any party receives any cell on a circuit, other than a flow control cell or a RELAY_FRAGMENT cell, before the fragmented cell is complete, than it SHOULD close the circuit.

Parties MUST NOT send extra data in fragmented cells beyond the amount given in the first 'total_len' field.

Not every relay command may be sent in a fragmented cell. In this proposal, we allow the following cell types to be fragmented: EXTEND2, EXTENDED2, INTRODUCE1, INTRODUCE2, RENDEZVOUS1, and RENDEZVOUS2. Any party receiving a command that they believe should not be fragmented should close the circuit.

Not all lengths up to 65535 are valid lengths for a fragmented cell. Any length under 499 bytes SHOULD cause the circuit to close, since that could fit into a non-fragmented RELAY cell. Parties SHOULD enforce maximum lengths for cell types that they understand.

All RELAY_FRAGMENT cells for the fragmented cell must have the same Stream ID. (For those cells allowed above, the Stream ID is always zero.) Implementations SHOULD close a circuit if they receive fragments with mismatched Stream ID.

Onion service concerns.

We allocate a new extension for use in the ESTABLISH_INTRO by onion services, to indicate that they can receive a wide INTRODUCE2 cell. This extension contains:

struct wide_intro2_ok { u16 max_len; }

We allocate a new extension for use in the ESTABLISH_RENDEZVOUS cell, to indicate acceptance of wide RENDEZVOUS2 cells. This extension contains:

struct wide_rend2_ok { u16 max_len; }

(Note that ESTABLISH_RENDEZVOUS cells do not currently have a an extension mechanism. They should be extended to use the same extension format as ESTABLISH_INTRO cells, with extensions placed after the rendezvous cookie.)

Handling RELAY_EARLY

The first fragment of each EXTEND cell should be tagged with RELAY_EARLY. The remaining fragments should not. Relays should accept EXTEND cells if and only if their first fragment is tagged with RELAY_EARLY.

Rationale: We could allow any fragment to be tagged, but that would give hostile guards an opportunity to move RELAY_EARLY tags around and build a covert channel. But if we later move to a relay encryption method that lets us authenticate RELAY_EARLY, we could then require only that any fragment has RELAY_EARLY set.

Compatibility

This proposal will require the allocation of a new 'Relay' protocol version, to indicate understanding of the RELAY_FRAGMENTED command.

Filename: 320-tap-out-again.md Title: Removing TAP usage from v2 onion services Author: Nick Mathewson Created: 11 May 2020 Status: Rejected

NOTE: we rejected this proposal in favor of simply deprecating v2 onion services entirely.

(This proposal is part of the Walking Onions spec project. It updates proposal 245.)

Removing TAP from v2 onion services

As we implement walking onions, we're faced with a problem: what to do with TAP keys? They are bulky and insecure, and not used for anything besides v2 onion services. Keeping them in SNIPs would consume bandwidth, and keeping them indirectly would consume complexity. It would be nicer to remove TAP keys entirely.

But although v2 onion services are obsolescent and their cryptographic parameters are disturbing, we do not want to drop support for them as part of the Walking Onions migration. If we did so, then we would force some users to choose between Walking Onions and v2 onion services, which we do not want to do.

Instead, we describe here a phased plan to replace TAP in v2 onion services with ntor. This change improves the forward secrecy of some of the session keys used with v2 onion services, but does not improve their authentication, which is strongly tied to truncated SHA1 hashes of RSA1024 keys.

Implementing this change is more complex than similar changes elsewhere in the Tor protocol, since we do not want clients or services to leak whether they have support for this proposal, until support is widespread enough that revealing it is no longer a privacy risk.

We define these entries that may appear in v2 onion service descriptors, once per introduction point.

"identity-ed25519" "ntor-onion-key" [at most once each per intro point.] These values are in the same format as and follow the same rules as their equivalents in router descriptors. "link-specifiers" [at most once per introduction point] This value is the same as the link specifiers in a v3 onion service descriptor, and follows the same rules.

Services should not include any of these fields unless a new network parameter, "hsv2-intro-updated" is set to 1. Clients should not parse these fields or use them unless "hsv2-use-intro-updated" is set to 1.

We define a new field that can be used for hsv2 descriptors with walking onions:

"snip" [at most once] This value is the same as the snip field introduced to a v3 onion service descriptor by proposal (XXX) and follows the same rules.

Services should not include this field unless a new network parameter, "hsv2-intro-snip" is set to 1. Clients should not parse this field or use it unless the parameter "hsv2-use-intro-snip" is set to 1.

Additionally, relays SHOULD omit the following legacy intro point parameters when a new network parameter, "hsv2-intro-legacy" is set to 0: "ip-address", "onion-port", and "onion-key". Clients should treat them as optional when "hsv2-tolerate-no-legacy" is set to 1.

INTRODUCE cells, RENDEZVOUS cells, and ntor.

We allow clients to specify the rendezvous point's ntor key in the INTRODUCE2 cell instead of the TAP key. To do this, the client simply sets KLEN to 32, and includes the ntor key for the relay.

Clients should only use ntor keys in this way if the network parameter "hsv2-client-rend-ntor" is set to 1, and if the entry "allow-rend-ntor" is present in the onion service descriptor.

Services should only advertise "allow-rend-ntor" in this way if the network parameter "hsv2-service-rend-ntor" is set to 1.

Migration steps

First, we implement all of the above, but set it to be disabled by default. We use torrc fields to selectively enable them for testing purposes, to make sure they work.

Once all non-LTS versions of Tor without support for this proposal are obsolete, we can safely enable "hsv2-client-rend-ntor", "hsv2-service-rend-ntor", "hsv2-intro-updated", and "hsv2-use-intro-updated".

Once all non-LTS versions of Tor without support for walking onions are obsolete, we can safely enable "hsv2-intro-snip", "hsv2-use-intro-snip", and "hsv2-tolerate-no-legacy".

Once all non-LTS versions of Tor without support for both of the above implementations are finally obsolete, we can finally set "hsv2-intro-legacy" to 0.

Future work

There is a final TAP-like protocol used for v2 hidden services: the client uses RSA1024 and DH1024 to send information about the rendezvous point and to start negotiating the session key to be used for end-to-end encryption.

In theory we could get a benefit to forward secrecy by using ntor instead of TAP here, but we would get not corresponding benefit for authentication, since authentication is still ultimately tied to HSv2's scary RSA1024-plus-truncated-SHA1 combination.

Given that, it might be just as good to allow the client to insert a curve25519 key in place of their DH1024 key, and use that for the DH handshake instead. That would be a separate proposal, though: this proposal is enough to allow all relays to drop TAP support.

Filename: 321-happy-families.md Title: Better performance and usability for the MyFamily option (v2) Author: Nick Mathewson Created: 27 May 2020 Status: Accepted

Problem statement.

The current family mechanism allows well-behaved relays to identify that they all belong to the same 'family', and should not be used in the same circuits.

Right now, families work by having every family member list every other family member in its server descriptor. This winds up using O(n^2) space in microdescriptors and server descriptors. (For RAM, we can de-duplicate families which sometimes helps.) Adding or removing a server from the family requires all the other servers to change their torrc settings.

This growth in size is not just a theoretical problem. Family declarations currently make up a little over 55% of the microdescriptors in the directory--around 24% after compression. The largest family has around 270 members. With Walking Onions, 270 members times a 160-bit hashed identifier leads to over 5 kilobytes per SNIP, which is much greater than we'd want to use.

This is an updated version of proposal 242. It differs by clarifying requirements and providing a more detailed migration plan.

Design overview.

In this design, every family has a master ed25519 "family key". A node is in the family iff its server descriptor includes a certificate of its ed25519 identity key with the family key. The certificate format is the one in the tor-certs.txt spec; we would allocate a new certificate type for this usage. These certificates would need to include the signing key in the appropriate extension.

Note that because server descriptors are signed with the node's ed25519 signing key, this creates a bidirectional relationship between the two keys, so that nodes can't be put in families without their consent.

Changes to router descriptors

We add a new entry to server descriptors:

"family-cert" NL "-----BEGIN FAMILY CERT-----" NL cert "-----END FAMILY CERT-----".

This entry contains a base64-encoded certificate as described above. It may appear any number of times; authorities MAY reject descriptors that include it more than three times.

Changes to microdescriptors

We add a new entry to microdescriptors: family-keys.

This line contains one or more space-separated strings describing families to which the node belongs. These strings MUST be sorted in lexicographic order. These strings MAY be base64-formated nonpadded ed25519 family keys, or may represent some future encoding.

Clients SHOULD accept unrecognized key formats.

Changes to voting algorithm

We allocate a new consensus method number for voting on these keys.

When generating microdescriptors using a suitable consensus method, the authorities include a "family-keys" line if the underlying server descriptor contains any valid family-cert lines. For each valid family-cert in the server descriptor, they add a base-64-encoded string of that family-cert's signing key.

See also "deriving family lines from family-keys?" below for an interesting but more difficult extension mechanism that I would not recommend.

Relay configuration

There are several ways that we could configure relays to let them include family certificates in their descriptors.

The easiest would be putting the private family key on each relay, so that the relays could generate their own certificates. This is easy to configure, but slightly risky: if the private key is compromised on any relay, anybody can claim membership in the family. That isn't so very bad, however -- all the relays would need to do in this event would be to move to a new private family key.

A more orthodox method would be to keep the private key somewhere offline, and use it to generate a certificate for each relay in the family as needed. These certificates should be made with long-enough lifetimes, and relays should warn when they are going to expire soon.

Changes to relay behavior

Each relay should track which other relays they have seen using the same family-key as itself. When generating a router descriptor, each relay should list all of these relays on the legacy 'family' line. This keeps the "family" lines up-to-date with "family-keys" lines for compliant relays.

Relays should continue listing relays in their family lines if they have seen a relay with that identity using the same family-key at any time in the last 7 days.

The presence of this line should be configured by a network parameter, derive-family-line.

Relays whose family lines do not stay at least mostly in sync with their family keys should be marked invalid by the authorities.

Client behavior

Clients should treat node A and node B as belonging to the same family if ANY of these is true:

  • The client has descriptors for A and B, and A's descriptor lists B in its family line, and B's descriptor lists A in its family line.

  • Client A has descriptors for A and B, and they both contain the same entry in their family-keys or family-cert. (Note that a family-cert key may match a base64-encoded entry in the family-keys entry.)

Migration

For some time, existing relays and clients will not support family certificates. Because of this, we try to make sure above the well-behaved relays will list the same entries in both places.

Once enough clients have migrated to using family certificates, authorities SHOULD disable derive-family-line.

Security

Listing families remains as voluntary in this design as in today's Tor, though bad-relay hunters can continue to look for families that have not adopted a family key.

A hostile relay family could list a "family" line that did not match its "family-certs" values. However, the only reason to do so would be in order to launch a client partitioning attack, which is probably less valuable than the kinds of attacks that they could run by simply not listing families at all.

Appendix: deriving family lines from family-keys?

As an alternative, we might declare that authorities should keep family lines in sync with family-certs. Here is a design sketch of how we might do that, but I don't think it's actually a good idea, since it would require major changes to the data flow of the voting system.

In this design, authorties would include a "family-keys" line in each router section in their votes corresponding to a relay with any family-cert. When generating final microdescriptors using this method, the authorities would use these lines to add entries to the microdescriptors' family lines:

  1. For every relay appearing in a routerstatus's family-keys, the relays calculate a consensus family-keys value by listing including all those keys that are listed by a majority of those voters listing the same router with the same descriptor. (This is the algorithm we use for voting on other values derived from the descriptor.)

  2. The authorities then compute a set of "expanded families": one for each family key. Each "expanded family" is a set containing every router in the consensus associated with that key in its consensus family-keys value.

  3. The authorities discard all "expanded families" of size 1 or smaller.

  4. Every router listed for the "expanded family" has every other router added to the "family" line in its microdescriptor. (The "family" line is then re-canonicalized according to the rules of proposal 298 to remove its )

  5. Note that the final microdescriptor consensus will include the digest of the derived microdescriptor in step 4, rather than the digest of the microdescriptor listed in the original votes. (This calculation is deterministic.)

The problem with this approach is that authorities would have to fetch microdescriptors they do not have in order to replace their family lines. Currently, voting never requires an authority to fetch a microdescriptor from another authority. If we implement vote compression and diffs as in the Walking Onions proposal, however, we might suppose that votes could include microdescriptors directly.

Still, this is likely more complexity than we want for a transition mechanism.

Appendix: Deriving family-keys from families??

We might also imagine that authorities could infer which families exist from the graph of family relationships, and then include synthetic "family-keys" entries for routers that belong to the same family.

This has two challenges: first, to compute these synthetic family keys, the authorities would need to have the same graph of family relationships to begin with, which once again would require them to include the complete list of families in their votes.

Secondly, finding all the families is equivalent to finding all maximal cliques in a graph. This problem is NP-hard in its general case. Although polynomial solutions exist for nice well-behaved graphs, we'd still need to worry about hostile relays including strange family relationships in order to drive the algorithm into its exponential cases.

Appendix: New assigned values

We need a new assigned value for the certificate type used for family signing keys.

We need a new consensus method for placing family-keys lines in microdescriptors.

Appendix: New network parameters

  • derive-family-line: If 1, relays should derive family lines from observed family-keys. If 0, they do not. Min: 0, Max: 1. Default: 1.
Filename: 322-dirport-linkspec.md Title: Extending link specifiers to include the directory port Author: Nick Mathewson Created: 27 May 2020 Status: Open

Motivation

Directory ports remain the only way to contact a (non-bridge) Tor relay that isn't expressible as a Link Specifier. We haven't specified a link specifier of this kind so far, since it isn't a way to contact a relay to create a channel.

But authorities still expose directory ports, and encourage relays to use them preferentially for uploading and downloading. And with Walking Onions, it would be convenient to try to make every kind of "address" a link specifier -- we'd like want authorities to be able to specify a list of link specifiers that can be used to contact them for uploads and downloads.

It is possible that after revision, Walking Onions won't need a way to specify this information. If so, this proposal should be moved to "Reserve" status as generally unuseful.

Proposal

We reserve a new link specifier type "dir-url", for use with the directory system. This is a variable-length link specifier, containing a URL prefix. The only currently supported URL schema is "http://". Implementations SHOULD ignore unrecognized schemas. IPv4 and IPv6 addresses MAY be used directory; hostnames are also allowed. Implementations MAY ignore hostnames and only use raw addresses.

The URL prefix includes everything through the string "tor" in the directory hierarchy.

A dir-url link specifier SHOULD NOT appear in an EXTEND cell; implementations SHOULD reject them if they do appear.

Filename: 323-walking-onions-full.md Title: Specification for Walking Onions Author: Nick Mathewson Created: 3 June 2020 Status: Open

Introduction: A Specification for Walking Onions

In Proposal 300, I introduced Walking Onions, a design for scaling Tor and simplifying clients, by removing the requirement that every client know about every relay on the network.

This proposal will elaborate on the original Walking Onions idea, and should provide enough detail to allow multiple compatible implementations. In this introduction, I'll start by summarizing the key ideas of Walking Onions, and then outline how the rest of this proposal will be structured.

Remind me about Walking Onions again?

With Tor's current design, every client downloads and refreshes a set of directory documents that describe the directory authorities' views about every single relay on the Tor network. This requirement makes directory bandwidth usage grow quadratically, since the directory size grows linearly with the number of relays, and it is downloaded a number of times that grows linearly with the number of clients. Additionally, low-bandwidth clients and bootstrapping clients spend a disproportionate amount of their bandwidth loading directory information.

With these drawbacks, why does Tor still require clients to download a directory? It does so in order to prevent attacks that would be possible if clients let somebody else choose their paths through the network, or if each client chose its paths from a different subset of relays.

Walking Onions is a design that resists these attacks without requiring clients ever to have a complete view of the network.

You can think of the Walking Onions design like this: Imagine that with the current Tor design, the client covers a wall with little pieces of paper, each representing a relay, and then throws a dart at the wall to pick a relay. Low-bandwidth relays get small pieces of paper; high-bandwidth relays get large pieces of paper. With the Walking Onions design, however, the client throws its dart at a blank wall, notes the position of the dart, and asks for the relay whose paper would be at that position on a "standard wall". These "standard walls" are mapped out by directory authorities in advance, and are authenticated in such a way that the client can receive a proof of a relay's position on the wall without actually having to know the whole wall.

Because the client itself picks the position on the wall, and because the authorities must vote together to build a set of "standard walls", nobody else controls the client's path through the network, and all clients can choose their paths in the same way. But since clients only probe one position on the wall at a time, they don't need to download a complete directory.

(Note that there has to be more than one wall at a time: the client throws darts at one wall to pick guards, another wall to pick middle relays, and so on.)

In Walking Onions, we call a collection of standard walls an "ENDIVE" (Efficient Network Directory with Individually Verifiable Entries). We call each of the individual walls a "routing index", and we call each of the little pieces of paper describing a relay and its position within the routing index a "SNIP" (Separable Network Index Proof).

For more details about the key ideas behind Walking Onions, see proposal 300. For more detailed analysis and discussion, see "Walking Onions: Scaling Anonymity Networks while Protecting Users" by Komlo, Mathewson, and Goldberg.

The rest of this document

This proposal is unusually long, since Walking Onions touches on many aspects of Tor's functionality. It requires changes to voting, directory formats, directory operations, circuit building, path selection, client operations, and more. These changes are described in the sections listed below.

Here in section 1, we briefly reintroduce Walking Onions, and talk about the rest of this proposal.

Section 2 will describe the formats for ENDIVEs, SNIPs, and related documents.

Section 3 will describe new behavior for directory authorities as they vote on and produce ENDIVEs.

Section 4 describes how relays fetch and reconstruct ENDIVEs from the directory authorities.

Section 5 has the necessary changes to Tor's circuit extension protocol so that clients can extend to relays by index position.

Section 6 describes new behaviors for clients as they use Walking Onions, to retain existing Tor functionality for circuit construction.

Section 7 explains how to implement onion services using Walking Onions.

Section 8 describes small alterations in client and relay behavior to strengthen clients against some kinds of attacks based on relays picking among multiple ENDIVEs, while still making the voting system robust against transient authority failures.

Section 9 closes with a discussion of how to migrate from the existing Tor design to the new system proposed here.

Appendices

Additionally, this proposal has several appendices:

Appendix A defines commonly used terms.

Appendix B provides definitions for CDDL grammar productions that are used elsewhere in the documents.

Appendix C lists the new elements in the protocol that will require assigned values.

Appendix D lists new network parameters that authorities must vote on.

Appendix E gives a sorting algorithm for a subset of the CBOR object representation.

Appendix F gives an example set of possible "voting rules" that authorities could use to produce an ENDIVE.

Appendix G lists the different routing indices that will be required in a Walking Onions deployment.

Appendix H discusses partitioning TCP ports into a small number of subsets, so that relays' exit policies can be represented only as the group of ports that they support.

Appendix Z closes with acknowledgments.

The following proposals are not part of the Walking Onions proposal, but they were written at the same time, and are either helpful or necessary for its implementation.

318-limit-protovers.md restricts the allowed version numbers for each subprotocol to the range 0..63.

319-wide-everything.md gives a general mechanism for splitting relay commands across more than one cell.

320-tap-out-again.md attempts to remove the need for TAP keys in the HSv2 protocol.

321-happy-families.md lets families be represented with a single identifier, rather than a long list of keys

322-dirport-linkspec.md allows a directory port to be represented with a link specifier.

Document Formats: ENDIVEs and SNIPs

Here we specify a pair of related document formats that we will use for specifying SNIPs and ENDIVEs.

Recall from proposal 300 that a SNIP is a set of information about a single relay, plus proof from the directory authorities that the given relay occupies a given range in a certain routing index. For example, we can imagine that a SNIP might say:

  • Relay X has the following IP, port, and onion key.
  • In the routing index Y, it occupies index positions 0x20002 through 0x23000.
  • This SNIP is valid on 2020-12-09 00:00:00, for one hour.
  • Here is a signature of all the above text, using a threshold signature algorithm.

You can think of a SNIP as a signed combination of a routerstatus and a microdescriptor... together with a little bit of the randomized routing table from Tor's current path selection code, all wrapped in a signature.

Every relay keeps a set of SNIPs, and serves them to clients when the client is extending by a routing index position.

An ENDIVE is a complete set of SNIPs. Relays download ENDIVEs, or diffs between ENDIVEs, once every voting period. We'll accept some complexity in order to make these diffs small, even though some of the information in them (particularly SNIP signatures and index ranges) will tend to change with every period.

Preliminaries and scope

Goals for our formats

We want SNIPs to be small, since they need to be sent on the wire one at a time, and won't get much benefit from compression. (To avoid a side-channel, we want CREATED cells to all be the same size, which means we need to pad up to the largest size possible for a SNIP.)

We want to place as few requirements on clients as possible, and we want to preserve forward compatibility.

We want ENDIVEs to be compressible, and small. We want successive ENDIVEs to be textually similar, so that we can use diffs to transmit only the parts that change.

We should preserve our policy of requiring only loose time synchronization between clients and relays, and allow even looser synchronization when possible. Where possible, we'll make the permitted skew explicit in the protocol: for example, rather than saying "you can accept a document 10 minutes before it is valid", we will just make the validity interval start 10 minutes earlier.

Notes on Metaformat

In the format descriptions below, we will describe a set of documents in the CBOR metaformat, as specified in RFC 7049. If you're not familiar with CBOR, you can think of it as a simple binary version of JSON, optimized first for simplicity of implementation and second for space.

I've chosen CBOR because it's schema-free (you can parse it without knowing what it is), terse, dumpable as text, extensible, standardized, and very easy to parse and encode.

We will choose to represent many size-critical types as maps whose keys are short integers: this is slightly shorter in its encoding than string-based dictionaries. In some cases, we make types even shorter by using arrays rather than maps, but only when we are confident we will not have to make changes to the number of elements in the future.

We'll use CDDL (defined in RFC 8610) to describe the data in a way that can be validated -- and hopefully, in a way that will make it comprehensible. (The state of CDDL tooling is a bit lacking at the moment, so my CDDL validation will likely be imperfect.)

We make the following restrictions to CBOR documents that Tor implementations will generate:

  • No floating-point values are permitted.

  • No tags are allowed unless otherwise specified.

  • All items must follow the rules of RFC 7049 section 3.9 for canonical encoding, unless otherwise specified.

Implementations SHOULD accept and parse documents that are not generated according to these rules, for future extensibility. However, implementations SHOULD reject documents that are not "well-formed" and "valid" by the definitions of RFC 7049.

Design overview: signing documents

We try to use a single document-signing approach here, using a hash function parameterized to accommodate lifespan information and an optional nonce.

All the signed CBOR data used in this format is represented as a binary string, so that CBOR-processing tools are less likely to re-encode or transform it. We denote this below with the CDDL syntax bstr .cbor Object, which means "a binary string that must hold a valid encoding of a CBOR object whose type is Object".

Design overview: SNIP Authentication

I'm going to specify a flexible authentication format for SNIPs that can handle threshold signatures, multisignatures, and Merkle trees. This will give us flexibility in our choice of authentication mechanism over time.

  • If we use Merkle trees, we can make ENDIVE diffs much much smaller, and save a bunch of authority CPU -- at the expense of requiring slightly larger SNIPs.

  • If Merkle tree root signatures are in SNIPs, SNIPs get a bit larger, but they can be used by clients that do not have the latest signed Merkle tree root.

  • If we use threshold signatures, we need to depend on not-yet-quite-standardized algorithms. If we use multisignatures, then either SNIPs get bigger, or we need to put the signed Merkle tree roots into a consensus document.

Of course, flexibility in signature formats is risky, since the more code paths there are, the more opportunities there are for nasty bugs. With this in mind, I'm structuring our authentication so that there should (to the extent possible) be only a single validation path for different uses.

With this in mind, our format is structured so that "not using a Merkle tree" is considered, from the client's point of view, the same as "using a Merkle of depth 1".

The authentication on a single snip is structured, in the abstract, as:

  • ITEM: The item to be authenticated.
  • PATH: A string of N bits, representing a path through a Merkle tree from its root, where 0 indicates a left branch and 1 indicates a right branch. (Note that in a left-leaning tree, the 0th leaf will have path 000..0, the 1st leaf will have path 000..1, and so on.)
  • BRANCH: A list of N digests, representing the digests for the branches in the Merkle tree that we are not taking.
  • SIG: A generalized signature (either a threshold signature or a multisignature) of a top-level digest.
  • NONCE: an optional nonce for use with the hash functions.

Note that PATH here is a bitstring, not an integer! "0001" and "01" are different paths, and "" is a valid path, indicating the root of the tree.

We assume two hash functions here: H_leaf() to be used with leaf items, and H_node() to be used with intermediate nodes. These functions are parameterized with a path through the tree, with the lifespan of the object to be signed, and with a nonce.

To validate the authentication on a SNIP, the client proceeds as follows:

Algorithm: Validating SNIP authentication Let N = the length of PATH, in bits. Let H = H_leaf(PATH, LIFESPAN, NONCE, ITEM). While N > 0: Remove the last bit of PATH; call it P. Remove the last digest of BRANCH; call it B. If P is zero: Let H = H_node(PATH, LIFESPAN, NONCE, H, B) else: Let H = H_node(PATH, LIFESPAN, NONCE, B, H) Let N = N - 1 Check wither SIG is a correct (multi)signature over H with the correct key(s).

Parameterization on this structure is up to the authorities. If N is zero, then we are not using a Merkle tree. The generalize signature SIG can either be given as part of the SNIP, or as part of a consensus document. I expect that in practice, we will converge on a single set of parameters here quickly (I'm favoring BLS signatures and a Merkle tree), but using this format will give clients the flexibility to handle other variations in the future.

For our definition of H_leaf() and H_node(), see "Digests and parameters" below.

Design overview: timestamps and validity.

For future-proofing, SNIPs and ENDIVEs have separate time ranges indicating when they are valid. Unlike with current designs, these validity ranges should take clock skew into account, and should not require clients or relays to deliberately add extra tolerance to their processing. (For example, instead of saying that a document is "fresh" for three hours and then telling clients to accept documents for 24 hours before they are valid and 24 hours after they are expired, we will simply make the documents valid for 51 hours.)

We give each lifespan as a (PUBLISHED, PRE, POST) triple, such that objects are valid from (PUBLISHED - PRE) through (PUBLISHED + POST). (The "PUBLISHED" time is provided so that we can more reliably tell which of two objects is more recent.)

Later (see section 08), we'll explain measures to ensure that hostile relays do not take advantage of multiple overlapping SNIP lifetimes to attack clients.

Design overview: how the formats work together

Authorities, as part of their current voting process, will produce an ENDIVE.

Relays will download this ENDIVE (either directly or as a diff), validate it, and extract SNIPs from it. Extracting these SNIPs may be trivial (if they are signed individually), or more complex (if they are signed via a Merkle tree, and the Merkle tree needs to be reconstructed). This complexity is acceptable only to the extent that it reduces compressed diff size.

Once the SNIPs are reconstructed, relays will hold them and serve them to clients.

What isn't in this section

This section doesn't tell you what the different routing indices are or mean. For now, we can imagine there being one routing index for guards, one for middles, and one for exits, and one for each hidden service directory ring. (See section 06 for more on regular indices, and section 07 for more on onion services.)

This section doesn't give an algorithm for computing ENDIVEs from votes, and doesn't give an algorithm for extracting SNIPs from an ENDIVE. Those come later. (See sections 03 and 04 respectively.)

SNIPs

Each SNIP has three pieces: the part of the SNIP that describes the router, the part of that describes the SNIP's place within an ENDIVE, and the part that authenticates the whole SNIP.

Why two separate authenticated pieces? Because one (the router description) is taken verbatim from the ENDIVE, and the other (the location within the ENDIVE) is computed from the ENDIVE by the relays. Separating them like this helps ensure that the part generated by the relay and the part generated by the authorities can't interfere with each other.

; A SNIP, as it is sent from the relay to the client. Note that ; this is represented as a three-element array. SNIP = [ ; First comes the signature. This is computed over ; the concatenation of the two bstr objects below. auth: SNIPSignature, ; Next comes the location of the SNIP within the ENDIVE. index: bstr .cbor SNIPLocation, ; Finally comes the information about the router. router: bstr .cbor SNIPRouterData, ]

(Computing the signature over a concatenation of objects is safe, since the objects' content is self-describing CBOR, and isn't vulnerable to framing issues.)

SNIPRouterData: information about a single router.

Here we talk about the type that tells a client about a single router. For cases where we are just storing information about a router (for example, when using it as a guard), we can remember this part, and discard the other pieces.

The only required parts here are those that identify the router and tell the client how to build a circuit through it. The others are all optional. In practice, I expect they will be encoded in most cases, but clients MUST behave properly if they are absent.

More than one SNIPRouterData may exist in the same ENDIVE for a single router. For example, there might be a longer version to represent a router to be used as a guard, and another to represent the same router when used as a hidden service directory. (This is not possible in the voting mechanism that I'm working on, but relays and clients MUST NOT treat this as an error.)

This representation is based on the routerstats and microdescriptor entries of today, but tries to omit a number of obsolete fields, including RSA identity fingerprint, TAP key, published time, etc.

; A SNIPRouterData is a map from integer keys to values for ; those keys. SNIPRouterData = { ; identity key. ? 0 => Ed25519PublicKey, ; ntor onion key. ? 1 => Curve25519PublicKey, ; list of link specifiers other than the identity key. ; If a client wants to extend to the same router later on, ; they SHOULD include all of these link specifiers verbatim, ; whether they recognize them or not. ? 2 => [ LinkSpecifier ], ; The software that this relay says it is running. ? 3 => SoftwareDescription, ; protovers. ? 4 => ProtoVersions, ; Family. See below for notes on dual encoding. ? 5 => [ * FamilyId ], ; Country Code ? 6 => Country, ; Exit policies describing supported port _classes_. Absent exit ; policies are treated as "deny all". ? 7 => ExitPolicy, ; NOTE: Properly speaking, there should be a CDDL 'cut' ; here, to indicate that the rules below should only match ; if one if the previous rules hasn't matched. ; Unfortunately, my CDDL tool doesn't seem to support cuts. ; For future tor extensions. * int => any, ; For unofficial and experimental extensions. * tstr => any, } ; For future-proofing, we are allowing multiple ways to encode ; families. One is as a list of other relays that are in your ; family. One is as a list of authority-generated family ; identifiers. And one is as a master key for a family (as in ; Tor proposal 242). ; ; A client should consider two routers to be in the same ; family if they have at least one FamilyId in common. ; Authorities will canonicalize these lists. FamilyId = bstr ; A country. These should ordinarily be 2-character strings, ; but I don't want to enforce that. Country = tstr; ; SoftwareDescription replaces our old "version". SoftwareDescription = [ software: tstr, version: tstr, extra: tstr ] ; Protocol versions: after a bit of experimentation, I think ; the most reasonable representation to use is a map from protocol ; ID to a bitmask of the supported versions. ProtoVersions = { ProtoId => ProtoBitmask } ; Integer protocols are reserved for future version of Tor. tstr ids ; are reserved for experimental and non-tor extensions. ProtoId = ProtoIdEnum / int / tstr ProtoIdEnum = &( Link : 0, LinkAuth : 1, Relay : 2, DirCache : 3, HSDir : 4, HSIntro : 5, HSRend : 6, Desc : 7, MicroDesc: 8, Cons : 9, Padding : 10, FlowCtrl : 11, ) ; This type is limited to 64 bits, and that's fine. If we ever ; need a protocol version higher than 63, we should allocate a ; new protoid. ProtoBitmask = uint ; An exit policy may exist in up to two variants. When port classes ; have not changed in a while, only one policy is needed. If port ; classes have changed recently, however, then SNIPs need to include ; each relay's position according to both the older and the newer policy ; until older network parameter documents become invalid. ExitPolicy = SinglePolicy / [ SinglePolicy, SinglePolicy ] ; Each single exit policy is a tagged bit array, whose bits ; correspond to the members of the list of port classes in the ; network parameter document with a corresponding tag. SinglePolicy = [ ; Identifies which group of port classes we're talking about tag: unsigned, ; Bit-array of which port classes this relay supports. policy: bstr ]

SNIPLocation: Locating a SNIP within a routing index.

The SNIPLocation type can encode where a SNIP is located with respect to one or more routing indices. Note that a SNIPLocation does not need to be exhaustive: If a given IndexId is not listed for a given relay in one SNIP, it might exist in another SNIP. Clients should not infer that the absence of an IndexId in one SNIPLocation for a relay means that no SNIPLocation with that IndexId exists for the relay.

; SNIPLocation: we're using a map here because it's natural ; to look up indices in maps. SNIPLocation = { ; The keys of this mapping represent the routing indices in ; which a SNIP appears. The values represent the index ranges ; that it occupies in those indices. * IndexId => IndexRange / ExtensionIndex, } ; We'll define the different index ranges as we go on with ; these specifications. ; ; IndexId values over 65535 are reserved for extensions and ; experimentation. IndexId = uint32 ; An index range extends from a minimum to a maximum value. ; These ranges are _inclusive_ on both sides. If 'hi' is less ; than 'lo', then this index "wraps around" the end of the ring. ; A "nil" value indicates an empty range, which would not ; ordinarily be included. IndexRange = [ lo: IndexPos, hi: IndexPos ] / nil ; An ExtensionIndex is reserved for future use; current clients ; will not understand it and current ENDIVEs will not contain it. ExtensionIndex = any ; For most routing indices, the ranges are encoded as 4-byte integers. ; But for hsdir rings, they are binary strings. (Clients and ; relays SHOULD NOT require this.) IndexPos = uint / bstr

A bit more on IndexRanges: Every IndexRange actually describes a set of prefixes for possible index positions. For example, the IndexRange [ h'AB12', h'AB24' ] includes all the binary strings that start with (hex) AB12, AB13, and so on, up through all strings that start with AB24. Alternatively, you can think of a bstr-based IndexRange (lo, hi) as covering lo00000... through hiff....

IndexRanges based on the uint type work the same, except that they always specify the first 32 bits of a prefix.

SNIPSignature: How to prove a SNIP is in the ENDIVE.

Here we describe the types for implementing SNIP signatures, to be validated as described in "Design overview: Authentication" above.

; Most elements in a SNIPSignature are positional and fixed SNIPSignature = [ ; The actual signature or signatures. If this is a single signature, ; it's probably a threshold signature. Otherwise, it's probably ; a list containing one signature from each directory authority. SingleSig / MultiSig, ; algorithm to use for the path through the merkle tree. d_alg: DigestAlgorithm, ; Path through merkle tree, possibly empty. merkle_path: MerklePath, ; Lifespan information. This is included as part of the input ; to the hash algorithm for the signature. LifespanInfo, ; optional nonce for hash algorithm. ? nonce: bstr, ; extensions for later use. These are not signed. ? extensions: { * any => any }, ] ; We use this group to indicate when an object originated, and when ; it should be accepted. ; ; When we are using it as an input to a hash algorithm for computing ; signatures, we encode it as an 8-byte number for "published", ; followed by two 4-byte numbers for pre-valid and post-valid. LifespanInfo = ( ; Official publication time in seconds since the epoch. These ; MUST be monotonically increasing over time for a given set of ; authorities on all SNIPs or ENDIVEs that they generate: a ; document with a greater `published` time is always more recent ; than one with an earlier `published` time. ; ; Seeing a publication time "in the future" on a correctly ; authenticated document is a reliable sign that your ; clock is set too far in the past. published: uint, ; Value to subtract from "published" in order to find the first second ; at which this object should be accepted. pre-valid: uint32, ; Value to add to "published" in order to find the last ; second at which this object should be accepted. The ; lifetime of an object is therefore equal to "(post-valid + ; pre-valid)". post-valid: uint32, ) ; A Lifespan is just the fields of LifespanInfo, encoded as a list. Lifespan = [ LifespanInfo ] ; One signature on a SNIP or ENDIVE. If the signature is a threshold ; signature, or a reference to a signature in another ; document, there will probably be just one of these per SNIP. But if ; we're sticking a full multisignature in the document, this ; is just one of the signatures on it. SingleSig = [ s_alg: SigningAlgorithm, ; One of signature and sig_reference must be present. ?signature: bstr, ; sig_reference is an identifier for a signature that appears ; elsewhere, and can be fetched on request. It should only be ; used with signature types too large to attach to SNIPs on their ; own. ?sig_reference: bstr, ; A prefix of the key or the key's digest, depending on the ; algorithm. ?keyid: bstr, ] MultiSig = [ + SingleSig ] ; A Merkle path is represented as a sequence of bits to ; indicate whether we're going left or right, and a list of ; hashes for the parts of the tree that we aren't including. ; ; (It's safe to use a uint for the number of bits, since it will ; never overflow 64 bits -- that would mean a Merkle tree with ; too many leaves to actually calculate on.) MerklePath = [ uint, *bstr ]

ENDIVEs: sending a bunch of SNIPs efficiently.

ENDIVEs are delivered by the authorities in a compressed format, optimized for diffs.

Note that if we are using Merkle trees for SNIP authentication, ENDIVEs do not include the trees at all, since those can be inferred from the leaves of the tree. Similarly, the ENDIVEs do not include raw routing indices, but instead include a set of bandwidths that can be combined into the routing indices -- these bandwidths change less frequently, and therefore are more diff-friendly.

Note also that this format has more "wasted bytes" than SNIPs do. Unlike SNIPs, ENDIVEs are large enough to benefit from compression with with gzip, lzma2, or so on.

This section does not fully specify how to construct SNIPs from an ENDIVE; for the full algorithm, see section 04.

; ENDIVEs are also sent as CBOR. ENDIVE = [ ; Signature for the ENDIVE, using a simpler format than for ; a SNIP. Since ENDIVEs are more like a consensus, we don't need ; to use threshold signatures or Merkle paths here. sig: ENDIVESignature, ; Contents, as a binary string. body: encoded-cbor .cbor ENDIVEContent, ] ; The set of signatures across an ENDIVE. ; ; This type doubles as the "detached signature" document used when ; collecting signatures for a consensus. ENDIVESignature = { ; The actual signatures on the endive. A multisignature is the ; likeliest format here. endive_sig: [ + SingleSig ], ; Lifespan information. As with SNIPs, this is included as part ; of the input to the hash algorithm for the signature. ; Note that the lifespan of an ENDIVE is likely to be a subset ; of the lifespan of its SNIPs. endive_lifespan: Lifespan, ; Signatures across SNIPs, at some level of the Merkle tree. Note ; that these signatures are not themselves signed -- having them ; signed would take another step in the voting algorithm. snip_sigs: DetachedSNIPSignatures, ; Signatures across the ParamDoc pieces. Note that as with the ; DetachedSNIPSignatures, these signatures are not themselves signed. param_doc: ParamDocSignature, ; extensions for later use. These are not signed. * tstr => any, } ; A list of single signatures or a list of multisignatures. This ; list must have 2^signature-depth elements. DetachedSNIPSignatures = [ *SingleSig ] / [ *MultiSig ] ENDIVEContent = { ; Describes how to interpret the signatures over the SNIPs in this ; ENDIVE. See section 04 for the full algorithm. sig_params: { ; When should we say that the signatures are valid? lifespan: Lifespan, ; Nonce to be used with the signing algorithm for the signatures. ? signature-nonce: bstr, ; At what depth of a Merkle tree do the signatures apply? ; (If this value is 0, then only the root of the tree is signed. ; If this value is >= ceil(log2(n_leaves)), then every leaf is ; signed.). signature-depth: uint, ; What digest algorithm is used for calculating the signatures? signature-digest-alg: DigestAlgorithm, ; reserved for future extensions. * tstr => any, }, ; Documents for clients/relays to learn about current network ; parameters. client-param-doc: encoded-cbor .cbor ClientParamDoc, relay-param-doc: encoded-cbor .cbor RelayParamDoc, ; Definitions for index group. Each "index group" is all ; applied to the same SNIPs. (If there is one index group, ; then every relay is in at most one SNIP, and likely has several ; indices. If there are multiple index groups, then relays ; can appear in more than one SNIP.) indexgroups: [ *IndexGroup ], ; Information on particular relays. ; ; (The total number of SNIPs identified by an ENDIVE is at most ; len(indexgroups) * len(relays).) relays: [ * ENDIVERouterData ], ; for future exensions * tstr => any, } ; An "index group" lists a bunch of routing indices that apply to the same ; SNIPs. There may be multiple index groups when a relay needs to appear ; in different SNIPs with routing indices for some reason. IndexGroup = { ; A list of all the indices that are built for this index group. ; An IndexId may appear in at most one group per ENDIVE. indices: [ + IndexId ], ; A list of keys to delete from SNIPs to build this index group. omit_from_snips: [ *(int/tstr) ], ; A list of keys to forward from SNIPs to the next relay in an EXTEND ; cell. This can help the next relay know which keys to use in its ; handshake. forward_with_extend: [ *(int/tstr) ], ; A number of "gaps" to place in the Merkle tree after the SNIPs ; in this group. This can be used together with signature-depth ; to give different index-groups independent signatures. ? n_padding_entries: uint, ; A detailed description of how to build the index. + IndexId => IndexSpec, ; For experimental and extension use. * tstr => any, } ; Enumeration to identify how to generate an index. Indextype_Raw = 0 Indextype_Weighted = 1 Indextype_RSAId = 2 Indextype_Ed25519Id = 3 Indextype_RawNumeric = 4 ; An indexspec may be given as a raw set of index ranges. This is a ; fallback for cases where we simply can't construct an index any other ; way. IndexSpec_Raw = { type: Indextype_Raw, ; This index is constructed by taking relays by their position in the ; list from the list of ENDIVERouterData, and placing them at a given ; location in the routing index. Each index range extends up to ; right before the next index position. index_ranges: [ * [ uint, IndexPos ] ], } ; An indexspec given as a list of numeric spans on the index. IndexSpec_RawNumeric = { type: Indextype_RawNumeric, first_index_pos: uint, ; This index is constructed by taking relays by index from the list ; of ENDIVERouterData, and giving them a certain amount of "weight" ; in the index. index_ranges: [ * [ idx: uint, span: uint ] ], } ; This index is computed from the weighted bandwidths of all the SNIPs. ; ; Note that when a single bandwidth changes, it can change _all_ of ; the indices in a bandwidth-weighted index, even if no other ; bandwidth changes. That's why we only pack the bandwidths ; here, and scale them as part of the reconstruction algorithm. IndexSpec_Weighted = { type: Indextype_Weighted, ; This index is constructed by assigning a weight to each relay, ; and then normalizing those weights. See algorithm below in section ; 04. ; Limiting bandwidth weights to uint32 makes reconstruction algorithms ; much easier. index_weights: [ * uint32 ], } ; This index is computed from the RSA identity key digests of all of the ; SNIPs. It is used in the HSv2 directory ring. IndexSpec_RSAId = { type: Indextype_RSAId, ; How many bytes of RSA identity data go into each indexpos entry? n_bytes: uint, ; Bitmap of which routers should be included. members: bstr, } ; This index is computed from the Ed25519 identity keys of all of the ; SNIPs. It is used in the HSv3 directory ring. IndexSpec_Ed25519Id = { type: Indextype_Ed25519Id, ; How many bytes of digest go into each indexpos entry? n_bytes: uint, ; What digest do we use for building this ring? d_alg: DigestAlgorithm, ; What bytes do we give to the hash before the ed25519? prefix: bstr, ; What bytes do we give to the hash after the ed25519? suffix: bstr, ; Bitmap of which routers should be included. members: bstr, } IndexSpec = IndexSpec_Raw / IndexSpec_RawNumeric / IndexSpec_Weighted / IndexSpec_RSAId / IndexSpec_Ed25519Id ; Information about a single router in an ENDIVE. ENDIVERouterData = { ; The authority-generated SNIPRouterData for this router. 1 => encoded-cbor .cbor SNIPRouterData, ; The RSA identity, or a prefix of it, to use for HSv2 indices. ? 2 => RSAIdentityFingerprint, * int => any, * tstr => any, } ; encoded-cbor is defined in the CDDL postlude as a bstr that is ; tagged as holding verbatim CBOR: ; ; encoded-cbor = #6.24(bstr) ; ; Using a tag like this helps tools that validate the string as ; valid CBOR; using a bstr helps indicate that the signed data ; should not be interpreted until after the signature is checked. ; It also helps diff tools know that they should look inside these ; objects.

Network parameter documents

Network parameter documents ("ParamDocs" for short) take the place of the current consensus and certificates as a small document that clients and relays need to download periodically and keep up-to-date. They are generated as part of the voting process, and contain fields like network parameters, recommended versions, authority certificates, and so on.

; A "parameter document" is like a tiny consensus that relays and clients ; can use to get network parameters. ParamDoc = [ sig: ParamDocSignature, ; Client-relevant portion of the parameter document. Everybody fetches ; this. cbody: encoded-cbor .cbor ClientParamDoc, ; Relay-relevant portion of the parameter document. Only relays need to ; fetch this; the document can be validated without it. ? sbody: encoded-cbor .cbor RelayParamDoc, ] ParamDocSignature = [ ; Multisignature or threshold signature of the concatenation ; of the two digests below. SingleSig / MultiSig, ; Lifespan information. As with SNIPs, this is included as part ; of the input to the hash algorithm for the signature. ; Note that the lifespan of a parameter document is likely to be ; very long. LifespanInfo, ; how are c_digest and s_digest computed? d_alg: DigestAlgorithm, ; Digest over the cbody field c_digest: bstr, ; Digest over the sbody field s_digest: bstr, ] ClientParamDoc = { params: NetParams, ; List of certificates for all the voters. These ; authenticate the keys used to sign SNIPs and ENDIVEs and votes, ; using the authorities' longest-term identity keys. voters: [ + bstr .cbor VoterCert ], ; A division of exit ports into "classes" of ports. port-classes: PortClasses, ; As in client-versions from dir-spec.txt ? recommend-versions: [ * tstr ], ; As in recommended-client-protocols in dir-spec.txt ? recommend-protos: ProtoVersions, ; As in required-client-protocols in dir-spec.txt ? require-protos: ProtoVersions, ; For future extensions. * tstr => any, } RelayParamDoc = { params: NetParams, ; As in server-versions from dir-spec.txt ? recommend-versions: [ * tstr ], ; As in recommended-relay-protocols in dir-spec.txt ? recommend-protos: ProtoVersions, ; As in required-relay-protocols in dir-spec.txt ? require-versions: ProtoVersions, * tstr => any, } ; A NetParams encodes information about the Tor network that ; clients and relays need in order to participate in it. The ; current list of parameters is described in the "params" field ; as specified in dir-spec.txt. ; ; Note that there are separate client and relay NetParams now. ; Relays are expected to first check for a defintion in the ; RelayParamDoc, and then in the ClientParamDoc. NetParams = { *tstr => int } PortClasses = { ; identifies which port class grouping this is. Used to migrate ; from one group of port classes to another. tag: uint, ; list of the port classes. classes: { * IndexId => PortList }, } PortList = [ *PortOrRange ] ; Either a single port or a low-high pair PortOrRange = Port / [ Port, Port ] Port = 1...65535

Certificates

Voting certificates are used to bind authorities' long-term identities to shorter-term signing keys. These have a similar purpose to the authority certs made for the existing voting algorithm, but support more key types.

; A 'voter certificate' is a statement by an authority binding keys to ; each other. VoterCert = [ ; One or more signatures over `content` using the provided lifetime. ; Each signature should be treated independently. [ + SingleSig ], ; A lifetime value, used (as usual ) as an input to the ; signature algorithm. LifespanInfo, ; The keys and other data to be certified. content: encoded-cbor .cbor CertContent, ] ; The contents of the certificate that get signed. CertContent = { ; What kind of a certificate is this? type: CertType, ; A list of keys that are being certified in this document keys: [ + CertifiedKey ], ; A list of other keys that you might need to know about, which ; are NOT certififed in this document. ? extra: [ + CertifiedKey ], * tstr => any, } CertifiedKey = { ; What is the intended usage of this key? usage: KeyUsage, ; What cryptographic algorithm is this key used for? alg: PKAlgorithm, ; The actual key being certified. data: bstr, ; A human readable string. ? remarks: tstr, * tstr => any, }

ENDIVE diffs

Here is a binary format to be used with ENDIVEs, ParamDocs, and any other similar binary formats. Authorities and directory caches need to be able to generate it; clients and non-cache relays only need to be able to parse and apply it.

; Binary diff specification. BinaryDiff = { ; This is version 1. v: 1, ; Optionally, a diff can say what different digests ; of the document should be before and after it is applied. ; If there is more than one entry, parties MAY check one or ; all of them. ? digest: { * DigestAlgorithm => [ pre: Digest, post: Digest ]}, ; Optionally, a diff can give some information to identify ; which document it applies to, and what document you get ; from applying it. These might be a tuple of a document type ; and a publication type. ? ident: [ pre: any, post: any ], ; list of commands to apply in order to the original document in ; order to get the transformed document cmds: [ *DiffCommand ], ; for future extension. * tstr => any, } ; There are currently only two diff commands. ; One is to copy some bytes from the original. CopyDiffCommand = [ OrigBytesCmdId, ; Range of bytes to copy from the original document. ; Ranges include their starting byte. The "offset" is relative to ; the end of the _last_ range that was copied. offset: int, length: uint, ] ; The other diff comment is to insert some bytes from the diff. InsertDiffCommand = [ InsertBytesCmdId, data: bstr, ] DiffCommand = CopyDiffCommand / InsertDiffCommand OrigBytesCmdId = 0 InsertBytesCmdId = 1

Applying a binary diff is simple:

Algorithm: applying a binary diff. (Given an input bytestring INP and a diff D, produces an output OUT.) Initialize OUT to an empty bytestring. Set OFFSET to 0. For each command C in D.commands, in order: If C begins with OrigBytesCmdId: Increase "OFFSET" by C.offset If OFFSET..OFFSET+C.length is not a valid range in INP, abort. Append INP[OFFSET .. OFFSET+C.length] to OUT. Increase "OFFSET" by C.length else: # C begins with InsertBytesCmdId: Append C.data to OUT.

Generating a binary diff can be trickier, and is not specified here. There are several generic algorithms out there for making binary diffs between arbitrary byte sequences. Since these are complex, I recommend a chunk-based CBOR-aware algorithm, using each CBOR item in a similar way to the way in which our current line-oriented code uses lines. When encountering a bstr tagged with "encoded-cbor", the diff algorithm should look inside it to find more cbor chunks. (See example-code/cbor_diff.py for an example of doing this with Python's difflib.)

The diff format above should work equally well no matter what diff algorithm is used, so we have room to move to other algorithms in the future if needed.

To indicate support for the above diff format in directory requests, implementations should use an X-Support-Diff-Formats header. The above format is designated "cbor-bindiff"; our existing format is called "ed".

Digests and parameters

Here we give definitions for H_leaf() and H_node(), based on an underlying digest function H() with a preferred input block size of B. (B should be chosen as the natural input size of the hash function, to aid in precomputation.)

We also define H_sign(), to be used outside of SNIP authentication where we aren't using a Merkle tree at all.

PATH must be no more than 64 bits long. NONCE must be no more than B-33 bytes long.

H_sign(LIFESPAN, NONCE, ITEM) = H( PREFIX(OTHER_C, LIFESPAN, NONCE) || ITEM) H_leaf(PATH, LIFESPAN, NONCE, ITEM) = H( PREFIX(LEAF_C, LIFESPAN, NONCE) || U64(PATH) || U64(bits(path)) || ITEM ) H_node(PATH, LIFESPAN, NONCE, ITEM) = H( PREFIX(NODE_C, LIFESPAN, NONCE) || U64(PATH) || U64(bits(PATH)) || ITEM ) PREFIX(leafcode, lifespan, nonce) = U64(leafcode) || U64(lifespan.published) || U32(lifespan.pre-valid) || U32(lifespan.post-valid) || U8(len(nonce)) || nonce || Z(B - 33 - len(nonce)) LEAF_C = 0x8BFF0F687F4DC6A1 ^ NETCONST NODE_C = 0xA6F7933D3E6B60DB ^ NETCONST OTHER_C = 0x7365706172617465 ^ NETCONST # For the live Tor network only. NETCONST = 0x0746f72202020202 # For testing networks, by default. NETCONST = 0x74657374696e6720 U64(n) -- N encoded as a big-endian 64-bit number. Z(n) -- N bytes with value zero. len(b) -- the number of bytes in a byte-string b. bits(b) -- the number of bits in a bit-string b.

Directory authority operations

For Walking Onions to work, authorities must begin to generate ENDIVEs as a new kind of "consensus document". Since this format is incompatible with the previous consensus document formats, and is CBOR-based, a text-based voting protocol is no longer appropriate for generating it.

We cannot immediately abandon the text-based consensus and microdescriptor formats, but instead will need to keep generating them for legacy relays and clients. Ideally, process that produces the ENDIVE should also produce a legacy consensus, to limit the amount of divergence in their contents.

Further, it would be good for the purposes of this proposal if we can "inherit" as much as possible of our existing voting mechanism for legacy purposes.

This section of the proposal will try to solve these goals by defining a new binary-based voting format, a new set of voting rules for it, and a series of migration steps.

Overview

Except as described below, we retain from Tor's existing voting mechanism all notions of how votes are transferred and processed. Other changes are likely desirable, but they are out of scope for this proposal.

Notably, we are not changing how the voting schedule works. Nor are we changing the property that all authorities must agree on the list of authorities; the property that a consensus is computed as a deterministic function of a set of votes; or the property that if authorities believe in different sets of votes, they will not reach the same consensus.

The principal changes in the voting that are relevant for legacy consensus computation are:

  • The uploading process for votes now supports negotiation, so that the receiving authority can tell the uploading authority what kind of formats, diffs, and compression it supports.

  • We specify a CBOR-based binary format for votes, with a simple embedding method for the legacy text format. This embedding is meant for transitional use only; once all authorities support the binary format, the transitional format and its support structures can be abandoned.

  • To reduce complexity, the new vote format also includes verbatim microdescriptors, whereas previously microdescriptors would have been referenced by hash. (The use of diffs and compression should make the bandwidth impact of this addition negligible.)

For computing ENDIVEs, the principal changes in voting are:

  • The consensus outputs for most voteable objects are specified in a way that does not require the authorities to understand their semantics when computing a consensus. This should make it easier to change fields without requiring new consensus methods.

Negotiating vote uploads

Authorities supporting Walking Onions are required to support a new resource "/tor/auth-vote-opts". This resource is a text document containing a list of HTTP-style headers. Recognized headers are described below; unrecognized headers MUST be ignored.

The Accept-Encoding header follows the same format as the HTTP header of the same name; it indicates a list of Content-Encodings that the authority will accept for uploads. All authorities SHOULD support the gzip and identity encodings. The identity encoding is mandatory. (Default: "identity")

The Accept-Vote-Diffs-From header is a list of digests of previous votes held by this authority; any new uploaded votes that are given as diffs from one of these old votes SHOULD be accepted. The format is a space-separated list of "digestname:Hexdigest". (Default: "".)

The Accept-Vote-Formats header is a space-separated list of the vote formats that this router accepts. The recognized vote formats are "legacy-3" (Tor's current vote format) and "endive-1" (the vote format described here). Unrecognized vote formats MUST be ignored. (Default: "legacy-3".)

If requesting "/tor/auth-vote-opts" gives an error, or if one or more headers are missing, the default values SHOULD be used. These documents (or their absence) MAY be cached for up to 2 voting periods.)

Authorities supporting Walking Onions SHOULD also support the "Connection: keep-alive" and "Keep-Alive" HTTP headers, to avoid needless reconnections in response to these requests. Implementors SHOULD be aware of potential denial-of-service attacks based on open HTTP connections, and mitigate them as appropriate.

Note: I thought about using OPTIONS here, but OPTIONS isn't quite right for this, since Accept-Vote-Diffs-From does not fit with its semantics.

Note: It might be desirable to support this negotiation for legacy votes as well, even before walking onions is implemented. Doing so would allow us to reduce authority bandwidth a little, and possibly include microdescriptors in votes for more convenient processing.

A generalized algorithm for voting

Unlike with previous versions of our voting specification, here I'm going to try to describe pieces the voting algorithm in terms of simpler voting operations. Each voting operation will be named and possibly parameterized, and data will frequently self-describe what voting operation is to be used on it.

Voting operations may operate over different CBOR types, and are themselves specified as CBOR objects.

A voting operation takes place over a given "voteable field". Each authority that specifies a value for a voteable field MUST specify which voting operation to use for that field. Specifying a voteable field without a voting operation MUST be taken as specifying the voting operation "None" -- that is, voting against a consensus.

On the other hand, an authority MAY specify a voting operation for a field without casting any vote for it. This means that the authority has an opinion on how to reach a consensus about the field, without having any preferred value for the field itself.

Constants used with voting operations

Many voting operations may be parameterized by an unsigned integer. In some cases the integers are constant, but in others, they depend on the number of authorities, the number of votes cast, or the number of votes cast for a particular field.

When we encode these values, we encode them as short strings rather than as integers.

I had thought of using negative integers here to encode these special constants, but that seems too error-prone.

The following constants are defined:

N_AUTH -- the total number of authorities, including those whose votes are absent.

N_PRESENT -- the total number of authorities whose votes are present for this vote.

N_FIELD -- the total number of authorities whose votes for a given field are present.

Necessarily, N_FIELD <= N_PRESENT <= N_AUTH -- you can't vote on a field unless you've cast a vote, and you can't cast a vote unless you're an authority.

In the definitions below, // denotes the truncating integer division operation, as implemented with / in C.

QUORUM_AUTH -- The lowest integer that is greater than half of N_AUTH. Equivalent to N_AUTH // 2 + 1.

QUORUM_PRESENT -- The lowest integer that is greater than half of N_PRESENT. Equivalent to N_PRESENT // 2 + 1.

QUORUM_FIELD -- The lowest integer that is greater than half of N_FIELD. Equivalent to N_FIELD // 2 + 1.

We define SUPERQUORUM_..., variants of these fields as well, based on the lowest integer that is greater than 2/3 majority of the underlying field. SUPERQUORUM_x is thus equivalent to (N_x * 2) // 3 + 1.

; We need to encode these arguments; we do so as short strings. IntOpArgument = uint / "auth" / "present" / "field" / "qauth" / "qpresent" / "qfield" / "sqauth" / "sqpresent" / "sqfield"

No IntOpArgument may be greater than AUTH. If an IntOpArgument is given as an integer, and that integer is greater than AUTH, then it is treated as if it were AUTH.

This rule lets us say things like "at least 3 authorities must vote on x...if there are 3 authorities."

Producing consensus on a field

Each voting operation will either produce a CBOR output, or produce no consensus. Unless otherwise stated, all CBOR outputs are to be given in canonical form.

Below we specify a number of operations, and the parameters that they take. We begin with operations that apply to "simple" values (integers and binary strings), then show how to compose them to larger values.

All of the descriptions below show how to apply a single voting operation to a set of votes. We will later describe how to behave when the authorities do not agree on which voting operation to use, in our discussion of the StructJoinOp operation.

Note that while some voting operations take other operations as parameters, we are not supporting full recursion here: there is a strict hierarchy of operations, and more complex operations can only have simpler operations in their parameters.

All voting operations follow this metaformat:

; All a generic voting operation has to do is say what kind it is. GenericVotingOp = { op: tstr, * tstr => any, }

Note that some voting operations require a sort or comparison operation over CBOR values. This operation is defined later in appendix E; it works only on homogeneous inputs.

Generic voting operations

None

This voting operation takes no parameters, and always produces "no consensus". It is encoded as:

; "Don't produce a consensus". NoneOp = { op: "None" }

When encountering an unrecognized or nonconforming voting operation, or one which is not recognized by the consensus-method in use, the authorities proceed as if the operation had been "None".

Voting operations for simple values

We define a "simple value" according to these cddl rules:

; Simple values are primitive types, and tuples of primitive types. SimpleVal = BasicVal / SimpleTupleVal BasicVal = bool / int / bstr / tstr SimpleTupleVal = [ *BasicVal ]

We also need the ability to encode the types for these values:

; Encoding a simple type. SimpleType = BasicType / SimpleTupleType BasicType = "bool" / "uint" / "sint" / "bstr" / "tstr" SimpleTupleType = [ "tuple", *BasicType ]

In other words, a SimpleVal is either an non-compound base value, or is a tuple of such values.

; We encode these operations as: SimpleOp = MedianOp / ModeOp / ThresholdOp / BitThresholdOp / CborSimpleOp / NoneOp

We define each of these operations in the sections below.

Median

Parameters: MIN_VOTES (an integer), BREAK_EVEN_LOW (a boolean), TYPE (a SimpleType)

; Encoding: MedianOp = { op: "Median", ? min_vote: IntOpArgument, ; Default is 1. ? even_low: bool, ; Default is true. type: SimpleType }

Discard all votes that are not of the specified TYPE. If there are fewer than MIN_VOTES votes remaining, return "no consensus".

Put the votes in ascending sorted order. If the number of votes N is odd, take the center vote (the one at position (N+1)/2). If N is even, take the lower of the two center votes (the one at position N/2) if BREAK_EVEN_LOW is true. Otherwise, take the higher of the two center votes (the one at position N/2 + 1).

For example, the Median(…, even_low: True, type: "uint") of the votes ["String", 2, 111, 6] is 6. The Median(…, even_low: True, type: "uint") of the votes ["String", 77, 9, 22, "String", 3] is 9.

Mode

Parameters: MIN_COUNT (an integer), BREAK_TIES_LOW (a boolean), TYPE (a SimpleType)

; Encoding: ModeOp = { op: "Mode", ? min_count: IntOpArgument, ; Default 1. ? tie_low: bool, ; Default true. type: SimpleType }

Discard all votes that are not of the specified TYPE. Of the remaining votes, look for the value that has received the most votes. If no value has received at least MIN_COUNT votes, then return "no consensus".

If there is a single value that has received the most votes, return it. Break ties in favor of lower values if BREAK_TIES_LOW is true, and in favor of higher values if BREAK_TIES_LOW is false. (Perform comparisons in canonical cbor order.)

Threshold

Parameters: MIN_COUNT (an integer), BREAK_MULTI_LOW (a boolean), TYPE (a SimpleType)

; Encoding ThresholdOp = { op: "Threshold", min_count: IntOpArgument, ; No default. ? multi_low: bool, ; Default true. type: SimpleType }

Discard all votes that are not of the specified TYPE. Sort in canonical cbor order. If BREAK_MULTI_LOW is false, reverse the order of the list.

Return the first element that received at least MIN_COUNT votes. If no value has received at least MIN_COUNT votes, then return "no consensus".

BitThreshold

Parameters: MIN_COUNT (an integer >= 1)

; Encoding BitThresholdOp = { op: "BitThreshold", min_count: IntOpArgument, ; No default. }

These are usually not needed, but are quite useful for building some ProtoVer operations.

Discard all votes that are not of type uint or bstr; construe bstr inputs as having type "biguint".

The output is a uint or biguint in which the b'th bit is set iff the b'th bit is set in at least MIN_COUNT of the votes.

Voting operations for lists

These operations work on lists of SimpleVal:

; List type definitions ListVal = [ * SimpleVal ] ListType = [ "list", [ *SimpleType ] / nil ]

They are encoded as:

; Only one list operation exists right now. ListOp = SetJoinOp

SetJoin

Parameters: MIN_COUNT (an integer >= 1). Optional parameters: TYPE (a SimpleType.)

; Encoding: SetJoinOp = { op: "SetJoin", min_count: IntOpArgument, ? type: SimpleType }

Discard all votes that are not lists. From each vote, discard all members that are not of type 'TYPE'.

For the consensus, construct a new list containing exactly those elements that appears in at least MIN_COUNT votes.

(Note that the input votes may contain duplicate elements. These must be treated as if there were no duplicates: the vote [1, 1, 1, 1] is the same as the vote [1]. Implementations may want to preprocess votes by discarding all but one instance of each member.)

Voting operations for maps

Map voting operations work over maps from key types to other non-map types.

; Map type definitions. MapVal = { * SimpleVal => ItemVal } ItemVal = ListVal / SimpleVal MapType = [ "map", [ *SimpleType ] / nil, [ *ItemType ] / nil ] ItemType = ListType / SimpleType

They are encoded as:

; MapOp encodings MapOp = MapJoinOp / StructJoinOp

MapJoin

The MapJoin operation combines homogeneous maps (that is, maps from a single key type to a single value type.)

Parameters: KEY_MIN_COUNT (an integer >= 1) KEY_TYPE (a SimpleType type) ITEM_OP (A non-MapJoin voting operation)

Encoding:

; MapJoin operation encoding MapJoinOp = { op: "MapJoin" ? key_min_count: IntOpArgument, ; Default 1. key_type: SimpleType, item_op: ListOp / SimpleOp }

First, discard all votes that are not maps. Then consider the set of keys from each vote as if they were a list, and apply SetJoin[KEY_MIN_COUNT,KEY_TYPE] to those lists. The resulting list is a set of keys to consider including in the output map.

We have a separate key_min_count field, even if item_op has its own min_count field, because some min_count values (like qfield) depend on the overall number of votes for the field. Having key_min_count lets us specify rules like "the FOO of all votes on this field, if there are at least 2 such votes."

For each key in the output list, run the sub-voting operation ItemOperation on the values it received in the votes. Discard all keys for which the outcome was "no consensus".

The final vote result is a map from the remaining keys to the values produced by the voting operation.

StructJoin

A StructJoinOp operation describes a way to vote on maps that encode a structure-like object.

Parameters: KEY_RULES (a map from int or string to StructItemOp) UNKNOWN_RULE (An operation to apply to unrecognized keys.)

; Encoding StructItemOp = ListOp / SimpleOp / MapJoinOp / DerivedItemOp / CborDerivedItemOp VoteableStructKey = int / tstr StructJoinOp = { op: "StructJoin", key_rules: { * VoteableStructKey => StructItemOp, } ? unknown_rule: StructItemOp }

To apply a StructJoinOp to a set of votes, first discard every vote that is not a map. Then consider the set of keys from all the votes as a single list, with duplicates removed. Also remove all entries that are not integers or strings from the list of keys.

For each key, then look for that key in the KEY_RULES map. If there is an entry, then apply the StructItemOp for that entry to the values for that key in every vote. Otherwise, apply the UNKNOWN_RULE operation to the values for that key in every vote. Otherwise, there is no consensus for the values of this key. If there is a consensus for the values, then the key should map to that consensus in the result.

This operation always reaches a consensus, even if it is an empty map.

CborData

A CborData operation wraps another operation, and tells the authorities that after the operation is completed, its result should be decoded as a CBOR bytestring and interpolated directly into the document.

Parameters: ITEM_OP (Any SingleOp that can take a bstr input.)

; Encoding CborSimpleOp = { op: "CborSimple", item-op: MedianOp / ModeOp / ThresholdOp / NoneOp } CborDerivedItemOp = { op: "CborDerived", item-op: DerivedItemOp, }

To apply either of these operations to a set of votes, first apply ITEM_OP to those votes. After that's done, check whether the consensus from that operation is a bstr that encodes a single item of "well-formed" "valid" cbor. If it is not, this operation gives no consensus. Otherwise, the consensus value for this operation is the decoding of that bstr value.

DerivedFromField

This operation can only occur within a StructJoinOp operation (or a semantically similar SectionRules). It indicates that one field should have been derived from another. It can be used, for example, to say that a relay's version is "derived from" a relay's descriptor digest.

Unlike other operations, this one depends on the entire consensus (as computed so far), and on the entirety of the set of votes.

This operation might be a mistake, but we need it to continue lots of our current behavior.

Parameters: FIELDS (one or more other locations in the vote) RULE (the rule used to combine values)

Encoding ; This item is "derived from" some other field. DerivedItemOp = { op: "DerivedFrom", fields: [ +SourceField ], rule: SimpleOp }

; A field in the vote. SourceField = [ FieldSource, VoteableStructKey ] ; A location in the vote. Each location here can only ; be referenced from later locations, or from itself. FieldSource = "M" ; Meta. / "CP" ; ClientParam. / "SP" ; ServerParam. / "RM" ; Relay-meta / "RS" ; Relay-SNIP / "RL" ; Relay-legacy

To compute a consensus with this operation, first locate each field described in the SourceField entry in each VoteDocument (if present), and in the consensus computed so far. If there is no such field in the consensus or if it has not been computed yet, then this operation produces "no consensus". Otherwise, discard the VoteDocuments that do not have the same value for the field as the consensus, and their corresponding votes for this field. Do this for every listed field.

At this point, we have a set of votes for this field's value that all come from VoteDocuments that describe the same value for the source field(s). Apply the RULE operation to those votes in order to give the result for this voting operation.

The DerivedFromField members in a SectionRules or a StructJoinOp should be computed after the other members, so that they can refer to those members themselves.

Voting on document sections

Voting on a section of the document is similar to the StructJoin operation, with some exceptions. When we vote on a section of the document, we do not apply a single voting rule immediately. Instead, we first "merge" a set of SectionRules together, and then apply the merged rule to the votes. This is the only place where we merge rules like this.

A SectionRules is not a voting operation, so its format is not tagged with an "op":

; Format for section rules. SectionRules = { * VoteableStructKey => SectionItemOp, ? nil => SectionItemOp } SectionItemOp = StructJoinOp / StructItemOp

To merge a set of SectionRules together, proceed as follows. For each key, consider whether at least QUORUM_AUTH authorities have voted the same StructItemOp for that key. If so, that StructItemOp is the resulting operation for this key. Otherwise, there is no entry for this key.

Do the same for the "nil" StructItemOp; use the result as the UNKNOWN_RULE.

Note that this merging operation is not recursive.

A CBOR-based metaformat for votes.

A vote is a signed document containing a number of sections; each section corresponds roughly to a section of another document, a description of how the vote is to be conducted, or both.

; VoteDocument is a top-level signed vote. VoteDocument = [ ; Each signature may be produced by a different key, if they ; are all held by the same authority. sig: [ + SingleSig ], lifetime: Lifespan, digest-alg: DigestAlgorithm, body: bstr .cbor VoteContent ] VoteContent = { ; List of supported consensus methods. consensus-methods: [ + uint ], ; Text-based legacy vote to be used if the negotiated ; consensus method is too old. It should itself be signed. ; It's encoded as a series of text chunks, to help with ; cbor-based binary diffs. ? legacy-vote: [ * tstr ], ; How should the votes within the individual sections be ; computed? voting-rules: VotingRules, ; Information that the authority wants to share about this ; vote, which is not itself voted upon. notes: NoteSection, ; Meta-information that the authorities vote on, which does ; not actually appear in the ENDIVE or consensus directory. meta: MetaSection .within VoteableSection, ; Fields that appear in the client network parameter document. client-params: ParamSection .within VoteableSection, ; Fields that appear in the server network parameter document. server-params: ParamSection .within VoteableSection, ; Information about each relay. relays: RelaySection, ; Information about indices. indices: IndexSection, * tstr => any } ; Self-description of a voter. VoterSection = { ; human-memorable name name: tstr, ; List of link specifiers to use when uploading to this ; authority. (See proposal for dirport link specifier) ? ul: [ *LinkSpecifier ], ; List of link specifiers to use when downloading from this authority. ? dl: [ *LinkSpecifier ], ; contact information for this authority. ? contact: tstr, ; legacy certificate in format given by dir-spec.txt. ? legacy-cert: tstr, ; for extensions * tstr => any, } ; An indexsection says how we think routing indices should be built. IndexSection = { * IndexId => bstr .cbor [ IndexGroupId, GenericIndexRule ], } IndexGroupId = uint ; A mechanism for building a single routing index. Actual values need to ; be within RecognizedIndexRule or the authority can't complete the ; consensus. GenericIndexRule = { type: tstr, * tstr => any } RecognizedIndexRule = EdIndex / RSAIndex / BWIndex / WeightedIndex ; The values in an RSAIndex are derived from digests of Ed25519 keys. EdIndex = { type: "ed-id", alg: DigestAlgorithm, prefix: bstr, suffix: bstr } ; The values in an RSAIndex are derived from RSA keys. RSAIndex = { type: "rsa-id" } ; A BWIndex is built by taking some uint-valued field referred to by ; SourceField from all the relays that have all of required_flags set. BWIndex = { type: "bw", bwfield: SourceField, require_flags: FlagSet, } ; A flag can be prefixed with "!" to indicate negation. A flag ; with a name like P@X indicates support for port class 'X' in its ; exit policy. ; ; FUTURE WORK: perhaps we should add more structure here and it ; should be a matching pattern? FlagSet = [ *tstr ] ; A WeightedIndex applies a set of weights to a BWIndex based on which ; flags the various routers have. Relays that match a set of flags have ; their weights multiplied by the corresponding WeightVal. WeightedIndex = { type: "weighted", source: BwIndex, weight: { * FlagSet => WeightVal } } ; A WeightVal is either an integer to multiply bandwidths by, or a ; string from the Wgg, Weg, Wbm, ... set as documented in dir-spec.txt, ; or a reference to an earlier field. WeightVal = uint / tstr / SourceField VoteableValue = MapVal / ListVal / SimpleVal ; A "VoteableSection" is something that we apply part of the ; voting rules to. When we apply voting rules to these sections, ; we do so without regards to their semantics. When we are done, ; we use these consensus values to make the final consensus. VoteableSection = { VoteableStructKey => VoteableValue, } ; A NoteSection is used to convey information about the voter and ; its vote that is not actually voted on. NoteSection = { ; Information about the voter itself voter: VoterSection, ; Information that the voter used when assigning flags. ? flag-thresholds: { tstr => any }, ; Headers from the bandwidth file to be reported as part of ; the vote. ? bw-file-headers: {tstr => any }, ? shared-rand-commit: SRCommit, * VoteableStructKey => VoteableValue, } ; Shared random commitment; fields are as for the current ; shared-random-commit fields. SRCommit = { ver: uint, alg: DigestAlgorithm, ident: bstr, commit: bstr, ? reveal: bstr } ; the meta-section is voted on, but does not appear in the ENDIVE. MetaSection = { ; Seconds to allocate for voting and distributing signatures ; Analagous to the "voting-delay" field in the legacy algorithm. voting-delay: [ vote_seconds: uint, dist_seconds: uint ], ; Proposed time till next vote. voting-interval: uint, ; proposed lifetime for the SNIPs and ENDIVEs snip-lifetime: Lifespan, ; proposed lifetime for client params document c-param-lifetime: Lifespan, ; proposed lifetime for server params document s-param-lifetime: Lifespan, ; signature depth for ENDIVE signature-depth: uint, ; digest algorithm to use with ENDIVE. signature-digest-alg: DigestAlgorithm, ; Current and previous shared-random values ? cur-shared-rand: [ reveals: uint, rand: bstr ], ? prev-shared-rand: [ reveals: uint, rand: bstr ], ; extensions. * VoteableStructKey => VoteableValue, }; ; A ParamSection will be made into a ParamDoc after voting; ; the fields are analogous. ParamSection = { ? certs: [ 1*2 bstr .cbor VoterCert ], ? recommend-versions: [ * tstr ], ? require-protos: ProtoVersions, ? recommend-protos: ProtoVersions, ? params: NetParams, * VoteableStructKey => VoteableValue, } RelaySection = { ; Mapping from relay identity key (or digest) to relay information. * bstr => RelayInfo, } ; A RelayInfo is a vote about a single relay. RelayInfo = { meta: RelayMetaInfo .within VoteableSection, snip: RelaySNIPInfo .within VoteableSection, legacy: RelayLegacyInfo .within VoteableSection, } ; Information about a relay that doesn't go into a SNIP. RelayMetaInfo = { ; Tuple of published-time and descriptor digest. ? desc: [ uint , bstr ], ; What flags are assigned to this relay? We use a ; string->value encoding here so that only the authorities ; who have an opinion on the status of a flag for a relay need ; to vote yes or no on it. ? flags: { *tstr=>bool }, ; The relay's self-declared bandwidth. ? bw: uint, ; The relay's measured bandwidth. ? mbw: uint, ; The fingerprint of the relay's RSA identity key. ? rsa-id: RSAIdentityFingerprint } ; SNIP information can just be voted on directly; the formats ; are the same. RelaySNIPInfo = SNIPRouterData ; Legacy information is used to build legacy consensuses, but ; not actually required by walking onions clients. RelayLegacyInfo = { ; Mapping from consensus version to microdescriptor digests ; and microdescriptors. ? mds: [ *Microdesc ], } ; Microdescriptor votes now include the digest AND the ; microdescriptor-- see note. Microdesc = [ low: uint, high: uint, digest: bstr .size 32, ; This is encoded in this way so that cbor-based diff tools ; can see inside it. Because of compression and diffs, ; including microdesc text verbatim should be comparatively cheap. content: encoded-cbor .cbor [ *tstr ], ] ; ========== ; The VotingRules field explains how to vote on the members of ; each section VotingRules = { meta: SectionRules, params: SectionRules, relay: RelayRules, indices: SectionRules, } ; The RelayRules object explains the rules that apply to each ; part of a RelayInfo. A key will appear in the consensus if it ; has been listed by at least key_min_count authorities. RelayRules = { key_min_count: IntOpArgument, meta: SectionRules, snip: SectionRules, legacy: SectionRules, }

Computing a consensus.

To compute a consensus, the authorities first verify that all the votes are timely and correctly signed by real authorities. This includes validating all invariants stated here, and all internal documents.

If they have two votes from an authority, authorities SHOULD issue a warning, and they should take the one that is published more recently.

TODO: Teor suggests that maybe we shouldn't warn about two votes from an authority for the same period, and we could instead have a more resilient process here, where authorities can update their votes at various times over the voting period, up to some point.

I'm not sure whether this helps reliability more or less than it risks it, but it worth investigating.

Next, the authorities determine the consensus method as they do today, using the field "consensus-method". This can also be expressed as the voting operation Threshold[SUPERQUORUM_PRESENT, false, uint].

If there is no consensus for the consensus-method, then voting stops without having produced a consensus.

Note that in contrast with its behavior in the current voting algorithm, the consensus method does not determine the way to vote on every individual field: that aspect of voting is controlled by the voting-rules. Instead, the consensus-method changes other aspects of this voting, such as:

* Adding, removing, or changing the semantics of voting operations. * Changing the set of documents to which voting operations apply. * Otherwise changing the rules that are set out in this document.

Once a consensus-method is decided, the next step is to compute the consensus for other sections in this order: meta, client-params, server-params, and indices. The consensus for each is calculated according to the operations given in the corresponding section of VotingRules.

Next the authorities compute a consensus on the relays section, which is done slightly differently, according to the rules of RelayRules element of VotingRules.

Finally, the authorities transform the resulting sections into an ENDIVE and a legacy consensus, as in "Computing an ENDIVE" and "Computing a legacy consensus" below.

To vote on a single VotingSection, find the corresponding SectionRules objects in the VotingRules of this votes, and apply it as described above in "Voting on document sections".

If an older consensus method is negotiated (Transitional)

The legacy-vote field in the vote document contains an older (v3, text-style) consensus vote, and is used when an older consensus method is negotiated. The legacy-vote is encoded by splitting it into pieces, to help with CBOR diff calculation. Authorities MAY split at line boundaries, space boundaries, or anywhere that will help with diffs. To reconstruct the legacy vote, concatenate the members of legacy-vote in order. The resulting string MUST validate according to the rules of the legacy voting algorithm.

If a legacy vote is present, then authorities SHOULD include the same view of the network in the legacy vote as they included in their real vote.

If a legacy vote is present, then authorities MUST give the same list of consensus-methods and the same voting schedule in both votes. Authorities MUST reject noncompliant votes.

Computing an ENDIVE.

If a consensus-method is negotiated that is high enough to support ENDIVEs, then the authorities proceed as follows to transform the consensus sectoins above into an ENDIVE.

The ParamSections from the consensus are used verbatim as the bodies of the client-params and relay-params fields.

The fields that appear in each RelaySNIPInfo determine what goes into the SNIPRouterData for each relay. To build the relay section, first decide which relays appear according to the key_min_count field in the RelayRules. Then collate relays across all the votes by their keys, and see which ones are listed. For each key that appears in at least key_min_count votes, apply the RelayRules to each section of the RelayInfos for that key.

The sig_params section is derived from fields in the meta section. Fields with identical names are simply copied; Lifespan values are copied to the corresponding documents (snip-lifetime as the lifespan for SNIPs and ENDIVEs, and c and s-param-lifetime as the lifespan for ParamDocs).

To compute the signature nonce, use the signature digest algorithm to compute the digest of each input vote body, sort those digests lexicographically, and concatenate and hash those digests again.

Routing indices are built according to named IndexRules, and grouped according to fields in the meta section. See "Constructing Indices" below.

(At this point extra fields may be copied from the Meta section of each RelayInfo into the ENDIVERouterData depending on the meta document; we do not, however, currently specify any case where this is done.)

Constructing indices

After having built the list of relays, the authorities construct and encode the indices that appear in the ENDIVEs. The voted-upon GenericIndexRule values in the IndexSection of the consensus say how to build the indices in the ENDIVE, as follows.

An EdIndex is built using the IndexType_Ed25519Id value, with the provided prefix and suffix values. Authorities don't need to expand this index in the ENDIVE, since the relays can compute it deterministically.

An RSAIndex is built using the IndexType_RSAId type. Authorities don't need to expand this index in the ENDIVE, since the relays can compute it deterministically.

A BwIndex is built using the IndexType_Weighted type. Each relay has a weight equal to some specified bandwidth field in its consensus RelayInfo. If a relay is missing any of the required_flags in its meta section, or if it does not have the specified bandwidth field, that relay's weight becomes 0.

A WeightedIndex is built by computing a BwIndex, and then transforming each relay in the list according to the flags that it has set. Relays that match any set of flags in the WeightedIndex rule get their bandwidths multiplied by all WeightVals that apply. Some WeightVals are computed according to special rules, such as "Wgg", "Weg", and so on. These are taken from the current dir-spec.txt.

For both BwIndex and WeightedIndex values, authorities MUST scale the computed outputs so that no value is greater than UINT32_MAX; they MUST do by shifting all values right by lowest number of bits that achieves this.

We could specify a more precise algorithm, but this is simpler.

Indices with the same IndexGroupId are placed in the same index group; index groups are ordered numerically.

Computing a legacy consensus.

When using a consensus method that supports Walking Onions, the legacy consensus is computed from the same data as the ENDIVE. Because the legacy consensus format will be frozen once Walking Onions is finalized, we specify this transformation directly, rather than in a more extensible way.

The published time and descriptor digest are used directly. Microdescriptor negotiation proceeds as before. Bandwidths, measured bandwidths, descriptor digests, published times, flags, and rsa-id values are taken from the RelayMetaInfo section. Addresses, protovers, versions, and so on are taken from the RelaySNIPInfo. Header fields are all taken from the corresponding header fields in the MetaSection or the ClientParamsSection. All parameters are copied into the net-params field.

Managing indices over time.

The present voting mechanism does not do a great job of handling the authorities

The semantic meaning of most IndexId values, as understood by clients should remain unchanging; if a client uses index 6 for middle nodes, 6 should always mean "middle nodes".

If an IndexId is going to change its meaning over time, it should not be hardcoded by clients; it should instead be listed in the NetParams document, as the exit indices are in the port-classes field. (See also section 6 and appendix AH.) If such a field needs to change, it also needs a migration method that allows clients with older and newer parameters documents to exist at the same time.

Relay operations: Receiving and expanding ENDIVEs

Previously, we introduced a format for ENDIVEs to be transmitted from authorities to relays. To save on bandwidth, the relays download diffs rather than entire ENDIVEs. The ENDIVE format makes several choices in order to make these diffs small: the Merkle tree is omitted, and routing indices are not included directly.

To address those issues, this document describes the steps that a relay needs to perform, upon receiving an ENDIVE document, to derive all the SNIPs for that ENDIVE.

Here are the steps to be followed. We'll describe them in order, though in practice they could be pipelined somewhat. We'll expand further on each step later on.

  1. Compute routing indices positions.

  2. Compute truncated SNIPRouterData variations.

  3. Build signed SNIP data.

  4. Compute Merkle tree.

  5. Build authenticated SNIPs.

Below we'll specify specific algorithms for these steps. Note that relays do not need to follow the steps of these algorithms exactly, but they MUST produce the same outputs as if they had followed them.

Computing index positions.

For every IndexId in every Index Group, the relay will compute the full routing index. Every routing index is a mapping from index position ranges (represented as 2-tuples) to relays, where the relays are represented as ENDIVERouterData members of the ENDIVE. The routing index must map every possible value of the index to exactly one relay.

An IndexSpec field describes how the index is to be constructed. There are four types of IndexSpec: Raw, Raw Spans, Weighted, RSAId, and Ed25519Id. We'll describe how to build the indices for each.

Every index may either have an integer key, or a binary-string key. We define the "successor" of an integer index as the succeeding integer. We define the "successor" of a binary string as the next binary string of the same length in lexicographical (memcmp) order. We define "predecessor" as the inverse of "successor". Both these operations "wrap around" the index.

The algorithms here describe a set of invariants that are "verified". Relays SHOULD check each of these invariants; authorities MUST NOT generate any ENDIVEs that violate them. If a relay encounters an ENDIVE that cannot be verified, then the ENDIVE cannot be expanded.

NOTE: conceivably should there be some way to define an index as a subset of another index, with elements weighted in different ways? In other words, "Index a is index b, except multiply these relays by 0 and these relays by 1.2". We can keep this idea sitting around in case there turns out to be a use for it.

Raw indices

When the IndexType is Indextype_Raw, then its members are listed directly in the IndexSpec.

Algorithm: Expanding a "Raw" indexspec. Let result_idx = {} (an empty mapping). Let previous_pos = indexspec.first_index For each element [i, pos2] of indexspec.index_ranges: Verify that i is a valid index into the list of ENDIVERouterData. Set pos1 = the successor of previous_pos. Verify that pos1 and pos2 have the same type. Append the mapping (pos1, pos2) => i to result_idx Set previous_pos to pos2. Verify that previous_pos = the predecessor of indexspec.first_index. Return result_idx.

Raw numeric indices

If the IndexType is Indextype_RawNumeric, it is described by a set of spans on a 32-bit index range.

Algorithm: Expanding a RawNumeric index. Let prev_pos = 0 For each element [i, span] of indexspec.index_ranges: Verify that i is a valid index into the list of ENDIVERouterData. Verify that prev_pos <= UINT32_MAX - span. Let pos2 = prev_pos + span. Append the mapping (pos1, pos2) => i to result_idx. Let prev_pos = successor(pos2) Verify that prev_pos = UINT32_MAX. Return result_idx.

Weighted indices

If the IndexSpec type is Indextype_Weighted, then the index is described by assigning a probability weight to each of a number of relays. From these, we compute a series of 32-bit index positions.

This algorithm uses 64-bit math, and 64-by-32-bit integer division.

It requires that the sum of weights is no more than UINT32_MAX.

Algorithm: Expanding a "Weighted" indexspec. Let total_weight = SUM(indexspec.index_weights) Verify total_weight <= UINT32_MAX. Let total_so_far = 0. Let result_idx = {} (an empty mapping). Define POS(b) = FLOOR( (b << 32) / total_weight). For 0 <= i < LEN(indexspec.indexweights): Let w = indexspec.indexweights[i]. Let lo = POS(total_so_far). Let total_so_far = total_so_far + w. Let hi = POS(total_so_far) - 1. Append (lo, hi) => i to result_idx. Verify that total_so_far = total_weight. Verify that the last value of "hi" was UINT32_MAX. Return result_idx.

This algorithm is a bit finicky in its use of division, but it results in a mapping onto 32 bit integers that completely covers the space of available indices.

RSAId indices

If the IndexSpec type is Indextype_RSAId then the index is a set of binary strings describing the routers' legacy RSA identities, for use in the HSv2 hash ring.

These identities are truncated to a fixed length. Though the SNIP format allows variable-length binary prefixes, we do not use this feature.

Algorithm: Expanding an "RSAId" indexspec. Let R = [ ] (an empty list). Take the value n_bytes from the IndexSpec. For 0 <= b_idx < MIN( LEN(indexspec.members) * 8, LEN(list of ENDIVERouterData) ): Let b = the b_idx'th bit of indexspec.members. If b is 1: Let m = the b_idx'th member of the ENDIVERouterData list. Verify that m has its RSAIdentityFingerprint set. Let pos = m.RSAIdentityFingerprint, truncated to n_bytes. Add (pos, b_idx) to the list R. Return INDEX_FROM_RING_KEYS(R). Sub-Algorithm: INDEX_FROM_RING_KEYS(R) First, sort R according to its 'pos' field. For each member (pos, idx) of the list R: If this is the first member of the list R: Let key_low = pos for the last member of R. else: Let key_low = pos for the previous member of R. Let key_high = predecessor(pos) Add (key_low, key_high) => idx to result_idx. Return result_idx.

Ed25519 indices

If the IndexSpec type is Indextype_Ed25519, then the index is a set of binary strings describing the routers' positions in a hash ring, derived from their Ed25519 identity keys.

This algorithm is a generalization of the one used for hsv3 rings, to be used to compute the hsv3 ring and other possible future derivatives.

Algorithm: Expanding an "Ed25519Id" indexspec. Let R = [ ] (an empty list). Take the values prefix, suffix, and n_bytes from the IndexSpec. Let H() be the digest algorithm specified by d_alg from the IndexSpec. For 0 <= b_idx < MIN( LEN(indexspec.members) * 8, LEN(list of ENDIVERouterData) ): Let b = the b_idx'th bit of indexspec.members. If b is 1: Let m = the b_idx'th member of the ENDIVERouterData list. Let key = m's ed25519 identity key, as a 32-byte value. Compute pos = H(prefix || key || suffix) Truncate pos to n_bytes. Add (pos, b_idx) to the list R. Return INDEX_FROM_RING_KEYS(R).

Building a SNIPLocation

After computing all the indices in an IndexGroup, relays combine them into a series of SNIPLocation objects. Each SNIPLocation MUST contain all the IndexId => IndexRange entries that point to a given ENDIVERouterData, for the IndexIds listed in an IndexGroup.

Algorithm: Build a list of SNIPLocation objects from a set of routing indices. Initialize R as [ { } ] * LEN(relays) (A list of empty maps) For each IndexId "ID" in the IndexGroup: Let router_idx be the index map calculated for ID. (This is what we computed previously.) For each entry ( (LO, HI) => idx) in router_idx: Let R[idx][ID] = (LO, HI).

SNIPLocation objects are thus organized in the order in which they will appear in the Merkle tree: that is, sorted by the position of their corresponding ENDIVERouterData.

Because SNIPLocation objects are signed, they must be encoded as "canonical" cbor, according to section 3.9 of RFC 7049.

If R[idx] is {} (the empty map) for any given idx, then no SNIP will be generated for the SNIPRouterData at that routing index for this index group.

Computing truncated SNIPRouterData.

An index group can include an omit_from_snips field to indicate that certain fields from a SNIPRouterData should not be included in the SNIPs for that index group.

Since a SNIPRouterData needs to be signed, this process has to be deterministic. Thus, the truncated SNIPRouterData should be computed by removing the keys and values for EXACTLY the keys listed and no more. The remaining keys MUST be left in the same order that they appeared in the original SNIPRouterData, and they MUST NOT be re-encoded.

(Two keys are "the same" if and only if they are integers encoding the same value, or text strings with the same UT-8 content.)

There is no need to compute a SNIPRouterData when no SNIP is going to be generated for a given router.

Building the Merkle tree.

After computing a list of (SNIPLocation, SNIPRouterData) for every entry in an index group, the relay needs to expand a Merkle tree to authenticate every SNIP.

There are two steps here: First the relay generates the leaves, and then it generates the intermediate hashes.

To generate the list of leaves for an index group, the relay first removes all entries from the (SNIPLocation, SNIPRouterData) list that have an empty index map. The relay then puts n_padding_entries "nil" entries at the end of the list.

To generate the list of leaves for the whole Merkle tree, the relay concatenates these index group lists in the order in which they appear in the ENDIVE, and pads the resulting list with "nil" entries until the length of the list is a power of two: 2^tree-depth for some integer tree-depth. Let LEAF(IDX) denote the entry at position IDX in this list, where IDX is a D-bit bitstring. LEAF(IDX) is either a byte string or nil.

The relay then recursively computes the hashes in the Merkle tree as follows. (Recall that H_node() and H_leaf() are hashes taking a bit-string PATH, a LIFESPAN and NONCE from the signature information, and a variable-length string ITEM.)

Recursive defintion: HM(PATH) Given PATH a bitstring of length no more than tree-depth. Define S: S(nil) = an all-0 string of the same length as the hash output. S(x) = x, for all other x. If LEN(PATH) = tree-depth: (Leaf case.) If LEAF(PATH) = nil: HM(PATH) = nil. Else: HM(PATH) = H_node(PATH, LIFESPAN, NONCE, LEAF(PATH)). Else: Let LEFT = HM(PATH || 0) Let RIGHT = HM(PATH || 1) If LEFT = nil and RIGHT = nil: HM(PATH) = nil else: HM(PATH) = H_node(PATH, LIFESPAN, NONCE, S(LEFT) || S(RIGHT))

Note that entries aren't computed for "nil" leaves, or any node all of whose children are "nil". The "nil" entries only exist to place all leaves at a constant depth, and to enable spacing out different sections of the tree.

If signature-depth for the ENDIVE is N, the relay does not need to compute any Merkle tree entries for PATHs of length shorter than N bits.

Assembling the SNIPs

Finally, the relay has computed a list of encoded (SNIPLocation, RouterData) values, and a Merkle tree to authenticate them. At this point, the relay builds them into SNIPs, using the sig_params and signatures from the ENDIVE.

Algorithm: Building a SNIPSignature for a SNIP. Given a non-nil (SNIPLocation, RouterData) at leaf position PATH. Let SIG_IDX = PATH, truncated to signature-depth bits. Consider SIG_IDX as an integer. Let Sig = signatures[SIG_IDX] -- either the SingleSig or the MultiSig for this snip. Let HashPath = [] (an empty list). For bitlen = signature-depth+1 ... tree-depth-1: Let X = PATH, truncated to bitlen bits. Invert the final bit of PATH. Append HM(PATH) to HashPath. The SnipSignature's signature values is Sig, and its merkle_path is HashPath.

Implementation considerations

A relay only needs to hold one set of SNIPs at a time: once one ENDIVE's SNIPs have been extracted, then the SNIPs from the previous ENDIVE can be discarded.

To save memory, a relay MAY store SNIPs to disk, and mmap them as needed.

Extending circuits with Walking Onions

When a client wants to extend a circuit, there are several possibilities. It might need to extend to an unknown relay with specific properties. It might need to extend to a particular relay from which it has received a SNIP before. In both cases, there are changes to be made in the circuit extension process.

Further, there are changes we need to make for the handshake between the extending relay and the target relay. The target relay is no longer told by the client which of its onion keys it should use... so the extending relay needs to tell the target relay which keys are in the SNIP that the client is using.

Modifying the EXTEND/CREATE handshake

First, we will require that proposal 249 (or some similar proposal for wide CREATE and EXTEND cells) is in place, so that we can have EXTEND cells larger than can fit in a single cell. (See 319-wide-everything.md for an example proposal to supersede 249.)

We add new fields to the CREATE2 cell so that relays can send each other more information without interfering with the client's part of the handshake.

The CREATE2, CREATED2, and EXTENDED2 cells change as follows:

struct create2_body { // old fields u16 htype; // client handshake type u16 hlen; // client handshake length u8 hdata[hlen]; // client handshake data. // new fields u8 n_extensions; struct extension extension[n_extensions]; } struct created2_body { // old fields u16 hlen; u8 hdata[hlen]; // new fields u8 n_extensions; struct extension extension[n_extensions]; } struct truncated_body { // old fields u8 errcode; // new fields u8 n_extensions; struct extension extension[n_extensions]; } // EXTENDED2 cells can now use the same new fields as in the // created2 cell. struct extension { u16 type; u16 len; u8 body[len]; }

These extensions are defined by this proposal:

[01] -- Partial_SNIPRouterData -- Sent from an extending relay to a target relay. This extension holds one or more fields from the SNIPRouterData that the extending relay is using, so that the target relay knows (for example) what keys to use. (These fields are determined by the "forward_with_extend" field in the ENDIVE.)

[02] -- Full_SNIP -- an entire SNIP that was used in an attempt to extend the circuit. This must match the client's provided index position.

[03] -- Extra_SNIP -- an entire SNIP that was not used to extend the circuit, but which the client requested anyway. This can be sent back from the extending relay when the client specifies multiple index positions, or uses a nonzero "nth" value in their snip_index_pos link specifier.

[04] -- SNIP_Request -- a 32-bit index position, or a single zero byte, sent away from the client. If the byte is 0, the originator does not want a SNIP. Otherwise, the originator does want a SNIP containing the router and the specified index. Other values are unspecified.

By default, EXTENDED2 cells are sent with a SNIP iff the EXTENDED2 cell used a snip_index_pos link specifier, and CREATED2 cells are not sent with a SNIP.

We add a new link specifier type for a router index, using the following coding for its contents:

/* Using trunnel syntax here. */ struct snip_index_pos { u32 index_id; // which index is it? u8 nth; // how many SNIPs should be skipped/included? u8 index_pos[]; // extends to the end of the link specifier. }

The index_pos field can be longer or shorter than the actual width of the router index. If it is too long, it is truncated. If it is too short, it is extended with zero-valued bytes.

Any number of these link specifiers may appear in an EXTEND cell. If there is more then one, then they should appear in order of client preference; the extending relay may extend to any of the listed routers.

This link specifier SHOULD NOT be used along with IPv4, IPv6, RSA ID, or Ed25519 ID link specifiers. Relays receiving such a link specifier along with a snip_index_pos link specifier SHOULD reject the entire EXTEND request.

If nth is nonzero, then link specifier means "the n'th SNIP after the one defined by the SNIP index position." A relay MAY reject this request if nth is greater than 4. If the relay does not reject this request, then it MUST include all snips between index_pos and the one that was actually used in an Extra_Snip extension. (Otherwise, the client would not be able to verify that it had gotten the correct SNIP.)

I've avoided use of CBOR for these types, under the assumption that we'd like to use CBOR for directory stuff, but no more. We already have trunnel-like objects for this purpose.

Modified ntor handshake

We adapt the ntor handshake from tor-spec.txt for this use, with the following main changes.

  • The NODEID and KEYID fields are omitted from the input. Instead, these fields may appear in a PartialSNIPData extension.

  • The NODEID and KEYID fields appear in the reply.

  • The NODEID field is extended to 32 bytes, and now holds the relay's ed25519 identity.

So the client's message is now:

CLIENT_PK [32 bytes]

And the relay's reply is now:

NODEID [32 bytes] KEYID [32 bytes] SERVER_PK [32 bytes] AUTH [32 bytes]

otherwise, all fields are computed as described in tor-spec.

When this handshake is in use, the hash function is SHA3-256 and keys are derived using SHAKE-256, as in rend-spec-v3.txt.

Future work: We may wish to update this choice of functions between now and the implementation date, since SHA3 is a bit pricey. Perhaps one of the BLAKEs would be a better choice. If so, we should use it more generally. On the other hand, the presence of public-key operations in the handshake probably outweighs the use of SHA3.

We will have to give this version of the handshake a new handshake type.

New relay behavior on EXTEND and CREATE failure.

If an EXTEND2 cell based on an routing index fails, the relay should not close the circuit, but should instead send back a TRUNCATED cell containing the SNIP in an extension.

If a CREATE2 cell fails and a SNIP was requested, then instead of sending a DESTROY cell, the relay SHOULD respond with a CREATED2 cell containing 0 bytes of handshake data, and the SNIP in an extension. Clients MAY re-extend or close the circuit, but should not leave it dangling.

NIL handshake type

We introduce a new handshake type, "NIL". The NIL handshake always fails. A client's part of the NIL handshake is an empty bytestring; there is no server response that indicates success.

The NIL handshake can used by the client when it wants to fetch a SNIP without creating a circuit.

Upon receiving a request to extend with the NIL circuit type, a relay SHOULD NOT actually open any connection or send any data to the target relay. Instead, it should respond with a TRUNCATED cell with the SNIP(s) that the client requested in one or more Extra_SNIP extensions.

Padding handshake cells to a uniform size

To avoid leaking information, all CREATE/CREATED/EXTEND/EXTENDED cells SHOULD be padded to the same sizes. In all cases, the amount of padding is controlled by a set of network parameters: "create-pad-len", "created-pad-len", "extend-pad-len" and "extended-pad-len". These parameters determine the minimum length that the cell body or relay cell bodies should be.

If a cell would be sent whose body is less than the corresponding parameter value, then the sender SHOULD pad the body by adding zero-valued bytes to the cell body. As usual, receivers MUST ignore extra bytes at the end of cells.

ALTERNATIVE: We could specify a more complicated padding mechanism, eg. 32 bytes of zeros then random bytes.

Client behavior with walking onions

Today's Tor clients have several behaviors that become somewhat more difficult to implement with Walking Onions. Some of these behaviors are essential and achievable. Others can be achieved with some effort, and still others appear to be incompatible with the Walking Onions design.

Bootstrapping and guard selection

When a client first starts running, it has no guards on the Tor network, and therefore can't start building circuits immediately. To produce a list of possible guards, the client begins connecting to one or more fallback directories on their ORPorts, and building circuits through them. These are 3-hop circuits. The first hop of each circuit is the fallback directory; the second and third hops are chosen from the Middle routing index. At the third hop, the client then sends an informational request for a guard's SNIP. This informational request is an EXTEND2 cell with handshake type NIL, using a random spot on the Guard routing index.

Each such request yields a single SNIP that the client will store. These SNIPs, in the order in which they were requested, will form the client's list of "Sampled" guards as described in guard-spec.txt.

Clients SHOULD ensure that their sampled guards are not linkable to one another. In particular, clients SHOULD NOT add more than one guard retrieved from the same third hop on the same circuit. (If it did, that third hop would realize that some client using guard A was also using guard B.)

Future work: Is this threat real? It seems to me that knowing one or two guards at a time in this way is not a big deal, though knowing the whole set would sure be bad. However, we shouldn't optimize this kind of defense away until we know that it's actually needless.

If a client's network connection or choice of entry nodes is heavily restricted, the client MAY request more than one guard at a time, but if it does so, it SHOULD discard all but one guard retrieved from each set.

After choosing guards, clients will continue to use them even after their SNIPs expire. On the first circuit through each guard after opening a channel, clients should ask that guard for a fresh SNIP for itself, to ensure that the guard is still listed in the consensus, and to keep the client's information up-to-date.

Using bridges

As now, clients are configured to use a bridge by using an address and a public key for the bridge. Bridges behave like guards, except that they are not listed in any directory or ENDIVE, and so cannot prove membership when the client connects to them.

On the first circuit through each channel to a bridge, the client asks that bridge for a SNIP listing itself in the Self routing index. The bridge responds with a self-created unsigned SNIP:

; This is only valid when received on an authenticated connection ; to a bridge. UnsignedSNIP = [ ; There is no signature on this SNIP. auth : nil, ; Next comes the location of the SNIP within the ENDIVE. This ; SNIPLocation will list only the Self index. index : bstr .cbor SNIPLocation, ; Finally comes the information about the router. router : bstr .cbor SNIPRouterData, ]

Security note: Clients MUST take care to keep UnsignedSNIPs separated from signed ones. These are not part of any ENDIVE, and so should not be used for any purpose other than connecting through the bridge that the client has received them from. They should be kept associated with that bridge, and not used for any other, even if they contain other link specifiers or keys. The client MAY use link specifiers from the UnsignedSNIP on future attempts to connect to the bridge.

Finding relays by exit policy

To find a relay by exit policy, clients might choose the exit routing index corresponding to the exit port they want to use. This has negative privacy implications, however, since the middle node discovers what kind of exit traffic the client wants to use. Instead, we support two other options.

First, clients may build anonymous three-hop circuits and then use those circuits to request the SNIPs that they will use for their exits. This may, however, be inefficient.

Second, clients may build anonymous three-hop circuits and then use a BEGIN cell to try to open the connection when they want. When they do so, they may include a new flag in the begin cell, "DVS" to enable Delegated Verifiable Selection. As described in the Walking Onions paper, DVS allows a relay that doesn't support the requested port to instead send the client the SNIP of a relay that does. (In the paper, the relay uses a digest of previous messages to decide which routing index to use. Instead, we have the client send an index field.)

This requires changes to the BEGIN and END cell formats. After the "flags" field in BEGIN cells, we add an extension mechanism:

struct begin_cell { nulterm addr_port; u32 flags; u8 n_extensions; struct extension exts[n_extensions]; }

We allow the snip_index_pos link specifier type to appear as a begin extension.

END cells will need to have a new format that supports including policy and SNIP information. This format is enabled whenever a new EXTENDED_END_CELL flag appears in the begin cell.

struct end_cell { u8 tag IN [ 0xff ]; // indicate that this isn't an old-style end cell. u8 reason; u8 n_extensions; struct extension exts[n_extensions]; }

We define three END cell extensions. Two types are for addresses, that indicate what address was resolved and the associated TTL:

struct end_ext_ipv4 { u32 addr; u32 ttl; } struct end_ext_ipv6 { u8 addr[16]; u32 ttl; }

One new END cell extension is used for delegated verifiable selection:

struct end_ext_alt_snip { u16 index_id; u8 snip[..]; }

This design may require END cells to become wider; see 319-wide-everything.md for an example proposal to supersede proposal 249 and allow more wide cell types.

Universal path restrictions

There are some restrictions on Tor paths that all clients should obey, unless they are configured not to do so. Some of these restrictions (like "start paths with a Guard node" or "don't use an Exit as a middle when Exit bandwidth is scarce") are captured by the index system. Some other restrictions are not. Here we describe how to implement those.

The general approach taken here is "build and discard". Since most possible paths will not violate these universal restrictions, we accept that a fraction of the paths built will not be usable. Clients tear them down a short time after they are built.

Clients SHOULD discard a circuit if, after it has been built, they find that it contains the same relay twice, or it contains more than one relay from the same family or from the same subnet.

Clients MAY remember the SNIPs they have received, and use those SNIPs to avoid index ranges that they would automatically reject. Clients SHOULD NOT store any SNIP for longer than it is maximally recent.

NOTE: We should continue to monitor the fraction of paths that are rejected in this way. If it grows too high, we either need to amend the path selection rules, or change authorities to e.g. forbid more than a certain fraction of relay weight in the same family or subnet.

FUTURE WORK: It might be a good idea, if these restrictions truly are 'universal', for relays to have a way to say "You wouldn't want that SNIP; I am giving you the next one in sequence" and send back both SNIPs. This would need some signaling in the EXTEND/EXTENDED cells.

Client-configured path restrictions

Sometimes users configure their clients with path restrictions beyond those that are in ordinary use. For example, a user might want to enter only from US relays, but never exit from US. Or they might be configured with a short list of vanguards to use in their second position.

Handling "light" restrictions

If a restriction only excludes a small number of relays, then clients can continue to use the "build and discard" methodology described above.

Handling some "heavy" restrictions

Some restrictions can exclude most relays, and still be reasonably easy to implement if they only include a small fraction of relays. For example, if the user has a EntryNodes restriction that contains only a small group of relays by exact IP address, the client can connect or extend to one of those addresses specifically.

If we decide IP ranges are important, that IP addresses without ports are important, or that key specifications are important, we can add routing indices that list relays by IP, by RSAId, or by Ed25519 Id. Clients could then use those indices to remotely retrieve SNIPs, and then use those SNIPs to connect to their selected relays.

Future work: we need to decide how many of the above functions to actually support.

Recognizing too-heavy restrictions

The above approaches do not handle all possible sets of restrictions. In particular, they do a bad job for restrictions that ban a large fraction of paths in a way that is not encodeable in the routing index system.

If there is substantial demand for such a path restriction, implementors and authority operators should figure out how to implement it in the index system if possible.

Implementations SHOULD track what fraction of otherwise valid circuits they are closing because of the user's configuration. If this fraction is above a certain threshold, they SHOULD issue a warning; if it is above some other threshold, they SHOULD refuse to build circuits entirely.

Future work: determine which fraction appears in practice, and use that to set the appropriate thresholds above.

Using and providing onion services with Walking Onions

Both live versions of the onion service design rely on a ring of hidden service directories for use in uploading and downloading hidden service descriptors. With Walking Onions, we can use routing indices based on Ed25519 or RSA identity keys to retrieve this data.

(The RSA identity ring is unchanging, whereas the Ed25519 ring changes daily based on the shared random value: for this reason, we have to compute two simultaneous indices for Ed25519 rings: one for the earlier date that is potentially valid, and one for the later date that is potentially valid. We call these hsv3-early and hsv3-late.)

Beyond the use of these indices, however, there are other steps that clients and services need to take in order to maintain their privacy.

Finding HSDirs

When a client or service wants to contact an HSDir, it SHOULD do so anonymously, by building a three-hop anonymous circuit, and then extending it a further hop using the snip_span link specifier to upload to any of the first 3 replicas on the ring. Clients SHOULD choose an 'nth' at random; services SHOULD upload to each replica.

Using a full 80-bit or 256-bit index position in the link specifier would leak the chosen service to somebody other than the directory. Instead, the client or service SHOULD truncate the identifier to a number of bytes equal to the network parameter hsv2-index-bytes or hsv3-index-bytes respectively. (See Appendix C.)

SNIPs for introduction points

When services select an introduction point, they should include the SNIP for the introduction point in their hidden service directory entry, along with the introduction-point fields. The format for this entry is:

"snip" NL snip NL [at most once per introduction points]

Clients SHOULD begin treating the link specifier and onion-key fields of each introduction point as optional when the "snip" field is present, and when the hsv3-tolerate-no-legacy network parameter is set to 1. If either of these fields is present, and the SNIP is too, then these fields MUST match those listed in the SNIPs. Clients SHOULD reject descriptors with mismatched fields, and alert the user that the service may be trying a partitioning attack. The "legacy-key" and "legacy-key-cert" fields, if present, should be checked similarly.

Using the SNIPs in these ways allows services to prove that their introduction points have actually been listed in the consensus recently. It also lets clients use introduction point features that the relay might not understand.

Services should include these fields based on a set of network parameters: hsv3-intro-snip and hsv3-intro-legacy-fields. (See appendix C.)

Clients should use these fields only when Walking Onions support is enabled; see section 09.

SNIPs for rendezvous points

When a client chooses a rendezvous point for a v3 onion service, it similarly has the opportunity to include the SNIP of its rendezvous point in the encrypted part of its INTRODUCE cell. (This may cause INTRODUCE cells to become fragmented; see proposal about fragmenting relay cells.)

Using the SNIPs in these ways allows services to prove that their introduction points have actually been listed in the consensus recently. It also lets services use introduction point features that the relay might not understand.

To include the SNIP, the client places it in an extension in the INTRODUCE cell. The onion key can now be omitted[*], along with the link specifiers.

[*] Technically, we use a zero-length onion key, with a new type "implicit in SNIP".

To know whether the service can recognize this kind of cell, the client should look for the presence of a "snips-allowed 1" field in the encrypted part of the hidden service descriptor.

In order to prevent partitioning, services SHOULD NOT advertise "snips-allowed 1" unless the network parameter "hsv3-rend-service-snip" is set to 1. Clients SHOULD NOT use this field unless "hsv3-rend-client-snip" is set to 1.

TAP keys and where to find them

If v2 hidden services are still supported when Walking Onions arrives on the network, we have two choices: We could migrate them to use ntor keys instead of TAP, or we could provide a way for TAP keys to be advertised with Walking Onions.

The first option would appear to be far simpler. See proposal draft 320-tap-out-again.md.

The latter option would require us to put RSA-1024 keys in SNIPs, or put a digest of them in SNIPs and give some way to retrieve them independently.

(Of course, it's possible that we will have v2 onion services deprecated by the time Walking Onions is implemented. If so, that will simplify matters a great deal too.)

Tracking Relay honesty

Our design introduces an opportunity for dishonest relay behavior: since multiple ENDIVEs are valid at the same time, a malicious relay might choose any of several possible SNIPs in response to a client's routing index value.

Here we discuss several ways to mitigate this kind of attack.

Defense: index stability

First, the voting process should be designed such that relays do not needlessly move around the routing index. For example, it would not be appropriate to add an index type whose value is computed by first putting the relays into a pseudorandom order. Instead, index voting should be deterministic and tend to give similar outputs for similar inputs.

This proposal tries to achieve this property in its index voting algorithms. We should measure the degree to which we succeed over time, by looking at all of the ENDIVEs that are valid at any particular time, and sampling several points for each index to see how many distinct relays are listed at each point, across all valid ENDIVEs.

We do not need this stability property for routing indices whose purpose is nonrandomized relay selection, such as those indices used for onion service directories.

Defense: enforced monotonicity

Once an honest relay has received an ENDIVE, it has no reason to keep any previous ENDIVEs or serve SNIPs from them. Because of this, relay implementations SHOULD ensure that no data is served from a new ENDIVE until all the data from an old ENDIVE is thoroughly discarded.

Clients and relays can use this monotonicity property to keep relays honest: once a relay has served a SNIP with some timestamp T, that relay should never serve any other SNIP with a timestamp earlier than T. Clients SHOULD track the most recent SNIP timestamp that they have received from each of their guards, and MAY track the most recent SNIP timestamps that they have received from other relays as well.

Defense: limiting ENDIVE variance within the network.

The primary motivation for allowing long (de facto) lifespans on today's consensus documents is to keep the network from grinding to a halt if the authorities fail to reach consensus for a few hours. But in practice, if there is a consensus, then relays should have it within an hour or two, so they should not be falling a full day out of date.

Therefore we can potentially add a client behavior that, within N minutes after the client has seen any SNIP with timestamp T, the client should not accept any SNIP with timestamp earlier than T - Delta.

Values for N and Delta are controlled by network parameters (enforce-endive-dl-delay-after and allow-endive-dl-delay respectively in appendix C). N should be about as long as we expect it to take for a single ENDIVE to propagate to all the relays on the network; Delta should be about as long as we would like relays to go between updating ENDIVEs under ideal circumstances.

Migrating to Walking Onions

This proposal is a major change in the Tor network that will eventually require the participation of all relays [*], and will make clients who support it distinguishable from clients that don't.

[*] Technically, the last relay in the path doesn't need support.

To keep the compatibility issues under control, here is the order in which it should be deployed on the network.

  1. First, authorities should add support for voting on ENDIVEs.

  2. Relays may immediately begin trying to download and reconstruct ENDIVEs. (Relay versions are public, so they leak nothing by doing this.)

  3. Once a sufficient number of authorities are voting on ENDIVEs and unlikely to downgrade, relays should begin serving parameter documents and responding to walking-onion EXTEND and CREATE cells. (Again, relay versions are public, so this doesn't leak.)

  4. In parallel with relay support, Tor should also add client support for Walking Onions. This should be disabled by default, however, since it will only be usable with the subset of relays that support Walking Onions, and since it would make clients distinguishable.

  5. Once enough of the relays (possibly, all) support Walking Onions, the client support can be turned on. They will not be able to use old relays that do not support Walking Onions.

  6. Eventually, relays that do not support Walking Onions should not be listed in the consensus.

Client support for Walking Onions should be enabled or disabled, at first, with a configuration option. Once it seems stable, the option should have an "auto" setting that looks at a network parameter. This parameter should NOT be a simple "on" or "off", however: it should be the minimum client version whose support for Walking Onions is believed to be correct.

Future work: migrating away from sedentary onions

Once all clients are using Walking Onions, we can take a pass through the Tor specifications and source code to remove no-longer-needed code.

Clients should be the first to lose support for old directories, since nobody but the clients depends on the clients having them. Only after obsolete clients represent a very small fraction of the network should relay or authority support be disabled.

Some fields in router descriptors become obsolete with Walking Onions, and possibly router descriptors themselves should be replaced with cbor objects of some kind. This can only happen, however, after no descriptor users remain.

Appendices

Appendix A: Glossary

I'm going to put a glossary here so I can try to use these terms consistently.

SNIP -- A "Separable Network Index Proof". Each SNIP contains the information necessary to use a single Tor relay, and associates the relay with one or more index ranges. SNIPs are authenticated by the directory authorities.

ENDIVE -- An "Efficient Network Directory with Individually Verifiable Entries". An ENDIVE is a collection of SNIPS downloaded by relays, authenticated by the directory authorities.

Routing index -- A routing index is a map from binary strings to relays, with some given property. Each relay that is in the routing index is associated with a single index range.

Index range -- A range of positions withing a routing index. Each range contains many positions.

Index position -- A single value within a routing index. Every position in a routing index corresponds to a single relay.

ParamDoc -- A network parameters document, describing settings for the whole network. Clients download this infrequently.

Index group -- A collection of routing indices that are encoded in the same SNIPs.

Appendix B: More cddl definions

; These definitions are used throughout the rest of the ; proposal ; Ed25519 keys are 32 bytes, and that isn't changing. Ed25519PublicKey = bstr .size 32 ; Curve25519 keys are 32 bytes, and that isn't changing. Curve25519PublicKey = bstr .size 32 ; 20 bytes or fewer: legacy RSA SHA1 identity fingerprint. RSAIdentityFingerprint = bstr ; A 4-byte integer -- or to be cddl-pedantic, one that is ; between 0 and UINT32_MAX. uint32 = uint .size 4 ; Enumeration to define integer equivalents for all the digest algorithms ; that Tor uses anywhere. Note that some of these are not used in ; this spec, but are included so that we can use this production ; whenever we need to refer to a hash function. DigestAlgorithm = &( NoDigest: 0, SHA1 : 1, ; deprecated. SHA2-256: 2, SHA2-512: 3, SHA3-256: 4, SHA3-512: 5, Kangaroo12-256: 6, Kangaroo12-512: 7, ) ; A digest is represented as a binary blob. Digest = bstr ; Enumeration for different signing algorithms. SigningAlgorithm = &( RSA-OAEP-SHA1 : 1, ; deprecated. RSA-OAEP-SHA256: 2, ; deprecated. Ed25519 : 3, Ed448 : 4, BLS : 5, ; Not yet standardized. ) PKAlgorithm = &( SigningAlgorithm, Curve25519: 100, Curve448 : 101 ) KeyUsage = &( ; A master unchangeable identity key for this authority. May be ; any signing key type. Distinct from the authority's identity as a ; relay. AuthorityIdentity: 0x10, ; A medium-term key used for signing SNIPs, votes, and ENDIVEs. SNIPSigning: 0x11, ; These are designed not to collide with the "list of certificate ; types" or "list of key types" in cert-spec.txt ) CertType = &( VotingCert: 0x12, ; These are designed not to collide with the "list of certificate ; types" in cert-spec.txt. ) LinkSpecifier = bstr

Appendix C: new numbers to assign.

Relay commands:

  • We need a new relay command for "FRAGMENT" per proposal 319.

CREATE handshake types:

  • We need a type for the NIL handshake.

  • We need a handshake type for the new ntor handshake variant.

Link specifiers:

  • We need a link specifier for extend-by-index.

  • We need a link specifier for dirport URL.

Certificate Types and Key Types:

  • We need to add the new entries from CertType and KeyUsage to cert-spec.txt, and possibly merge the two lists.

Begin cells:

  • We need a flag for Delegated Verifiable Selection.

  • We need an extension type for extra data, and a value for indices.

End cells:

  • We need an extension type for extra data, a value for indices, a value for IPv4 addresses, and a value for IPv6 addresses.

Extensions for decrypted INTRODUCE2 cells:

  • A SNIP for the rendezvous point.

Onion key types for decrypted INTRODUCE2 cells:

  • An "onion key" to indicate that the onion key for the rendezvous point is implicit in the SNIP.

New URLs:

  • A URL for fetching ENDIVEs.

  • A URL for fetching client / relay parameter documents

  • A URL for fetching detached SNIP signatures.

Protocol versions:

(In theory we could omit many new protovers here, since being listed in an ENDIVE implies support for the new protocol variants. We're going to use new protovers anyway, however, since doing so keeps our numbering consistent.)

We need new versions for these subprotocols:

  • Relay to denote support for new handshake elements.

  • DirCache to denote support for ENDIVEs, paramdocs, binary diffs, etc.

  • Cons to denote support for ENDIVEs

Appendix D: New network parameters.

We introduce these network parameters:

From section 5:

  • create-pad-len -- Clients SHOULD pad their CREATE cell bodies to this size.

  • created-pad-len -- Relays SHOULD pad their CREATED cell bodies to this size.

  • extend-pad-len -- Clients SHOULD pad their EXTEND cell bodies to this size.

  • extended-pad-len -- Relays SHOULD pad their EXTENDED cell bodies to this size.

From section 7:

  • hsv2-index-bytes -- how many bytes to use when sending an hsv2 index position to look up a hidden service directory. Min: 1, Max: 40. Default: 4.

  • hsv3-index-bytes -- how many bytes to use when sending an hsv3 index position to look up a hidden service directory. Min: 1, Max: 128. Default: 4.

  • hsv3-intro-legacy-fields -- include legacy fields in service descriptors. Min: 0. Max: 1. Default: 1.

  • hsv3-intro-snip -- include intro point SNIPs in service descriptors. Min: 0. Max: 1. Default: 0.

  • hsv3-rend-service-snip -- Should services advertise and accept rendezvous point SNIPs in INTRODUCE2 cells? Min: 0. Max: 1. Default: 0.

  • hsv3-rend-client-snip -- Should clients place rendezvous point SNIPS in INTRODUCE2 cells when the service supports it? Min: 0. Max: 1. Default: 0.

  • hsv3-tolerate-no-legacy -- Should clients tolerate v3 service descriptors that don't have legacy fields? Min: 0. Max: 1. Default: 0.

From section 8:

  • enforce-endive-dl-delay-after -- How many seconds after receiving a SNIP with some timestamp T does a client wait for rejecting older SNIPs? Equivalent to "N" in "limiting ENDIVE variance within the network." Min: 0. Max: INT32_MAX. Default: 3600 (1 hour).

  • allow-endive-dl-delay -- Once a client has received an SNIP with timestamp T, it will not accept any SNIP with timestamp earlier than "allow-endive-dl-delay" seconds before T. Equivalent to "Delta" in "limiting ENDIVE variance within the network." Min: 0. Max: 2592000 (30 days). Default: 10800 (3 hours).

Appendix E: Semantic sorting for CBOR values.

Some voting operations assume a partial ordering on CBOR values. We define such an ordering as follows:

  • bstr and tstr items are sorted lexicographically, as if they were compared with a version of strcmp() that accepts internal NULs.
  • uint and int items are are sorted by integer values.
  • arrays are sorted lexicographically by elements.
  • Tagged items are sorted as if they were not tagged.
  • Maps do not have any sorting order.
  • False precedes true.
  • Otherwise, the ordering between two items is not defined.

More specifically:

Algorithm: compare two cbor items A and B. Returns LT, EQ, GT, or NIL. While A is tagged, remove the tag from A. While B is tagged, remove the tag from B. If A is any integer type, and B is any integer type: return A cmp B If the type of A is not the same as the type of B: return NIL. If A and B are both booleans: return int(A) cmp int(B), where int(false)=0 and int(B)=1. If A and B are both tstr or both bstr: while len(A)>0 and len(B)>0: if A[0] != B[0]: return A[0] cmp B[0] Discard A[0] and B[0] If len(A) == len(B) == 0: return EQ. else if len(A) == 0: return LT. (B is longer) else: return GT. (A is longer) If A and B are both arrays: while len(A)>0 and len(B)>0: Run this algorithm recursively on A[0] and B[0]. If the result is not EQ: Return that result. Discard A[0] and B[0] If len(A) == len(B) == 0: return EQ. else if len(A) == 0: return LT. (B is longer) else: return GT. (A is longer) Otherwise, A and B are a type for which we do not define an ordering, so return NIL.

Appendix F: Example voting rules

Here we give a set of voting rules for the fields described in our initial VoteDocuments.

{ meta: { voting-delay: { op: "Mode", tie_low:false, type:["tuple","uint","uint"] }, voting-interval: { op: "Median", type:"uint" }, snip-lifespan: {op: "Mode", type:["tuple","uint","uint","uint"] }, c-param-lifetime: {op: "Mode", type:["tuple","uint","uint","uint"] }, s-param-lifetime: {op: "Mode", type:["tuple","uint","uint","uint"] }, cur-shared-rand: {op: "Mode", min_count: "qfield", type:["tuple","uint","bstr"]}, prev-shared-rand: {op: "Mode", min_count: "qfield", type:["tuple","uint","bstr"]}, client-params: { recommend-versions: {op:"SetJoin", min_count:"qfield",type:"tstr"}, require-protos: {op:"BitThreshold", min_count:"sqauth"}, recommend-protos: {op:"BitThreshold", min_count:"qauth"}, params: {op:"MapJoin",key_min_count:"qauth", keytype:"tstr", item_op:{op:"Median",min_vote:"qauth",type:"uint"}, }, certs: {op:"SetJoin",min_count:1, type: 'bstr'}, }, ; Use same value for server-params. relay: { meta: { desc: {op:"Mode", min_count:"qauth",tie_low:false, type:["uint","bstr"] }, flags: {op:"MapJoin", key_type:"tstr", item_op:{op:"Mode",type:"bool"}}, bw: {op:"Median", type:"uint" }, mbw :{op:"Median", type:"uint" }, rsa-id: {op:"Mode", type:"bstr"}, }, snip: { ; ed25519 key is handled as any other value. 0: { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type="bstr"} }, ; ntor onion key. 1: { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type="bstr"} }, ; link specifiers. 2: { op: "CborDerived", item-op: { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type="bstr" } } }, ; software description. 3: { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type=["tuple", "tstr", "tstr"] } }, ; protovers. 4: { op: "CborDerived", item-op: { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type="bstr" } } }, ; families. 5: { op:"SetJoin", min_count:"qfield", type:"bstr" }, ; countrycode 6: { op:"Mode", type="tstr" } , ; 7: exitpolicy. 7: { op: "CborDerived", item-op: { op: "DerivedFrom", fields:[["RM","desc"],["CP","port-classes"]], rule:{op:"Mode",type="bstr" } } }, }, legacy: { "sha1-desc": { op:"DerivedFrom", fields:[["RM","desc"]], rule:{op:"Mode",type="bstr"} }, "mds": { op:"DerivedFrom", fields:[["RM":"desc"]], rule: { op:"ThresholdOp", min_count: "qauth", multi_low:false, type:["tuple", "uint", "uint", "bstr", "bstr" ] }}, } } indices: { ; See appendix G. } }

Appendix G: A list of routing indices

Middle -- general purpose index for use when picking middle hops in circuits. Bandwidth-weighted for use as middle relays. May exclude guards and/or exits depending on overall balance of resources on the network.

Formula: type: 'weighted', source: { type:'bw', require_flags: ['Valid'], 'bwfield' : ["RM", "mbw"] }, weight: { [ "!Exit", "!Guard" ] => "Wmm", [ "Exit", "Guard" ] => "Wbm", [ "Exit", "!Guard" ] => "Wem", [ "!Exit", "Guard" ] => "Wgm", }

Guard -- index for choosing guard relays. This index is not used directly when extending, but instead only for picking guard relays that the client will later connect to directly. Bandwidth-weighted for use as guard relays. May exclude guard+exit relays depending on resource balance.

type: 'weighted', source: { type:'bw', require_flags: ['Valid', "Guard"], bwfield : ["RM", "mbw"] }, weight: { [ "Exit", ] => "Weg", }

HSDirV2 -- index for finding spots on the hsv2 directory ring.

Formula: type: 'rsa-id',

HSDirV3-early -- index for finding spots on the hsv3 directory ring for the earlier of the two "active" days. (The active days are today, and whichever other day is closest to the time at which the ENDIVE becomes active.)

Formula: type: 'ed-id' alg: SHA3-256, prefix: b"node-idx", suffix: (depends on shared-random and time period)

HSDirV3-late -- index for finding spots on the hsv3 directory ring for the later of the two "active" days.

Formula: as HSDirV3-early, but with a different suffix.

Self -- A virtual index that never appears in an ENDIVE. SNIPs with this index are unsigned, and occupy the entire index range. This index is used with bridges to represent each bridge's uniqueness.

Formula: none.

Exit0..ExitNNN -- Exits that can connect to all ports within a given PortClass 0 through NNN.

Formula:

type: 'weighted', source: { type:'bw', ; The second flag here depends on which portclass this is. require_flags: [ 'Valid', "P@3" ], bwfield : ["RM", "mbw"] }, weight: { [ "Guard", ] => "Wge", }

Appendix H: Choosing good clusters of exit policies

With Walking Onions, we cannot easily support all the port combinations [*] that we currently allow in the "policy summaries" that we support in microdescriptors.

[*] How many "short policy summaries" are there? The number would be 2^65535, except for the fact today's Tor doesn't permit exit policies to get maximally long.

In the Walking Onions whitepaper (https://crysp.uwaterloo.ca/software/walkingonions/) we noted in section 6 that we can group exit policies by class, and get down to around 220 "classes" of port, such that each class was either completely supported or completely unsupported by every relay. But that number is still impractically large: if we need ~11 bytes to represent a SNIP index range, we would need an extra 2320 bytes per SNIP, which seems like more overhead than we really want.

We can reduce the number of port classes further, at the cost of some fidelity. For example, suppose that the set {https,http} is supported by relays {A,B,C,D}, and that the set {ssh,irc} is supported by relays {B,C,D,E}. We could combine them into a new port class {https,http,ssh,irc}, supported by relays {B,C,D} -- at the expense of no longer being able to say that relay A supported {https,http}, or that relay E supported {ssh,irc}.

This loss would not necessarily be permanent: the operator of relay A might be willing to add support for {ssh,irc}, and the operator of relay E might be willing to add support for {https,http}, in order to become useful as an exit again.

(We might also choose to add a configuration option for relays to take their exit policies directly from the port classes in the consensus.)

How might we select our port classes? Three general categories of approach seem possible: top-down, bottom-up, and hybrid.

In a top-down approach, we would collaborate with authority and exit operators to identify a priori reasonable classes of ports, such as "Web", "Chat", "Miscellaneous internet", "SMTP", and "Everything else". Authorities would then base exit indices on these classes.

In a bottom-up approach, we would find an algorithm to run on the current exit policies in order to find the "best" set of port classes to capture the policies as they stand with minimal loss. (Quantifying this loss is nontrivial: do we weight by bandwidth? Do we weight every port equally, or do we call some more "important" than others?)

See exit-analysis for an example tool that runs a greedy algorithm to find a "good" partition using an unweighted, all-ports-are-equal cost function. See the files "greedy-set-cov-{4,8,16}" for examples of port classes produced by this algorithm.

In a hybrid approach, we'd use top-down and bottom-up techniques together. For example, we could start with an automated bottom-up approach, and then evaluate it based feedback from operators. Or we could start with a handcrafted top-down approach, and then use bottom-up cost metrics to look for ways to split or combine those port classes in order to represent existing policies with better fidelity.

Appendix I: Non-clique topologies with Walking Onions

For future work, we can expand the Walking Onions design to accommodate network topologies where relays are divided into groups, and not every group connects to every other. To do so requires additional design work, but here I'll provide what I hope will be a workable sketch.

First, each SNIP needs to contain an ID saying which relay group it belongs to, and an ID saying which relay group(s) may serve it.

When downloading an ENDIVE, each relay should report its own identity, and receive an ENDIVE for that identity's group. It should contain both the identities of relays in the group, and the SNIPs that should be served for different indices by members of that group.

The easy part would be to add an optional group identity field to SNIPs, defaulting to 0, indicating that the relay belongs to that group, and an optional served-by field to each SNIP, indicating groups that may serve the SNIP. You'd only accept SNIPs if they were served by a relay in a group that was allowed to serve them.

Would guards work? Sure: we'd need to have guard SNIPS served by middle relays.

For hsdirs, we'd need to have either multiple shards of the hsdir ring (which seems like a bad idea?) or have all middle nodes able to reach the hsdir ring.

Things would get tricky with making onion services work: if you need to use an introduction point or a rendezvous point in group X, then you need to get there from a relay that allows connections to group X. Does this imply indices meaning "Can reach group X" or "two-degrees of group X"?

The question becomes: "how much work on alternative topologies does it make sense to deploy in advance?" It seems like there are unknowns affecting both client and relay operations here, which suggests that advance deployment for either case is premature: we can't necessarily make either clients or relays "do the right thing" in advance given what we now know of the right thing.

Appendix Z: acknowledgments

Thanks to Peter Palfrader for his original design in proposal 141, and to the designers of PIR-Tor, both of which inspired aspects of this Walking Onions design.

Thanks to Chelsea Komlo, Sajin Sasy, and Ian Goldberg for feedback on an earlier version of this design.

Thanks to David Goulet, Teor, and George Kadianakis for commentary on earlier versions of proposal 300.

Thanks to Chelsea Komlo and Ian Goldberg for their help fleshing out so many ideas related to Walking Onions in their work on the design paper.

Thanks to Teor for improvements to diff format, ideas about grouping exit ports, and numerous ideas about getting topology and distribution right.

These specifications were supported by a grant from the Zcash Foundation.

Filename: 324-rtt-congestion-control.txt Title: RTT-based Congestion Control for Tor Author: Mike Perry Created: 02 July 2020 Status: Finished 0. Motivation [MOTIVATION] This proposal specifies how to incrementally deploy RTT-based congestion control and improved queue management in Tor. It is written to allow us to first deploy the system only at Exit relays, and then incrementally improve the system by upgrading intermediate relays. Lack of congestion control is the reason why Tor has an inherent speed limit of about 500KB/sec for downloads and uploads via Exits, and even slower for onion services. Because our stream SENDME windows are fixed at 500 cells per stream, and only ~500 bytes can be sent in one cell, the max speed of a single Tor stream is 500*500/circuit_latency. This works out to about 500KB/sec max sustained throughput for a single download, even if circuit latency is as low as 500ms. Because onion services paths are more than twice the length of Exit paths (and thus more than twice the circuit latency), onion service throughput will always have less than half the throughput of Exit throughput, until we deploy proper congestion control with dynamic windows. Proper congestion control will remove this speed limit for both Exits and onion services, as well as reduce memory requirements for fast Tor relays, by reducing queue lengths. The high-level plan is to use Round Trip Time (RTT) as a primary congestion signal, and compare the performance of two different congestion window update algorithms that both use RTT as a congestion signal. The combination of RTT-based congestion signaling, a congestion window update algorithm, and Circuit-EWMA will get us the most if not all of the benefits we seek, and only requires clients and Exits to upgrade to use it. Once this is deployed, circuit bandwidth caps will no longer be capped at ~500kb/sec by the fixed window sizes of SENDME; queue latency will fall significantly; memory requirements at relays should plummet; and transient bottlenecks in the network should dissipate. Extended background information on the choices made in this proposal can be found at: https://lists.torproject.org/pipermail/tor-dev/2020-June/014343.html https://lists.torproject.org/pipermail/tor-dev/2020-January/014140.html An exhaustive list of citations for further reading is in Section [CITATIONS]. A glossary of common congestion control acronyms and terminology is in Section [GLOSSARY]. 1. Overview [OVERVIEW] This proposal has five main sections, after this overview. These sections are referenced [IN_ALL_CAPS] rather than by number, for easy searching. Section [CONGESTION_SIGNALS] specifies how to use Tor's SENDME flow control cells to measure circuit RTT, for use as an implicit congestion signal. It also mentions an explicit congestion signal, which can be used as a future optimization once all relays upgrade. Section [CONTROL_ALGORITHMS] specifies two candidate congestion window upgrade mechanisms, which will be compared for performance in simulation in Shadow, as well as evaluated on the live network, and tuned via consensus parameters listed in [CONSENSUS_PARAMETERS]. Section [FLOW_CONTROL] specifies how to handle back-pressure when one of the endpoints stops reading data, but data is still arriving. In particular, it specifies what to do with streams that are not being read by an application, but still have data arriving on them. Section [SYSTEM_INTERACTIONS] describes how congestion control will interact with onion services, circuit padding, and conflux-style traffic splitting. Section [EVALUATION] describes how we will evaluate and tune our options for control algorithms and their parameters. Section [PROTOCOL_SPEC] describes the specific cell formats and descriptor changes needed by this proposal. Section [SECURITY_ANALYSIS] provides information about the DoS and traffic analysis properties of congestion control. 2. Congestion Signals [CONGESTION_SIGNALS] In order to detect congestion at relays on a circuit, Tor will use circuit Round Trip Time (RTT) measurement. This signal will be used in slightly different ways in our various [CONTROL_ALGORITHMS], which will be compared against each other for optimum performance in Shadow and on the live network. To facilitate this, we will also change SENDME accounting logic slightly. These changes only require clients, exits, and dirauths to update. As a future optimization, it is possible to send a direct ECN congestion signal. This signal *will* require all relays on a circuit to upgrade to support it, but it will reduce congestion by making the first congestion event on a circuit much faster to detect. To reduce confusion and complexity of this proposal, this signal has been moved to the ideas repository, under xxx-backward-ecn.txt [BACKWARD_ECN]. 2.1 RTT measurement Recall that Tor clients, exits, and onion services send RELAY_COMMAND_SENDME relay cells every CIRCWINDOW_INCREMENT (100) cells of received RELAY_COMMAND_DATA. This allows those endpoints to measure the current circuit RTT, by measuring the amount of time between sending a RELAY_COMMAND_DATA cell that would trigger a SENDME from the other endpoint, and the arrival of that SENDME cell. This means that RTT is measured every 'cc_sendme_inc' data cells. Circuits will record the minimum and maximum RTT measurement, as well as a smoothed value of representing the current RTT. The smoothing for the current RTT is performed as specified in [N_EWMA_SMOOTHING]. Algorithms that make use of this RTT measurement for congestion window update are specified in [CONTROL_ALGORITHMS]. 2.1.1. Clock Jump Heuristics [CLOCK_HEURISTICS] The timestamps for RTT (and BDP) are measured using Tor's monotime_absolute_usec() API. This API is designed to provide a monotonic clock that only moves forward. However, depending on the underlying system clock, this may result in the same timestamp value being returned for long periods of time, which would result in RTT 0-values. Alternatively, the clock may jump forward, resulting in abnormally large RTT values. To guard against this, we perform a series of heuristic checks on the time delta measured by the RTT estimator, and if these heurtics detect a stall or a jump, we do not use that value to update RTT or BDP, nor do we update any congestion control algorithm information that round. If the time delta is 0, that is always treated as a clock stall, the RTT is not used, congestion control is not updated, and this fact is cached globally. If the circuit does not yet have an EWMA RTT or it is still in Slow Start, then no further checks are performed, and the RTT is used. If the circuit has stored an EWMA RTT and has exited Slow Start, then every sendme ACK, the new candidate RTT is compared to the stored EWMA RTT. If the new RTT is 5000 times larger than the EWMA RTT, then the circuit does not record that estimate, and does not update BDP or the congestion control algorithms for that SENDME ack. If the new RTT is 5000 times smaller than the EWMA RTT, then the circuit uses the globally cached value from above (ie: it assumes the clock is stalled *only* if there was previously *also* a 0-delta RTT). If both ratio checks pass, the globally cached clock stall state is set to false (no stall), and the RTT value is used. 2.1.2. N_EWMA Smoothing [N_EWMA_SMOOTHING] RTT estimation requires smoothing, to reduce the effects of packet jitter. This smoothing is performed using N_EWMA[27], which is an Exponential Moving Average with alpha = 2/(N+1): N_EWMA = RTT*2/(N+1) + N_EWMA_prev*(N-1)/(N+1) = (RTT*2 + N_EWMA_prev*(N-1))/(N+1). Note that the second rearranged form MUST be used in order to ensure that rounding errors are handled in the same manner as other implementations. Flow control rate limiting uses this function. During Slow Start, N is set to `cc_ewma_ss`, for RTT estimation. After Slow Start, N is the number of SENDME acks between congestion window updates, divided by the value of consensus parameter 'cc_ewma_cwnd_pct', and then capped at a max of 'cc_ewma_max', but always at least 2: N = MAX(MIN(CWND_UPDATE_RATE(cc)*cc_ewma_cwnd_pct/100, cc_ewma_max), 2); CWND_UPDATE_RATE is normally just round(CWND/cc_sendme_inc), but after slow start, it is round(CWND/(cc_cwnd_inc_rate*cc_sendme_inc)). 2.2. SENDME behavior changes We will make four major changes to SENDME behavior to aid in computing and using RTT as a congestion signal. First, we will need to establish a ProtoVer of "FlowCtrl=2" to signal support by Exits for the new SENDME format and congestion control algorithm mechanisms. We will need a similar announcement in the onion service descriptors of services that support congestion control. Second, we will turn CIRCWINDOW_INCREMENT into a consensus parameter cc_sendme_inc, instead of using a hardcoded value of 100 cells. It is likely that more frequent SENDME cells will provide quicker reaction to congestion, since the RTT will be measured more often. If experimentation in Shadow shows that more frequent SENDMEs reduce congestion and improve performance but add significant overhead, we can reduce SENDME overhead by allowing SENDME cells to carry stream data, as well, using Proposal 325. The method for negotiating a common value of cc_sendme_inc on a circuit is covered in [ONION_NEGOTIATION] and [EXIT_NEGOTIATION]. Third, authenticated SENDMEs can remain as-is in terms of protocol behavior, but will require some implementation updates to account for variable window sizes and variable SENDME pacing. In particular, the sendme_last_digests list for auth sendmes needs updated checks for larger windows and CIRCWINDOW_INCREMENT changes. Other functions to examine include: - circuit_sendme_cell_is_next() - sendme_record_cell_digest_on_circ() - sendme_record_received_cell_digest() - sendme_record_sending_cell_digest() - send_randomness_after_n_cells Fourth, stream level SENDMEs will be eliminated. Details on handling streams and backpressure is covered in [FLOW_CONTROL]. 3. Congestion Window Update Algorithms [CONTROL_ALGORITHMS] In general, the goal of congestion control is to ensure full and fair utilization of the capacity of a network path -- in the case of Tor the spare capacity of the circuit. This is accomplished by setting the congestion window to target the Bandwidth-Delay Product[28] (BDP) of the circuit in one way or another, so that the total data outstanding is roughly equal to the actual transit capacity of the circuit. There are several ways to update a congestion window to target the BDP. Some use direct BDP estimation, where as others use backoff properties to achieve this. We specify three BDP estimation algorithms in the [BDP_ESTIMATION] sub-section, and three congestion window update algorithms in [TOR_WESTWOOD], [TOR_VEGAS], and [TOR_NOLA]. Note that the congestion window update algorithms differ slightly from the background tor-dev mails[1,2], due to corrections and improvements. Hence they have been given different names than in those two mails. The third algorithm, [TOR_NOLA], simply uses the latest BDP estimate directly as its congestion window. These algorithms were evaluated by running Shadow simulations, to help determine parameter ranges, and with experimentation on the live network. After this testing, we have converged on using [TOR_VEGAS], and RTT-based BDP estimation using the congestion window. We leave the algorithms in place for historical reference. All of these algorithms have rules to update 'cwnd' - the current congestion window, which starts out at a value controlled by consensus parameter 'cc_cwnd_init'. The algorithms also keep track of 'inflight', which is a count of the number of cells currently not yet acked by a SENDME. The algorithm MUST ensure that cells cease being sent if 'cwnd - inflight <= 0'. Note that this value CAN become negative in the case where the cwnd is reduced while packets are inflight. While these algorithms are in use, updates and checks of the current 'package_window' field are disabled. Where a 'package_window' value is still needed, for example by cell packaging schedulers, 'cwnd - inflight' is used (with checks to return 0 in the event of negative values). The 'deliver_window' field is still used to decide when to send a SENDME. In C tor, the deliver window is initially set at 1000, but it never gets below 900, because authenticated sendmes (Proposal 289) require that we must send only one SENDME at a time, and send it immediately after 100 cells are received. Implementation of different algorithms should be very simple - each algorithm should have a different update function depending on the selected algorithm, as specified by consensus parameter 'cc_alg'. For C Tor's current flow control, these functions are defined in sendme.c, and are called by relay.c: - sendme_note_circuit_data_packaged() - sendme_circuit_data_received() - sendme_circuit_consider_sending() - sendme_process_circuit_level() Despite the complexity of the following algorithms in their TCP implementations, their Tor equivalents are extremely simple, each being just a handful of lines of C. This simplicity is possible because Tor does not have to deal with out-of-order delivery, packet drops, duplicate packets, and other network issues at the circuit layer, due to the fact that Tor circuits already have reliability and in-order delivery at that layer. We are also removing the aspects of TCP that cause the congestion algorithm to reset into slow start after being idle for too long, or after too many congestion signals. These are deliberate choices that simplify the algorithms and also should provide better performance for Tor workloads. In all cases, variables in these sections are either consensus parameters specified in [CONSENSUS_PARAMETERS], or scoped to the circuit. Consensus parameters for congestion control are all prefixed by cc_. Everything else is circuit-scoped. 3.1. Estimating Bandwidth-Delay Product [BDP_ESTIMATION] At a high-level, there are three main ways to estimate the Bandwidth-Delay Product: by using the current congestion window and RTT, by using the inflight cells and RTT, and by measuring SENDME arrival rate. After extensive shadow simulation and live testing, we have arrived at using the congestion window RTT based estimator, but we will describe all three for background. All three estimators are updated every SENDME ack arrival. The SENDME arrival rate is the most direct way to estimate BDP, but it requires averaging over multiple SENDME acks to do so. Unfortunatetely, this approach suffers from what is called "ACK compression", where returning SENDMEs build up in queues, causing over-estimation of the BDP. The congestion window and inflight estimates rely on the congestion algorithm more or less correctly tracking an approximation of the BDP, and then use current and minimum RTT to compensate for overshoot. These estimators tend to under-estimate BDP, especially when the congestion window is below the BDP. This under-estimation is corrected for by the increase of the congestion window in congestion control algorithm rules. 3.1.1. SENDME arrival BDP estimation It is possible to directly measure BDP via the amount of time between SENDME acks. In this period of time, we know that the endpoint successfully received 'cc_sendme_inc' cells. This means that the bandwidth of the circuit is then calculated as: BWE = cc_sendme_inc/sendme_ack_timestamp_delta The bandwidth delay product of the circuit is calculated by multiplying this bandwidth estimate by the *minimum* RTT time of the circuit (to avoid counting queue time): BDP = BWE * RTT_min In order to minimize the effects of ack compression (aka SENDME responses becoming close to one another due to queue delay on the return), we maintain a history a full congestion window worth of previous SENDME timestamps. With this, the calculation becomes: BWE = (num_sendmes-1) * cc_sendme_inc / num_sendme_timestamp_delta BDP = BWE * RTT_min Note that because we are counting the number of cells *between* the first and last sendme of the congestion window, we must subtract 1 from the number of sendmes actually received. Over the time period between the first and last sendme of the congestion window, the other endpoint successfully read (num_sendmes-1) * cc_sendme_inc cells. Furthermore, because the timestamps are microseconds, to avoid integer truncation, we compute the BDP using multiplication first: BDP = (num_sendmes-1) * cc_sendme_inc * RTT_min / num_sendme_timestamp_delta After all of this, the BDP is smoothed using [N_EWMA_SMOOTHING]. This smoothing means that the SENDME BDP estimation will only work after two (2) SENDME acks have been received. Additionally, it tends not to be stable unless at least 'cc_bwe_min' sendme's are used. This is controlled by the 'cc_bwe_min' consensus parameter. Finally, if [CLOCK_HEURISTICS] have detected a clock jump or stall, this estimator is not updated. If all edge connections no longer have data available to send on a circuit and all circuit queues have drained without blocking the local orconn, we stop updating this BDP estimate and discard old timestamps. However, we retain the actual estimator value. Unfortunately, even after all of this, SENDME BDP estimation proved unreliable in Shadow simulation, due to ack compression. 3.1.2. Congestion Window BDP Estimation This is the BDP estimator we use. Assuming that the current congestion window is at or above the current BDP, the bandwidth estimate is the current congestion window size divided by the RTT estimate: BWE = cwnd / RTT_current_ewma The BDP estimate is computed by multiplying the Bandwidth estimate by the *minimum* circuit latency: BDP = BWE * RTT_min Simplifying: BDP = cwnd * RTT_min / RTT_current_ewma The RTT_min for this calculation comes from the minimum RTT_current_ewma seen in the lifetime of this circuit. If the congestion window falls to `cc_cwnd_min` after slow start, implementations MAY choose to reset RTT_min for use in this calculation to either the RTT_current_ewma, or a percentile-weighted average between RTT_min and RTT_current_ewma, specified by `cc_rtt_reset_pct`. This helps with escaping starvation conditions. The net effect of this estimation is to correct for any overshoot of the cwnd over the actual BDP. It will obviously underestimate BDP if cwnd is below BDP. 3.1.3. Inflight BDP Estimation Similar to the congestion window based estimation, the inflight estimation uses the current inflight packet count to derive BDP. It also subtracts local circuit queue use from the inflight packet count. This means it will be strictly less than or equal to the cwnd version: BDP = (inflight - circ.chan_cells.n) * RTT_min / RTT_current_ewma If all edge connections no longer have data available to send on a circuit and all circuit queues have drained without blocking the local orconn, we stop updating this BDP estimate, because there are not sufficient inflight cells to properly estimate BDP. While the research literature for Vegas says that inflight estimators performed better due to the ability to avoid overhsoot, we had better performance results using other methods to control overshot. Hence, we do not use the inflight BDP estimator. 3.1.4. Piecewise BDP estimation A piecewise BDP estimation could be used to help respond more quickly in the event the local OR connection is blocked, which indicates congestion somewhere along the path from the client to the guard (or between Exit and Middle). In this case, it takes the minimum of the inflight and SENDME estimators. When the local OR connection is not blocked, this estimator uses the max of the SENDME and cwnd estimator values. When the SENDME estimator has not gathered enough data, or has cleared its estimates based on lack of edge connection use, this estimator uses the Congestion Window BDP estimator value. 3.2. Tor Westwood: TCP Westwood using RTT signaling [TOR_WESTWOOD] http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-westwood http://nrlweb.cs.ucla.edu/nrlweb/publication/download/99/2001-mobicom-0.pdf http://cpham.perso.univ-pau.fr/TCP/ccr_v31.pdf https://c3lab.poliba.it/images/d/d7/Westwood_linux.pdf Recall that TCP Westwood is basically TCP Reno, but it uses BDP estimates for "Fast recovery" after a congestion signal arrives. We will also be using the RTT congestion signal as per BOOTLEG_RTT_TOR here, from the Options mail[1] and Defenestrator paper[3]. This system must keep track of RTT measurements per circuit: RTT_min, RTT_max, and RTT_current. These are measured using the time delta between every 'cc_sendme_inc' relay cells and the SENDME response. The first RTT_min can be measured arbitrarily, so long as it is larger than what we would get from SENDME. RTT_current is N-EWMA smoothed over 'cc_ewma_cwnd_pct' percent of congestion windows worth of SENDME acks, up to a max of 'cc_ewma_max' acks, as described in [N_EWMA_SMOOTHING]. Recall that BOOTLEG_RTT_TOR emits a congestion signal when the current RTT falls below some fractional threshold ('cc_westwood_rtt_thresh') fraction between RTT_min and RTT_max. This check is: RTT_current < (1−cc_westwood_rtt_thresh)*RTT_min + cc_westwood_rtt_thresh*RTT_max Additionally, if the local OR connection is blocked at the time of SENDME ack arrival, this is treated as an immediate congestion signal. (We can also optionally use the ECN signal described in ideas/xxx-backward-ecn.txt, to exit Slow Start.) Congestion signals from RTT, blocked OR connections, or ECN are processed only once per congestion window. This is achieved through the next_cc_event flag, which is initialized to a cwnd worth of SENDME acks, and is decremented each ack. Congestion signals are only evaluated when it reaches 0. Note that because the congestion signal threshold of TOR_WESTWOOD is a function of RTT_max, and excessive queuing can cause an increase in RTT_max, TOR_WESTWOOD may have runaway conditions. Additionally, if stream activity is constant, but of a lower bandwidth than the circuit, this will not drive the RTT upwards, and this can result in a congestion window that continues to increase in the absence of any other concurrent activity. Here is the complete congestion window algorithm for Tor Westwood. This will run each time we get a SENDME (aka sendme_process_circuit_level()): # Update acked cells inflight -= cc_sendme_inc if next_cc_event: next_cc_event-- # Do not update anything if we detected a clock stall or jump, # as per [CLOCK_HEURISTICS] if clock_stalled_or_jumped: return if next_cc_event == 0: # BOOTLEG_RTT_TOR threshold; can also be BACKWARD_ECN check: if (RTT_current < (100−cc_westwood_rtt_thresh)*RTT_min/100 + cc_westwood_rtt_thresh*RTT_max/100) or orconn_blocked: if in_slow_start: cwnd += cwnd * cc_cwnd_inc_pct_ss # Exponential growth else: cwnd = cwnd + cc_cwnd_inc # Linear growth else: if cc_westwood_backoff_min: cwnd = min(cwnd * cc_westwood_cwnd_m, BDP) # Window shrink else: cwnd = max(cwnd * cc_westwood_cwnd_m, BDP) # Window shrink in_slow_start = 0 # Back off RTT_max (in case of runaway RTT_max) RTT_max = RTT_min + cc_westwood_rtt_m * (RTT_max - RTT_min) cwnd = MAX(cwnd, cc_circwindow_min) next_cc_event = cwnd / (cc_cwnd_inc_rate * cc_sendme_inc) 3.3. Tor Vegas: TCP Vegas with Aggressive Slow Start [TOR_VEGAS] http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-vegas http://pages.cs.wisc.edu/~akella/CS740/F08/740-Papers/BOP94.pdf http://www.mathcs.richmond.edu/~lbarnett/cs332/assignments/brakmo_peterson_vegas.pdf ftp://ftp.cs.princeton.edu/techreports/2000/628.pdf TCP Vegas control algorithm estimates the queue lengths at relays by subtracting the current BDP estimate from the current congestion window. After extensive shadow simulation and live testing, we have settled on this congestion control algorithm for use in Tor. Assuming the BDP estimate is accurate, any amount by which the congestion window exceeds the BDP will cause data to queue. Thus, Vegas estimates estimates the queue use caused by congestion as: queue_use = cwnd - BDP Original TCP Vegas used a cwnd BDP estimator only. We added the ability to switch this BDP estimator in the implementation, and experimented with various options. We also parameterized this queue_use calculation as a tunable weighted average between the cwnd-based BDP estimate and the piecewise estimate (consensus parameter 'cc_vegas_bdp_mix'). After much testing of various ways to compute BDP, we were still unable to do much better than the original cwnd estimator. So while this capability to change the BDP estimator remains in the C implementation, we do not expect it to be used. However, it was useful to use a local OR connection block at the time of SENDME ack arrival, as an immediate congestion signal. Note that in C-Tor, this orconn_block state is not derived from any socket info, but instead is a heuristic that declares an orconn as blocked if any circuit cell queue exceeds the 'cellq_high' consensus parameter. (As an additional optimization, we could also use the ECN signal described in ideas/xxx-backward-ecn.txt, but this is not implemented. It is likely only of any benefit during Slow Start, and even that benefit is likely small.) During Slow Start, we use RFC3742 Limited Slow Start[32], which checks the congestion signals from RTT, blocked OR connections, or ECN every single SENDME ack. It also provides a `cc_sscap_*` parameter for each path length, which reduces the congestion window increment rate after it is crossed, as per the rules in RFC3742: rfc3742_ss_inc(cwnd): if cwnd <= cc_ss_cap_pathtype: # Below the cap, we increment as per cc_cwnd_inc_pct_ss percent: return round(cc_cwnd_inc_pct_ss*cc_sendme_inc/100) else: # This returns an increment equivalent to RFC3742, rounded, # with a minimum of inc=1. # From RFC3742: # K = int(cwnd/(0.5 max_ssthresh)); # inc = int(MSS/K); return MAX(round((cc_sendme_inc*cc_ss_cap_pathtype)/(2*cwnd)), 1); During both Slow Start, and Steady State, if the congestion window is not full, we never increase the congestion window. We can still decrease it, or exit slow start, in this case. This is done to avoid causing overshoot. The original TCP Vegas addressed this problem by computing BDP and queue_use from inflight, instead of cwnd, but we found that approach to have signficantly worse performance. Because C-Tor is single-threaded, multiple SENDME acks may arrive during one processing loop, before edge connections resume reading. For this reason, we provide two heuristics to provide some slack in determining the full condition. The first is to allow a gap between inflight and cwnd, parameterized as 'cc_cwnd_full_gap' multiples of 'cc_sendme_inc': cwnd_is_full(cwnd, inflight): if inflight + 'cc_cwnd_full_gap'*'cc_sendme_inc' >= cwnd: return true else return false The second heuristic immediately resets the full state if it falls below 'cc_cwnd_full_minpct' full: cwnd_is_nonfull(cwnd, inflight): if 100*inflight < 'cc_cwnd_full_minpct'*cwnd: return true else return false This full status is cached once per cwnd if 'cc_cwnd_full_per_cwnd=1'; otherwise it is cached once per cwnd update. These two helper functions determine the number of acks in each case: SENDME_PER_CWND(cwnd): return ((cwnd + 'cc_sendme_inc'/2)/'cc_sendme_inc') CWND_UPDATE_RATE(cwnd, in_slow_start): # In Slow Start, update every SENDME if in_slow_start: return 1 else: # Otherwise, update as per the 'cc_inc_rate' (31) return ((cwnd + 'cc_cwnd_inc_rate'*'cc_sendme_inc'/2) / ('cc_cwnd_inc_rate'*'cc_sendme_inc')); Shadow experimentation indicates that 'cc_cwnd_full_gap=2' and 'cc_cwnd_full_per_cwnd=0' minimizes queue overshoot, where as 'cc_cwnd_full_per_cwnd=1' and 'cc_cwnd_full_gap=1' is slightly better for performance. Since there may be a difference between Shadow and live, we leave this parmeterization in place. Here is the complete pseudocode for TOR_VEGAS with RFC3742, which is run every time an endpoint receives a SENDME ack. All variables are scoped to the circuit, unless prefixed by an underscore (local), or in single quotes (consensus parameters): # Decrement counters that signal either an update or cwnd event if next_cc_event: next_cc_event-- if next_cwnd_event: next_cwnd_event-- # Do not update anything if we detected a clock stall or jump, # as per [CLOCK_HEURISTICS] if clock_stalled_or_jumped: inflight -= 'cc_sendme_inc' return if BDP > cwnd: _queue_use = 0 else: _queue_use = cwnd - BDP if cwnd_is_full(cwnd, inflight): cwnd_full = 1 else if cwnd_is_nonfull(cwnd, inflight): cwnd_full = 0 if in_slow_start: if _queue_use < 'cc_vegas_gamma' and not orconn_blocked: # Only increase cwnd if the cwnd is full if cwnd_full: _inc = rfc3742_ss_inc(cwnd); cwnd += _inc # If the RFC3742 increment drops below steady-state increment # over a full cwnd worth of acks, exit slow start. if _inc*SENDME_PER_CWND(cwnd) <= 'cc_cwnd_inc'*'cc_cwnd_inc_rate': in_slow_start = 0 else: # Limit hit. Exit Slow start (even if cwnd not full) in_slow_start = 0 cwnd = BDP + 'cc_vegas_gamma' # Provide an emergency hard-max on slow start: if cwnd >= 'cc_ss_max': cwnd = 'cc_ss_max' in_slow_start = 0 else if next_cc_event == 0: if _queue_use > 'cc_vegas_delta': cwnd = BDP + 'cc_vegas_delta' - 'cc_cwnd_inc' elif _queue_use > cc_vegas_beta or orconn_blocked: cwnd -= 'cc_cwnd_inc' elif cwnd_full and _queue_use < 'cc_vegas_alpha': # Only increment if queue is low, *and* the cwnd is full cwnd += 'cc_cwnd_inc' cwnd = MAX(cwnd, 'cc_circwindow_min') # Specify next cwnd and cc update if next_cc_event == 0: next_cc_event = CWND_UPDATE_RATE(cwnd) if next_cwnd_event == 0: next_cwnd_event = SENDME_PER_CWND(cwnd) # Determine if we need to reset the cwnd_full state # (Parameterized) if 'cc_cwnd_full_per_cwnd' == 1: if next_cwnd_event == SENDME_PER_CWND(cwnd): cwnd_full = 0 else: if next_cc_event == CWND_UPDATE_RATE(cwnd): cwnd_full = 0 # Update acked cells inflight -= 'cc_sendme_inc' 3.4. Tor NOLA: Direct BDP tracker [TOR_NOLA] Based on the theory that congestion control should track the BDP, the simplest possible congestion control algorithm could just set the congestion window directly to its current BDP estimate, every SENDME ack. Such an algorithm would need to overshoot the BDP slightly, especially in the presence of competing algorithms. But other than that, it can be exceedingly simple. Like Vegas, but without putting on airs. Just enough strung together. After meditating on this for a while, it also occurred to me that no one has named a congestion control algorithm after New Orleans. We have Reno, Vegas, and scores of others. What's up with that? Here's the pseudocode for TOR_NOLA that runs on every SENDME ack: # Do not update anything if we detected a clock stall or jump, # as per [CLOCK_HEURISTICS] if clock_stalled_or_jumped: return # If the orconn is blocked, do not overshoot BDP if orconn_blocked: cwnd = BDP else: cwnd = BDP + cc_nola_overshoot cwnd = MAX(cwnd, cc_circwindow_min) 4. Flow Control [FLOW_CONTROL] Flow control provides what is known as "pushback" -- the property that if one endpoint stops reading data, the other endpoint stops sending data. This prevents data from accumulating at points in the network, if it is not being read fast enough by an application. Because Tor must multiplex many streams onto one circuit, and each stream is mapped to another TCP socket, Tor's current pushback is rather complicated and under-specified. In C Tor, it is implemented in the following functions: - circuit_consider_stop_edge_reading() - connection_edge_package_raw_inbuf() - circuit_resume_edge_reading() The decision on when a stream is blocked is performed in: - sendme_note_stream_data_packaged() - sendme_stream_data_received() - sendme_connection_edge_consider_sending() - sendme_process_stream_level() Tor currently maintains separate windows for each stream on a circuit, to provide individual stream flow control. Circuit windows are SENDME acked as soon as a relay data cell is decrypted and recognized. Stream windows are only SENDME acked if the data can be delivered to an active edge connection. This allows the circuit to continue to operate if an endpoint refuses to read data off of one of the streams on the circuit. Because Tor streams can connect to many different applications and endpoints per circuit, it is important to preserve the property that if only one endpoint edge connection is inactive, it does not stall the whole circuit, in case one of those endpoints is malfunctioning or malicious. However, window-based stream flow control also imposes a speed limit on individual streams. If the stream window size is below the circuit congestion window size, then it becomes the speed limit of a download, as we saw in the [MOTIVATION] section of this proposal. So for performance, it is optimal that each stream window is the same size as the circuit's congestion window. However, large stream windows are a vector for OOM attacks, because malicious clients can force Exits to buffer a full stream window for each stream while connecting to a malicious site and uploading data that the site does not read from its socket. This attack is significantly easier to perform at the stream level than on the circuit level, because of the multiplier effects of only needing to establish a single fast circuit to perform the attack on a very large number of streams. This catch22 means that if we use windows for stream flow control, we either have to commit to allocating a full congestion window worth memory for each stream, or impose a speed limit on our streams. Hence, we will discard stream windows entirely, and instead use a simpler buffer-based design that uses XON/XOFF to signal when this buffer is too large. Additionally, the XON cell will contain advisory rate information based on the rate at which that edge connection can write data while it has data to write. The other endpoint can rate limit sending data for that stream to the rate advertised in the XON, to avoid excessive XON/XOFF chatter and sub-optimal behavior. This will allow us to make full use of the circuit congestion window for every stream in combination, while still avoiding buffer buildup inside the network. 4.1. Stream Flow Control Without Windows [WINDOWLESS_FLOW] Each endpoint (client, Exit, or onion service) sends circuit-level SENDME acks for all circuit cells as soon as they are decrypted and recognized, but *before* delivery to their edge connections. This means that if the edge connection is blocked because an application's SOCKS connection or a destination site's TCP connection is not reading, data will build up in a queue at that endpoint, specifically in the edge connection's outbuf. Consensus parameters will govern the length of this queue that determines when XON and XOFF cells are sent, as well as when advisory XON cells that contain rate information can be sent. These parameters are separate for the queue lengths of exits, and of clients/services. (Because clients and services will typically have localhost connections for their edges, they will need similar buffering limits. Exits may have different properties, since their edges will be remote.) The trunnel relay cell payload definitions for XON and XOFF are: struct xoff_cell { u8 version IN [0x00]; } struct xon_cell { u8 version IN [0x00]; u32 kbps_ewma; } Parties SHOULD treat XON or XOFF cells with unrecognized versions as a protocol violation. In `xon_cell`, a zero value for `kbps_ewma` means that the stream's rate is unlimited. Parties should therefore not send "0" to mean "do not send data". 4.1.1. XON/XOFF behavior If the length of an edge outbuf queue exceeds the size provided in the appropriate client or exit XOFF consensus parameter, a RELAY_COMMAND_STREAM_XOFF will be sent, which instructs the other endpoint to stop sending from that edge connection. Once the queue is expected to empty, a RELAY_COMMAND_STREAM_XON will be sent, which allows the other end to resume reading on that edge connection. This XON also indicates the average rate of queue drain since the XOFF. Advisory XON cells are also sent whenever the edge connection's drain rate changes by more than 'cc_xon_change_pct' percent compared to the previously sent XON cell's value. 4.1.2. Edge bandwidth rate advertisement [XON_ADVISORY] As noted above, the XON cell provides a field to indicate the N_EWMA rate which edge connections drain their outgoing buffers. To compute the drain rate, we maintain a timestamp and a byte count of how many bytes were written onto the socket from the connection outbuf. In order to measure the drain rate of a connection, we need to measure the time it took between flushing N bytes on the socket and when the socket is available for writing again. In other words, we are measuring the time it took for the kernel to send N bytes between the first flush on the socket and the next poll() write event. For example, lets say we just wrote 100 bytes on the socket at time t = 0sec and at time t = 2sec the socket becomes writeable again, we then estimate that the rate of the socket is 100 / 2sec thus 50B/sec. To make such measurement, we start the timer by recording a timestamp as soon as data begins to accumulate in an edge connection's outbuf, currently 16KB (32 cells). We use such value for now because Tor write up to 32 cells at once on a connection outbuf and so we use this burst of data as an indicator that bytes are starting to accumulate. After 'cc_xon_rate' cells worth of stream data, we use N_EWMA to average this rate into a running EWMA average, with N specified by consensus parameter 'cc_xon_ewma_cnt'. Every EWMA update, the byte count is set to 0 and a new timestamp is recorded. In this way, the EWMA counter is averaging N counts of 'cc_xon_rate' cells worth of bytes each. If the buffers are non-zero, and we have sent an XON before, and the N_EWMA rate has changed more than 'cc_xon_change_pct' since the last XON, we send an updated rate. Because the EWMA rate is only updated every 'cc_xon_rate' cells worth of bytes, such advisory XON updates can not be sent more frequent than this, and should be sent much less often in practice. When the outbuf completely drains to 0, and has been 0 for 'cc_xon_rate' cells worth of data, we double the EWMA rate. We continue to double it while the outbuf is 0, every 'cc_xon_rate' cells. The measurement timestamp is also set back to 0. When an XOFF is sent, the EWMA rate is reset to 0, to allow fresh calculation upon drain. If a clock stall or jump is detected by [CLOCK_HEURISTICS], we also clear the fields, but do not record them in ewma. NOTE: Because our timestamps are microseconds, we chose to compute and transmit both of these rates as 1000 byte/sec units, as this reduces the number of multiplications and divisions and avoids precision loss. 4.1.3. Oomkiller behavior A malicious client can attempt to exhaust memory in an Exits outbufs, by ignoring XOFF and advisory XONs. Implementations MAY choose to close specific streams with outbufs that grow too large, but since the exit does not know with certainty the client's congestion window, it is non-trival to determine the exact upper limit a well-behaved client might send on a blocked stream. Implementations MUST close the streams with the oldest chunks present in their outbufs, while under global memory pressure, until memory pressure is relieved. 4.1.4. Sidechannel mitigation In order to mitigate DropMark attacks[28], both XOFF and advisory XON transmission must be restricted. Because DropMark attacks are most severe before data is sent, clients MUST ensure that an XOFF does not arrive before it has sent the appropriate XOFF limit of bytes on a stream ('cc_xoff_exit' for exits, 'cc_xoff_client' for onions). Clients also SHOULD ensure that advisory XONs do not arrive before the minimum of the XOFF limit and 'cc_xon_rate' full cells worth of bytes have been transmitted. Clients SHOULD ensure that advisory XONs do not arrive more frequently than every 'cc_xon_rate' cells worth of sent data. Clients also SHOULD ensure than XOFFs do not arrive more frequently than every XOFF limit worth of sent data. Implementations SHOULD close the circuit if these limits are violated on the client-side, to detect and resist dropmark attacks[28]. Additionally, because edges no longer use stream SENDME windows, we alter the half-closed connection handling to be time based instead of data quantity based. Half-closed connections are allowed to receive data up to the larger value of the congestion control max_rtt field or the circuit build timeout (for onion service circuits, we use twice the circuit build timeout). Any data or relay cells after this point are considered invalid data on the circuit. Recall that all of the dropped cell enforcement in C-Tor is performed by accounting data provided through the control port CIRC_BW fields, currently enforced only by using the vanguards addon[29]. The C-Tor implementation exposes all of these properties to CIRC_BW for vanguards to enforce, but does not enforce them itself. So violations of any of these limits do not cause circuit closure unless that addon is used (as with the rest of the dropped cell side channel handling in C-Tor). 5. System Interactions [SYSTEM_INTERACTIONS] Tor's circuit-level SENDME system currently has special cases in the following situations: Intropoints, HSDirs, onion services, and circuit padding. Additionally, proper congestion control will allow us to very easily implement conflux (circuit traffic splitting). This section details those special cases and interactions of congestion control with other components of Tor. 5.1. HSDirs Because HSDirs use the tunneled dirconn mechanism and thus also use RELAY_COMMAND_DATA, they are already subject to Tor's flow control. We may want to make sure our initial circuit window for HSDir circuits is set custom for those circuit types, so a SENDME is not required to fetch long descriptors. This will ensure HSDir descriptors can be fetched in one RTT. 5.2. Introduction Points Introduction Points are not currently subject to any flow control. Because Intropoints accept INTRODUCE1 cells from many client circuits and then relay them down a single circuit to the service as INTRODUCE2 cells, we cannot provide end-to-end congestion control all the way from client to service for these cells. We can run congestion control from the service to the Intropoint, and probably should, since this is already subject to congestion control. As an optimization, if that congestion window reaches zero (because the service is overwhelmed), then we start sending NACKS back to the clients (or begin requiring proof-of-work), rather than just let clients wait for timeout. 5.3. Rendezvous Points Rendezvous points are already subject to end-to-end SENDME control, because all relay cells are sent end-to-end via the rendezvous circuit splice in circuit_receive_relay_cell(). This means that rendezvous circuits will use end-to-end congestion control, as soon as individual onion clients and onion services upgrade to support it. There is no need for intermediate relays to upgrade at all. 5.4. Circuit Padding Recall that circuit padding is negotiated between a client and a middle relay, with one or more state machines running on circuits at the middle relay that decide when to add padding. https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md This means that the middle relay can send padding traffic towards the client that contributes to congestion, and the client may also send padding towards the middle relay, that also creates congestion. For low-traffic padding machines, such as the currently deployed circuit setup obfuscation, this padding is inconsequential. However, higher traffic circuit padding machines that are designed to defend against website traffic fingerprinting will need additional care to avoid inducing additional congestion, especially after the client or the exit experiences a congestion signal. The current overhead percentage rate limiting features of the circuit padding system should handle this in some cases, but in other cases, an XON/XOFF circuit padding flow control command may be required, so that clients may signal to the machine that congestion is occurring. 5.5. Conflux Conflux (aka multi-circuit traffic splitting) becomes significantly easier to implement once we have congestion control. However, much like congestion control, it will require experimentation to tune properly. Recall that Conflux uses a 256-bit UUID to bind two circuits together at the Exit or onion service. The original Conflux paper specified an equation based on RTT to choose which circuit to send cells on. https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf However, with congestion control, we will already know which circuit has the larger congestion window, and thus has the most available cells in its current congestion window. This will also be the faster circuit. Thus, the decision of which circuit to send a cell on only requires comparing congestion windows (and choosing the circuit with more packets remaining in its window). Conflux will require sequence numbers on data cells, to ensure that the two circuits' data is properly re-assembled. The resulting out-of-order buffer can potentially be as large as an entire congestion window, if the circuits are very desynced (or one of them closes). It will be very expensive for Exits to maintain this much memory, and exposes them to OOM attacks. This is not as much of a concern in the client download direction, since clients will typically only have a small number of these out-of-order buffers to keep around. But for the upload direction, Exits will need to send some form of early XOFF on the faster circuit if this out-of-order buffer begins to grow too large, since simply halting the delivery of SENDMEs will still allow a full congestion window full of data to arrive. This will also require tuning and experimentation, and optimum results will vary between simulator and live network. 6. Performance Evaluation [EVALUATION] Congestion control for Tor will be easy to implement, but difficult to tune to ensure optimal behavior. 6.1. Congestion Signal Experiments Our first experiments were to conduct client-side experiments to determine how stable the RTT measurements of circuits are across the live Tor network, to determine if we need more frequent SENDMEs, and/or need to use any RTT smoothing or averaging. These experiments were performed using onion service clients and services on the live Tor network. From these experiments, we tuned the RTT and BDP estimators, and arrived at reasonable values for EWMA smoothing and the minimum number of SENDME acks required to estimate BDP. Additionally, we specified that the algorithms maintain previous congestion window estimates in the event that a circuit goes idle, rather than revert to slow start. We experimented with intermittent idle/active live onion clients to make sure that this behavior is acceptable, and it appeared to be. In Shadow experimentation, the primary thing to test will be if the OR conn on Exit relays blocks too frequently when under load, thus causing excessive congestion signals, and overuse of the Inflight BDP estimator as opposed to SENDME or CWND BDP. It may also be the case that this behavior is optimal, even if it does happen. Finally, we should check small variations in the EWMA smoothing and minimum BDP ack counts in Shadow experimentation, to check for high variability in these estimates, and other surprises. 6.2. Congestion Algorithm Experiments In order to evaluate performance of congestion control algorithms, we will need to implement [TOR_WESTWOOD], [TOR_VEGAS], and [TOR_NOLA]. We will need to simulate their use in the Shadow Tor network simulator. Simulation runs will need to evaluate performance on networks that use only one algorithm, as well as on networks that run a combinations of algorithms - particularly each type of congestion control in combination with Tor's current flow control. Depending upon the number of current flow control clients, more aggressive parameters of these algorithms may need to be set, but this will result in additional queueing as well as sub-optimal behavior once all clients upgrade. In particular, during live onion service testing, we noticed that these algorithms required particularly agressive default values to compete against a network full of current clients. As more clients upgrade, we may be able to lower these defaults. We should get a good idea of what values we can choose at what upgrade point, from mixed Shadow simulation. If Tor's current flow control is so aggressive that it causes probelems with any amount of remaining old clients, we can experiment with kneecapping these legacy flow control Tor clients by setting a low 'circwindow' consensus parameter for them. This will allow us to set more reasonable parameter values, without waiting for all clients to upgrade. Because custom congestion control can be deployed by any Exit or onion service that desires better service, we will need to be particularly careful about how congestion control algorithms interact with rogue implementations that more aggressively increase their window sizes. During these adversarial-style experiments, we must verify that cheaters do not get better service, and that Tor's circuit OOM killer properly closes circuits that seriously abuse the congestion control algorithm, as per [SECURITY_ANALYSIS]. This may requiring tuning 'circ_max_cell_queue_size', and 'CircuitPriorityHalflifeMsec'. Additionally, we will need to experiment with reducing the cell queue limits on OR conns before they are blocked (OR_CONN_HIGHWATER), and study the interaction of that with treating the or conn block as a congestion signal. Finally, we will need to monitor our Shadow experiments for evidence of ack compression, which can cause the BDP estimator to over-estimate the congestion window. We will instrument our Shadow simulations to alert if they discover excessive congestion window values, and tweak 'cc_bwe_min' and 'cc_sendme_inc' appropriately. We can set the 'cc_cwnd_max' parameter value to low values (eg: ~2000 or so) to watch for evidence of this in Shadow, and log. Similarly, we should watch for evidence that the 'cc_cwnd_min' parameter value is rarely hit in Shadow, as this indicates that the cwnd may be too small to measure BDP (for cwnd less than 'cc_sendme_inc'*'cc_bwe_min'). 6.3. Flow Control Algorithm Experiments Flow control only applies when the edges outside of Tor (SOCKS application, onion service application, or TCP destination site) are *slower* than Tor's congestion window. This typically means that the application is either suspended or reading too slow off its SOCKS connection, or the TCP destination site itself is bandwidth throttled on its downstream. To examine these properties, we will perform live onion service testing, where curl is used to download a large file. We will test no rate limit, and verify that XON/XOFF was never sent. We then suspend this download, verify that an XOFF is sent, and transmission stops. Upon resuming this download, the download rate should return to normal. We will also use curl's --limit-rate option, to exercise that the flow control properly measures the drain rate and limits the buffering in the outbuf, modulo kernel socket and localhost TCP buffering. However, flow control can also get triggered at Exits in a situation where either TCP fairness issues or Tor's mainloop does not properly allocate enough capacity to edge uploads, causing them to be rate limited below the circuit's congestion window, even though the TCP destination actually has sufficient downstream capacity. Exits are also most vulnerable to the buffer bloat caused by such uploads, since there may be many uploads active at once. To study this, we will run shadow simulations. Because Shadow does *not* rate limit its tgen TCP endpoints, and only rate limits the relays themselves, if *any* XON/XOFF activity happens in Shadow *at all*, it is evidence that such fairness issues can ocurr. Just in case Shadow does not have sufficient edge activity to trigger such emergent behavior, when congestion control is enabled on the live network, we will also need to instrument a live exit, to verify that XON/XOFF is not happening frequently on it. Relays may also report these statistics in extra-info descriptor, to help with monitoring the live network conditions, but this might also require aggregation or minimization. If excessive XOFF/XON activity happens at Exits, we will need to investigate tuning the libevent mainloop to prioritize edge writes over orconn writes. Additionally, we can lower 'cc_xoff_exit'. Linux Exits can also lower the 'net.ipv[46].tcp_wmem' sysctl value, to reduce the amount of kernel socket buffering they do on such streams, which will improve XON/XOFF responsiveness and reduce memory usage. 6.4. Performance Metrics [EVALUATION_METRICS] The primary metrics that we will be using to measure the effectiveness of congestion control in simulation are TTFB/RTT, throughput, and utilization. We will calibrate the Shadow simulator so that it has similar CDFs for all of these metrics as the live network, without using congestion control. Then, we will want to inspect CDFs of these three metrics for various congestion control algorithms and parameters. The live network testing will also spot-check performance characteristics of a couple algorithm and parameter sets, to ensure we see similar results as Shadow. On the live network, because congestion control will affect so many aspects of performance, from throughput to RTT, to load balancing, queue length, overload, and other failure conditions, the full set of performance metrics will be required, to check for any emergent behaviors: https://gitlab.torproject.org/legacy/trac/-/wikis/org/roadmaps/CoreTor/PerformanceMetrics We will also need to monitor network health for relay queue lengths, relay overload, and other signs of network stress (and particularly the alleviation of network stress). 6.5. Consensus Parameter Tuning [CONSENSUS_PARAMETERS] During Shadow simulation, we will determine reasonable default parameters for our consensus parameters for each algorithm. We will then re-run these tuning experiments on the live Tor network, as described in: https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/Sponsor61/PerformanceExperiments 6.5.1. Parameters common to all algorithms These are sorted in order of importance to tune, most important first. cc_alg: - Description: Specifies which congestion control algorithm clients should use, as an integer. - Range: 0 or 2 (0=fixed windows, 2=Vegas) - Default: 2 - Tuning Values: [2,3] - Tuning Notes: These algorithms need to be tested against percentages of current fixed alg client competition, in Shadow. Their optimal parameter values, and even the optimal algorithm itself, will likely depend upon how much fixed sendme traffic is in competition. See the algorithm-specific parameters for additional tuning notes. As of Tor 0.4.8, Vegas is the default algorithm, and support for algorithms 1 (Westwood) and 3 (NOLA) have been removed. - Shadow Tuning Results: Westwood exhibited responsiveness problems, drift, and overshoot. NOLA exhibited ack compression resulting in over-estimating the BDP. Vegas, when tuned properly, kept queues low and throughput high, but even. cc_bwe_min: - Description: The minimum number of SENDME acks to average over in order to estimate bandwidth (and thus BDP). - Range: [2, 20] - Default: 5 - Tuning Values: 4-10 - Tuning Notes: The lower this value is, the sooner we can get an estimate of the true BDP of a circuit. Low values may lead to massive over-estimation, due to ack compression. However, if this value is above the number of acks that fit in cc_cwnd_init, then we won't get a BDP estimate within the first use of the circuit. Additionally, if this value is above the number of acks that fit in cc_cwnd_min, we may not be able to estimate BDP when the congestion window is small. If we need small congestion windows, we should also lower cc_sendme_inc, which will get us more frequent acks with less data. - Shadow Tuning Results: Regarless of how high this was set, there were still cases where queues built up, causing BDP over-estimation. As a result, we disable use of the BDP estimator, and only use the Vegas CWND estimator. cc_sendme_inc: - Description: Specifies how many cells a SENDME acks - Range: [1, 254] - Default: 31 - Tuning Values: 25,33,50 - Tuning Notes: Increasing this increases overhead, but also increases BDP estimation accuracy. Since we only use circuit-level sendmes, and the old code sent sendmes at both every 50 cells, and every 100, we can set this as low as 33 to have the same amount of overhead. - Shadow Tuning Results: This was optimal at 31-32 cells, which is also the number of cells that fit in a TLS frame. Much of the rest of Tor has processing values at 32 cells, as well. - Consensus Update Notes: This value MUST only be changed by +/- 1, every 4 hours. If greater changes are needed, they MUST be spread out over multiple consensus updates. cc_cwnd_init: - Description: Initial congestion window for new congestion control Tor clients. This can be set much higher than TCP, since actual TCP to the guard will prevent buffer bloat issues at local routers. - Range: [31, 10000] - Default: 4*31 - Tuning Values: 150,200,250,500 - Tuning Notes: Higher initial congestion windows allow the algorithms to measure initial BDP more accurately, but will lead to queue bursts and latency. Ultimately, the ICW should be set to approximately 'cc_bwe_min'*'cc_sendme_inc', but the presence of competing fixed window clients may require higher values. - Shadow Tuning Results: Setting this too high caused excessive cell queues at relays. 4*31 ended up being a sweet spot. - Consensus Update Notes: This value must never be set below cc_sendme_inc. cc_cwnd_min: - Description: The minimum allowed cwnd. - Range: [31, 1000] - Default: 31 - Tuning Values: [100, 150, 200] - Tuning Notes: If the cwnd falls below cc_sendme_inc, connections can't send enough data to get any acks, and will stall. If it falls below cc_bwe_min*cc_sendme_inc, connections can't use SENDME BDP estimates. Likely we want to set this around cc_bwe_min*cc_sendme_inc, but no lower than cc_sendme_inc. - Shadow Tuning Results: We set this at 31 cells, the cc_sendme_inc. - Consensus Update Notes: This value must never be set below cc_sendme_inc. cc_cwnd_max: - Description: The maximum allowed cwnd. - Range: [500, INT32_MAX] - Default: INT32_MAX - Tuning Values: [5000, 10000, 20000] - Tuning Notes: If cc_bwe_min is set too low, the BDP estimator may over-estimate the congestion window in the presence of large queues, due to SENDME ack compression. Once all clients have upgraded to congestion control, queues large enough to cause ack compression should become rare. This parameter exists primarily to verify this in Shadow, but we preserve it as a consensus parameter for emergency use in the live network, as well. - Shadow Tuning Results: We kept this at INT32_MAX. circwindow: - Description: Initial congestion window for legacy Tor clients - Range: [100, 1000] - Default: 1000 - Tuning Values: 100,200,500,1000 - Tuning Notes: If the above congestion algorithms are not optimal until an unreasonably high percentge of clients upgrade, we can reduce the performance of ossified legacy clients by reducing their circuit windows. This will allow old clients to continue to operate without impacting optimal network behavior. cc_cwnd_inc_rate: - Description: How often we update our congestion window, per cwnd worth of packets - Range: [1, 250] - Default: 1 - Tuning Values: [1,2,5,10] - Tuning Notes: Congestion control theory says that the congestion window should only be updated once every cwnd worth of packets. We may find it better to update more frequently, but this is probably unlikely to help a great deal. - Shadow Tuning Results: Increasing this during slow start caused overshoot and excessive queues. Increasing this after slow start was suboptimal for performance. We keep this at 1. cc_ewma_cwnd_pct: - Description: This specifies the N in N-EWMA smoothing of RTT and BDP estimation, as a percent of the number of SENDME acks in a congestion window. It allows us to average these RTT values over a percentage of the congestion window, capped by 'cc_ewma_max' below, and specified in [N_EWMA_SMOOTHING]. - Range: [1, 255] - Default: 50,100 - Tuning Values: [25,50,100] - Tuning Notes: Smoothing our estimates reduces the effects of ack compression and other ephemeral network hiccups; changing this much is unlikely to have a huge impact on performance. - Shadow Tuning Results: Setting this to 50 seemed to reduce cell queues, but this may also have impacted performance. cc_ewma_max: - Description: This specifies the max N in N_EWMA smoothing of RTT and BDP estimation. It allows us to place a cap on the N of EWMA smoothing, as specified in [N_EWMA_SMOOTHING]. - Range: [2, INT32_MAX] - Default: 10 - Tuning Values: [10,20] - Shadow Tuning Results: We ended up needing this to make Vegas more responsive to congestion, to avoid overloading slow relays. Values of 10 or 20 were best. cc_ewma_ss: - Description: This specifies the N in N_EWMA smoothing of RTT during Slow Start. - Range: [2, INT32_MAX] - Default: 2 - Tuning Values: [2,4] - Shadow Tuning Results: Setting this to 2 helped reduce overshoot during Slow Start. cc_rtt_reset_pct: - Description: Describes a percentile average between RTT_min and RTT_current_ewma, for use to reset RTT_min, when the congestion window hits cwnd_min. - Range: [0, 100] - Default: 100 - Shadow Tuning Results: cwnd_min is not hit in Shadow simulations, but it can be hit on the live network while under DoS conditions, and with cheaters. cc_cwnd_inc: - Description: How much to increment the congestion window by during steady state, every cwnd. - Range: [1, 1000] - Default: 31 - Tuning Values: 25,50,100 - Tuning Notes: We are unlikely to need to tune this much, but it might be worth trying a couple values. - Shadow Tuning Results: Increasing this negatively impacted performance. Keeping it at cc_sendme_inc is best. cc_cwnd_inc_pct_ss: - Description: Percentage of the current congestion window to increment by during slow start, every cwnd. - Range: [1, 500] - Default: 50 - Tuning Values: 50,100,200 - Tuning Notes: On the current live network, the algorithms tended to exit slow start early, so we did not exercise this much. This may not be the case in Shadow, or once clients upgrade to the new algorithms. - Shadow Tuning Results: Setting this above 50 caused excessive queues to build up in Shadow. This may have been due to imbalances in Shadow client allocation, though. Values of 50-100 will be explored after examining Shadow Guard Relay Utilization. 6.5.2. Westwood parameters Westwood has runaway conditions. Because the congestion signal threshold of TOR_WESTWOOD is a function of RTT_max, excessive queuing can cause an increase in RTT_max. Additionally, if stream activity is constant, but of a lower bandwidth than the circuit, this will not drive the RTT upwards, and this can result in a congestion window that continues to increase in the absence of any other concurrent activity. For these reasons, we are unlikely to spend much time deeply investigating Westwood in Shadow, beyond a simulaton or two to check these behaviors. cc_westwood_rtt_thresh: - Description: Specifies the cutoff for BOOTLEG_RTT_TOR to deliver congestion signal, as fixed point representation divided by 1000. - Range: [1, 1000] - Default: 33 - Tuning Values: [20, 33, 40, 50] - Tuning Notes: The Defenestrator paper set this at 23, but did not justify it. We may need to raise it to compete with current fixed window SENDME. cc_westwood_cwnd_m: - Description: Specifies how much to reduce the congestion window after a congestion signal, as a fraction of 100. - Range: [0, 100] - Default: 75 - Tuning Values: [50, 66, 75] - Tuning Notes: Congestion control theory started out using 50 here, and then decided 70-75 was better. cc_westwood_min_backoff: - Description: If 1, take the min of BDP estimate and westwood backoff. If 0, take the max of BDP estimate and westwood backoff. - Range: [0, 1] - Default: 0 - Tuning Notes: This parameter can make the westwood backoff less agressive, if need be. We're unlikely to need it, though. cc_westwood_rtt_m: - Description: Specifies a backoff percent of RTT_max, upon receipt of a congestion signal. - Range: [50, 100] - Default: 100 - Tuning Notes: Westwood technically has a runaway condition where congestion can cause RTT_max to grow, which increases the congestion threshhold. This has not yet been observed, but because it is possible, we include this parameter. 6.5.3. Vegas Parameters cc_vegas_alpha_{exit,onion,sbws}: cc_vegas_beta_{exit,onion,sbws}: cc_vegas_gamma_{exit,onion,sbws}: cc_vegas_delta_{exit,onion,sbws}: - Description: These parameters govern the number of cells that [TOR_VEGAS] can detect in queue before reacting. - Range: [0, 1000] (except delta, which has max of INT32_MAX) - Defaults: # OUTBUF_CELLS=62 cc_vegas_alpha_exit (3*OUTBUF_CELLS) cc_vegas_beta_exit (4*OUTBUF_CELLS) cc_vegas_gamma_exit (3*OUTBUF_CELLS) cc_vegas_delta_exit (5*OUTBUF_CELLS) cc_vegas_alpha_onion (3*OUTBUF_CELLS) cc_vegas_beta_onion (6*OUTBUF_CELLS) cc_vegas_gamma_onion (4*OUTBUF_CELLS) cc_vegas_delta_onion (7*OUTBUF_CELLS) - Tuning Notes: The amount of queued cells that Vegas should tolerate is heavily dependent upon competing congestion control algorithms. The specified defaults are necessary to compete against current fixed SENDME traffic, but are much larger than neccessary otherwise. These values also need a large-ish range between alpha and beta, to allow some degree of variance in traffic, as per [33]. The tuning of these parameters happened in two tickets[34,35]. The onion service parameters were set on the basis that they should increase the queue until as much queue delay as Exits, but tolerate up to 6 hops of outbuf delay. Lack of visibility into onion service congestion window on the live network prevented confirming this. - Shadow Tuning Results: We found that the best values for 3-hop Exit circuits was to set alpha and gamma to the size of the outbufs times the number of hops. Beta is set to one TLS record/sendme_inc above this value. cc_sscap_{exit,onion,sbws}: - Description: These parameters describe the RFC3742 'cap', after which congestion window increments are reduced. INT32_MAX disables RFC3742. - Range: [100, INT32_MAX] - Defaults: sbws: 400 exit: 600 onion: 475 - Shadow Tuning Results: We picked these defaults based on the average congestion window seen in Shadow sims for exits and onion service circuits. cc_ss_max: - Description: This parameter provides a hard-max on the congestion window in slow start. - Range: [500, INT32_MAX] - Default: 5000 - Shadow Tuning Results: The largest congestion window seen in Shadow is ~3000, so this was set as a safety valve above that. cc_cwnd_full_gap: - Description: This parameter defines the integer number of 'cc_sendme_inc' multiples of gap allowed between inflight and cwnd, to still declare the cwnd full. - Range: [0, INT16_MAX] - Default: 4 - Shadow Tuning Results: Low values resulted in a slight loss of performance, and increased variance in throughput. Setting this at 4 seemed to achieve a good balance betwen throughput and queue overshoot. cc_cwnd_full_minpct: - Description: This paramter defines a low watermark in percent. If inflight falls below this percent of cwnd, the congestion window is immediately declared non-full. - Range: [0, 100] - Default: 25 cc_cwnd_full_per_cwnd: - Description: This parameter governs how often a cwnd must be full, in order to allow congestion window increase. If it is 1, then the cwnd only needs to be full once per cwnd worth of acks. If it is 0, then it must be full once every cwnd update (ie: every SENDME). - Range: [0, 1] - Default: 1 - Shadow Tuning Results: A value of 0 resulted in a slight loss of performance, and increased variance in throughput. The optimal number here likely depends on edgeconn inbuf size, edgeconn kernel buffer size, and eventloop behavior. 6.5.4. NOLA Parameters cc_nola_overshoot: - Description: The number of cells to add to the BDP estimator to obtain the NOLA cwnd. - Range: [0, 1000] - Default: 100 - Tuning Values: 0, 50, 100, 150, 200 - Tuning Notes: In order to compete against current fixed sendme, and to ensure that the congestion window has an opportunity to grow, we must set the cwnd above the current BDP estimate. How much above will be a function of competing traffic. It may also turn out that absent any more agressive competition, we do not need to overshoot the BDP estimate. 6.5.5. Flow Control Parameters As with previous sections, the parameters in this section are sorted with the parameters that are most impportant to tune, first. These parameters have been tuned using onion services. The defaults are believed to be good. cc_xoff_client cc_xoff_exit - Description: Specifies the outbuf length, in relay cell multiples, before we send an XOFF. - Range: [1, 10000] - Default: 500 - Tuning Values: [500, 1000] - Tuning Notes: This threshold plus the sender's cwnd must be greater than the cc_xon_rate value, or a rate cannot be computed. Unfortunately, unless it is sent, the receiver does not know the cwnd. Therefore, this value should always be higher than cc_xon_rate minus 'cc_cwnd_min' (100) minus the xon threshhold value (0). cc_xon_rate - Description: Specifies how many full packed cells of bytes must arrive before we can compute a rate, as well as how often we can send XONs. - Range: [1, 5000] - Default: 500 - Tuning Values: [500, 1000] - Tuning Notes: Setting this high will prevent excessive XONs, as well as reduce side channel potential, but it will delay response to queuing. and will hinder our ability to detect rate changes. However, low values will also reduce our ability to accurately measure drain rate. This value should always be lower than 'cc_xoff_*' + 'cc_cwnd_min', so that a rate can be computed solely from the outbuf plus inflight data. cc_xon_change_pct - Description: Specifies how much the edge drain rate can change before we send another advisory cell. - Range: [1, 99] - Default: 25 - Tuning values: [25, 50, 75] - Tuning Notes: Sending advisory updates due to a rate change may help us avoid hitting the XOFF limit, but it may also not help much unless we are already above the advise limit. cc_xon_ewma_cnt - Description: Specifies the N in the N_EWMA of rates. - Range: [2, 100] - Default: 2 - Tuning values: [2, 3, 5] - Tuning Notes: Setting this higher will smooth over changes in the rate field, and thus avoid XONs, but will reduce our reactivity to rate changes. 6.5.6. External Performance Parameters to Tune The following parameters are from other areas of Tor, but tuning them will improve congestion control performance. They are again sorted by most important to tune, first. cbtquantile - Description: Specifies the percentage cutoff for the circuit build timeout mechanism. - Range: [60, 80] - Default: 80 - Tuning Values: [70, 75, 80] - Tuning Notes: The circuit build timeout code causes Tor to use only the fastest 'cbtquantile' percentage of paths to build through the network. Lowering this value will help avoid congested relays, and improve latency. CircuitPriorityHalflifeMsec - Description: The CircEWMA half-life specifies the time period after which the cell count on a circuit is halved. This allows circuits to regain their priority if they stop being bursty. - Range: [1, INT32_MAX] - Default: 30000 - Tuning Values: [5000, 15000, 30000, 60000] - Tuning Notes: When we last tuned this, it was before KIST[31], so previous values may have little relevance to today. According to the CircEWMA paper[30], values that are too small will fail to differentiate bulk circuits from interactive ones, and values that are too large will allow new bulk circuits to keep priority over interactive circuits for too long. The paper does say that the system was not overly sensitive to specific values, though. CircuitPriorityTickSecs - Description: This specifies how often in seconds we readjust circuit priority based on their EWMA. - Range: [1, 600] - Default: 10 - Tuning Values: [1, 5, 10] - Tuning Notes: Even less is known about the optimal value for this parameter. At a guess, it should be more often than the half-life. Changing it also influences the half-life decay, though, at least according to the CircEWMA paper[30]. KISTSchedRunInterval - If 0, KIST is disabled. (We should also test KIST disabled) 6.5.7. External Memory Reduction Parameters to Tune The following parameters are from other areas of Tor, but tuning them will reduce memory utilization in relays. They are again sorted by most important to tune, first. circ_max_cell_queue_size - Description: Specifies the minimum number of cells that are allowed to accumulate in a relay queue before closing the circuit. - Range: [1000, INT32_MAX] - Default: 50000 - Tuning Values: [1000, 2500, 5000] - Tuning Notes: Once all clients have upgraded to congestion control, relay circuit queues should be minimized. We should minimize this value, as any high amounts of queueing is a likely violator of the algorithm. cellq_low cellq_high - Description: Specifies the number of cells that can build up in a circuit's queue for delivery onto a channel (from edges) before we either block or unblock reading from streams attached to that circuit. - Range: [1, 1000] - Default: low=10, high=256 - Tuning Values: low=[0, 2, 4, 8]; high=[16, 32, 64] - Tuning Notes: When data arrives from edges into Tor, it gets packaged up into cells and then delivered to the cell queue, and from there is dequeued and sent on a channel. If the channel has blocked (see below params), then this queue grows until the high watermark, at which point Tor stops reading on all edges associated with a circuit, and a congestion signal is delivered to that circuit. At 256 cells, this is ~130k of data for *every* circuit, which is far more than Tor can write in a channel outbuf. Lowering this will reduce latency, reduce memory usage, and improve responsiveness to congestion. However, if it is too low, we may incur additional mainloop invocations, which are expensive. We will need to trace or monitor epoll() invocations in Shadow or on a Tor exit to verify that low values do not lead to more mainloop invocations. - Shadow Tuning Results: After extensive tuning, it turned out that the defaults were optimal in terms of throughput. orconn_high orconn_low - Description: Specifies the number of bytes that can be held in an orconn's outbuf before we block or unblock the orconn. - Range: [509, INT32_MAX] - Default: low=16k, high=32k - Tuning Notes: When the orconn's outbuf is above the high watermark, cells begin to accumulate in the cell queue as opposed to being added to the outbuf. It may make sense to lower this to be more in-line with the cellq values above. Also note that the low watermark is only used by the vanilla scheduler, so tuning it may be relevant when we test with KIST disabled. Just like the cell queue, if this is set lower, congestion signals will arrive sooner to congestion control when orconns become blocked, and less memory will occupy queues. It will also reduce latency. Note that if this is too low, we may not fill TLS records, and we may incur excessive epoll()/mainloop invocations. Tuning this is likely less beneficial than tuning the above cell_queue, unless KIST is disabled. MaxMemInQueues - Should be possible to set much lower, similarly to help with OOM conditions due to protocol violation. Unfortunately, this is just a torrc, and a bad one at that. 7. Protocol format specifications [PROTOCOL_SPEC] TODO: This section needs details once we close out other TODOs above. 7.1. Circuit window handshake format TODO: We need to specify a way to communicate the currently seen cc_sendme_inc consensus parameter to the other endpoint, due to consensus sync delay. Probably during the CREATE onionskin (and RELAY_COMMAND_EXTEND). TODO: We probably want stricter rules on the range of values for the per-circuit negotiation - something like it has to be between [cc_sendme_inc/2, 2*cc_sendme_inc]. That way, we can limit weird per-circuit values, but still allow us to change the consensus value in increments. 7.2. XON/XOFF relay cell formats TODO: We need to specify XON/XOFF for flow control. This should be simple. TODO: We should also allow it to carry stream data, as in Prop 325. 7.3. Onion Service formats TODO: We need to specify how to signal support for congestion control in an onion service, to both the intropoint and to clients. 7.4. Protocol Version format TODO: We need to pick a protover to signal Exit and Intropoint congestion control support. 7.5. SENDME relay cell format TODO: We need to specify how to add stream data to a SENDME as an optimization. 7.6. Extrainfo descriptor formats TODO: We will want to gather information on circuitmux and other relay queues, as well as XON/XOFF rates, and edge connection queue lengths at exits. 8. Security Analysis [SECURITY_ANALYSIS] The security risks of congestion control come in three forms: DoS attacks, fairness abuse, and side channel risk. 8.1. DoS Attacks (aka Adversarial Buffer Bloat) The most serious risk of eliminating our current window cap is that endpoints can abuse this situation to create huge queues and thus DoS Tor relays. This form of attack was already studied against the Tor network in the Sniper attack: https://www.freehaven.net/anonbib/cache/sniper14.pdf We had two fixes for this. First, we implemented a circuit-level OOM killer that closed circuits whose queues became too big, before the relay OOMed and crashed. Second, we implemented authenticated SENDMEs, so clients could not artificially increase their window sizes with honest exits: https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-sendmes.txt We can continue this kind of enforcement by having Exit relays ensure that clients are not transmitting SENDMEs too often, and do not appear to be inflating their send windows beyond what the Exit expects by calculating a similar estimated receive window. Note that such an estimate may have error and may become negative if the estimate is jittery. Unfortunately, authenticated SENDMEs do *not* prevent the same attack from being done by rogue exits, or rogue onion services. For that, we rely solely on the circuit OOM killer. During our experimentation, we must ensure that the circuit OOM killer works properly to close circuits in these scenarios. But in any case, it is important to note that we are not any worse off with congestion control than we were before, with respect to these kinds of DoS attacks. In fact, the deployment of congestion control by honest clients should reduce queue use and overall memory use in relays, allowing them to be more resilient to OOM attacks than before. 8.2. Congestion Control Fairness Abuse (aka Cheating) On the Internet, significant research and engineering effort has been devoted to ensuring that congestion control algorithms are "fair" in that each connection receives equal throughput. This fairness is provided both via the congestion control algorithm, as well as via queue management algorithms at Internet routers. One of the most unfortunate early results was that TCP Vegas, despite being near-optimal at minimizing queue lengths at routers, was easily out-performed by more aggressive algorithms that tolerated larger queue delay (such as TCP Reno). Note that because the most common direction of traffic for Tor is from Exit to client, unless Exits are malicious, we do not need to worry about rogue algorithms as much, but we should still examine them in our experiments because of the possibility of malicious Exits, as well as malicious onion services. Queue management can help further mitigate this risk, too. When RTT is used as a congestion signal, our current Circuit-EWMA queue management algorithm is likely sufficient for this. Because Circuit-EWMA will add additional delay to loud circuits, "cheaters" who use alternate congestion control algorithms to inflate their congestion windows should end up with more RTT congestion signals than those who do not, and the Circuit-EWMA scheduler will also relay fewer of their cells per time interval. In this sense, we do not need to worry about fairness and cheating as a security property, but a lack of fairness in the congestion control algorithm *will* increase memory use in relays to queue these unfair/loud circuits, perhaps enough to trigger the OOM killer. So we should still be mindful of these properties in selecting our congestion control algorithm, to minimize relay memory use, if nothing else. These two properties (honest Exits and Circuit-EWMA) may even be enough to make it possible to use [TOR_VEGAS] even in the presence of other algorithms, which would be a huge win in terms of memory savings as well as vastly reduced queue delay. We must verify this experimentally, though. 8.3. Side Channel Risks Vastly reduced queue delay and predictable amounts of congestion on the Tor network may make certain forms of traffic analysis easier. Additionally, the ability to measure RTT and have it be stable due to minimal network congestion may make geographical inference attacks easier: https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf https://www.robgjansen.com/publications/howlow-pets2013.pdf It is an open question as to if these risks are serious enough to warrant eliminating the ability to measure RTT at the protocol level and abandoning it as a congestion signal, in favor of other approaches (which have their own side channel risks). It will be difficult to comprehensively eliminate RTT measurements, too. On the plus side, Conflux traffic splitting (which is made easy once congestion control is implemented) does show promise as providing defense against traffic analysis: https://www.comsys.rwth-aachen.de/fileadmin/papers/2019/2019-delacadena-splitting-defense.pdf There is also literature on shaping circuit bandwidth to create a side channel. This can be done regardless of the use of congestion control, and is not an argument against using congestion control. In fact, the Backlit defense may be an argument in favor of endpoints monitoring circuit bandwidth and latency more closely, as a defense: https://www.freehaven.net/anonbib/cache/ndss09-rainbow.pdf https://www.freehaven.net/anonbib/cache/ndss11-swirl.pdf https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf Finally, recall that we are considering ideas/xxx-backward-ecn.txt [BACKWARD_ECN] to use a circuit-level cell_t.command to signal congestion. This allows all relays in the path to signal congestion in under RTT/2 in either direction, and it can be flipped on existing relay cells already in transit, without introducing any overhead. However, because cell_t.command is visible and malleable to all relays, it can also be used as a side channel. So we must limit its use to a couple of cells per circuit, at most. https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack 9. Onion Service Negotiation [ONION_NEGOTIATION] Onion service requires us to advertise the protocol version and congestion control parameters in a different way since the end points do not know each other like a client knows all the relays and what they support. Additionally, we cannot use ntorv3 for onion service negotiation, because it is not supported at all rendezvous and introduction points. To address this, this is done in two parts. First, the service needs to advertise to the world that it supports congestion control, and its view of the current cc_sendme_inc consensus parameter. This is done through a new line in the onion service descriptor, see section 9.1 below. Second, the client needs to inform the service that it wants to use congestion control on the rendezvous circuit. This is done through the INTRODUCE cell as an extension, see section 9.2 below. 9.1. Onion Service Descriptor We propose to add a new line to advertise the flow control protocol version, in the encrypted section of the onion service descriptor: "flow-control" SP version-range SP sendme-inc NL The "version-range" value is the same as the protocol version FlowCtrl that relay advertises which is defined earlier in this proposal. The current value is "1-2". The "sendme-inc" value comes from the service's current cc_sendme_inc consensus parameter. Clients MUST ignore additional unknown versions in "version-range", and MUST ignore any additional values on this line. Clients SHOULD use the highest value in "version-range" to govern their protocol choice for "FlowCtrl" and INTRODUCE cell format, as per Section 9.2 below. If clients do not support any of the versions in "version-range", they SHOULD reject the descriptor. (They MAY choose to ignore this line instead, but doing so means using the old fixed-window SENDME flow control, which will likely be bad for the network). Clients that are able to parse this line and know the protocol version MUST validate that the "sendme-inc" value is within a multiple of 2 of the "cc_sendme_inc" in the consensus that they see. If "sendme-inc" is not within range, they MUST reject the descriptor. If their consensus also lists a non-zero "cc_alg", they MAY then send in the INTRODUCE1 cell congestion control request extention field, which is detailed in the next section. A service should only advertise its flow control version if congestion control is enabled. It MUST remove this line if congestion control is disabled. If the service observes a change in 'cc_sendme_inc' consensus parameter since it last published its descriptor, it MUST immediately close its introduction points, and publish a new descriptor with the new "sendme-inc" value. The additional step of closing the introduction points ensures that no clients arrive using a cached descriptor, with the old "sendme-inc" value. 9.2 INTRODUCE cell extension We propose a new extension to the INTRODUCE cell which can be used to send congestion control parameters down to the service. It is important to mention that this is an extension to be used in the encrypted setion of the cell and not its readable section by the introduction point. If used, it needs to be encoded within the ENCRYPTED section of the INTRODUCE1 cell defined in rend-spec-v3.txt section 3.3. The content is defined as follow: EXT_FIELD_TYPE: [01] -- Congestion Control Request. This field is has zero payload length. Its presence signifies that the client wants to use congestion control. The client MUST NOT set this field, or use ntorv3, if the service did not list "2" in the "FlowCtrl" line in the descriptor. The client SHOULD NOT provide this field if the consensus parameter 'cc_alg' is 0. The service MUST ignore any unknown fields. 9.3 Protocol Flow First, the client reads the "flow-control" line in the descriptor and gets the maximum value from that line's "version-range" and the service supports. As an example, if the client supports 2-3-4 and the service supports 2-3, then 3 is chosen. It then sends that value along its desired cc_sendme_inc value in the INTRODUCE1 cell in the extension. The service will then validate that is does support version 3 and that the parameter cc_sendme_inc is within range of the protocol. Congestion control is then applied to the rendezvous circuit. 9.4 Circuit Behavior If the extension is not found in the cell, the service MUST NOT use congestion control on the rendezvous circuit. Any invalid values received in the extension should result in closing the introduction circuit and thus not continuing the rendezvous process. An invalid value is either if the value is not supported or out of the defined range. 9.5 Security Considerations Advertising a new line in a descriptor does leak that a service is running at least a certain tor version. We believe that this is an acceptable risk in order to be able for service to take advantage of congestion control. Once a new tor stable is released, we hope that most service upgrades and thus everyone looks the same again. The new extension is located in the encrypted part of the INTRODUCE1 cell and thus the introduction point can't learn its content. 10. Exit negotiation [EXIT_NEGOTIATION] Similar to onion services, clients and exits will need to negotiate the decision to use congestion control, as well as a common value for 'cc_sendme_inc', for a given circuit. 10.1. When to negotiate Clients decide to initiate a negotiation attempt for a circuit if the consensus lists a non-zero 'cc_alg' parameter value, and the protover line for their chosen exit includes a value of 2 in the "FlowCtrl" field. If the FlowCtrl=2 subprotocol is absent, a client MUST NOT attempt negotiation. If 'cc_alg' is absent or zero, a client SHOULD NOT attempt negotiation, or use ntorv3. If the protover and consensus conditions are met, clients SHOULD negotiate with the Exit if the circuit is to be used for exit stream activity. Clients SHOULD NOT negotiate congestion control for one-hop circuits, or internal circuits. 10.2. What to negotiate Clients and exits need not agree on a specific congestion control algorithm, or any aspects of its behavior. Each endpoint's management of its congestion window is independent. However, because the new algorithms no longer use stream SENDMEs or fixed window sizes, they cannot be used with an endpoint expecting the old behavior. Additionally, each endpoint must agree on the the SENDME increment rate, in order to synchronize SENDME authentication and pacing. For this reason, negotiation needs to establish a boolean: "use congestion control", and an integer value for SENDME increment. No other parameters need to be negotiated. 10.3. How to negotiate Negotiation is performed by sending an ntorv3 onionskin, as specified in Proposal 332, to the Exit node. The encrypted payload contents from the clients are encoded as an extension field, as in the onion service INTRO1 cell: EXT_FIELD_TYPE: [01] -- Congestion Control Request. As in the INTRO1 extension field, this field is has zero payload length. Its presence signifies that the client wants to use congestion control. Again, the client MUST NOT set this field, or use ntorv3, if this exit did not list "2" in the "FlowCtrl" version line. The client SHOULD NOT set this to 1 if the consensus parameter 'cc_alg' is 0. The Exit MUST ignore any additional unknown extension fields. The server's encrypted ntorv3 reply payload is encoded as: EXT_FIELD_TYPE: [02] -- Congestion Control Response. If this flag is set, the extension should be used by the service to learn what are the congestion control parameters to use on the rendezvous circuit. EXT_FIELD content payload is a single byte: sendme_inc [1 byte] The Exit MUST provide its current view of 'cc_sendme_inc' in this payload if it observes a non-zero 'cc_alg' consensus parameter. Exits SHOULD only include this field once. The client MUST use the FIRST such field value, and ignore any duplicate field specifiers. The client MUST ignore any unknown additional fields. 10.5. Client checks The client MUST reject any ntorv3 replies for non-ntorv3 onionskins. The client MUST reject an ntorv3 reply with field EXT_FIELD_TYPE=02, if the client did not include EXT_FIELD_TYPE=01 in its handshake. The client SHOULD reject a sendme_inc field value that differs from the current 'cc_sendme_inc' consensus parameter by more than +/- 1, in either direction. If a client rejects a handshake, it MUST close the circuit. 10.6. Managing consenus updates The pedantic reader will note that a rogue consensus can cause all clients to decide to close circuits by changing 'cc_sendme_inc' by a large margin. As a matter of policy, the directory authorities MUST NOT change 'cc_sendme_inc' by more than +/- 1. In Shadow simulation, the optimal 'cc_sendme_inc' value to be ~31 cells, or one (1) TLS record worth of cells. We do not expect to change this value significantly. 11. Acknowledgements Immense thanks to Toke Høiland-Jørgensen for considerable input into all aspects of the TCP congestion control background material for this proposal, as well as review of our versions of the algorithms. 12. Glossary [GLOSSARY] ACK - Acknowledgment. In congestion control, this is a type of packet that signals that the endpoint received a packet or packet set. In Tor, ACKs are called SENDMEs. BDP - Bandwidth Delay Product. This is the quantity of bytes that are actively in transit on a path at any given time. Typically, this does not count packets waiting in queues. It is essentially RTT*BWE - queue_delay. BWE - BandWidth Estimate. This is the estimated throughput on a path. CWND - Congestion WiNDow. This is the total number of packets that are allowed to be "outstanding" (aka not ACKed) on a path at any given time. An ideal congestion control algorithm sets CWND=BDP. EWMA - Exponential Weighted Moving Average. This is a mechanism for smoothing out high-frequency changes in a value, due to temporary effects. ICW - Initial Congestion Window. This is the initial value of the congestion window at the start of a connection. RTT - Round Trip Time. This is the time it takes for one endpoint to send a packet to the other endpoint, and get a response. SS - Slow Start. This is the initial phase of most congestion control algorithms. Despite the name, it is an exponential growth phase, to quickly increase the congestion window from the ICW value up the path BDP. After Slow Start, changes to the congestion window are linear. XOFF - Transmitter Off. In flow control, XOFF means that the receiver is receiving data too fast and is beginning to queue. It is sent to tell the sender to stop sending. XON - Transmitter On. In flow control, XON means that the receiver is ready to receive more data. It is sent to tell the sender to resume sending. 13. [CITATIONS] 1. Options for Congestion Control in Tor-Like Networks. https://lists.torproject.org/pipermail/tor-dev/2020-January/014140.html 2. Towards Congestion Control Deployment in Tor-like Networks. https://lists.torproject.org/pipermail/tor-dev/2020-June/014343.html 3. DefenestraTor: Throwing out Windows in Tor. https://www.cypherpunks.ca/~iang/pubs/defenestrator.pdf 4. TCP Westwood: Bandwidth Estimation for Enhanced Transport over Wireless Links http://nrlweb.cs.ucla.edu/nrlweb/publication/download/99/2001-mobicom-0.pdf 5. Performance Evaluation and Comparison of Westwood+, New Reno, and Vegas TCP Congestion Control http://cpham.perso.univ-pau.fr/TCP/ccr_v31.pdf 6. Linux 2.4 Implementation of Westwood+ TCP with rate-halving https://c3lab.poliba.it/images/d/d7/Westwood_linux.pdf 7. TCP Westwood http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-westwood 8. TCP Vegas: New Techniques for Congestion Detection and Avoidance http://pages.cs.wisc.edu/~akella/CS740/F08/740-Papers/BOP94.pdf 9. Understanding TCP Vegas: A Duality Model ftp://ftp.cs.princeton.edu/techreports/2000/628.pdf 10. TCP Vegas http://intronetworks.cs.luc.edu/1/html/newtcps.html#tcp-vegas 11. Controlling Queue Delay https://queue.acm.org/detail.cfm?id=2209336 12. Controlled Delay Active Queue Management https://tools.ietf.org/html/rfc8289 13. How Much Anonymity does Network Latency Leak? https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf 14. How Low Can You Go: Balancing Performance with Anonymity in Tor https://www.robgjansen.com/publications/howlow-pets2013.pdf 15. POSTER: Traffic Splitting to Counter Website Fingerprinting https://www.comsys.rwth-aachen.de/fileadmin/papers/2019/2019-delacadena-splitting-defense.pdf 16. RAINBOW: A Robust And Invisible Non-Blind Watermark for Network Flows https://www.freehaven.net/anonbib/cache/ndss09-rainbow.pdf 17. SWIRL: A Scalable Watermark to Detect Correlated Network Flows https://www.freehaven.net/anonbib/cache/ndss11-swirl.pdf 18. Exposing Invisible Timing-based Traffic Watermarks with BACKLIT https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf 19. The Sniper Attack: Anonymously Deanonymizing and Disabling the Tor Network https://www.freehaven.net/anonbib/cache/sniper14.pdf 20. Authenticating sendme cells to mitigate bandwidth attacks https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-sendmes.txt 21. Tor security advisory: "relay early" traffic confirmation attack https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack 22. The Path Less Travelled: Overcoming Tor’s Bottlenecks with Traffic Splitting https://www.cypherpunks.ca/~iang/pubs/conflux-pets.pdf 23. Circuit Padding Developer Documentation https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md 24. Plans for Tor Live Network Performance Experiments https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/Sponsor61/PerformanceExperiments 25. Tor Performance Metrics for Live Network Tuning https://gitlab.torproject.org/legacy/trac/-/wikis/org/roadmaps/CoreTor/PerformanceMetrics 26. Bandwidth-Delay Product https://en.wikipedia.org/wiki/Bandwidth-delay_product 27. Exponentially Weighted Moving Average https://corporatefinanceinstitute.com/resources/knowledge/trading-investing/exponentially-weighted-moving-average-ewma/ 28. Dropping on the Edge https://www.petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf 29. https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md#the-bandguards-subsystem 30. An Improved Algorithm for Tor Circuit Scheduling. https://www.cypherpunks.ca/~iang/pubs/ewma-ccs.pdf 31. KIST: Kernel-Informed Socket Transport for Tor https://matt.traudt.xyz/static/papers/kist-tops2018.pdf 32. RFC3742 Limited Slow Start https://datatracker.ietf.org/doc/html/rfc3742#section-2 33. https://people.csail.mit.edu/venkatar/cc-starvation.pdf 34. https://gitlab.torproject.org/tpo/core/tor/-/issues/40642 35. https://gitlab.torproject.org/tpo/network-health/analysis/-/issues/49
Filename: 325-packed-relay-cells.md Title: Packed relay cells: saving space on small commands Author: Nick Mathewson Created: 10 July 2020 Status: Obsolete

(Proposal superseded by proposal 340)

Introduction

In proposal 319 I suggested a way to fragment long commands across multiple RELAY cells. In this proposal, I suggest a new format for RELAY cells that can be used to pack multiple relay commands into a single cell.

Why would we want to do this? As we move towards improved congestion-control and flow-control algorithms, we might not want to use an entire 498-byte relay payload just to send a one-byte flow-control message.

We already have some cases where we'd benefit from this feature. For example, when we send SENDME messages, END cells, or BEGIN_DIR cells, most of the cell body is wasted with padding.

As a side benefit, packing cells in this way may make the job of the traffic analyst a little more tricky, as cell contents become less predictable.

The basic design

Let's use the term "Relay Message" to mean the kind of thing that a relay cell used to hold. Thus, this proposal is about packing multiple "Relay Messages" in to a cell.

I'll use "Packed relay cell" to mean a relay cell in this new format, that supports multiple messages.

I'll use "client" to mean the initiator of a circuit, and "relay" to refer to the parties through who a circuit is created. Note that each "relay" (as used here) may be the "client" on circuits of its own.

When a relay supports relay message packing, it advertises the fact using a new Relay protocol version. Clients must opt-in to using this protocol version (see "Negotiation and Migration" section below ) before they can send any packed relay cells, and before the relay will send them any packed relay cells.

When packed cells are in use, multiple cell messages can be concatenated in a single relay cell.

Packed Cell Format

In order to have multiple commands within one single relay cell, they are concatenated one after another following this format of a relay cell. The first command is the same header format as a normal relay cell detailed in section 6.1 of tor-spec.txt

Relay Command [1 byte] 'Recognized' [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [Length bytes] RELAY_MESSAGE Padding [up to end of cell]

The RELAY_MESSAGE can be empty as in no bytes indicating no other messages or set to the following:

Relay Command [1 byte] StreamID [2 bytes] Length [2 bytes] Data [Length bytes] RELAY_MESSAGE

Note that the Recognized and Digest field are not added to a second relay message, they are solely used for the whole relay cell thus how we encrypt/decrypt and recognize a cell is not changed, only the payload changes to contain multiple messages.

The "Relay Command" byte "0" is now used to explicitly indicate "end of commands". If the byte "0" appears after a RELAY_MESSAGE, the rest of the cell MUST be ignored. (Note that this "end of commands" indicator may be absent if there are no bytes remaining after the last message in the cell.)

Only some "Relay Command" are supported for relay cell packing:

  • BEGIN_DIR
  • BEGIN
  • CONNECTED
  • DATA
  • DROP
  • END
  • PADDING_NEGOTIATED
  • PADDING_NEGOTIATE
  • SENDME

If any relay message with a relay command not listed above appears in a packed relay cell with another relay message, then the receiving party MUST tear down the circuit.

(Note that relay cell fragments (proposal 319) are not supported for packing.)

When generating RELAY cells, implementations SHOULD (as they do today) fill in the Padding field with four 0-valued bytes, followed by a sequence of random bytes up to the end of the cell. If there are fewer than 4 unused bytes at the end of the cell, those unused bytes should all be filled with 0-valued bytes.

Negotiation and migration

After receiving a packed relay cell, the relay knows that the client supports this proposal: Relays SHOULD send packed relay cells on any circuit on which they have received a packed relay cell. Relays MUST NOT send packed relay cells otherwise.

Clients, in turn, MAY send packed relay cells to any relay whose "Relay" subprotocol version indicates that it supports this protocol. To avoid fingerprinting, this client behavior should controlled with a tristate (1/0/auto) torrc configuration value, with the default set to use a consensus parameter.

The parameter is:

"relay-cell-packing" Boolean: if 1, clients should send packed relay cells. (Min: 0, Max 1, Default: 0)

To handle migration, first the parameter should be set to 0 and the configuration setting should be "auto". To test the feature, individual clients can set the tristate to "1".

Once enough clients have support for the parameter, the parameter can be set to 1.

A new relay message format

(This section is optional and should be considered separately; we may decide it is too complex.)

Currently, every relay message uses 5 bytes of header to hold a relay command, a length field, and a stream ID. This is wasteful: the stream ID is often redundant, and the top 7 bits of the length field are always zero.

I propose a new relay message format, described here (with ux denoting an x-bit bitfield). This format is 2 bytes or 4 bytes, depending on its first bit.

struct relay_header { u1 stream_id_included; // Is the stream_id included? u6 relay_command; // as before u9 relay_data_len; // as before u8 optional_stream_id[]; // 0 bytes or two bytes. }

Alternatively, you can view the first three fields as a 16-bit value, computed as:

(stream_id_included<<15) | (relay_command << 9) | (relay_data_len).

If the optional_stream_id field is not present, then the default value for the stream_id is computed as follows. We use stream_id 0 for any command that doesn't take a stream ID. For commands that do take a steam_id, we use whichever nonzero stream_id appeared most recently in the same cell.

This format limits the space of possible relay commands. That's probably okay: after 20 years of Tor development, we have defined 25 relay command values. But in case 2^6==64 commands will not be enough, we reserve command values 48 through 63 for future formats that need more command bits.

Filename: 326-tor-relay-well-known-uri-rfc8615.md Title: The "tor-relay" Well-Known Resource Identifier Author: nusenu Created: 14 August 2020 Status: Open

The "tor-relay" Well-Known Resource Identifier

This is a specification for a well-known registry entry according to RFC8615.

This resource identifier can be used for serving and finding proofs related to Tor relay and bridge contact information. It can also be used for autodiscovery of Tor relays run by a given entity, if the entity's domain is known. It solves the issue that Tor relay/bridge contact information is a unidirectional and unverified claim by nature. This well-known URI aims to allow the verification of the unidirectional claim. It aims to reduce the risk of impersonation attacks, where a Tor relay/bridge claims to be operated by a certain entity, but actually isn't. The automated verification will also support the visualization of relay/bridge groups.

  • An initially (unverified) Tor relay or bridge contact information might claim to be related to an organization by pointing to its website: Tor relay/bridge contact information field -> website

  • The "tor-relay" URI allows for the verification of that claim by fetching the files containing Tor relay ID(s) or hashed bridge fingerprints under the specified URI, because attackers can not easily place these files at the given location.

  • By publishing Tor relay IDs or hashed bridge IDs under this URI the website operator claims to be the responsible entity for these Tor relays/bridges. The verification of listed Tor relay/bridge IDs only succeeds if the claim can be verified bidirectionally (website -> relay/bridge and relay/bridge -> website).

  • This URI is not related to Tor onion services.

  • The URL MUST be HTTPS and use a valid TLS certificate from a generally trusted root CA. Plain HTTP MUST not be used.

  • The URL MUST be accessible by robots (no CAPTCHAs).

/.well-known/tor-relay/rsa-fingerprint.txt

  • The file contains one or more Tor relay RSA SHA1 fingerprints operated by the entity in control of this website.
  • Each line contains one relay fingerprint.
  • The file MUST NOT contain fingerprints of Tor bridges (or hashes of bridge fingerprints). For bridges see the file hashed-bridge-rsa-fingerprint.txt.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 40 characters long and consist of the following characters [a-fA-F0-9].
  • Fingerprints are not case-sensitive.
  • Each fingerprint MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor relays A234567890123456789012345678901234567ABC B234567890123456789012345678901234567890

The RSA SHA1 relay fingerprint can be found in the file named "fingerprint" located in the Tor data directory on the relay.

/.well-known/tor-relay/ed25519-master-pubkey.txt

  • The file contains one or more ed25519 Tor relay public master keys of relays operated by the entity in control of this website.
  • This file is not relevant for bridges.
  • Each line contains one public ed25519 master key in its base64 encoded form.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 43 characters long and consist of the following characters [a-zA-z0-9/+].
  • Each key MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor relays yp0fwtp4aa/VMyZJGz8vN7Km3zYet1YBZwqZEk1CwHI kXdA5dmIhXblAquMx0M0ApWJJ4JGQGLsjUSn86cbIaU bHzOT41w56KHh+w6TYwUhN4KrGwPWQWJX04/+tw/+RU

The base64 encoded ed25519 public master key can be found in the file named "fingerprint-ed25519" located in the Tor data directory on the relay.

/.well-known/tor-relay/hashed-bridge-rsa-fingerprint.txt

  • The file contains one or more SHA1 hashed Tor bridge SHA1 fingerprints operated by the entity in control of this website.
  • Each line contains one hashed bridge fingerprint.
  • The file may contain comments (starting with #).
  • Non-comment lines must be exactly 40 characters long and consist of the following characters [a-fA-F0-9].
  • Hashed fingerprints are not case-sensitive.
  • Each hashed fingerprint MUST appear at most once.
  • The file MUST not be larger than one MByte.
  • The file MUST NOT contain fingerprints of Tor relays.
  • The content MUST be a media type of "text/plain".

Example file content:

# we operate these Tor bridges 1234567890123456789012345678901234567ABC 4234567890123456789012345678901234567890

The hashed Tor bridge fingerprint can be found in the file named "hashed-fingerprint" located in the Tor data directory on the bridge.

Change Controller

Tor Project Development Mailing List tor-dev@lists.torproject.org

Related Information

Filename: 327-pow-over-intro.txt Title: A First Take at PoW Over Introduction Circuits Author: George Kadianakis, Mike Perry, David Goulet, tevador Created: 2 April 2020 Status: Finished 0. Abstract This proposal aims to thwart introduction flooding DoS attacks by introducing a dynamic Proof-Of-Work protocol that occurs over introduction circuits. 1. Motivation So far our attempts at limiting the impact of introduction flooding DoS attacks on onion services has been focused on horizontal scaling with Onionbalance, optimizing the CPU usage of Tor and applying rate limiting. While these measures move the goalpost forward, a core problem with onion service DoS is that building rendezvous circuits is a costly procedure both for the service and for the network. For more information on the limitations of rate-limiting when defending against DDoS, see [REF_TLS_1]. If we ever hope to have truly reachable global onion services, we need to make it harder for attackers to overload the service with introduction requests. This proposal achieves this by allowing onion services to specify an optional dynamic proof-of-work scheme that its clients need to participate in if they want to get served. With the right parameters, this proof-of-work scheme acts as a gatekeeper to block amplification attacks by attackers while letting legitimate clients through. 1.1. Related work For a similar concept, see the three internet drafts that have been proposed for defending against TLS-based DDoS attacks using client puzzles [REF_TLS]. 1.2. Threat model [THREAT_MODEL] 1.2.1. Attacker profiles [ATTACKER_MODEL] This proposal is written to thwart specific attackers. A simple PoW proposal cannot defend against all and every DoS attack on the Internet, but there are adversary models we can defend against. Let's start with some adversary profiles: "The script-kiddie" The script-kiddie has a single computer and pushes it to its limits. Perhaps it also has a VPS and a pwned server. We are talking about an attacker with total access to 10 GHz of CPU and 10 GB of RAM. We consider the total cost for this attacker to be zero $. "The small botnet" The small botnet is a bunch of computers lined up to do an introduction flooding attack. Assuming 500 medium-range computers, we are talking about an attacker with total access to 10 THz of CPU and 10 TB of RAM. We consider the upfront cost for this attacker to be about $400. "The large botnet" The large botnet is a serious operation with many thousands of computers organized to do this attack. Assuming 100k medium-range computers, we are talking about an attacker with total access to 200 THz of CPU and 200 TB of RAM. The upfront cost for this attacker is about $36k. We hope that this proposal can help us defend against the script-kiddie attacker and small botnets. To defend against a large botnet we would need more tools at our disposal (see [FUTURE_DESIGNS]). 1.2.2. User profiles [USER_MODEL] We have attackers and we have users. Here are a few user profiles: "The standard web user" This is a standard laptop/desktop user who is trying to browse the web. They don't know how these defences work and they don't care to configure or tweak them. If the site doesn't load, they are gonna close their browser and be sad at Tor. They run a 2GHz computer with 4GB of RAM. "The motivated user" This is a user that really wants to reach their destination. They don't care about the journey; they just want to get there. They know what's going on; they are willing to make their computer do expensive multi-minute PoW computations to get where they want to be. "The mobile user" This is a motivated user on a mobile phone. Even tho they want to read the news article, they don't have much leeway on stressing their machine to do more computation. We hope that this proposal will allow the motivated user to always connect where they want to connect to, and also give more chances to the other user groups to reach the destination. 1.2.3. The DoS Catch-22 [CATCH22] This proposal is not perfect and it does not cover all the use cases. Still, we think that by covering some use cases and giving reachability to the people who really need it, we will severely demotivate the attackers from continuing the DoS attacks and hence stop the DoS threat all together. Furthermore, by increasing the cost to launch a DoS attack, a big class of DoS attackers will disappear from the map, since the expected ROI will decrease. 2. System Overview 2.1. Tor protocol overview +----------------------------------+ | Onion Service | +-------+ INTRO1 +-----------+ INTRO2 +--------+ | |Client |-------->|Intro Point|------->| PoW |-----------+ | +-------+ +-----------+ |Verifier| | | +--------+ | | | | | | | | | +----------v---------+ | | |Intro Priority Queue| | +---------+--------------------+---+ | | | Rendezvous | | | circuits | | | v v v The proof-of-work scheme specified in this proposal takes place during the introduction phase of the onion service protocol. The system described in this proposal is not meant to be on all the time, and it can be entirely disabled for services that do not experience DoS attacks. When the subsystem is enabled, suggested effort is continuously adjusted and the computational puzzle can be bypassed entirely when the effort reaches zero. In these cases, the proof-of-work subsystem can be dormant but still provide the necessary parameters for clients to voluntarily provide effort in order to get better placement in the priority queue. The protocol involves the following major steps: 1) Service encodes PoW parameters in descriptor [DESC_POW] 2) Client fetches descriptor and computes PoW [CLIENT_POW] 3) Client completes PoW and sends results in INTRO1 cell [INTRO1_POW] 4) Service verifies PoW and queues introduction based on PoW effort [SERVICE_VERIFY] 5) Requests are continuously drained from the queue, highest effort first, subject to multiple constraints on speed [HANDLE_QUEUE] 2.2. Proof-of-work overview 2.2.1. Algorithm overview For our proof-of-work function we will use the Equi-X scheme by tevador [REF_EQUIX]. Equi-X is an asymmetric PoW function based on Equihash<60,3>, using HashX as the underlying layer. It features lightning fast verification speed, and also aims to minimize the asymmetry between CPU and GPU. Furthermore, it's designed for this particular use-case and hence cryptocurrency miners are not incentivized to make optimized ASICs for it. The overall scheme consists of several layers that provide different pieces of this functionality: 1) At the lowest layers, blake2b and siphash are used as hashing and PRNG algorithms that are well suited to common 64-bit CPUs. 2) A custom hash function family, HashX, randomizes its implementation for each new seed value. These functions are tuned to utilize the pipelined integer performance on a modern 64-bit CPU. This layer provides the strongest ASIC resistance, since a hardware reimplementation would need to include a CPU-like pipelined execution unit to keep up. 3) The Equi-X layer itself builds on HashX and adds an algorithmic puzzle that's designed to be strongly asymmetric and to require RAM to solve efficiently. 4) The PoW protocol itself builds on this Equi-X function with a particular construction of the challenge input and particular constraints on the allowed blake2b hash of the solution. This layer provides a linearly adjustable effort that we can verify. 5) Above the level of individual PoW handshakes, the client and service form a closed-loop system that adjusts the effort of future handshakes. The Equi-X scheme provides two functions that will be used in this proposal: - equix_solve(challenge) which solves a puzzle instance, returning a variable number of solutions per invocation depending on the specific challenge value. - equix_verify(challenge, solution) which verifies a puzzle solution quickly. Verification still depends on executing the HashX function, but far fewer times than when searching for a solution. For the purposes of this proposal, all cryptographic algorithms are assumed to produce and consume byte strings, even if internally they operate on some other data type like 64-bit words. This is conventionally little endian order for blake2b, which contrasts with Tor's typical use of big endian. HashX itself is configured with an 8-byte output but its input is a single 64-bit word of undefined byte order, of which only the low 16 bits are used by Equi-X in its solution output. We treat Equi-X solution arrays as byte arrays using their packed little endian 16-bit representation. We tune Equi-X in section [EQUIX_TUNING]. 2.2.2. Dynamic PoW DoS is a dynamic problem where the attacker's capabilities constantly change, and hence we want our proof-of-work system to be dynamic and not stuck with a static difficulty setting. Hence, instead of forcing clients to go below a static target like in Bitcoin to be successful, we ask clients to "bid" using their PoW effort. Effectively, a client gets higher priority the higher effort they put into their proof-of-work. This is similar to how proof-of-stake works but instead of staking coins, you stake work. The benefit here is that legitimate clients who really care about getting access can spend a big amount of effort into their PoW computation, which should guarantee access to the service given reasonable adversary models. See [PARAM_TUNING] for more details about these guarantees and tradeoffs. As a way to improve reachability and UX, the service tries to estimate the effort needed for clients to get access at any given time and places it in the descriptor. See [EFFORT_ESTIMATION] for more details. 2.2.3. PoW effort It's common for proof-of-work systems to define an exponential effort function based on a particular number of leading zero bits or equivalent. For the benefit of our effort estimation system, it's quite useful if we instead have a linear scale. We use the first 32 bits of a hashed version of the Equi-X solution as compared to the full 32-bit range. Conceptually we could define a function: unsigned effort(uint8_t *token) which takes as its argument a hashed solution, interprets it as a bitstring, and returns the quotient of dividing a bitstring of 1s by it. So for example: effort(00000001100010101101) = 11111111111111111111 / 00000001100010101101 or the same in decimal: effort(6317) = 1048575 / 6317 = 165. In practice we can avoid even having to perform this division, performing just one multiply instead to see if a request's claimed effort is supported by the smallness of the resulting 32-bit hash prefix. This assumes we send the desired effort explicitly as part of each PoW solution. We do want to force clients to pick a specific effort before looking for a solution, otherwise a client could opportunistically claim a very large effort any time a lucky hash prefix comes up. Thus the effort is communicated explicitly in our protocol, and it forms part of the concatenated Equi-X challenge. 3. Protocol specification 3.1. Service encodes PoW parameters in descriptor [DESC_POW] This whole protocol starts with the service encoding the PoW parameters in the 'encrypted' (inner) part of the v3 descriptor. As follows: "pow-params" SP type SP seed-b64 SP suggested-effort SP expiration-time NL [At most once] type: The type of PoW system used. We call the one specified here "v1" seed-b64: A random seed that should be used as the input to the PoW hash function. Should be 32 random bytes encoded in base64 without trailing padding. suggested-effort: An unsigned integer specifying an effort value that clients should aim for when contacting the service. Can be zero to mean that PoW is available but not currently suggested for a first connection attempt. See [EFFORT_ESTIMATION] for more details here. expiration-time: A timestamp in "YYYY-MM-DDTHH:MM:SS" format (iso time with no space) after which the above seed expires and is no longer valid as the input for PoW. It's needed so that our replay cache does not grow infinitely. It should be set to RAND_TIME(now+7200, 900) seconds. The service should refresh its seed when expiration-time passes. The service SHOULD keep its previous seed in memory and accept PoWs using it to avoid race-conditions with clients that have an old seed. The service SHOULD avoid generating two consequent seeds that have a common 4 bytes prefix. See [INTRO1_POW] for more info. By RAND_TIME(ts, interval) we mean a time between ts-interval and ts, chosen uniformly at random. 3.2. Client fetches descriptor and computes PoW [CLIENT_POW] If a client receives a descriptor with "pow-params", it should assume that the service is prepared to receive PoW solutions as part of the introduction protocol. The client parses the descriptor and extracts the PoW parameters. It makes sure that the <expiration-time> has not expired and if it has, it needs to fetch a new descriptor. The client should then extract the <suggested-effort> field to configure its PoW 'target' (see [REF_TARGET]). The client SHOULD NOT accept 'target' values that will cause unacceptably long PoW computation. The client uses a "personalization string" P equal to the following nul-terminated ASCII string: "Tor hs intro v1\0". The client looks up `ID`, the current 32-byte blinded public ID (KP_hs_blind_id) for the onion service. To complete the PoW the client follows the following logic: a) Client selects a target effort E, based on <suggested-effort> and past connection attempt history. b) Client generates a secure random 16-byte nonce N, as the starting point for the solution search. c) Client derives seed C by decoding 'seed-b64'. d) Client calculates S = equix_solve(P || ID || C || N || E) e) Client calculates R = ntohl(blake2b_32(P || ID || C || N || E || S)) f) Client checks if R * E <= UINT32_MAX. f1) If yes, success! The client can submit N, E, the first 4 bytes of C, and S. f2) If no, fail! The client interprets N as a 16-byte little-endian integer, increments it by 1 and goes back to step d). Note that the blake2b hash includes the output length parameter in its initial state vector, so a blake2b_32 is not equivalent to the prefix of a blake2b_512. We calculate the 32-bit blake2b specifically, and interpret it in network byte order as an unsigned integer. At the end of the above procedure, the client should have S as the solution of the Equix-X puzzle with N as the nonce, C as the seed. How quickly this happens depends solely on the target effort E parameter. The algorithm as described is suitable for single-threaded computation. Optionally, a client may choose multiple nonces and attempt several solutions in parallel on separate CPU cores. The specific choice of nonce is entirely up to the client, so parallelization choices like this do not impact the network protocol's interoperability at all. 3.3. Client sends PoW in INTRO1 cell [INTRO1_POW] Now that the client has an answer to the puzzle it's time to encode it into an INTRODUCE1 cell. To do so the client adds an extension to the encrypted portion of the INTRODUCE1 cell by using the EXTENSIONS field (see [PROCESS_INTRO2] section in rend-spec-v3.txt). The encrypted portion of the INTRODUCE1 cell only gets read by the onion service and is ignored by the introduction point. We propose a new EXT_FIELD_TYPE value: [02] -- PROOF_OF_WORK The EXT_FIELD content format is: POW_VERSION [1 byte] POW_NONCE [16 bytes] POW_EFFORT [4 bytes] POW_SEED [4 bytes] POW_SOLUTION [16 bytes] where: POW_VERSION is 1 for the protocol specified in this proposal POW_NONCE is the nonce 'N' from the section above POW_EFFORT is the 32-bit integer effort value, in network byte order POW_SEED is the first 4 bytes of the seed used This will increase the INTRODUCE1 payload size by 43 bytes since the extension type and length is 2 extra bytes, the N_EXTENSIONS field is always present and currently set to 0 and the EXT_FIELD is 41 bytes. According to ticket #33650, INTRODUCE1 cells currently have more than 200 bytes available. 3.4. Service verifies PoW and handles the introduction [SERVICE_VERIFY] When a service receives an INTRODUCE1 with the PROOF_OF_WORK extension, it should check its configuration on whether proof-of-work is enabled on the service. If it's not enabled, the extension SHOULD BE ignored. If enabled, even if the suggested effort is currently zero, the service follows the procedure detailed in this section. If the service requires the PROOF_OF_WORK extension but received an INTRODUCE1 cell without any embedded proof-of-work, the service SHOULD consider this cell as a zero-effort introduction for the purposes of the priority queue (see section [INTRO_QUEUE]). 3.4.1. PoW verification [POW_VERIFY] To verify the client's proof-of-work the service MUST do the following steps: a) Find a valid seed C that starts with POW_SEED. Fail if no such seed exists. b) Fail if N = POW_NONCE is present in the replay cache (see [REPLAY_PROTECTION]) c) Calculate R = ntohl(blake2b_32(P || ID || C || N || E || S)) d) Fail if R * E > UINT32_MAX e) Fail if equix_verify(P || ID || C || N || E, S) != EQUIX_OK f) Put the request in the queue with a priority of E If any of these steps fail the service MUST ignore this introduction request and abort the protocol. In this proposal we call the above steps the "top half" of introduction handling. If all the steps of the "top half" have passed, then the circuit is added to the introduction queue as detailed in section [INTRO_QUEUE]. 3.4.1.1. Replay protection [REPLAY_PROTECTION] The service MUST NOT accept introduction requests with the same (seed, nonce) tuple. For this reason a replay protection mechanism must be employed. The simplest way is to use a simple hash table to check whether a (seed, nonce) tuple has been used before for the active duration of a seed. Depending on how long a seed stays active this might be a viable solution with reasonable memory/time overhead. If there is a worry that we might get too many introductions during the lifetime of a seed, we can use a Bloom filter as our replay cache mechanism. The probabilistic nature of Bloom filters means that sometimes we will flag some connections as replays even if they are not; with this false positive probability increasing as the number of entries increase. However, with the right parameter tuning this probability should be negligible and well handled by clients. {TODO: Design and specify a suitable bloom filter for this purpose.} 3.4.2. The Introduction Queue [INTRO_QUEUE] 3.4.2.1. Adding introductions to the introduction queue [ADD_QUEUE] When PoW is enabled and a verified introduction comes through, the service instead of jumping straight into rendezvous, queues it and prioritizes it based on how much effort was devoted by the client to PoW. This means that introduction requests with high effort should be prioritized over those with low effort. To do so, the service maintains an "introduction priority queue" data structure. Each element in that priority queue is an introduction request, and its priority is the effort put into its PoW: When a verified introduction comes through, the service uses its included effort commitment value to place each request into the right position of the priority_queue: The bigger the effort, the more priority it gets in the queue. If two elements have the same effort, the older one has priority over the newer one. 3.4.2.2. Handling introductions from the introduction queue [HANDLE_QUEUE] The service should handle introductions by pulling from the introduction queue. We call this part of introduction handling the "bottom half" because most of the computation happens in this stage. For a description of how we expect such a system to work in Tor, see [TOR_SCHEDULER] section. 3.4.3. PoW effort estimation [EFFORT_ESTIMATION] 3.4.3.1. High-level description of the effort estimation process The service starts with a default suggested-effort value of 0, which keeps the PoW defenses dormant until we notice signs of overload. The overall process of determining effort can be thought of as a set of multiple coupled feedback loops. Clients perform their own effort adjustments via [CLIENT_TIMEOUT] atop a base effort suggested by the service. That suggestion incorporates the service's control adjustments atop a base effort calculated using a sum of currently-queued client effort. Each feedback loop has an opportunity to cover different time scales. Clients can make adjustments at every single circuit creation request, whereas services are limited by the extra load that frequent updates would place on HSDir nodes. In the combined client/service system these client-side increases are expected to provide the most effective quick response to an emerging DoS attack. After early clients increase the effort using [CLIENT_TIMEOUT], later clients will benefit from the service detecting this increased queued effort and offering a larger suggested_effort. Effort increases and decreases both have an intrinsic cost. Increasing effort will make the service more expensive to contact, and decreasing effort makes new requests likely to become backlogged behind older requests. The steady state condition is preferable to either of these side-effects, but ultimately it's expected that the control loop always oscillates to some degree. 3.4.3.2. Service-side effort estimation Services keep an internal effort estimation which updates on a regular periodic timer in response to measurements made on the queueing behavior in the previous period. These internal effort changes can optionally trigger client-visible suggested_effort changes when the difference is great enough to warrant republishing to the HSDir. This evaluation and update period is referred to as HS_UPDATE_PERIOD. The service side effort estimation takes inspiration from TCP congestion control's additive increase / multiplicative decrease approach, but unlike a typical AIMD this algorithm is fixed-rate and doesn't update immediately in response to events. {TODO: HS_UPDATE_PERIOD is hardcoded to 300 (5 minutes) currently, but it should be configurable in some way. Is it more appropriate to use the service's torrc here or a consensus parameter?} 3.4.3.3. Per-period service state During each update period, the service maintains some state: 1. TOTAL_EFFORT, a sum of all effort values for rendezvous requests that were successfully validated and enqueued. 2. REND_HANDLED, a count of rendezvous requests that were actually launched. Requests that made it to dequeueing but were too old to launch by then are not included. 3. HAD_QUEUE, a flag which is set if at any time in the update period we saw the priority queue filled with more than a minimum amount of work, greater than we would expect to process in approximately 1/4 second using the configured dequeue rate. 4. MAX_TRIMMED_EFFORT, the largest observed single request effort that we discarded during the period. Requests are discarded either due to age (timeout) or during culling events that discard the bottom half of the entire queue when it's too full. 3.4.3.4. Service AIMD conditions At the end of each period, the service may decide to increase effort, decrease effort, or make no changes, based on these accumulated state values: 1. If MAX_TRIMMED_EFFORT > our previous internal suggested_effort, always INCREASE. Requests that follow our latest advice are being dropped. 2. If the HAD_QUEUE flag was set and the queue still contains at least one item with effort >= our previous internal suggested_effort, INCREASE. Even if we haven't yet reached the point of dropping requests, this signal indicates that the our latest suggestion isn't high enough and requests will build up in the queue. 3. If neither condition (1) or (2) are taking place and the queue is below a level we would expect to process in approximately 1/4 second, choose to DECREASE. 4. If none of these conditions match, the suggested effort is unchanged. When we INCREASE, the internal suggested_effort is increased to either its previous value + 1, or (TOTAL_EFFORT / REND_HANDLED), whichever is larger. When we DECREASE, the internal suggested_effort is scaled by 2/3rds. Over time, this will continue to decrease our effort suggestion any time the service is fully processing its request queue. If the queue stays empty, the effort suggestion decreases to zero and clients should no longer submit a proof-of-work solution with their first connection attempt. It's worth noting that the suggested-effort is not a hard limit to the efforts that are accepted by the service, and it's only meant to serve as a guideline for clients to reduce the number of unsuccessful requests that get to the service. The service still adds requests with lower effort than suggested-effort to the priority queue in [ADD_QUEUE]. 3.4.3.5. Updating descriptor with new suggested effort The service descriptors may be updated for multiple reasons including introduction point rotation common to all v3 onion services, the scheduled seed rotations described in [DESC_POW], and updates to the effort suggestion. Even though the internal effort estimate updates on a regular timer, we avoid propagating those changes into the descriptor and the HSDir hosts unless there is a significant change. If the PoW params otherwise match but the seed has changed by less than 15 percent, services SHOULD NOT upload a new descriptor. 4. Client behavior [CLIENT_BEHAVIOR] This proposal introduces a bunch of new ways where a legitimate client can fail to reach the onion service. Furthermore, there is currently no end-to-end way for the onion service to inform the client that the introduction failed. The INTRO_ACK cell is not end-to-end (it's from the introduction point to the client) and hence it does not allow the service to inform the client that the rendezvous is never gonna occur. From the client's perspective there's no way to attribute this failure to the service itself rather than the introduction point, so error accounting is performed separately for each introduction-point. Existing mechanisms will discard an introduction point that's required too many retries. 4.1. Clients handling timeouts [CLIENT_TIMEOUT] Alice can fail to reach the onion service if her introduction request gets trimmed off the priority queue in [HANDLE_QUEUE], or if the service does not get through its priority queue in time and the connection times out. This section presents a heuristic method for the client getting service even in such scenarios. If the rendezvous request times out, the client SHOULD fetch a new descriptor for the service to make sure that it's using the right suggested-effort for the PoW and the right PoW seed. If the fetched descriptor includes a new suggested effort or seed, it should first retry the request with these parameters. {TODO: This is not actually implemented yet, but we should do it. How often should clients at most try to fetch new descriptors? Determined by a consensus parameter? This change will also allow clients to retry effectively in cases where the service has just been reconfigured to enable PoW defenses.} Every time the client retries the connection, it will count these failures per-introduction-point. These counts of previous retries are combined with the service's suggested_effort when calculating the actual effort to spend on any individual request to a service that advertises PoW support, even when the currently advertised suggested_effort is zero. On each retry, the client modifies its solver effort: 1. If the effort is below (CLIENT_POW_EFFORT_DOUBLE_UNTIL = 1000) it will be doubled. 2. Otherwise, multiply the effort by (CLIENT_POW_RETRY_MULTIPLIER = 1.5). 3. Constrain the new effort to be at least (CLIENT_MIN_RETRY_POW_EFFORT = 8) and no greater than (CLIENT_MAX_POW_EFFORT = 10000) {TODO: These hardcoded limits should be replaced by timed limits and/or an unlimited solver with robust cancellation. This is issue tor#40787} 5. Attacker strategies [ATTACK_META] Now that we defined our protocol we need to start tweaking the various knobs. But before we can do that, we first need to understand a few high-level attacker strategies to see what we are fighting against. 5.1.1. Overwhelm PoW verification (aka "Overwhelm top half") [ATTACK_TOP_HALF] A basic attack here is the adversary spamming with bogus INTRO cells so that the service does not have computing capacity to even verify the proof-of-work. This adversary tries to overwhelm the procedure in the [POW_VERIFY] section. That's why we need the PoW algorithm to have a cheap verification time so that this attack is not possible: we tune this PoW parameter in section [POW_TUNING_VERIFICATION]. 5.1.2. Overwhelm rendezvous capacity (aka "Overwhelm bottom half") [ATTACK_BOTTOM_HALF] Given the way the introduction queue works (see [HANDLE_QUEUE]), a very effective strategy for the attacker is to totally overwhelm the queue processing by sending more high-effort introductions than the onion service can handle at any given tick. This adversary tries to overwhelm the procedure in the [HANDLE_QUEUE] section. To do so, the attacker would have to send at least 20 high-effort introduction cells every 100ms, where high-effort is a PoW which is above the estimated level of "the motivated user" (see [USER_MODEL]). An easier attack for the adversary, is the same strategy but with introduction cells that are all above the comfortable level of "the standard user" (see [USER_MODEL]). This would block out all standard users and only allow motivated users to pass. 5.1.3. Hybrid overwhelm strategy [ATTACK_HYBRID] If both the top- and bottom- halves are processed by the same thread, this opens up the possibility for a "hybrid" attack. Given the performance figures for the bottom half (0.31 ms/req.) and the top half (5.5 ms/req.), the attacker can optimally deny service by submitting 91 high-effort requests and 1520 invalid requests per second. This will completely saturate the main loop because: 0.31*(1520+91) ~ 0.5 sec. 5.5*91 ~ 0.5 sec. This attack only has half the bandwidth requirement of [ATTACK_TOP_HALF] and half the compute requirement of [ATTACK_BOTTOM_HALF]. Alternatively, the attacker can adjust the ratio between invalid and high-effort requests depending on their bandwidth and compute capabilities. 5.1.4. Gaming the effort estimation logic [ATTACK_EFFORT] Another way to beat this system is for the attacker to game the effort estimation logic (see [EFFORT_ESTIMATION]). Essentially, there are two attacks that we are trying to avoid: - Attacker sets descriptor suggested-effort to a very high value effectively making it impossible for most clients to produce a PoW token in a reasonable timeframe. - Attacker sets descriptor suggested-effort to a very small value so that most clients aim for a small value while the attacker comfortably launches an [ATTACK_BOTTOM_HALF] using medium effort PoW (see [REF_TEVADOR_1]) 5.1.4. Precomputed PoW attack The attacker may precompute many valid PoW nonces and submit them all at once before the current seed expires, overwhelming the service temporarily even using a single computer. The current scheme gives the attackers 4 hours to launch this attack since each seed lasts 2 hours and the service caches two seeds. An attacker with this attack might be aiming to DoS the service for a limited amount of time, or to cause an [ATTACK_EFFORT] attack. 6. Parameter tuning [POW_TUNING] There are various parameters in this PoW system that need to be tuned: We first start by tuning the time it takes to verify a PoW token. We do this first because it's fundamental to the performance of onion services and can turn into a DoS vector of its own. We will do this tuning in a way that's agnostic to the chosen PoW function. We will then move towards analyzing the client starting difficulty setting for our PoW system. That defines the expected time for clients to succeed in our system, and the expected time for attackers to overwhelm our system. Same as above we will do this in a way that's agnostic to the chosen PoW function. Currently, we have hardcoded the initial client starting difficulty at 8, but this may be too low to ramp up quickly to various on and off attack patterns. A higher initial difficulty may be needed for these, depending on their severity. This section gives us an idea of how large such attacks can be. Finally, using those two pieces we will tune our PoW function and pick the right client starting difficulty setting. At the end of this section we will know the resources that an attacker needs to overwhelm the onion service, the resources that the service needs to verify introduction requests, and the resources that legitimate clients need to get to the onion service. 6.1. PoW verification [POW_TUNING_VERIFICATION] Verifying a PoW token is the first thing that a service does when it receives an INTRODUCE2 cell and it's detailed in section [POW_VERIFY]. This verification happens during the "top half" part of the process. Every millisecond spent verifying PoW adds overhead to the already existing "top half" part of handling an introduction cell. Hence we should be careful to add minimal overhead here so that we don't enable attacks like [ATTACK_TOP_HALF]. During our performance measurements in [TOR_MEASUREMENTS] we learned that the "top half" takes about 0.26 msecs in average, without doing any sort of PoW verification. Using that value we compute the following table, that describes the number of cells we can queue per second (aka times we can perform the "top half" process) for different values of PoW verification time: +---------------------+-----------------------+--------------+ |PoW Verification Time| Total "top half" time | Cells Queued | | | | per second | |---------------------|-----------------------|--------------| | 0 msec | 0.26 msec | 3846 | | 1 msec | 1.26 msec | 793 | | 2 msec | 2.26 msec | 442 | | 3 msec | 3.26 msec | 306 | | 4 msec | 4.26 msec | 234 | | 5 msec | 5.26 msec | 190 | | 6 msec | 6.26 msec | 159 | | 7 msec | 7.26 msec | 137 | | 8 msec | 8.26 msec | 121 | | 9 msec | 9.26 msec | 107 | | 10 msec | 10.26 msec | 97 | +---------------------+-----------------------+--------------+ Here is how you can read the table above: - For a PoW function with a 1ms verification time, an attacker needs to send 793 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack. - For a PoW function with a 2ms verification time, an attacker needs to send 442 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack. - For a PoW function with a 10ms verification time, an attacker needs to send 97 dummy introduction cells per second to succeed in a [ATTACK_TOP_HALF] attack. Whether an attacker can succeed at that depends on the attacker's resources, but also on the network's capacity. Our purpose here is to have the smallest PoW verification overhead possible that also allows us to achieve all our other goals. [Note that the table above is simply the result of a naive multiplication and does not take into account all the auxiliary overheads that happen every second like the time to invoke the mainloop, the bottom-half processes, or pretty much anything other than the "top-half" processing. During our measurements the time to handle INTRODUCE2 cells dominates any other action time: There might be events that require a long processing time, but these are pretty infrequent (like uploading a new HS descriptor) and hence over a long time they smooth out. Hence extrapolating the total cells queued per second based on a single "top half" time seems like good enough to get some initial intuition. That said, the values of "Cells queued per second" from the table above, are likely much smaller than displayed above because of all the auxiliary overheads.] 6.2. PoW difficulty analysis [POW_DIFFICULTY_ANALYSIS] The difficulty setting of our PoW basically dictates how difficult it should be to get a success in our PoW system. An attacker who can get many successes per second can pull a successful [ATTACK_BOTTOM_HALF] attack against our system. In classic PoW systems, "success" is defined as getting a hash output below the "target". However, since our system is dynamic, we define "success" as an abstract high-effort computation. Our system is dynamic but we still need a starting difficulty setting that will be used for bootstrapping the system. The client and attacker can still aim higher or lower but for UX purposes and for analysis purposes we do need to define a starting difficulty, to minimize retries by clients. 6.2.1. Analysis based on adversary power In this section we will try to do an analysis of PoW difficulty without using any sort of Tor-related or PoW-related benchmark numbers. We created the table (see [REF_TABLE]) below which shows how much time a legitimate client with a single machine should expect to burn before they get a single success. The x-axis is how many successes we want the attacker to be able to do per second: the more successes we allow the adversary, the more they can overwhelm our introduction queue. The y-axis is how many machines the adversary has in her disposal, ranging from just 5 to 1000. =============================================================== | Expected Time (in seconds) Per Success For One Machine | =========================================================================== | | | Attacker Succeses 1 5 10 20 30 50 | | per second | | | | 5 5 1 0 0 0 0 | | 50 50 10 5 2 1 1 | | 100 100 20 10 5 3 2 | | Attacker 200 200 40 20 10 6 4 | | Boxes 300 300 60 30 15 10 6 | | 400 400 80 40 20 13 8 | | 500 500 100 50 25 16 10 | | 1000 1000 200 100 50 33 20 | | | ============================================================================ Here is how you can read the table above: - If an adversary has a botnet with 1000 boxes, and we want to limit her to 1 success per second, then a legitimate client with a single box should be expected to spend 1000 seconds getting a single success. - If an adversary has a botnet with 1000 boxes, and we want to limit her to 5 successes per second, then a legitimate client with a single box should be expected to spend 200 seconds getting a single success. - If an adversary has a botnet with 500 boxes, and we want to limit her to 5 successes per second, then a legitimate client with a single box should be expected to spend 100 seconds getting a single success. - If an adversary has access to 50 boxes, and we want to limit her to 5 successes per second, then a legitimate client with a single box should be expected to spend 10 seconds getting a single success. - If an adversary has access to 5 boxes, and we want to limit her to 5 successes per second, then a legitimate client with a single box should be expected to spend 1 seconds getting a single success. With the above table we can create some profiles for starting values of our PoW difficulty. 6.2.2. Analysis based on Tor's performance [POW_DIFFICULTY_TOR] To go deeper here, we can use the performance measurements from [TOR_MEASUREMENTS] to get a more specific intuition on the starting difficulty. In particular, we learned that completely handling an introduction cell takes 5.55 msecs in average. Using that value, we can compute the following table, that describes the number of introduction cells we can handle per second for different values of PoW verification: +---------------------+-----------------------+--------------+ |PoW Verification Time| Total time to handle | Cells handled| | | introduction cell | per second | |---------------------|-----------------------|--------------| | 0 msec | 5.55 msec | 180.18 | | 1 msec | 6.55 msec | 152.67 | | 2 msec | 7.55 msec | 132.45 | | 3 msec | 8.55 msec | 116.96 | | 4 msec | 9.55 mesc | 104.71 | | 5 msec | 10.55 msec | 94.79 | | 6 msec | 11.55 msec | 86.58 | | 7 msec | 12.55 msec | 79.68 | | 8 msec | 13.55 msec | 73.80 | | 9 msec | 14.55 msec | 68.73 | | 10 msec | 15.55 msec | 64.31 | +---------------------+-----------------------+--------------+ Here is how you can read the table above: - For a PoW function with a 1ms verification time, an attacker needs to send 152 high-effort introduction cells per second to succeed in a [ATTACK_BOTTOM_HALF] attack. - For a PoW function with a 10ms verification time, an attacker needs to send 64 high-effort introduction cells per second to succeed in a [ATTACK_BOTTOM_HALF] attack. We can use this table to specify a starting difficulty that won't allow our target adversary to succeed in an [ATTACK_BOTTOM_HALF] attack. Of course, when it comes to this table, the same disclaimer as in section [POW_TUNING_VERIFICATION] is valid. That is, the above table is just a theoretical extrapolation and we expect the real values to be much lower since they depend on auxiliary processing overheads, and on the network's capacity. 7. Discussion 7.1. UX This proposal has user facing UX consequences. When the client first attempts a pow, it can note how long iterations of the hash function take, and then use this to determine an estimation of the duration of the PoW. This estimation could be communicated via the control port or other mechanism, such that the browser could display how long the PoW is expected to take on their device. If the device is a mobile platform, and this time estimation is large, it could recommend that the user try from a desktop machine. 7.2. Future work [FUTURE_WORK] 7.2.1. Incremental improvements to this proposal There are various improvements that can be done in this proposal, and while we are trying to keep this v1 version simple, we need to keep the design extensible so that we build more features into it. In particular: - End-to-end introduction ACKs This proposal suffers from various UX issues because there is no end-to-end mechanism for an onion service to inform the client about its introduction request. If we had end-to-end introduction ACKs many of the problems from [CLIENT_BEHAVIOR] would be alleviated. The problem here is that end-to-end ACKs require modifications on the introduction point code and a network update which is a lengthy process. - Multithreading scheduler Our scheduler is pretty limited by the fact that Tor has a single-threaded design. If we improve our multithreading support we could handle a much greater amount of introduction requests per second. 7.2.2. Future designs [FUTURE_DESIGNS] This is just the beginning in DoS defences for Tor and there are various future designs and schemes that we can investigate. Here is a brief summary of these: "More advanced PoW schemes" -- We could use more advanced memory-hard PoW schemes like MTP-argon2 or Itsuku to make it even harder for adversaries to create successful PoWs. Unfortunately these schemes have much bigger proof sizes, and they won't fit in INTRODUCE1 cells. See #31223 for more details. "Third-party anonymous credentials" -- We can use anonymous credentials and a third-party token issuance server on the clearnet to issue tokens based on PoW or CAPTCHA and then use those tokens to get access to the service. See [REF_CREDS] for more details. "PoW + Anonymous Credentials" -- We can make a hybrid of the above ideas where we present a hard puzzle to the user when connecting to the onion service, and if they solve it we then give the user a bunch of anonymous tokens that can be used in the future. This can all happen between the client and the service without a need for a third party. All of the above approaches are much more complicated than this proposal, and hence we want to start easy before we get into more serious projects. 7.3. Environment We love the environment! We are concerned of how PoW schemes can waste energy by doing useless hash iterations. Here is a few reasons we still decided to pursue a PoW approach here: "We are not making things worse" -- DoS attacks are already happening and attackers are already burning energy to carry them out both on the attacker side, on the service side and on the network side. We think that asking legitimate clients to carry out PoW computations is not gonna affect the equation too much, since an attacker right now can very quickly cause the same damage that hundreds of legitimate clients do a whole day. "We hope to make things better" -- The hope is that proposals like this will make the DoS actors go away and hence the PoW system will not be used. As long as DoS is happening there will be a waste of energy, but if we manage to demotivate them with technical means, the network as a whole will less wasteful. Also see [CATCH22] for a similar argument. 8. Acknowledgements Thanks a lot to tevador for the various improvements to the proposal and for helping us understand and tweak the RandomX scheme. Thanks to Solar Designer for the help in understanding the current PoW landscape, the various approaches we could take, and teaching us a few neat tricks. Appendix A. Little-t tor introduction scheduler This section describes how we will implement this proposal in the "tor" software (little-t tor). The following should be read as if tor is an onion service and thus the end point of all inbound data. A.1. The Main Loop [MAIN_LOOP] Tor uses libevent for its mainloop. For network I/O operations, a mainloop event is used to inform tor if it can read on a certain socket, or a connection object in tor. From there, this event will empty the connection input buffer (inbuf) by extracting and processing a cell at a time. The mainloop is single threaded and thus each cell is handled sequentially. Processing an INTRODUCE2 cell at the onion service means a series of operations (in order): 1) Unpack cell from inbuf to local buffer. 2) Decrypt cell (AES operations). 3) Parse cell header and process it depending on its RELAY_COMMAND. 4) INTRODUCE2 cell handling which means building a rendezvous circuit: i) Path selection ii) Launch circuit to first hop. 5) Return to mainloop event which essentially means back to step (1). Tor will read at most 32 cells out of the inbuf per mainloop round. A.2. Requirements for PoW With this proposal, in order to prioritize cells by the amount of PoW work it has done, cells can _not_ be processed sequentially as described above. Thus, we need a way to queue a certain number of cells, prioritize them and then process some cell(s) from the top of the queue (that is, the cells that have done the most PoW effort). We thus require a new cell processing flow that is _not_ compatible with current tor design. The elements are: - Validate PoW and place cells in a priority queue of INTRODUCE2 cells (as described in section [INTRO_QUEUE]). - Defer "bottom half" INTRO2 cell processing for after cells have been queued into the priority queue. A.3. Proposed scheduler [TOR_SCHEDULER] The intuitive way to address the A.2 requirements would be to do this simple and naive approach: 1) Mainloop: Empty inbuf INTRODUCE2 cells into priority queue 2) Process all cells in pqueue 3) Goto (1) However, we are worried that handling all those cells before returning to the mainloop opens possibilities of attack by an adversary since the priority queue is not gonna be kept up to date while we process all those cells. This means that we might spend lots of time dealing with introductions that don't deserve it. See [BOTTOM_HALF_SCHEDULER] for more details. We thus propose to split the INTRODUCE2 handling into two different steps: "top half" and "bottom half" process, as also mentioned in [POW_VERIFY] section above. A.3.1. Top half and bottom half scheduler The top half process is responsible for queuing introductions into the priority queue as follows: a) Unpack cell from inbuf to local buffer. b) Decrypt cell (AES operations). c) Parse INTRODUCE2 cell header and validate PoW. d) Return to mainloop event which essentially means step (1). The top-half basically does all operations of section [MAIN_LOOP] except from (4). An then, the bottom-half process is responsible for handling introductions and doing rendezvous. To achieve this we introduce a new mainloop event to process the priority queue _after_ the top-half event has completed. This new event would do these operations sequentially: a) Pop INTRODUCE2 cell from priority queue. b) Parse and process INTRODUCE2 cell. c) End event and yield back to mainloop. A.3.2. Scheduling the bottom half process [BOTTOM_HALF_SCHEDULER] The question now becomes: when should the "bottom half" event get triggered from the mainloop? We propose that this event is scheduled in when the network I/O event queues at least 1 cell into the priority queue. Then, as long as it has a cell in the queue, it would re-schedule itself for immediate execution meaning at the next mainloop round, it would execute again. The idea is to try to empty the queue as fast as it can in order to provide a fast response time to an introduction request but always leave a chance for more cells to appear between cell processing by yielding back to the mainloop. With this we are aiming to always have the most up-to-date version of the priority queue when we are completing introductions: this way we are prioritizing clients that spent a lot of time and effort completing their PoW. If the size of the queue drops to 0, it stops scheduling itself in order to not create a busy loop. The network I/O event will re-schedule it in time. Notice that the proposed solution will make the service handle 1 single introduction request at every main loop event. However, when we do performance measurements we might learn that it's preferable to bump the number of cells in the future from 1 to N where N <= 32. A.4 Performance measurements This section will detail the performance measurements we've done on tor.git for handling an INTRODUCE2 cell and then a discussion on how much more CPU time we can add (for PoW validation) before it badly degrades our performance. A.4.1 Tor measurements [TOR_MEASUREMENTS] In this section we will derive measurement numbers for the "top half" and "bottom half" parts of handling an introduction cell. These measurements have been done on tor.git at commit 80031db32abebaf4d0a91c01db258fcdbd54a471. We've measured several set of actions of the INTRODUCE2 cell handling process on Intel(R) Xeon(R) CPU E5-2650 v4. Our service was accessed by an array of clients that sent introduction requests for a period of 60 seconds. 1. Full Mainloop Event We start by measuring the full time it takes for a mainloop event to process an inbuf containing INTRODUCE2 cells. The mainloop event processed 2.42 cells per invocation on average during our measurements. Total measurements: 3279 Min: 0.30 msec - 1st Q.: 5.47 msec - Median: 5.91 msec Mean: 13.43 msec - 3rd Q.: 16.20 msec - Max: 257.95 msec 2. INTRODUCE2 cell processing (bottom-half) We also measured how much time the "bottom half" part of the process takes. That's the heavy part of processing an introduction request as seen in step (4) of the [MAIN_LOOP] section: Total measurements: 7931 Min: 0.28 msec - 1st Q.: 5.06 msec - Median: 5.33 msec Mean: 5.29 msec - 3rd Q.: 5.57 msec - Max: 14.64 msec 3. Connection data read (top half) Now that we have the above pieces, we can use them to measure just the "top half" part of the procedure. That's when bytes are taken from the connection inbound buffer and parsed into an INTRODUCE2 cell where basic validation is done. There is an average of 2.42 INTRODUCE2 cells per mainloop event and so we divide that by the full mainloop event mean time to get the time for one cell. From that we subtract the "bottom half" mean time to get how much the "top half" takes: => 13.43 / (7931 / 3279) = 5.55 => 5.55 - 5.29 = 0.26 Mean: 0.26 msec To summarize, during our measurements the average number of INTRODUCE2 cells a mainloop event processed is ~2.42 cells (7931 cells for 3279 mainloop invocations). This means that, taking the mean of mainloop event times, it takes ~5.55msec (13.43/2.42) to completely process an INTRODUCE2 cell. Then if we look deeper we see that the "top half" of INTRODUCE2 cell processing takes 0.26 msec in average, whereas the "bottom half" takes around 5.33 msec. The heavyness of the "bottom half" is to be expected since that's where 95% of the total work takes place: in particular the rendezvous path selection and circuit launch. A.2. References [REF_EQUIX]: https://github.com/tevador/equix https://github.com/tevador/equix/blob/master/devlog.md [REF_TABLE]: The table is based on the script below plus some manual editing for readability: https://gist.github.com/asn-d6/99a936b0467b0cef88a677baaf0bbd04 [REF_BOTNET]: https://media.kasperskycontenthub.com/wp-content/uploads/sites/43/2009/07/01121538/ynam_botnets_0907_en.pdf [REF_CREDS]: https://lists.torproject.org/pipermail/tor-dev/2020-March/014198.html [REF_TARGET]: https://en.bitcoin.it/wiki/Target [REF_TLS]: https://www.ietf.org/archive/id/draft-nygren-tls-client-puzzles-02.txt https://tools.ietf.org/id/draft-nir-tls-puzzles-00.html https://tools.ietf.org/html/draft-ietf-ipsecme-ddos-protection-10 [REF_TLS_1]: https://www.ietf.org/archive/id/draft-nygren-tls-client-puzzles-02.txt [REF_TEVADOR_1]: https://lists.torproject.org/pipermail/tor-dev/2020-May/014268.html [REF_TEVADOR_2]: https://lists.torproject.org/pipermail/tor-dev/2020-June/014358.html [REF_TEVADOR_SIM]: https://github.com/mikeperry-tor/scratchpad/blob/master/tor-pow/effort_sim.py#L57
Filename: 328-relay-overload-report.md Title: Make Relays Report When They Are Overloaded Author: David Goulet, Mike Perry Created: November 3rd 2020 Status: Closed

0. Introduction

Many relays are likely sometimes under heavy load in terms of memory, CPU or network resources which in turns diminishes their ability to efficiently relay data through the network.

Having the capability of learning if a relay is overloaded would allow us to make better informed load balancing decisions. For instance, we can make our bandwidth scanners more intelligent on how they allocate bandwidth based on such metrics from relays.

We could furthermore improve our network health monitoring and pinpoint relays possibly misbehaving or under DDoS attack.

1. Metrics to Report

We propose that relays start collecting several metrics (see section 2) reflecting their loads from different component of tor.

Then, we propose that 1 new line be added to the server descriptor document (see dir-spec.txt, section 2.1.1) for the general overload case.

And 2 new lines to the extra-info document (see dir-spec.txt, section 2.1.2) for more specific overload cases.

The following describes a series of metrics to collect but more might come in the future and thus this is not an exhaustive list.

1.1. General Overload

The general overload line indicates that a relay has reached an "overloaded state" which can be one or many of the following load metrics:

  • Any OOM invocation due to memory pressure
  • Any ntor onionskins are dropped [Removed in tor-0.4.6.11 and 0.4.7.5-alpha]
  • A certain ratio of ntor onionskins dropped. [Added in tor-0.4.6.11 and 0.4.7.5-alpha]
  • TCP port exhaustion
  • DNS timeout reached (X% of timeouts over Y seconds). [Removed in tor-0.4.7.3-alpha]
  • CPU utilization of Tor's mainloop CPU core above 90% for 60 sec [Never implemented]
  • Control port overload (too many messages queued) [Never implemented]

For DNS timeouts, the X and Y are consensus parameters (overload_dns_timeout_scale_percent and overload_dns_timeout_period_secs) defined in param-spec.txt.

The format of the overloaded line added in the server descriptor document is as follows:

"overload-general" SP version SP YYYY-MM-DD HH:MM:SS NL [At most once.]

The timestamp is when at least one metric was detected. It should always be at the hour and thus, as an example, "2020-01-10 13:00:00" is an expected timestamp. Because this is a binary state, if the line is present, we consider that it was hit at the very least once somewhere between the provided timestamp and the "published" timestamp of the document which is when the document was generated.

The overload field should remain in place for 72 hours since last triggered. If the limits are reached again in this period, the timestamp is updated, and this 72 hour period restarts.

The 'version' field is set to '1' for the initial implementation of this proposal which includes all the above overload metrics except from the CPU and control port overload.

1.2. Token bucket size

Relays should report the 'BandwidthBurst' and 'BandwidthRate' limits in their descriptor, as well as the number of times these limits were reached, for read and write, in the past 24 hours starting at the provided timestamp rounded down to the hour.

The format of this overload line added in the extra-info document is as follows:

"overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS SP rate-limit SP burst-limit SP read-overload-count SP write-overload-count NL [At most once.]

The "rate-limit" and "burst-limit" are the raw values from the BandwidthRate and BandwidthBurst found in the torrc configuration file.

The "{read|write}-overload-count" are the counts of how many times the reported limits of burst/rate were exhausted and thus the maximum between the read and write count occurrences. To make the counter more meaningful and to avoid multiple connections saturating the counter when a relay is overloaded, we only increment it once a minute.

The 'version' field is set to '1' for the initial implementation of this proposal.

1.3. File Descriptor Exhaustion

Not having enough file descriptors in this day of age is really a misconfiguration or a too old operation system. That way, we can very quickly notice which relay has a value too small and we can notify them.

The format of this overload line added in the extra-info document is as follows:

"overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL [At most once.]

As the overloaded line, the timestamp indicates that the maximum was reached between the this timestamp and the "published" timestamp of the document.

This overload field should remain in place for 72 hours since last triggered. If the limits are reached again in this period, the timestamp is updated, and this 72 hour period restarts.

The 'version' field is set to '1' for the initial implementation of this proposal which detects fd exhaustion only when a socket open fails.

2. Load Metrics

This section proposes a series of metrics that should be collected and reported to the MetricsPort. The Prometheus format (only one supported for now) is described for each metrics.

2.1 Out-Of-Memory (OOM) Invocation

Tor's OOM manages caches and queues of all sorts. Relays have many of them and so any invocation of the OOM should be reported.

# HELP Total number of bytes the OOM has cleaned up # TYPE counter tor_relay_load_oom_bytes_total{<LABEL>} <VALUE>

Running counter of how many bytes were cleaned up by the OOM for a tor component identified by a label (see list below). To make sense, this should be visualized with the rate() function.

Possible LABELs for which the OOM was triggered:

  • subsys=cell: Circuit cell queue
  • subsys=dns: DNS resolution cache
  • subsys=geoip: GeoIP cache
  • subsys=hsdir: Onion service descriptors

2.2 Onionskin Queues

Onionskins handling is one of the few items that tor processes in parallel but they can be dropped for various reasons when under load. For this metrics to make sense, we also need to gather how many onionskins are we processing and thus one can provide a total processed versus dropped ratio:

# HELP Total number of onionskins # TYPE counter tor_relay_load_onionskins_total{<LABEL>} <NUM>

Possible LABELs are:

  • type=<handshake_type>: Type of handshake of that onionskins.
    • Possible values: ntor, tap, fast
  • action=processed: Indicating how many were processed.
  • action=dropped: Indicating how many were dropped due to load.

2.3 File Descriptor Exhaustion

Relays can reach a "ulimit" (on Linux) cap that is the number of allowed opened file descriptors. In Tor's use case, this is mostly sockets. File descriptors should be reported as follow:

# HELP Total number of sockets # TYPE gauge tor_relay_load_socket_total{<LABEL>} <NUM>

Possible LABELs are:

  • : How many available sockets.
  • state=opened: How many sockets are opened.

Note: since tor does track that value in order to reserve a block for critical port such as the Control Port, that value can easily be exported.

2.4 TCP Port Exhaustion

TCP protocol is capped at 65535 ports and thus if the relay ever is unable to open more outbound sockets, that is an overloaded state. It should be reported:

# HELP Total number of times we ran out of TCP ports # TYPE gauge tor_relay_load_tcp_exhaustion_total <NUM>

2.5 Connection Bucket Limit

Rate limited connections track bandwidth using a bucket system. Once the bucket is filled and tor wants to send more, it pauses until it is refilled a second later. Once that is hit, it should be reported:

# HELP Total number of global connection bucket limit reached # TYPE counter tor_relay_load_global_rate_limit_reached_total{<LABEL>} <NUM>

Possible LABELs are:

  • side=read: Read side of the global rate limit bucket.
  • side=write: Write side of the global rate limit bucket.
Filename: 329-traffic-splitting.txt Title: Overcoming Tor's Bottlenecks with Traffic Splitting Author: David Goulet, Mike Perry Created: 2020-11-25 Status: Finished 0. Status This proposal describes the Conflux [CONFLUX] system developed by Mashael AlSabah, Kevin Bauer, Tariq Elahi, and Ian Goldberg. It aims at improving Tor client network performance by dynamically splitting traffic between two circuits. We have made several additional improvements to the original Conflux design, by making use of congestion control information, as well as updates from Multipath TCP literature. 1. Overview 1.1. Multipath TCP Design Space In order to understand our improvements to Conflux, it is important to properly conceptualize what is involved in the design of multipath algorithms in general. The design space is broken into two orthogonal parts: congestion control algorithms that apply to each path, and traffic scheduling algorithms that decide which packets to send on each path. MPTCP specifies 'coupled' congestion control (see [COUPLED]). Coupled congestion control updates single-path congestion control algorithms to account for shared bottlenecks between the paths, so that the combined congestion control algorithms do not overwhelm any bottlenecks that happen to be shared between the multiple paths. Various ways of accomplishing this have been proposed and implemented in the Linux kernel. Because Tor's congestion control only concerns itself with bottlenecks in Tor relay queues, and not with any other bottlenecks (such as intermediate Internet routers), we can avoid this complexity merely by specifying that any paths that are constructed SHOULD NOT share any relays (except for the exit). This assumption is valid, because non-relay bottlenecks are managed by TCP of client-to-relay and relay-to-relay OR connections, and not Tor's circuit-level congestion control. In this way, we can proceed to use the exact same congestion control as specified in [PROP324], for each path. For this reason, this proposal will focus on protocol specification, and the traffic scheduling algorithms, rather than coupling. Note that the scheduling algorithms are currently in flux, and will be subject to change as we tune them in Shadow, on the live network, and for future UDP implementation (see [PROP339]). This proposal will be kept up to date with the current implementation. 1.2. Divergence from the initial Conflux design The initial [CONFLUX] paper doesn't provide any indications on how to handle the size of out-of-order cell queue, which we consider a potential dangerous memory DoS vector (see [MEMORY_DOS]). It also used RTT as the sole heuristic for selecting which circuit to send on (which may vary depending on the geographical locations of the participant relays), without considering their actual available circuit capacity (which will be available to us via Proposal 324). Additionally, since the publication of [CONFLUX], more modern packet scheduling algorithms have been developed, which aim to reduce out-of-order queue size. We propose mitigations for these issues using modern scheduling algorithms, as well as implementations options for avoiding the out-of-order queue at Exit relays. Additionally, we consider resumption, side channel, and traffic analysis risks and benefits in [RESUMPTION], [SIDE_CHANNELS] and [TRAFFIC_ANALYSIS]. 1.3. Design Overview The following section describes the Conflux design. The circuit construction is as follows: Primary Circuit (lower RTT) +-------+ +--------+ |Guard 1|----->|Middle 1|----------+ +---^---+ +--------+ | +-----+ | +--v---+ | OP +------+ | Exit |--> ... +-----+ | +--^---+ +---v---+ +--------+ | |Guard 2|----->|Middle 2|----------+ +-------+ +--------+ Secondary Circuit (higher RTT) Both circuits are built using current Tor path selection, however they SHOULD NOT share the same Guard relay, or middle relay. By avoiding using the same relays in these positions in the path, we ensure additional path capacity, and eliminate the need to use more complicated 'coupled' congestion control algorithms from the MPTCP literature[COUPLED]. This both simplifies design, and improves performance. Then, the OP needs to link the two circuits together, as described in [CONFLUX_HANDSHAKE]. For ease of explanation, the primary circuit is the circuit that is more desirable to use, as per the scheduling algorithm, and the secondary circuit is used after the primary is blocked by congestion control. Note that for some algorithms, this selection becomes fuzzy, but all of them favor the circuit with lower RTT, at the beginning of transmission. Note also that this notion of primary vs secondary is a local property of the current sender: each endpoint may have different notions of primary, secondary, and current sending circuit. They also may use different scheduling algorithms to determine this. Initial RTT is measured during circuit linking, as described in [CONFLUX_HANDSHAKE]. After the initial link, RTT is continually measured using SENDME timing, as in Proposal 324. This means that during use, the primary circuit and secondary circuit may switch roles, depending on unrelated network congestion caused by other Tor clients. We also support linking onion service circuits together. In this case, only two rendezvous circuits are linked. Each of these RP circuits will be constructed separately, and then linked. However, the same path constraints apply to each half of the circuits (no shared relays between the legs). If, by chance, the service and the client sides end up sharing some relays, this is not catastrophic. Multipath TCP researchers we have consulted (see [ACKNOWLEDGMENTS]), believe Tor's congestion control from Proposal 324 to be sufficient in this rare case. In the algorithms we recommend here, only two circuits will be linked together at a time. However, implementations SHOULD support more than two paths, as this has been shown to assist in traffic analysis resistance[WTF_SPLIT], and will also be useful for maintaining a desired target RTT, for UDP VoIP applications. If the number of circuits exceeds the current number of guard relays, guard relays MAY be re-used, but implementations SHOULD use the same number of Guards as paths. Linked circuits MUST NOT be extended further once linked (ie: 'cannibalization' is not supported). 2. Protocol Mechanics 2.1. Advertising support for conflux 2.1.1 Relay We propose a new protocol version in order to advertise support for circuit linking on the relay side: "Conflux=1" -- Relay supports Conflux as in linking circuits together using the new LINK, LINKED and SWITCH relay command. 2.1.2 Onion Service We propose to add a new line in order to advertise conflux support in the encrypted section of the onion service descriptor: "conflux" SP max-num-circ SP desired-ux NL The "max-num-circ" value indicate the maximum number of rendezvous circuits that are allowed to be linked together. We let the service specify the conflux algorithm to use, when sending data to the service. Some services may prefer latency, where as some may prefer throughput. However, clients also have the ability to request their own UX for data that the service sends, in the LINK handshake below, in part because the high-throughput algorithms will require more out-of-order queue memory, which may be infeasible on mobile. The next section describes how the circuits are linked together. 2.2. Conflux Handshake [CONFLUX_HANDSHAKE] To link circuits, we propose new relay commands that are sent on both circuits, as well as a response to confirm the join, and an ack of this response. These commands create a 3way handshake, which allows each endpoint to measure the initial RTT of each leg upon link, without needing to wait for any data. All three stages of this handshake are sent on *each* circuit leg to be linked. When packed cells are a reality (proposal 340), these cells SHOULD be combined with the initial RELAY_BEGIN cell on the faster circuit leg. This combination also allows better enforcement against side channels. (See [SIDE_CHANNELS]). There are other ways to do this linking that we have considered, but they seem not to be significantly better than this method, especially since we can use Proposal 340 to eliminate the RTT cost of this setup before sending data. For those other ideas, see [ALTERNATIVE_LINKING] and [ALTERNATIVE_RTT], in the appendix. The first two parts of the handshake establish the link, and enable resumption: 19 -- RELAY_CONFLUX_LINK Sent from the OP to the exit/service in order to link circuits together at the end point. 20 -- RELAY_CONFLUX_LINKED Sent from the exit/service to the OP, to confirm the circuits were linked. The contents of these two cells is exactly the same. They have the following contents: VERSION [1 byte] PAYLOAD [variable, up to end of relay payload] The VERSION tells us which circuit linking mechanism to use. At this point in time, only 0x01 is recognized and is the one described by the Conflux design. For version 0x01, the PAYLOAD contains: NONCE [32 bytes] LAST_SEQNO_SENT [8 bytes] LAST_SEQNO_RECV [8 bytes] DESIRED_UX [1 byte] The NONCE contains a random 256-bit secret, used to associate the two circuits together. The nonce MUST NOT be shared outside of the circuit transmission, or data may be injected into TCP streams. This means it MUST NOT be logged to disk. The two sequence number fields are 0 upon initial link, but non-zero in the case of a reattach or resumption attempt (See [CONFLUX_SET_MANAGEMENT] and [RESUMPTION]). The DESIRED_UX field allows the endpoint to request the UX properties it wants. The other endpoint SHOULD select the best known scheduling algorithm, for these properties. The endpoints do not need to agree on which UX style they prefer. The UX properties are: 0 - NO_OPINION 1 - MIN_LATENCY 2 - LOW_MEM_LATENCY 3 - HIGH_THROUGHPUT 4 - LOW_MEM_THROUGHPUT The algorithm choice is performed by to the *sender* of data, (ie: the receiver of the PAYLOAD). The receiver of data (sender of the PAYLOAD) does not need to be aware of the exact algorithm in use, but MAY enforce expected properties (particularly low queue usage, in the case of requesting either LOW_MEM_LATENCY or LOW_MEM_THROUGHPUT). The receiver MAY close the entire conflux set if these properties are violated. If either circuit does not receive a RELAY_CONFLUX_LINKED response, both circuits MUST be closed. The third stage of the handshake exists to help the exit/service measure initial RTT, for use in [SCHEDULING]: 21 -- RELAY_CONFLUX_LINKED_ACK Sent from the OP to the exit/service, to provide initial RTT measurement for the exit/service. These three relay commands are sent on *each* leg, to allow each endpoint to measure the initial RTT of each leg. The client SHOULD abandon and close circuit if the LINKED message takes too long to arrive. This timeout MUST be no larger than the normal SOCKS/stream timeout in use for RELAY_BEGIN, but MAY be the Circuit Build Timeout value, instead. (The C-Tor implementation currently uses Circuit Build Timeout). See [SIDE_CHANNELS] for rules for when to reject unexpected handshake cells. 2.2. Linking Circuits from OP to Exit [LINKING_EXIT] To link exit circuits, two circuits to the same exit are built, with additional restrictions such that they do not share Guard or Middle relays. This restriction applies via specific relay identity keys, and not IP addresses, families, or networks. (This is because the purpose of it is to avoid sharing a bottleneck *inside* relay circuit queues; bottlenecks caused by a shared network are handled by TCP's congestion control on the OR conns). Bridges also are subject to the same constraint as Guard relays; the C-Tor codebase emits a warn if only one bridge is configured, unless that bridge has transport "snowflake". Snowflake is exempt from this Guard restriction because it is actually backed by many bridges. In the bridge case, we only warn, and do not refuse to build conflux circuits, because it is not catastrophic that Bridges are shared, it is just sub-optimal for performance and congestion. When each circuit is opened, we ensure that congestion control has been negotiated. If congestion control negotiation has failed, the circuit MUST be closed. After this, the linking handshake begins. The RTT times between RELAY_CONFLUX_LINK and RELAY_CONFLUX_LINKED are measured by the client, to determine primary vs secondary circuit use, and for packet scheduling. Similarly, the exit measures the RTT times between RELAY_CONFLUX_LINKED and RELAY_CONFLUX_LINKED_ACK, for the same purpose. Because of the race between initial data and the RELAY_CONFLUX_LINKED_ACK cell, conditions can arise where an Exit needs to send data before the slowest circuit delivers this ACK. In these cases, it should prefer sending data on the circuit that has delivered the ACK (which will arrive immediately prior to any data from the client). This circuit will be the lower RTT circuit anyway, but the code needs to handle the fact that in this case, there won't yet be an RTT for the second circuit. 2.3. Linking circuits to an onion service [LINKING_SERVICE] For onion services, we will only concern ourselves with linking rendezvous circuits. To join rendezvous circuits, clients make two introduce requests to a service's intropoint, causing it to create two rendezvous circuits, to meet the client at two separate rendezvous points. These introduce requests MUST be sent to the same intropoint (due to potential use of onionbalance), and SHOULD be sent back-to-back on the same intro circuit. They MAY be combined with Proposal 340. (Note that if we do not use Prop340, we will have to raise the limit on number of intros per client circuit to 2, here, at intropoints). When rendezvous circuits are built, they should use the same Guard, Bridge, and Middle restrictions as specified in 2.2, for Exits. These restrictions SHOULD also take into consideration all Middles in the path, including the rendezvous point. All relay identities should be unique (again, except for when the Snowflake transport is in use). The one special case here is if the chosen rendezvous points by a client are the same as the service's guards. In this case, the service SHOULD NOT use different guards, but instead stick with the guards it has. The reason for this is that we do not want to create the ability for a client to force a service to use different guards. The first rendezvous circuit to get joined SHOULD use Proposal 340 to append the RELAY_BEGIN command, and the service MUST answer on this circuit, until RTT can be measured. Once both circuits are linked and RTT is measured, packet scheduling MUST be used, as per [SCHEDULING]. 2.4. Conflux Set Management [CONFLUX_SET_MANAGEMENT] When managing legs, it is useful to separate sets that have completed the link handshake from legs that are still performing the handshake. Linked sets MAY have additional unlinked legs on the way, but these should not be used for sending data until the handshake is complete. It is also useful to enforce various additional conditions on the handshake, depending on if [RESUMPTION] is supported, and if a leg has been launched because of an early failure, or due to a desire for replacement. 2.4.1. Pre-Building Sets In C-Tor, conflux is only used via circuit prebuilding. Pre-built conflux sets are preferred over other pre-built circuits, but if the pre-built pool ends up empty, normal pre-built circuits are used. If those run out, regular non-conflux circuits are built. In other words, in C-Tor, conflux sets are never built on-demand, but this is strictly an implementation decision, to simplify dealing with the C-Tor codebase The consensus parameter 'cfx_max_prebuilt_set' specifies the number of sets to pre-build. During upgrade, the consensus parameter 'cfx_low_exit_threshold' will be used, so that if there is a low amount of conflux-supporting exits, only one conflux set will be built. 2.4.2. Set construction When a set is launched, legs begin the handshake in the unlinked state. As handshakes complete, finalization is attempted, to create a linked set. On the client, this finalization happens upon receipt of the LINKED cell. On the exit/service, this finalization happens upon *sending* the LINKED_ACK. The initiator of this handshake considers the set fully linked once the RELAY_CONFLUX_LINKED_ACK is sent (roughly upon receipt of the LINKED cell). Because of the potential race between LINKED_ACK, and initial data sent by the client, the receiver of the handshake must consider a leg linked at the time of *sending* a LINKED_ACK cell. This means that exit legs may not have an RTT measurement, if data on the faster leg beats the LINKED_ACK on the slower leg. The implementation MUST account for this, by treating unmeasured legs as having infinite RTT. When attempting to finalize a set, this finalization should not complete if any unlinked legs are still pending. 2.4.3. Closing circuits For circuits that are unlinked, the origin SHOULD immediately relaunch a new leg when it is closed, subject to the limits in [SIDE_CHANNELS]. In C-Tor, we do not support arbitrary resumption. Therefore, we perform some additional checks upon closing circuits, to decide if we should immediately tear down the entire set: - If the closed leg was the current sending leg, close the set - If the closed leg had the highest non-zero last_seq_recv/sent, close the set - If data was in progress on a closed leg (inflight > cc_sendme_inc), then all legs must be closed 2.4.4. Reattaching Legs While C-Tor does not support arbitrary resumption, new legs *can* be attached, so long as there is no risk of data loss from a closed leg. This enables latency probing, which will be important for UDP VoIP. Currently, the C-Tor codebase checks for data loss by verifying that the LINK/LINKED cell has a lower last_seq_sent than all current legs' maximum last_seq_recv, and a lower last_seq_recv than all current legs last_seq_sent. This check is performed on finalization, not the receipt of first handshake cell. This gives the data additional time to arrive. 2.5. Congestion Control Application [CONGESTION_CONTROL] The SENDMEs for congestion control are performed per-leg. As soon as data arrives, regardless of its ordering, it is counted towards SENDME delivery. In this way, 'cwnd - inflight' of each leg always reflects the available data to send on each leg. This is important for [SCHEDULING]. The Congestion control Stream XON/XOFF can be sent on either leg, and applies to the stream's transmission on both legs. In C-Tor, streams used to become blocked as soon as the OR conn of their circuit was blocked. Because conflux can send on the other circuit, which uses a different OR conn, this form of stream blocking has been decoupled from the OR conn status, and only happens when congestion control has decided that all circuits are blocked (congestion control becomes blocked when either 'cwnd - inflight <= 0', *or* when the local OR conn is blocked, so if all local OR conns of a set are blocked, then the stream will block that way). Note also that because congestion control only covers RELAY_COMMAND_DATA cells, for all algorithms, a special case must be made such that if no circuit is available to send on due to congestion control blocking, commands other than RELAY_COMMAN_DATA MUST be sent on the current circuit, even if the cell scheduler believes that no circuit is available. Depending on the code structure of Arti, this special case may or may not be necessary. It arises in C-Tor because nothing can block the sending of arbitrary non-DATA relay command cells. 2.6. Sequencing [SEQUENCING] With multiple paths for data, the problem of data re-ordering appears. In other words, cells can arrive out of order from the two circuits where cell N + 1 arrives before the cell N. Handling this reordering operates after congestion control for each circuit leg, but before relay cell command processing or stream data delivery. For the receiver to be able to reorder the receiving cells, a sequencing scheme needs to be implemented. However, because Tor does not drop or reorder packets inside of a circuit, this sequence number can be very small. It only has to signal that a cell comes after those arriving on another circuit. To achieve this, we propose a new relay command used to indicate a switch to another leg: 22 -- RELAY_CONFLUX_SWITCH Sent from a sending endpoint when switching leg in an already linked circuit construction. This message is sent on the leg that will be used for new traffic, and tells the receiver the size of the gap since the last data (if any) sent on that leg. The cell payload format is: SeqNum [4 bytes] The "SeqNum" value is a relative sequence number, which is the difference between the last absolute sequence number sent on the new leg and the last absolute sequence number sent on all other legs prior to the switch. In this way, the endpoint knows what to increment its local absolute sequence number by, before cells start to arrive. To achieve this, the sender must maintain the last absolute sequence sent for each leg, and the receiver must maintain the last absolute sequence number received for each leg. As an example, let's say we send 10 cells on the first leg, so our absolute sequence number is 10. If we then switch to the second leg, it is trivial to see that we should send a SWITCH with 10 as the relative sequence number, to indicate that regardless of the order in which the first cells are received, subsequent cells on the second leg should start counting at 10. However, if we then send 21 cells on this leg, our local absolute sequence number as the sender is 31. So when we switch back to the first leg, where the last absolute sequence sent was 10, we must send a SWITCH cell with 21, so that when the first leg receives subsequent cells, it assigns those cells an absolute sequence number starting at 31. In the rare event that we send more than 2^31 cells (~1TB) on a single leg, the leg should be switched in order to reset that relative sequence number to fit within 4 bytes. For a discussion of rules to rate limit the usage of SWITCH as a side channel, see [SIDE_CHANNELS]. 2.7. Resumption [RESUMPTION] In the event that a circuit leg is destroyed, they MAY be resumed. Full resumption is not supported in C-Tor, but is possible to implement, at the expense of always storing roughly a congestion window of already-transmitted data on each endpoint, in the worst case. Simpler forms of resumption, where there is no data loss, are supported. This is important to support latency probing, for ensuring UDP VoIP minimum RTT requirements are met (roughly 300-500ms, depending on VoIP implementation). Resumption is achieved by re-using the NONCE to the same endpoint (either [LINKING_EXIT] or [LINKING_SERVICE]). The resumed path need not use the same middle and guard relays as the destroyed leg(s), but SHOULD NOT share any relays with any existing legs(s). If data loss has been detected upon a link handshake, resumption can be achieved by sending a switch cell, which is immediately followed by the missing data. Roughly, each endpoint must check: - if cell.last_seq_recv < min(max(legs.last_seq_sent),max(closed_legs.last_seq_sent)): - send a switch cell immediately with missing data: (last_seq_sent - cell.last_seq_recv) If an endpoint does not have this missing data due to memory pressure, that endpoint MUST destroy *both* legs, as this represents unrecoverable data loss. Re-transmitters MUST NOT re-increment their absolute sent fields while re-transmitting. It is even possible to resume conflux circuits where both legs have been collapsed using this scheme, if endpoints continue to buffer their unacked package_window data for some time after this close. However, see [TRAFFIC_ANALYSIS] for more details on the full scope of this issue. If endpoints are buffering package_window data, such data should be given priority to be freed in any oomkiller invocation. See [MEMORY_DOS] for more oomkiller information. 2.8. Data transmission Most cells in Tor are circuit-specific, and should only be sent on a circuit, even if that circuit is part of a conflux set. Cells that are not multiplexed do not count towards the conflux sequence numbers. However, in addition to the obvious RELAY_COMMAND_DATA, a subset of cells MUST ALSO be multiplexed, so that their ordering is preserved when they arrive at the other end. These cells do count towards conflux sequence numbers, and are handled in the out-of-order queue, to preserve ordered delivery: RELAY_COMMAND_BEGIN RELAY_COMMAND_DATA RELAY_COMMAND_END RELAY_COMMAND_CONNECTED RELAY_COMMAND_RESOLVE RELAY_COMMAND_RESOLVED RELAY_COMMAND_XOFF RELAY_COMMAND_XON Currently, this set is the same as the set of cells that have stream ID, but the property that leads to this requirement is not stream usage by itself, it is that these cells must be ordered with respect to all data on the circuit. It is not impossible that future relay commands could be invented that don't have stream IDs, but yet must still arrive in order with respect to circuit data cells. Prop#253 is one possible example of such a thing (though we won't be implementing that proposal). 3. Traffic Scheduling [SCHEDULING] In order to load balance the traffic between the two circuits, the original conflux paper used only RTT. However, with Proposal 324, we will have accurate information on the instantaneous available bandwidth of each circuit leg, as 'cwnd - inflight' (see Section 3 of Proposal 324). We also have the TCP block state of the local OR connection. We specify two traffic schedulers from the multipath literature and adapt them to Tor: [MINRTT_TOR], and [LOWRTT_TOR]. Additionally, we create low-memory variants of these that aim to minimize the out-of-order queue size at the receiving endpoint. Additionally, see the [TRAFFIC_ANALYSIS] sections of this proposal for important details on how this selection can be changed, to reduce website traffic fingerprinting. 3.1. MinRTT scheduling [MINRTT_TOR] This schedulng algorithm is used for the MIN_LATENCY user experience. It works by always and only sending on the circuit with the current minimum RTT. With this algorithm, conflux should effectively stay on the circuit with the lowest initial RTT, unless that circuit's RTT raises above the RTT of the other circuit (due to relay load or congestion). When the circuit's congestion window is full (ie: cwnd - inflight <= 0), or if the local OR conn blocks, the conflux set stops transmitting and stops reading on edge connections, rather than switch. This should result in low out-of-order queues in most situations, unless the initial RTTs of the two circuits are very close (basically within the Vegas RTT bounds of queue variance, 'alpha' and 'beta'). 3.2. LowRTT Scheduling [LOWRTT_TOR] This scheduling algorithm is based on [MPTCP]'s LowRTT scheduler. This algorithm is used for the UX choice of HIGH_THROUGHPUT. In this algorithm, endpoints send cells on the circuit with lowest RTT that has an unblocked local OR connection, and room in its congestion window (ie: cwnd - inflight > 0). We stop reading on edge connections only when both congestion windows become full, or when both local OR connections are blocked. In this way, unlike original conflux, we switch to the secondary circuit without causing congestion either locally, or on either circuit. This improves both load times, and overall throughput. Given a large enough transmission, both circuits are used to their full capacity, simultaneously. 3.3. MinRTT Low-Memory Scheduling [MINRTT_LOWMEM_TOR] The low memory version of the MinRTT scheduler ensures that we do not perform a switch more often than once per congestion window worth of data. XXX: Other rate limiting, such as not switching unless the RTT changes by more than X%, may be useful here. 3.4. BLEST Scheduling [BLEST_TOR] XXX: Something like this might be useful for minimizing OOQ for the UX choice of LOW_MEM_THROUGHPUT, but we might just be able to reduce switching frequency instead. XXX: We want an algorithm that only uses cwnd instead. This algorithm has issues if the primary cwnd grows while the secondary does not. Expect this section to change. [BLEST] attempts to predict the availability of the primary circuit, and use this information to reorder transmitted data, to minimize head-of-line blocking in the recipient (and thus minimize out-of-order queues there). BLEST_TOR uses the primary circuit until the congestion window is full. Then, it uses the relative RTT times of the two circuits to calculate how much data can be sent on the secondary circuit faster than if we just waited for the primary circuit to become available. This is achieved by computing two variables at the sender: rtts = secondary.currRTT / primary.currRTT primary_limit = primary.cwnd + (rtts-1)/2)*rtts Note: This (rtts-1)/2 factor represents anticipated congestion window growth over this period.. it may be different for Tor, depending on CC alg. If primary_limit < secondary.cwnd - (secondary.package_window + 1), then there is enough space on the secondary circuit to send data faster than we could than waiting for the primary circuit. XXX: Note that BLEST uses total_send_window where we use secondary.cwnd in this check. total_send_window is min(recv_win, CWND). But since Tor does not use receive windows and instead uses stream XON/XOFF, we only use CWND. There is some concern this may alter BLEST's buffer minimization properties, but since receive window only matter if the application is slower than Tor, and XON/XOFF will cover that case, hopefully this is fine. If we need to, we could turn [REORDER_SIGNALING] into a receive window indication of some kind, to indicate remaining buffer size. Otherwise, if the primary_limit condition is not hit, cease reading on source edge connections until SENDME acks come back. Here is the pseudocode for this: while source.has_data_to_send(): if primary.cwnd > primary.package_window: primary.send(source.get_packet()) continue rtts = secondary.currRTT / primary.currRTT primary_limit = (primary.cwnd + (rtts-1)/2)*rtts if primary_limit < secondary.cwnd - (secondary.package_window+1): secondary.send(source.get_packet()) else: break # done for now, wait for SENDME to free up CWND and restart Note that BLEST also has a parameter lambda that is updated whenever HoL blocking occurs. Because it is expensive and takes significant time to signal this over Tor, we omit this. 4. Security Considerations 4.1. Memory Denial of Service [MEMORY_DOS] Both reorder queues and retransmit buffers inherently represent a memory denial of service condition. For [RESUMPTION] retransmit buffers, endpoints that support this feature SHOULD free retransmit information as soon as they get close to memory pressure. This prevents resumption while data is in flight, but will not otherwise harm operation. In terms of adversarial issues, clients can lie about sequence numbers, sending cells with sequence numbers such that the next expected sequence number is never sent. They can do this repeatedly on many circuits, to exhaust memory at exits. Intermediate relays may also block a leg, allowing cells to traverse only one leg, thus still accumulating at the reorder queue. In C-Tor we will mitigate this in three ways: via the OOM killer, by the ability for exits to request that clients use the LOW_MEM_LATENCY UX behavior, and by rate limiting the frequency of switching under the LOW_MEM_LATENCY UX style. When a relay is under memory pressure, the circuit OOM killer SHOULD free and close circuits with the oldest reorder queue data, first. This heuristic was shown to be best during the [SNIPER] attack OOM killer iteration cycle. The rate limiting under LOW_MEM_LATENCY will be heuristic driven, based on data from Shadow simulations, and live network testing. It is possible that other algorithms may be able to be similarly rate limited. 4.2. Protocol Side Channels [SIDE_CHANNELS] To understand the decisions we make below with respect to handling potential side channels, it is important to understand a bit of the history of the Tor threat model. Tor's original threat model completely disregarded all traffic analysis, including protocol side channels, assuming that they were all equally effective, and that diversity of relays was what provided protection. Numerous attack papers have proven this to be an over-generalization. Protocol side channels are most severe when a circuit is known to be silent, because stateful protocol behavior prevents other normal cells from ever being sent. In these cases, it is trivial to inject a packet count pattern that has zero false positives. These kinds of side channels are made use of in the Guard discovery literature, such as [ONION_FOUND], and [DROPMARK]. It is even more trivial to manipulate the AES-CTR cipherstream, as per [RACOON23], until we implement [PROP308]. However, because we do not want to make this problem worse, it is extremely important to be mindful of ways that an adversary can inject new cell commands, as well as ways that the adversary can spawn new circuits arbitrarily. It is also important, though slightly less so, to be mindful of the uniqueness of new handshakes, as handshakes can be used to classify usage (such as via Onion Service Circuit Fingerprinting). Handshake side channels are only weakly defended, via padding machines for onion services. These padding machines will need to be improved, and this is also scheduled for arti. Finally, usage-based traffic analysis need to be considered. This includes things like website traffic fingerprinting, and is covered in [TRAFFIC_ANALYSIS]. 4.2.1. Cell Injection Side Channel Mitigations To avoid [DROPMARK] attacks, several checks must be performed, depending on the cell type. The circuit MUST be closed if any of these checks fail. RELAY_CONFLUX_LINK: - Ensure conflux is enabled - Ensure the circuit is an Exit (or Service Rend) circuit - Ensure that no previous LINK cell has arrived on this circuit RELAY_CONFLUX_LINKED: - Ensure conflux is enabled - Ensure the circuit is client-side - Ensure this is an unlinked circuit that sent a LINK command - Ensure that the nonce matches the nonce used in the LINK command - Ensure that the cell came from the expected hop RELAY_CONFLUX_LINKED_ACK: - Ensure conflux is enabled - Ensure that this circuit is not client-side - Ensure that the circuit has successfully received its LINK cell - Ensure that this circuit has not received a LINKED_ACK yet RELAY_CONFLUX_SWITCH - If Prop#340 is in use, this cell MUST be packed with a valid multiplexed RELAY_COMMAND cell. - XXX: Additional rate limiting per algorithm, after tuning. 4.2.2. Guard Discovery Side Channel Mitigations In order to mitigate potential guard discovery by malicious exits, clients MUST NOT retry failed unlinked circuit legs for a set more than 'cfx_max_unlinked_leg_retry' times. 4.2.3. Usage-Based Side Channel Discussion After we have solved all of the zero false positive protocol side channels in Tor, our attention can turn to more subtle, usage-based side channels. Two potential usage side channels may be introduced by the use of Conflux: 1. Delay-based side channels, by manipulating switching 2. Location info leaks through the use of both leg's latencies To perform delay-based side channels, Exits can simply disregard the RTT or cwnd when deciding to switch legs, thus introducing a pattern of gaps that the Guard node can detect. Guard relays can also delay legs to introduce a pattern into the delivery of cells at the exit relay, by varying the latency of SENDME cells (every 31st cell) to change the distribution of traffic to send information. This attack could be performed in either direction of traffic, to bias traffic load off of a particular Guard. If an adversary controls both Guards, it could in theory send a binary signal, by alternating delays on each. However, Tor currently provides no defenses against already existing single-circuit delay-based (or stop-and-start) side channels. It is already the case that on a single circuit, either the Guard or the Exit can simply withhold sending traffic, as per a recognizable pattern. This class of attacks, and a possible defense for them, is discussed in [BACKLIT]. However, circuit padding can also help to obscure these side channels, even if tuned for website fingerprinting. See [TRAFFIC_ANALYSIS] for more details there. The second class of side channel is where the Exit relay may be able to use the two legs to further infer more information about client location. See [LATENCY_LEAK] for more details. It is unclear at this time how much more severe this is for two paths than just one. We preserve the ability to disable conflux to and from Exit relays using consensus parameters, if these side channels prove more severe, or if it proves possible possible to mitigate single-circuit side channels, but not conflux side channels. 4.3. Traffic analysis [TRAFFIC_ANALYSIS] Even though conflux shows benefits against traffic analysis in [WTF_SPLIT], these gains may be moot if the adversary is able to perform packet counting and timing analysis at guards to guess which specific circuits are linked. In particular, the 3 way handshake in [LINKING_CIRCUITS] may be quite noticeable. Additionally, the conflux handshake may make onion services stand out more, regardless of the number of stages in the handshake. For this reason, it may be wise to simply address these issues with circuit padding machines during circuit setup (see padding-spec.txt). Additional traffic analysis considerations arise when combining conflux with padding, for purposes of mitigating traffic fingerprinting. For this, it seems wise to treat the packet schedulers as another piece of a combined optimization problem in tandem with optimizing padding machines, perhaps introducing randomness or fudge factors their scheduling, as a parameterized distribution. For details, see https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md Finally, conflux may exacerbate forms of confirmation-based traffic analysis that close circuits to determine concretely if they were in use, since closing either leg might cause resumption to fail. TCP RST injection can perform this attack on the side, without surveillance capability. [RESUMPTION] with buffering of the inflight unacked package_window data, for retransmit, is a partial mitigation, if endpoints buffer this data for retransmission for a brief time even if both legs close. This buffering seems more feasible for onion services, which are more vulnerable to this attack. However, if the adversary controls the client and is attacking the service in this way, they will notice the resumption re-link at their client, and still obtain confirmation that way. It seems the only way to fully mitigate these kinds of attacks is with the Snowflake pluggable transport, which provides its own resumption and retransmit behavior. Additionally, Snowflake's use of UDP DTLS also protects against TCP RST injection, which we suspect to be the main vector for such attacks. In the future, a DTLS or QUIC transport for Tor such as masque could provide similar RST injection resistance, and resumption at Guard/Bridge nodes, as well. 5. Consensus Parameters [CONSENSUS] - cfx_enabled - Values: 0=off, 1=on - Description: Emergency off switch, in case major issues are discovered. - cfx_low_exit_threshold - Range: 0-10000 - Description: Fraction out of 10000 that represents the fractional rate of exits that must support protover 5. If the fraction is below this amount, the number of pre-built sets is restricted to 1. - cfx_max_linked_set - Range: 0-255 - Description: The total number of linked sets that can be created. 255 means "unlimited". - cfx_max_prebuilt_set - Range: 0-255 - Description: The maximum number of pre-built conflux sets to make. This value is overridden by the 'cfx_low_exit_threshold' criteria. - cfx_max_unlinked_leg_retry - Range: 0-255 - Description: The maximum number of times to retry an unlinked leg that fails during build or link, to mitigate guard discovery attacks. - cfx_num_legs_set - Range: 0-255 - Description: The number of legs to link in a set. - cfx_send_pct - XXX: Experimental tuning parameter. Subject to change/removal. - cfx_drain_pct - XXX: Experimental tuning parameter. Subject to change/removal. 7. Tuning Experiments [EXPERIMENTS] - conflux_sched & conflux_exits - Exit reorder queue size - Responsiveness vs throughput tradeoff? - Congestion control - EWMA and KIST - num guards & conflux_circs Appended A [ALTERNATIVES] A.1 Alternative Link Handshake [ALTERNATIVE_LINKING] The circuit linking in [LINKING_CIRCUITS] could be done as encrypted ntor onionskin extension fields, similar to those used by v3 onions. This approach has at least four problems: i). For onion services, since onionskins traverse the intro circuit and return on the rend circuit, this handshake cannot measure RTT there. ii). Since these onionskins are larger, and have no PFS, an adversary at the middle relay knows that the onionskin is for linking, and can potentially try to obtain the onionskin key for attacks on the link. iii). It makes linking circuits more fragile, since they could timeout due to CBT, or other issues during construction. iv). The overhead in processing this onionskin in onionskin queues adds additional time for linking, even in the Exit case, making that RTT potentially noisy. Additionally, it is not clear that this approach actually saves us anything in terms of setup time, because we can optimize away the linking phase using Proposal 340, to combine initial RELAY_BEGIN cells with RELAY_CIRCUIT_LINK. A.2. Alternative RTT measurement [ALTERNATIVE_RTT] Instead of measuring RTTs during [LINKING_CIRCUITS], we could create PING/PONG cells, whose sole purpose is to allow endpoints to measure RTT. This was rejected for several reasons. First, during circuit use, we already have SENDMEs to measure RTT. Every 100 cells (or 'circwindow_inc' from Proposal 324), we are able to re-measure RTT based on the time between that Nth cell and the SENDME ack. So we only need PING/PONG to measure initial circuit RTT. If we were able to use onionskins, as per [ALTERNATIVE_LINKING] above, we might be able to specify a PING/PONG/PING handshake solely for measuring initial RTT, especially for onion service circuits. The reason for not making a dedicated PING/PONG for this purpose is that it is context-free. Even if we were able to use onionskins for linking and resumption, to avoid additional data in handshake that just measures RTT, we would have to enforce that this PING/PONG/PING only follows the exact form needed by this proposal, at the expected time, and at no other points. If we do not enforce this specific use of PING/PONG/PING, it becomes another potential side channel, for use in attacks such as [DROPMARK]. In general, Tor is planning to remove current forms of context-free and semantic-free cells from its protocol: https://gitlab.torproject.org/tpo/core/torspec/-/issues/39 We should not add more. Appendix B: Acknowledgments [ACKNOWLEDGMENTS] Thanks to Per Hurtig for helping us with the framing of the MPTCP problem space. Thanks to Simone Ferlin for clarifications on the [BLEST] paper, and for pointing us at the Linux kernel implementation. Extreme thanks goes again to Toke Høiland-Jørgensen, who helped immensely towards our understanding of how the BLEST condition relates to edge connection pushback, and for clearing up many other misconceptions we had. Finally, thanks to Mashael AlSabah, Kevin Bauer, Tariq Elahi, and Ian Goldberg, for the original [CONFLUX] paper! References: [CONFLUX] https://freehaven.net/anonbib/papers/pets2013/paper_65.pdf [BLEST] https://olivier.mehani.name/publications/2016ferlin_blest_blocking_estimation_mptcp_scheduler.pdf https://opus.lib.uts.edu.au/bitstream/10453/140571/2/08636963.pdf https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.95/net/mptcp/mptcp_blest.c [WTF_SPLIT] https://www.comsys.rwth-aachen.de/fileadmin/papers/2020/2020-delacadena-trafficsliver.pdf [COUPLED] https://datatracker.ietf.org/doc/html/rfc6356 https://www.researchgate.net/profile/Xiaoming_Fu2/publication/230888515_Delay-based_Congestion_Control_for_Multipath_TCP/links/54abb13f0cf2ce2df668ee4e.pdf?disableCoverPage=true http://staff.ustc.edu.cn/~kpxue/paper/ToN-wwj-2020.04.pdf https://www.thinkmind.org/articles/icn_2019_2_10_30024.pdf https://arxiv.org/pdf/1308.3119.pdf [BACKLIT] https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf [LATENCY_LEAK] https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf https://www.robgjansen.com/publications/howlow-pets2013.pdf [SNIPER] https://www.freehaven.net/anonbib/cache/sniper14.pdf [DROPMARK] https://www.petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf [RACCOON23] https://archives.seul.org/or/dev/Mar-2012/msg00019.html [ONION_FOUND] https://www.researchgate.net/publication/356421302_From_Onion_Not_Found_to_Guard_Discovery/fulltext/619be24907be5f31b7ac194a/From-Onion-Not-Found-to-Guard-Discovery.pdf [VANGUARDS_ADDON] https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md [PROP324] https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/324-rtt-congestion-control.txt [PROP339] https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/339-udp-over-tor.md [PROP308] https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/308-counter-galois-onion.txt
Filename: 330-authority-contact.md Title: Modernizing authority contact entries Author: Nick Mathewson Created: 10 Feb 2021 Status: Open

This proposal suggests changes to interfaces used to describe a directory authority, to better support load balancing and denial-of-service resistance.

(In an appendix, it also suggests an improvement to the description of authority identity keys, to avoid a deprecated cryptographic algorithm.)

Background

There are, broadly, three good reasons to make a directory request to a Tor directory authority:

  • As a relay, to publish a new descriptor.
  • As another authority, to perform part of the voting and consensus protocol.
  • As a relay, to fetch a consensus or a set of (micro)descriptors.

There are some more reasons that are OK-ish:

  • as a bandwidth authority or similar related tool running under the auspices of an authority.
  • as a metrics tool, to fetch directory information.
  • As a liveness checking tool, to make sure the authorities are running.

There are also a number of bad reasons to make a directory request to a Tor directory authority.

  • As a client, to download directory information. (Clients should instead use a directory guard, or a fallback directory if they don't know any directory information at all.)
  • As a tor-related application, to download directory information. (Such applications should instead run a tor client, which can maintain an up-to-date directory much more efficiently.)

Currently, Tor provides two mechanisms for downloading and uploading directory information: the DirPort, and the BeginDir command. A DirPort is an HTTP port on which directory information is served. The BeginDir command is a relay command that is used to send an HTTP stream directly over a Tor circuit.

Historically, we used DirPort for all directory requests. Later, when we needed encrypted or anonymous directory requests, we moved to a "Begin-over-tor" approach, and then to BeginDir. We still use the DirPort directly, however, when relays are connecting to authorities to publish descriptors or download fresh directories. We also use it for voting.

This proposal suggests that instead of having only a single DirPort, authorities should be able to expose a separate contact point for each supported interaction above. By separating these contact points, we can impose separate access controls and rate limits on each, to improve the robustness of the consensus voting process.

Eventually, separate contact points will allow us do even more: we'll be able to have separate implementations for the upload and download components of the authorities, and keep the voting component mostly offline.

Adding contact points to authorities

Currently, for each directory authority, we ship an authority entry. For example, the entry describing tor26 is:

"tor26 orport=443 " "v3ident=14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4 " "ipv6=[2001:858:2:2:aabb:0:563b:1526]:443 " "86.59.21.38:80 847B 1F85 0344 D787 6491 A548 92F9 0493 4E4E B85D",

We extend these lines with optional contact point elements as follows:

  • upload=http://IP:port/ A location to publish router descriptors.
  • download=http://IP:port/ A location to use for caches when fetching router descriptors.
  • vote=http://IP:port/ A location to use for authorities when voting.

Each of these contact point elements can appear more than once. If it does, then it describes multiple valid contact points for a given purpose; implementations MAY use any of the contact point elements that they recognize for a given authority.

Implementations SHOULD ignore url schemas that they do not recognize, and SHOULD ignore hostnames addresses that appear in the place of the IP elements above. (This will make it easier for us to extend these lists in the future.)

If there is no contact point element for a given type, then implementations should fall back to using the main IPv4 addr:port, and/or the IPv6 addr:port if available.

As an extra rule: If more than one authority lists the same upload point, then uploading a descriptor to that upload point counts as having uploaded it to all of those authorities. (This rule will allow multiple authorities to share an upload point in the future, if they decide to do so. We do not need a corresponding rules for voting or downloading, since every authority participates in voting directly, and since there is no notion of "downloading from each authority.")

Authority-side configuration

We add a few flags to DirPort configuration, indicating what kind of requests are acceptable.

  • no-voting
  • no-download
  • no-upload

These flags remove a given set of possible operations from a given DirPort. So for example, an authority might say:

DirPort 9030 no-download no-upload DirPort 9040 no-voting no-upload DirPort 9050 no-voting no-download

We can also allow "upload-only" as an alias for "no-voting no-download", and so on.

Note that authorities would need to keep a legacy dirport around until all relays have upgraded.

Bridge authorities

This proposal does not yet apply to bridge authorities, since neither clients nor bridges connect to bridge authorities over HTTP. A later proposal may add a schema that can be used to describe contacting to a bridge authority via BEGINDIR.

Example uses

Example setup: Simple access control and balancing.

Right now the essential functionality of authorities is sometimes blocked by getting too much load from directory downloads by non-relays. To address this we can proceed as follows. We can have each relay authority open four separate dirports: One for publishing, one for voting, one for downloading, and one legacy port. These can be rate-limited separately, and requests sent to the wrong port can be rejected. We could additionally prioritize voting, then uploads, then downloads. This could be done either within Tor, or with other IP shaping tools.

Example setup: Full authority refactoring

In the future, this system lets us get fancier with our authorities and how they are factored. For example, as in proposal 257, an authority could run upload services, voting, and download services all at separate locations.

The authorities themselves would be the only ones that needed to use their voting protocol. The upload services (run on the behalf of authorities or groups of authorities) could receive descriptors and do initial testing on them before passing them on to the authorities. The authorities could then vote with one another, and push the resulting consensus and descriptors to the download services. This would make the download services primarily responsible for serving directory information, and have them take all the load.

Appendix: Cryptographic extensions to authority configuration

The 'v3ident' element, and the relay identity fingerprint in authority configuration, are currently both given as SHA1 digests of RSA keys. SHA1 is currently deprecated: even though we're only relying on second-preimage resistance, we should migrate away.

With that in mind, we're adding two more fields to the authority entries:

  • ed25519-id=BASE64 The ed25519 identity of a the authority when it acts as a relay.
  • v3ident-sha3-256=HEX The SHA3-256 digest of the authority's v3 signing key.

(We use base64 here for the ed25519 key since that's what we use elsewhere.)

Filename: 331-res-tokens-for-anti-dos.md Title: Res tokens: Anonymous Credentials for Onion Service DoS Resilience Author: George Kadianakis, Mike Perry Created: 11-02-2021 Status: Draft
+--------------+ +------------------+ | Token Issuer | | Onion Service | +--------------+ +------------------+ ^ ^ | +----------+ | Issuance | 1. | | 2. | Redemption +------->| Alice |<-------+ | | +----------+

0. Introduction

This proposal specifies a simple anonymous credential scheme based on Blind RSA signatures designed to fight DoS abuse against onion services. We call the scheme "Res tokens".

Res tokens are issued by third-party issuance services, and are verified by onion services during the introduction protocol (through the INTRODUCE1 cell).

While Res tokens are used for denial of service protection in this proposal, we demonstrate how they can have application in other Tor areas as well, like improving the IP reputation of Tor exit nodes.

1. Motivation

Denial of service attacks against onion services have been explored in the past and various defenses have been proposed:

  • Tor proposal #305 specifies network-level rate-limiting mechanisms.
  • Onionbalance allows operators to scale their onions horizontally.
  • Tor proposal #327 increases the attacker's computational requirements (not implemented yet).

While the above proposals in tandem should provide reasonable protection against many DoS attackers, they fundamentally work by reducing the asymmetry between the onion service and the attacker. This won't work if the attacker is extremely powerful because the asymmetry is already huge and cutting it down does not help.

We believe that a proposal based on cryptographic guarantees -- like Res tokens -- can offer protection against even extremely strong attackers.

2. Overview

In this proposal we introduce an anonymous credential scheme -- Res tokens -- that is well fitted for protecting onion services against DoS attacks. We also introduce a system where clients can acquire such anonymous credentials from various types of Token Issuers and then redeem them at the onion service to gain access even when under DoS conditions.

In section [TOKEN_DESIGN], we list our requirements from an anonymous credential scheme and provide a high-level overview of how the Res token scheme works.

In section [PROTOCOL_SPEC], we specify the token issuance and redemption protocols, as well as the mathematical operations that need to be conducted for these to work.

In section [TOKEN_ISSUERS], we provide a few examples and guidelines for various token issuer services that could exist.

In section [DISCUSSION], we provide more use cases for Res tokens as well as future improvements we can conduct to the scheme.

3. Design [TOKEN_DESIGN]

In this section we will go over the high-level design of the system, and in the next section we will delve into the lower-level details of the protocol.

3.1. Anonymous credentials

Anonymous credentials or tokens are cryptographic identifiers that allow their bearer to maintain an identity while also preserving anonymity.

Clients can acquire a token in a variety of ways (e.g. registering on a third-party service, solving a CAPTCHA, completing a PoW puzzle) and then redeem it at the onion service proving this way that work was done, but without linking the act of token acquisition with the act of token redemption.

3.2. Anonymous credential properties

The anonymous credential literature is vast and there are dozens of credential schemes with different properties REF_TOKEN_ZOO, in this section we detail the properties we care about for this use case:

  • Public Verifiability: Because of the distributed trust properties of the Tor network, we need anonymous credentials that can be issued by one party (the token issuer) and verified by a different party (in this case the onion service).

  • Perfect unlinkability: Unlinkability between token issuance and token redemption is vital in private settings like Tor. For this reason we want our scheme to preserve its unlinkability even if its fundamental security assumption is broken. We want unlinkability to be protected by information theoretic security or random oracle, and not just computational security.

  • Small token size: The tokens will be transfered to the service through the INTRODUCE1 cell which is not flexible and has only a limited amount of space (about 200 bytes) REF_INTRO_SPACE. We need tokens to be small.

  • Quick Verification: Onions are already experiencing resource starvation because of the DoS attacks so it's important that the process of verifying a token should be as quick as possible. In section [TOKEN_PERF] we will go deeper into this requirement.

After careful consideration of the above requirements, we have leaned towards using Blind RSA as the primitive for our tokens, since it's the fastest scheme by far that also allows public verifiability. See also Appendix B [BLIND_RSA_PROOF] for a security proof sketch of Blind RSA perfect unlinkability.

3.3. Other security considerations

Apart from the above properties we also want:

  • Double spending protection: We don't want Malory to be able to double spend her tokens in various onion services thereby amplifying her attack. For this reason our tokens are not global, and can only be redeemed at a specific destination onion service.

  • Metadata: We want to encode metadata/attributes in the tokens. In particular, we want to encode the destination onion service and an expiration date. For more information see section [DEST_DIGEST]. For blind RSA tokens this is usually done using "partially blind signatures" but to keep it simple we instead encode the destination directly in the message to be blind-signed and the expiration date using a set of rotating signing keys.

  • One-show: There are anonymous credential schemes with multi-show support where one token can be used multiple times in an unlinkable fashion. However, that might allow an adversary to use a single token to launch a DoS attack, since revocation solutions are complex and inefficient in anonymous credentials. For this reason, in this work we use one-show tokens that can only be redeemed once. That takes care of the revocation problem but it means that a client will have to get more tokens periodically.

3.4. Res tokens overview

Throughout this proposal we will be using our own token scheme, named "Res", which is based on blind RSA signatures. In this modern cryptographic world, not only we have the audacity of using Chaum's oldest blind signature scheme of all times, but we are also using RSA with a modulus of 1024 bits...

The reason that Res uses only 1024-bits RSA is because we care most about small token size and quick verification rather than the unforgeability of the token. This means that if the attacker breaks the issuer's RSA signing key and issues tokens for herself, this will enable the adversary to launch DoS attacks against onion services, but it won't allow her to link users (because of the "perfect unlinkability" property).

Furthermore, Res tokens get a short implicit expiration date by having the issuer rapidly rotate issuance keys every few hours. This means that even if an adversary breaks an issuance key, she will be able to forge tokens for just a few hours before that key expires.

For more ideas on future schemes and improvements see section [FUTURE_RES].

3.5. Token performance requirements [TOKEN_PERF]

As discussed above, verification performance is extremely important in the anti-DoS use case. In this section we provide some concrete numbers on what we are looking for.

In proposal #327 REF_POW_PERF we measured that the total time spent by the onion service on processing a single INTRODUCE2 cell ranges from 5 msec to 15 msecs with a mean time around 5.29 msec. This time also includes the launch of a rendezvous circuit, but does not include the additional blocking and time it takes to process future cells from the rendezvous point.

We also measured that the parsing and validation of INTRODUCE2 cell ("top half") takes around 0.26 msec; that's the lightweight part before the onion service decides to open a rendezvous circuit and do all the path selection and networking.

This means that any defenses introduced by this proposal should add minimal overhead to the above "top half" procedure, so as to apply access control in the lightest way possible.

For this reason we implemented a basic version of the Res token scheme in Rust and benchmarked the verification and issuance procedure REF_RES_BENCH.

We measured that the verification procedure from section [RES_VERIFY] takes about 0.104 ms, which we believe is a reasonable verification overhead for the purposes of this proposal.

We also measured that the issuance procedure from [RES_ISSUANCE] takes about 0.614 ms.

4. Specification [PROTOCOL_SPEC]

+--------------+ +------------------+ | Token Issuer | | Onion Service | +--------------+ +------------------+ ^ ^ | +----------+ | Issuance | 1. | | 2. | Redemption +------->| Alice |<-------+ | | +----------+

4.0. Notation

Let a || b be the concatenation of a with b.

Let a^b denote the exponentiation of a to the bth power.

Let a == b denote a check for equality between a and b.

Let FDH_N(msg) be a Full Domain Hash (FDH) of 'msg' using SHA256 and stretching the digest to be equal to the size of an RSA modulus N.

4.1. Token issuer setup

The Issuer creates a set of ephemeral RSA-1024 "issuance keys" that will be used during the issuance protocol. Issuers will be rotating these ephemeral keys every 6 hours.

The Issuer exposes the set of active issuance public keys through a REST HTTP API that can be accessed by visiting /issuers.keys.

Tor directory authorities periodically fetch the issuer's public keys and vote for those keys in the consensus so that they are readily available by clients. The keys in the current consensus are considered active, whereas the ones that have fallen off have expired.

XXX how many issuance public keys are active each time? how does overlapping keys work? clients and onions need to know precise expiration date for each key. this needs to be specified and tested for robustness.

XXX every how often does the fetch work? how does the voting work? which issuers are considered official? specify consensus method.

XXX An alternative approach: Issuer has a long-term ed25519 certification key that creates expiring certificates for the ephemeral issuance keys. Alice shows the certificate to the service to prove that the token comes from an issuer. The consensus includes the long-term certification key of the issuers to establish ground truth. This way we avoid the synchronization between dirauths and issuers, and the multiple overlapping active issuance keys. However, certificates might not fit in the INTRODUCE1 cell (prop220 certs take 104 bytes on their own). Also certificate metadata might create a vector for linkability attacks between the issuer and the verifier.

4.2. Onion service signals ongoing DoS attack

When an onion service is under DoS attack it adds the following line in the "encrypted" (inner) part of the v3 descriptor as a way to signal to its clients that tokens are required for gaining access:

"token-required" SP token-type SP issuer-list NL [At most once] token-type: Is the type of token supported ("res" for this proposal) issuer-list: A comma separated list of issuers which are supported by this onion service

4.3. Token issuance

When Alice visits an onion service with an active "token-required" line in its descriptor it checks whether there are any tokens available for this onion service in its token store. If not, it needs to acquire some and hence the token issuance protocol commences.

4.3.1. Client preparation [DEST_DIGEST]

Alice first chooses an issuer supported by the onion service depending on her preferences by looking at the consensus and her Tor configuration file for the current list of active issuers.

After picking a supported issuer, she performs the following preparation before contacting the issuer:

  1. Alice extracts the issuer's public key (N,e) from the consensus

  2. Alice computes a destination digest as follows:

    dest_digest = FDH_N(destination || salt) where: - 'destination' is the 32-byte ed25519 public identity key of the destination onion - 'salt' is a random 32-byte value,
  3. Alice samples a blinding factor 'r' uniformly at random from [1, N)

  4. Alice computes: blinded_message = dest_digest * r^e (mod N)

After this phase is completed, Alice has a blinded message that is tailored specifically for the destination onion service. Alice will send the blinded message to the Token Issuer, but because of the blinding the Issuer does not get to learn the dest_digest value.

XXX Is the salt needed? Reevaluate.

4.3.3. Token Issuance [RES_ISSUANCE]

Alice now initiates contact with the Token Issuer and spends the resources required to get issued a token (e.g. solve a CAPTCHA or a PoW, create an account, etc.). After that step is complete, Alice sends the blinded_message to the issuer through a JSON-RPC API.

After the Issuer receives the blinded_message it signs it as follows:

blinded_signature = blinded_message ^ d (mod N) where: - 'd' is the private RSA exponent.

and returns the blinded_signature to Alice.

XXX specify API (JSON-RPC? Needs SSL + pubkey pinning.)

4.3.4. Unblinding step

Alice verifies the received blinded signature, and unblinds it to get the final token as follows:

token = blinded_signature * r^{-1} (mod N) = blinded_message ^ d * r^{-1] (mod N) = (dest_digest * r^e) ^d * r^{-1} (mod N) = dest_digest ^ d * r * r^{-1} (mod N) = dest_digest ^ d (mod N) where: - r^{-1} is the multiplicative inverse of the blinding factor 'r'

Alice will now use the 'token' to get access to the onion service.

By verifying the received signature using the issuer keys in the consensus, Alice ensures that a legitimate token was received and that it has not expired (since the issuer keys are still in the consensus).

4.4. Token redemption

4.4.1. Alice sends token to onion service

Now that Alice has a valid 'token' it can request access to the onion service. It does so by embedding the token into the INTRODUCE1 cell to the onion service.

To do so, Alice adds an extension to the encrypted portion of the INTRODUCE1 cell by using the EXTENSIONS field (see [PROCESS_INTRO2] section in rend-spec-v3.txt). The encrypted portion of the INTRODUCE1 cell only gets read by the onion service and is ignored by the introduction point.

We propose a new EXT_FIELD_TYPE value:

[02] -- ANON_TOKEN

The EXT_FIELD content format is:

TOKEN_VERSION [1 byte] ISSUER_KEY [4 bytes] DEST_DIGEST [32 bytes] TOKEN [128 bytes] SALT [32 bytes]

where:

  • TOKEN_VERSION is the version of the token ([0x01] for Res tokens)
  • ISSUER_KEY is the public key of the chosen issuer (truncated to 4 bytes)
  • DEST_DIGEST is the 'dest_digest' from above
  • TOKEN is the 'token' from above
  • SALT is the 32-byte 'salt' added during blinding

This will increase the INTRODUCE1 payload size by 199 bytes since the data above is 197 bytes, the extension type and length is 2 extra bytes, and the N_EXTENSIONS field is always present. According to ticket #33650, INTRODUCE1 cells currently have more than 200 bytes available so we should be able to fit the above fields in the cell.

XXX maybe we don't need to pass DEST_DIGEST and we can just derive it

XXX maybe with a bit of tweaking we can even use a 1536-bit RSA signature here...

4.4.2. Onion service verifies token [RES_VERIFY]

Upon receiving an INTRODUCE1 cell with the above extension the service verifies the token. It does so as follows:

  1. The service checks its double spend protection cache for an element that matches DEST_DIGEST. If one is found, verification fails.
  2. The service checks: DEST_DIGEST == FDH_N(service_pubkey || SALT), where 'service_pubkey' is its own long-term public identity key.
  3. The service finds the corresponding issuer public key 'e' based on ISSUER_KEY from the consensus or its configuration file
  4. The service checks: TOKEN ^ e == DEST_DIGEST

Finally the onion service adds the DEST_DIGEST to its double spend protection cache to avoid the same token getting redeemed twice. Onion services keep a double spend protection cache by maintaining a sorted array of truncated DEST_DIGEST elements.

If any of the above steps fails, the verification process aborts and the introduction request gets discarded.

If all the above verification steps have been completed successfully, the service knows that this a valid token issued by the token issuer, and that the token has been created for this onion service specifically. The service considers the token valid and the rest of the onion service protocol carries out as normal.

5. Token issuers [TOKEN_ISSUERS]

In this section we go over some example token issuers. While we can have official token issuers that are supported by the Tor directory authorities, it is also possible to have unofficial token issuers between communities that can be embedded directly into the configuration file of the onion service and the client.

In general, we consider the design of token issuers to be independent from this proposal so we will touch the topic but not go too deep into it.

5.1. CAPTCHA token issuer

A use case resembling the setup of Cloudflare's PrivacyPass would be to have a CAPTCHA service that issues tokens after a successful CAPTCHA solution.

Tor Project, Inc runs https://ctokens.torproject.org which serves hCaptcha CAPTCHAs. When the user solves a CAPTCHA the server gives back a list of tokens. The amount of tokens rewarded for each solution can be tuned based on abuse level.

Clients reach this service via a regular Tor Exit connection, possibly via a dedicated exit enclave-like relay that can only connect to https://ctokens.torproject.org.

Upon receiving tokens, Tor Browser delivers them to the Tor client via the control port, which then stores the tokens into a token cache to be used when connecting to onion services.

In terms of UX, most of the above procedure can be hidden from the user by having Tor Browser do most of the things under the scenes and only present the CAPTCHA to the user if/when needed (if the user doesn't have tokens available for that destination).

XXX specify control port API between browser and tor

5.2. PoW token issuer

An idea that mixes the CAPTCHA issuer with proposal#327, would be to have a token issuer that accepts PoW solutions and provides tokens as a reward.

This solution tends to be less optimal than applying proposal#327 directly because it doesn't allow us to fine-tune the PoW difficulty based on the attack severity; which is something we are able to do with proposal#327.

However, we can use the fact that token issuance happens over HTTP to introduce more advanced PoW-based concepts. For example, we can design token issuers that accept blockchain shares as a reward for tokens. For example, a system like Monero's Primo could be used to provide DoS protection and also incentivize the token issuer by being able to use those shares for pool mining REF_PRIMO.

5.3. Onion service self-issuing

The onion service itself can also issue tokens to its users and then use itself as an issuer for verification. This way it can reward trusted users by giving it tokens for the future. The tokens can be rewarded from within the website of the onion service and passed to the Tor Client through the control port, or they can be provided in an out-of-bands way for future use (e.g. from a journalist to a future source using a QR code).

Unfortunately, the anonymous credential scheme specified in this proposal is one-show, so the onion service cannot provide a single token that will work for multiple "logins". In the future we can design multi-show credential systems that also have revocation to further facilitate this use case (see [FUTURE_RES] for more info).

6. User Experience

This proposal has user facing UX consequences.

Ideally we want this process to be invisible to the user and things to "just work". This can be achieved with token issuers that don't require manual work by the user (e.g. the PoW issuer, or the onion service itself), since both the token issuance and the token redemption protocols don't require any manual work.

In the cases where manual work is needed by the user (e.g. solving a CAPTCHA) it's ideal if the work is presented to the user right before visiting the destination and only if it's absolutely required. An explanation about the service being under attack should be given to the user when the CAPTCHA is provided.

7. Security

In this section we analyze potential security threats of the above system:

  • An evil client can hoard tokens for hours and unleash them all at once to cause a denial of service attack. We might want to make the key rotation even more frequent if we think that's a possible threat.

  • A trusted token issuer can always DoS an onion service by forging tokens.

  • Overwhelming attacks like "top half attacks" and "hybrid attacks" from proposal#327 is valid for this proposal as well.

  • A bad RNG can completely wreck the linkability properties of this proposal.

XXX Actually analyze the above if we think there is merit to listing them

8. Discussion [DISCUSSION]

8.1. Using Res tokens on Exit relays

There are more scenarios within Tor that could benefit from Res tokens however we didn't expand on those use cases to keep the proposal short. In the future, we might want to split this document into two proposals: one proposal that specifies the token scheme, and another that specifies how to use it in the context of onion services, so that we can then write more proposals that use the token scheme as a primitive.

An extremely relevant use case would be to use Res tokens as a way to protect and improve the IP reputation of Exit relays. We can introduce an exit pool that requires tokens in exchange for circuit streams. The idea is that exits that require tokens will see less abuse, and will not have low scores in the various IP address reputation systems that now govern who gets access to websites and web services on the public Internet. We hope that this way we will see less websites blocking Tor.

8.2. Future improvements to this proposal [FUTURE_RES]

The Res token scheme is a pragmatic scheme that works for the space/time constraints of this use case but it's far from ideal for the greater future (RSA? RSA-1024?).

After Tor proposal#319 gets implemented we will be able to pack more data in RELAY cells and that opens the door to token schemes with bigger token sizes. For example, we could design schemes based on BBS+ that can provide more advanced features like multi-show and complex attributes but currently have bigger token sizes (300+ bytes). That would greatly improve UX since the client won't have to solve multiple CAPTCHAs to gain access. Unfortunately, another problem here is that right now pairing-based schemes have significantly worse verification performance than RSA (e.g. in the order of 4-5 ms compared to <0.5 ms). We expect pairing-based cryptography performance to only improve in the future and we are looking forward to these advances.

When we switch to a multi-show scheme, we will also need revocation support otherwise a single client can abuse the service with a single multi-show token. To achieve this we would need to use blacklisting schemes based on accumulators (or other primitives) that can provide more flexible revocation and blacklisting; however these come at the cost of additional verification time which is not something we can spare at this time. We warmly welcome research on revocation schemes that are lightweight on the verification side but can be heavy on the proving side.

8.3. Other uses for tokens in Tor

There is more use cases for tokens in Tor but we think that other token schemes with different properties would be better suited for those.

In particular we could use tokens as authentication mechanisms for logging into services (e.g. acquiring bridges, or logging into Wikipedia). However for those use cases we would ideally need multi-show tokens with revocation support. We can also introduce token schemes that help us build a secure name system for onion services.

We hope that more research will be done on how to combine various token schemes together, and how we can maintain agility while using schemes with different primitives and properties.

9. Acknowledgements

Thanks to Jeff Burdges for all the information about Blind RSA and anonymous credentials.

Thanks to Michele Orrù for the help with the unlinkability proof and for the discussions about anonymous credentials.

Thanks to Chelsea Komlo for pointing towards anonymous credentials in the context of DoS defenses for onion services.


Appendix A: RSA Blinding Security Proof [BLIND_RSA_PROOF]

This proof sketch was provided by Michele Orrù:

RSA Blind Sigs: https://en.wikipedia.org/wiki/Blind_signature#Blind_RSA_signatures As you say, blind RSA should be perfectly blind. I tried to look at Boneh-Shoup, Katz-Lindell, and Bellare-Goldwasser for a proof, but didn't find any :( The basic idea is proving that: for any message "m0" that is blinded with "r0^e" to obtain "b" (that is sent to the server), it is possible to freely choose another message "m1" that blinded with another opening "r1^e" to obtain the same "b". As long as r1, r0 are chosen uniformly at random, you have no way of telling if what message was picked and therefore it is *perfectly* blind. To do so: Assume the messages ("m0" and "m1") are invertible mod N=pq (this happens at most with overwhelming probability phi(N)/N if m is uniformly distributed as a result of a hash, or you can enforce it at signing time). Blinding happens by computing: b = m0 * (r0^e). However, I can also write: b = m0 * r0^e = (m1/m1) * m0 * r0^e = m1 * (m0/m1*r0^e). This means that r1 = (m0/m1)^d * r0 is another valid blinding factor for b, and it's distributed exactly as r0 in the group of invertibles (it's unif at random, because r0 is so).

https://www.monerooutreach.org/stories/RPC-Pay.html
Filename: 332-ntor-v3-with-extra-data.md Title: Ntor protocol with extra data, version 3. Author: Nick Mathewson Created: 12 July 2021 Status: Closed

Overview

The ntor handshake is our current protocol for circuit establishment.

So far we have two variants of the ntor handshake in use: the "ntor v1" that we use for everyday circuit extension (see tor-spec.txt) and the "hs-ntor" that we use for v3 onion service handshake (see rend-spec-v3.txt). This document defines a third version of ntor, adapting the improvements from hs-ntor for use in regular circuit establishment.

These improvements include:

  • Support for sending additional encrypted and authenticated protocol-setup handshake data as part of the ntor handshake. (The information sent from the client to the relay does not receive forward secrecy.)

  • Support for using an external shared secret that both parties must know in order to complete the handshake. (In the HS handshake, this is the subcredential. We don't use it for circuit extension, but in theory we could.)

  • Providing a single specification that can, in the future, be used both for circuit extension and HS introduction.

The improved protocol: an abstract view

Given a client "C" that wants to construct a circuit to a relay "S":

The client knows:

  • B: a public "onion key" for S
  • ID: an identity for S, represented as a fixed-length byte string.
  • CM: a message that it wants to send to S as part of the handshake.
  • An optional "verification" string.

The relay knows:

  • A set of [(b,B)...] "onion key" keypairs. One of them is "current", the others are outdated, but still valid.
  • ID: Its own identity.
  • A function for computing a server message SM, based on a given client message.
  • An optional "verification" string. This must match the "verification" string from the client.

Both parties have a strong source of randomness.

Given this information, the client computes a "client handshake" and sends it to the relay.

The relay then uses its information plus the client handshake to see if the incoming message is valid; if it is, then it computes a "server handshake" to send in reply.

The client processes the server handshake, and either succeeds or fails.

At this point, the client and the relay both have access to:

  • CM (the message the client sent)
  • SM (the message the relay sent)
  • KS (a shared byte stream of arbitrary length, used to compute keys to be used elsewhere in the protocol).

Additionally, the client knows that CM was sent only to the relay whose public onion key is B, and that KS is shared only with that relay.

The relay does not know which client participated in the handshake, but it does know that CM came from the same client that generated the key X, and that SM and KS were shared only with that client.

Both parties know that CM, SM, and KS were shared correctly, or not at all.

Both parties know that they used the same verification string; if they did not, they do not learn what the verification string was. (This feature is required for HS handshakes.)

The handshake in detail

Notation

We use the following notation:

  • | -- concatenation
  • "..." -- a byte string, with no terminating NUL.
  • ENCAP(s) -- an encapsulation function. We define this as htonll(len(s)) | s. (Note that len(ENCAP(s)) = len(s) + 8).
  • PARTITION(s, n1, n2, n3, ...) -- a function that partitions a bytestring s into chunks of length n1, n2, n3, and so on. Extra data is put into a final chunk. If s is not long enough, the function fails.

We require the following crypto operations:

  • KDF(s,t) -- a tweakable key derivation function, returning a keystream of arbitrary length.
  • H(s,t) -- a tweakable hash function of output length DIGEST_LEN.
  • MAC(k, msg, t) -- a tweakable message-authentication-code function, with key length MAC_KEY_LEN and output length MAC_LEN.
  • EXP(pk,sk) -- our Diffie Hellman group operation, taking a public key of length PUB_KEY_LEN.
  • KEYGEN() -- our Diffie-Hellman keypair generation algorithm, returning a (secret-key,public-key) pair.
  • ENC(k, m) -- a stream cipher with key of length ENC_KEY_LEN. DEC(k, m) is its inverse.

Parameters:

  • PROTOID -- a short protocol identifier
  • t_* -- a set of "tweak" strings, used to derive distinct hashes from a single hash function.
  • ID_LEN -- the length of an identity key that uniquely identifies a relay.

Given our cryptographic operations and a set of tweak strings, we define:

H_foo(s) = H(s, t_foo) MAC_foo(k, msg) = MAC(k, msg, t_foo) KDF_foo(s) = KDF(s, t_foo)

See Appendix A.1 below for a set of instantiations for these operations and constants.

Client operation, phase 1

The client knows: B, ID -- the onion key and ID of the relay it wants to use. CM -- the message that it wants to send as part of its handshake. VER -- a verification string.

First, the client generates a single-use keypair:

x,X = KEYGEN()

and computes:

Bx = EXP(B,x) secret_input_phase1 = Bx | ID | X | B | PROTOID | ENCAP(VER) phase1_keys = KDF_msgkdf(secret_input_phase1) (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) encrypted_msg = ENC(ENC_K1, CM) msg_mac = MAC_msgmac(MAC_K1, ID | B | X | encrypted_msg)

and sends:

NODEID ID [ID_LEN bytes] KEYID B [PUB_KEY_LEN bytes] CLIENT_PK X [PUB_KEY_LEN bytes] MSG encrypted_msg [len(CM) bytes] MAC msg_mac [last MAC_LEN bytes of message]

The client remembers x, X, B, ID, Bx, and msg_mac.

Server operation

The relay checks whether NODEID is as expected, and looks up the (b,B) keypair corresponding to KEYID. If the keypair is missing or the NODEID is wrong, the handshake fails.

Now the relay uses X=CLIENT_PK to compute:

Xb = EXP(X,b) secret_input_phase1 = Xb | ID | X | B | PROTOID | ENCAP(VER) phase1_keys = KDF_msgkdf(secret_input_phase1) (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) expected_mac = MAC_msgmac(MAC_K1, ID | B | X | MSG)

If expected_mac is not MAC, the handshake fails. Otherwise the relay computes CM as:

CM = DEC(MSG, ENC_K1)

The relay then checks whether CM is well-formed, and in response composes SM, the reply that it wants to send as part of the handshake. It then generates a new ephemeral keypair:

y,Y = KEYGEN()

and computes the rest of the handshake:

Xy = EXP(X,y) secret_input = Xy | Xb | ID | B | X | Y | PROTOID | ENCAP(VER) ntor_key_seed = H_key_seed(secret_input) verify = H_verify(secret_input) RAW_KEYSTREAM = KDF_final(ntor_key_seed) (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) encrypted_msg = ENC(ENC_KEY, SM) auth_input = verify | ID | B | Y | X | MAC | ENCAP(encrypted_msg) | PROTOID | "Server" AUTH = H_auth(auth_input)

The relay then sends:

Y Y [PUB_KEY_LEN bytes] AUTH AUTH [DIGEST_LEN bytes] MSG encrypted_msg [len(SM) bytes, up to end of the message]

The relay uses KEYSTREAM to generate the shared secrets for the newly created circuit.

Client operation, phase 2

The client computes:

Yx = EXP(Y, x) secret_input = Yx | Bx | ID | B | X | Y | PROTOID | ENCAP(VER) ntor_key_seed = H_key_seed(secret_input) verify = H_verify(secret_input) auth_input = verify | ID | B | Y | X | MAC | ENCAP(MSG) | PROTOID | "Server" AUTH_expected = H_auth(auth_input)

If AUTH_expected is equal to AUTH, then the handshake has succeeded. The client can then calculate:

RAW_KEYSTREAM = KDF_final(ntor_key_seed) (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) SM = DEC(ENC_KEY, MSG)

SM is the message from the relay, and the client uses KEYSTREAM to generate the shared secrets for the newly created circuit.

Security notes

Whenever comparing bytestrings, implementations SHOULD use constant-time comparison function to avoid side-channel attacks.

To avoid small-subgroup attacks against the Diffie-Hellman function, implementations SHOULD either:

  • Make sure that all incoming group members are in fact in the DH group.
  • Validate all outputs from the EXP function to make sure that they are not degenerate.

Notes on usage

We don't specify what should actually be done with the resulting keystreams; that depends on the usage for which this handshake is employed. Typically, they'll be divided up into a series of tags and symmetric keys.

The keystreams generated here are (conceptually) unlimited. In practice, the usage will determine the amount of key material actually needed: that's the amount that clients and relays will actually generate.

The PROTOID parameter should be changed not only if the cryptographic operations change here, but also if the usage changes at all, or if the meaning of any parameters changes. (For example, if the encoding of CM and SM changed, or if ID were a different length or represented a different type of key, then we should start using a new PROTOID.)

A.1 Instantiation

Here are a set of functions based on SHA3, SHAKE-256, Curve25519, and AES256:

H(s, t) = SHA3_256(ENCAP(t) | s) MAC(k, msg, t) = SHA3_256(ENCAP(t) | ENCAP(k) | s) KDF(s, t) = SHAKE_256(ENCAP(t) | s) ENC(k, m) = AES_256_CTR(k, m) EXP(pk,sk), KEYGEN: defined as in curve25519 DIGEST_LEN = MAC_LEN = MAC_KEY_LEN = ENC_KEY_LEN = PUB_KEY_LEN = 32 ID_LEN = 32 (representing an ed25519 identity key)

Notes on selected operations: SHA3 can be pretty slow, and AES256 is likely overkill. I'm choosing them anyway because they are what we use in hs-ntor, and in my preliminary experiments they don't account for even 1% of the time spent on this handshake.

t_msgkdf = PROTOID | ":kdf_phase1" t_msgmac = PROTOID | ":msg_mac" t_key_seed = PROTOID | ":key_seed" t_verify = PROTOID | ":verify" t_final = PROTOID | ":kdf_final" t_auth = PROTOID | ":auth_final"

A.2 Encoding for use with Tor circuit extension

Here we give a concrete instantiation of ntor-v3 for use with circuit extension in Tor, and the parameters in A.1 above.

If in use, this is a new CREATE2 type. Clients should not use it unless the relay advertises support by including an appropriate version of the Relay=X subprotocol in its protocols list.

When the encoding and methods of this section, along with the instantiations from the previous section, are in use, we specify:

PROTOID = "ntor3-curve25519-sha3_256-1"

The key material is extracted as follows, unless modified by the handshake (see below). See tor-spec.txt for more info on the specific values:

Df Digest authentication, forwards [20 bytes] Db Digest authentication, backwards [20 bytes] Kf Encryption key, forwards [16 bytes] Kb Encryption key, backwards [16 bytes] KH Onion service nonce [20 bytes]

We use the following meta-encoding for the contents of client and server messages.

[Any number of times]: EXTENSION EXT_FIELD_TYPE [one byte] EXT_FIELD_LEN [one byte] EXT_FIELD [EXT_FIELD_LEN bytes]

(EXT_FIELD_LEN may be zero, in which case EXT_FIELD is absent.)

All parties MUST reject messages that are not well-formed per the rules above.

We do not specify specific TYPE semantics here; we leave those for other proposals and specifications.

Parties MUST ignore extensions with EXT_FIELD_TYPE bodies they do not recognize.

Unless otherwise specified in the documentation for an extension type:

  • Each extension type SHOULD be sent only once in a message.
  • Parties MUST ignore any occurrences all occurrences of an extension with a given type after the first such occurrence.
  • Extensions SHOULD be sent in numerically ascending order by type.

(The above extension sorting and multiplicity rules are only defaults; they may be overridden in the description of individual extensions.)

A.3 How much space is available?

We start with a 498-byte payload in each relay cell.

The header of the EXTEND2 cell, including link specifiers and other headers, comes to 89 bytes.

The client handshake requires 128 bytes (excluding CM).

That leaves 281 bytes, "which should be plenty".

X.1 Negotiating proposal-324 circuit windows

(We should move this section into prop324 when this proposal is finished.)

We define a type value, CIRCWINDOW_INC.

We define a triplet of consensus parameters: circwindow_inc_min, cincwindow_inc_max, and circwindow_inc_dflt. These all have range (1,65535).

When the authority operators want to experiment with different values for circwindow_inc_dflt, they set circwindow_inc_min and circwindow_inc_max to the range in which they want to experiment, making sure that the existing circwindow_inc_dflt is within that range.

vWhen a client sees that a relay supports the ntor3 handshake type (subprotocol Relay=X), and also supports the flow control algorithms of proposal 324 (subprotocol FlowCtrl=X), then the client sends a message, with type CIRCWINDOW_INC, containing a two-byte integer equal to circwindow_inc_dflt.

The relay rejects the message if the value given is outside of the [circwindow_inc_min, circwindow_inc_max] range. Otherwise, it accepts it, and replies with the same message that the client sent.

X.2: Test vectors

The following test values, in hex, were generated by a Python reference implementation.

Inputs:

b = "4051daa5921cfa2a1c27b08451324919538e79e788a81b38cbed097a5dff454a" B = "f8307a2bc1870b00b828bb74dbb8fd88e632a6375ab3bcd1ae706aaa8b6cdd1d" ID = "9fad2af287ef942632833d21f946c6260c33fae6172b60006e86e4a6911753a2" x = "b825a3719147bcbe5fb1d0b0fcb9c09e51948048e2e3283d2ab7b45b5ef38b49" X = "252fe9ae91264c91d4ecb8501f79d0387e34ad8ca0f7c995184f7d11d5da4f46" CM = "68656c6c6f20776f726c64" VER = "78797a7a79" y = "4865a5b7689dafd978f529291c7171bc159be076b92186405d13220b80e2a053" Y = "4bf4814326fdab45ad5184f5518bd7fae25dc59374062698201a50a22954246d" SM = "486f6c61204d756e646f"

Intermediate values:

ENC_K1 = "4cd166e93f1c60a29f8fb9ec40ea0fc878930c27800594593e1c4d0f3b5fbd02" MAC_K1 = "f5b69e85fdd26e1b0bdbbc8128e32d8123040255f11f744af3cc98fc13613cda" msg_mac = "9e044d53565f04d82bbb3bebed3d06cea65db8be9c72b68cd461942088502f67" key_seed = "b9a092741098e1f5b8ab37ce74399dd57522c974d7ae4626283a1077b9273255" verify = "1dc09fb249738a79f1bc3a545eee8c415f27213894a760bb4df58862e414799a" ENC_KEY (server) = "cab8a93eef62246a83536c4384f331ec26061b66098c61421b6cae81f4f57c56" AUTH = "2fc5f8773ca824542bc6cf6f57c7c29bbf4e5476461ab130c5b18ab0a9127665"

Messages:

client_handshake = "9fad2af287ef942632833d21f946c6260c33fae6172b60006e86e4a6911753a2f8307a2bc1870b00b828bb74dbb8fd88e632a6375ab3bcd1ae706aaa8b6cdd1d252fe9ae91264c91d4ecb8501f79d0387e34ad8ca0f7c995184f7d11d5da4f463bebd9151fd3b47c180abc9e044d53565f04d82bbb3bebed3d06cea65db8be9c72b68cd461942088502f67"

server_handshake = "4bf4814326fdab45ad5184f5518bd7fae25dc59374062698201a50a22954246d2fc5f8773ca824542bc6cf6f57c7c29bbf4e5476461ab130c5b18ab0a91276651202c3e1e87c0d32054c"

First 256 bytes of keystream:

KEYSTREAM = "9c19b631fd94ed86a817e01f6c80b0743a43f5faebd39cfaa8b00fa8bcc65c3bfeaa403d91acbd68a821bf6ee8504602b094a254392a07737d5662768c7a9fb1b2814bb34780eaee6e867c773e28c212ead563e98a1cd5d5b4576f5ee61c59bde025ff2851bb19b721421694f263818e3531e43a9e4e3e2c661e2ad547d8984caa28ebecd3e4525452299be26b9185a20a90ce1eac20a91f2832d731b54502b09749b5a2a2949292f8cfcbeffb790c7790ed935a9d251e7e336148ea83b063a5618fcff674a44581585fd22077ca0e52c59a24347a38d1a1ceebddbf238541f226b8f88d0fb9c07a1bcd2ea764bbbb5dacdaf5312a14c0b9e4f06309b0333b4a"

Filename: 333-vanguards-lite.md Title: Vanguards lite Author: George Kadianakis, Mike Perry Created: 2021-05-20 Status: Finished Implemented-In: 0.4.7.1-alpha

0. Introduction & Motivation

This proposal specifies a simplified version of Proposal 292 "Mesh-based vanguards" for the purposes of implementing it directly into the C Tor codebase.

For more details on guard discovery attacks and how vanguards defend against it, we refer to Proposal 292 PROP292_REF.

1. Overview

We propose an identical system to the Mesh-based Vanguards from proposal 292, but with the following differences:

  • No third layer of guards is used.
  • The Layer2 lifetime uses the max(x,x) distribution with a minimum of one day and maximum of 12 days. This makes the average lifetime approximately a week.
  • We let NUM_LAYER2_GUARDS=4. We also introduce a consensus parameter guard-hs-l2-number that controls the number of layer2 guards (with a maximum of 19 layer2 guards).
  • We don't write guards on disk. This means that the guard topology resets when tor restarts.

By avoiding a third-layer of guards we avoid most of the linkability issues of Proposal 292. This means that we don't add an extra hop on top of most of our onion service paths, which increases performance. However, we do add an extra middle hop at the end of service-side introduction circuits to avoid linkability of L2s by the intro points.

This is how onion service circuits look like with this proposal:

Client rend: C -> G -> L2 -> Rend Client intro: C -> G -> L2 -> M -> Intro Client hsdir: C -> G -> L2 -> M -> HSDir Service rend: C -> G -> L2 -> M -> Rend Service intro: C -> G -> L2 -> M -> Intro Service hsdir: C -> G -> L2 -> M -> HSDir

2. Rotation Period Analysis

From the table in Section 3.1 of Proposal 292, with NUM_LAYER2_GUARDS=4 it can be seen that this means that the Sybil attack on Layer2 will complete with 50% chance in 187 days (126 days) for the 1% adversary, 47 days (one month) for the 5% adversary, and 2*7 days (two weeks) for the 10% adversary.

3. Tradeoffs from Proposal 292

This proposal has several advantages over Proposal 292:

By avoiding a third-layer of guards we reduce the linkability issues of Proposal 292, which means that we don't have to add an extra hop on top of our paths. This simplifies engineering and makes paths shorter by default: this means less latency and quicker page load times.

This proposal also comes with disadvantages:

The lack of third-layer guards makes it easier to launch guard discovery attacks against clients and onion services. Long-lived services are not well protected, and this proposal might provide those services with a false sense of security. Such services should still use the vanguards addon VANGUARDS_REF.

4. Implementation nuances

Tor replaces an L2 vanguard whenever it is no longer listed in the most recent consensus, with the goal that we will always have the right number of vanguards ready to be used.

For implementation reasons, we also replace a vanguard if it loses the Fast or Stable flag, because the path selection logic wants middle nodes to have those flags when it's building preemptive vanguard-using circuits.

The design doesn't have to be this way: we might instead have chosen to keep vanguards in our list as long as possible, and continue to use them even if they have lost some flags. This tradeoff is similar to the one in https://bugs.torproject.org/17773 about whether to continue using Entry Guards if they lose the Guard flag -- and Tor's current choice is "no, rotate" for that case too.

5. References

Filename: 334-middle-only-flag.txt Title: A Directory Authority Flag To Mark Relays As Middle-only Author: Neel Chauhan Created: 2021-09-07 Status: Superseded Superseded-by: 335-middle-only-redux.md 1. Introduction The Health Team often deals with a large number of relays with an incorrect configuration (e.g. not all relays in MyFamily), or needs validation that requires contacting the relay operator. It is desirable to put the said relays in a less powerful position, such as a middle only flag that prevents a relay from being used in more powerful positions like an entry guard or an exit relay. [1] 1.1. Motivation The proposed middle-only flag is needed by the Health Team to prevent misconfigured relays from being used in positions capable of deanonymizing users while the team evaluates the relay's risk to the network. An example of this scenario is when a guard and exit relay run by the same operator has an incomplete MyFamily, and the same operator's guard and exit are used in a circuit. The reason why we won't play with the Guard and Exit flags or weights to achieve the same goal is because even if we were to reduce the guard and exit weights of a misconfigured relay, it could keep some users at risk of deanonymization. Even a small fraction of users at risk of deanonymization isn't something we should aim for. One case we could look out for is if all relays are exit relays (unlikely), or if walking onions are working on the current Tor network. This proposal should not affect those scenarios, but we should watch out for these cases. 2. The MiddleOnly Flag We propose a consensus flag MiddleOnly. As mentioned earlier, relays will be assigned this flag from the directory authorities. What this flag does is that a relay must not be used as an entry guard or exit relay. This is to prevent issues with a misconfigured relay as described in Section 1 (Introduction) while the Health Team assesses the risk with the relay. 3. Implementation details The MiddleOnly flag can be assigned to relays whose IP addresses and/or fingerprints are configured at the directory authority level, similar to how the BadExit flag currently works. In short, if a relay's IP is designated as middle-only, it must assign the MiddleOnly flag, otherwise we must not assign it. Relays which haven't gotten the Guard or Exit flags yet but have IP addresses that aren't designated as middle-only in the dirauths must not get the MiddleOnly flag. This is to allow new entry guards and exit relays to enter the Tor network, while giving relay administrators flexibility to increase and reduce bandwidth, or change their exit policy. 3.1. Client Implementation Clients should interpret the MiddleOnly flag while parsing relay descriptors to determine whether a relay is to be avoided for non-middle purposes. If a client parses the MiddleOnly flag, it must not use MiddleOnly-designated relays as entry guards or exit relays. 3.2. MiddleOnly Relay Purposes If a relay has the MiddleOnly flag, we do not allow it to be used for the following purposes: * Entry Guard * Directory Guard * Exit Relay The reason for this is to prevent a misconfigured relay from being used in places where they may know about clients or destination traffic. This is in case certain misconfigured relays are used to deanonymize clients. We could also bar a MiddleOnly relay from other purposes such as rendezvous and fallback directory purposes. However, while more secure in theory, this adds unnecessary complexity to the Tor design and has the possibility of breaking clients that aren't MiddleOnly-aware [2]. 4. Consensus Considerations 4.1. Consensus Methods We propose a new consensus method 32, which is to only use this flag if and when all authorities understand the flag and agree on it. This is because the MiddleOnly flag impacts path selection for clients. 4.2. Consensus Requirements The MiddleOnly flag would work like most other consensus flags where a majority of dirauths have to assign a relay the flag in order for a relay to have the MiddleOnly flag. Another approach is to make it that only one dirauth is needed to give relays this flag, however it would put too much power in the hands of a single directory authority servre [3]. 5. Acknowledgements Thank you so much to nusenu, s7r, David Goulet, and Roger Dingledine for your suggestions to Prop334. My proposal wouldn't be what it is without you. 6. Citations [1] - https://gitlab.torproject.org/tpo/core/tor/-/issues/40448 [2] - https://lists.torproject.org/pipermail/tor-dev/2021-September/014627.html [3] - https://lists.torproject.org/pipermail/tor-dev/2021-September/014630.html
Filename: 335-middle-only-redux.md Title: An authority-only design for MiddleOnly Author: Nick Mathewson Created: 2021-10-08 Status: Closed Implemented-In: 0.4.7.2-alpha

Introduction

This proposal describes an alternative design for a MiddleOnly flag. Instead of making changes at the client level, it adds a little increased complexity at the directory authority's voting process. In return for that complexity, this design will work without additional changes required from Tor clients.

For additional motivation and discussion see proposal 334 by Neel Chauhan, and the related discussions on tor-dev.

Protocol changes

Generating votes

When voting for a relay with the MiddleOnly flag, an authority should vote for all flags indicating that a relay is unusable for a particular purpose, and against all flags indicating that the relay is usable for a particular position.

Specifically, these flags SHOULD be set in a vote whenever MiddleOnly is present, and only when the authority is configured to vote on the BadExit flag.

  • BadExit

And these flags SHOULD be cleared in a vote whenever MiddleOnly is present.

  • Exit
  • Guard
  • HSDir
  • V2Dir

Computing a consensus

This proposal will introduce a new consensus method (probably 32). Whenever computing a consensus using that consensus method or later, authorities post-process the set of flags that appear in the consensus after flag voting takes place, by applying the same rule as above.

That is, with this consensus method, the authorities first compute the presence or absence of each flag on each relay as usual. Then, if the MiddleOnly flag is present, the authorities set BadExit, and clear Exit, Guard, HSDir, and V2Dir.

Configuring authorities

We'll need a means for configuring which relays will receive this flag. For now, we'll just reuse the same mechanism as AuthDirReject and AuthDirBadExit: a set of torrc configuration lines listing relays by address. We'll call this AuthDirMiddleOnly.

We'll also add an AuthDirListsMiddleOnly option to turn on or off voting on this option at all.

Notes on safety and migration

Under this design, the MiddleOnly option becomes useful immediately, since authorities that use it will stop voting for certain additional options for MiddleOnly relays without waiting for the other authorities.

We don't need to worry about a single authority setting MiddleOnly unilaterally for all relays, since the MiddleOnly flag will have no special effect until most authorities have upgraded to the new consensus method.

Filename: 336-randomize-guard-retries.md Title: Randomized schedule for guard retries Author: Nick Mathewson Created: 2021-10-22 Status: Closed

Implementation Status

This proposal is implemented in Arti, and recommended for future guard implementations. We have no current plans to implement it in C Tor.

Introduction

When we notice that a guard isn't working, we don't mark it as retriable until a certain interval has passed. Currently, these intervals are fixed, as described in the documentation for GUARDS_RETRY_SCHED in guard-spec appendix A.1. Here we propose using a randomized retry interval instead, based on the same decorrelated-jitter algorithm we use for directory retries.

The upside of this approach is that it makes our behavior in the presence of an unreliable network a bit harder for an attacker to predict. It also means that if a guard goes down for a while, its clients will notice that it is up at staggered times, rather than probing it in lock-step.

The downside of this approach is that we can, if we get unlucky enough, completely fail to notice that a preferred guard is online when we would otherwise have noticed sooner.

Note that when a guard is marked retriable, it isn't necessarily retried immediately. Instead, its status is changed from "Unreachable" to "Unknown", which will cause it to get retried.

For reference, our previous schedule was:

{param:PRIMARY_GUARDS_RETRY_SCHED} -- every 10 minutes for the first six hours, -- every 90 minutes for the next 90 hours, -- every 4 hours for the next 3 days, -- every 9 hours thereafter. {param:GUARDS_RETRY_SCHED} -- -- every hour for the first six hours, -- every 4 hours for the next 90 hours, -- every 18 hours for the next 3 days, -- every 36 hours thereafter.

The new algorithm

We re-use the decorrelated-jitter algorithm from dir-spec section 5.5. The specific formula used to compute the 'i+1'th delay is:

Delay_{i+1} = MIN(cap, random_between(lower_bound, upper_bound)) where upper_bound = MAX(lower_bound+1, Delay_i * 3) lower_bound = MAX(1, base_delay).

For primary guards, we set base_delay to 30 seconds and cap to 6 hours.

For non-primary guards, we set base_delay to 10 minutes and cap to 36 hours.

(These parameters were selected by simulating the results of using them until they looked "a bit more aggressive" than the current algorithm, but not too much.)

The average behavior for the new primary schedule is:

First 1.0 hours: 10.14283 attempts. (Avg delay 4m 47.41s) First 6.0 hours: 19.02377 attempts. (Avg delay 15m 36.95s) First 96.0 hours: 56.11173 attempts. (Avg delay 1h 40m 3.13s) First 168.0 hours: 83.67091 attempts. (Avg delay 1h 58m 43.16s) Steady state: 2h 36m 44.63s between attempts.

The average behavior for the new non-primary schedule is:

First 1.0 hours: 3.08069 attempts. (Avg delay 14m 26.08s) First 6.0 hours: 8.1473 attempts. (Avg delay 35m 25.27s) First 96.0 hours: 22.57442 attempts. (Avg delay 3h 49m 32.16s) First 168.0 hours: 29.02873 attempts. (Avg delay 5h 27m 2.36s) Steady state: 11h 15m 28.47s between attempts.
Filename: 337-simpler-guard-usability.md Title: A simpler way to decide, "Is this guard usable?" Author: Nick Mathewson Created: 2021-10-22 Status: Closed

Introduction

The current guard-spec describes a mechanism for how to behave when our primary guards are unreachable, and we don't know which other guards are reachable. This proposal describes a simpler method, currently implemented in Arti.

(Note that this method might not actually give different results: its only advantage is that it is much simpler to implement.)

The task at hand

For illustration, we'll assume that our primary guards are P1, P2, and P3, and our subsequent guards (in preference order) are G1, G2, G3, and so on. The status of each guard is Reachable (we think we can connect to it), Unreachable (we think it's down), or Unknown (we haven't tried it recently).

The question becomes, "What should we do when P1, P2, and P3 are Unreachable, and G1, G2, ... are all Unknown"?

In this circumstance, we could say that we only build circuits to G1, wait for them to succeed or fail, and only try G2 if we see that the circuits to G1 have failed completely. But that delays in the case that G1 is down.

Instead, the first time we get a circuit request, we try to build one circuit to G1. On the next circuit request, if the circuit to G1 isn't done yet, we launch a circuit to G2 instead. The next request (if the G1 and G2 circuits are still pending) goes to G3, and so on. But (here's the critical part!) we don't actually use the circuit to G2 unless the circuit to G1 fails, and we don't actually use the circuit to G3 unless the circuits to G1 and G2 both fail.

This approach causes Tor clients to check the status of multiple possible guards in parallel, while not actually using any guard until we're sure that all the guards we'd rather use are down.

The current algorithm and its drawbacks

For the current algorithm, see guard-spec section 4.9: circuits are exploratory if they are not using a primary guard. If such an exploratory circuit is waiting_for_better_guard, then we advance it (or not) depending on the status of all other circuits using guards that we'd rather be using.

In other words, the current algorithm is described in terms of actions to take with given circuits.

For Arti (and for other modular Tor implementations), however, this algorithm is a bit of a pain: it introduces dependencies between the guard code and the circuit handling code, requiring each one to mess with the other.

Proposal

I suggest that we describe an alternative algorithm for handing circuits to non-primary guards, to be used in preference to the current algorithm. Unlike the existing approach, it isolates the guard logic a bit better from the circuit logic.

Handling exploratory circuits

When all primary guards are Unreachable, we need to try non-primary guards. We select the first such guard (in preference order) that is neither Unreachable nor Pending. Whenever we give out such a guard, if the guard's status is Unknown, then we call that guard "Pending" until the attempt to use it succeeds or fails. We remember when the guard became Pending.

Aside: None of the above is a change from our existing specification.

After completing a circuit, the implementation must check whether its guard is usable. A guard is usable according to these rules:

Primary guards are always usable.

Non-primary guards are usable for a given circuit if every guard earlier in the preference list is either unsuitable for that circuit (e.g. because of family restrictions), or marked as Unreachable, or has been pending for at least {NONPRIMARY_GUARD_CONNECT_TIMEOUT}.

Non-primary guards are unusable for a given circuit if some guard earlier in the preference list is suitable for the circuit and Reachable.

Non-primary guards are unusable if they have not become usable after {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.

If a circuit's guard is neither usable nor unusable immediately, the circuit is not discarded; instead, it is kept (but not used) until it becomes usable or unusable.

I am not 100% sure whether this description produces the same behavior as the current guard-spec, but it is simpler to describe, and has proven to be simpler to implement.

Implications for program design.

(This entire section is implementation detail to explain why this is a simplification from the previous algorithm. It is for explanatory purposes only and is not part of the spec.)

With this algorithm, we cut down the interaction between the guard code and the circuit code considerably, but we do not remove it entirely. Instead, there remains (in Arti terms) a pair of communication channels between the circuit manager and the guard manager:

  • Whenever a guard is given to the circuit manager, the circuit manager receives the write end of a single-use channel to report whether the guard has succeeded or failed.

  • Whenever a non-primary guard is given to the circuit manager, the circuit receives the read end of a single-use channel that will tell it whether the guard is usable or unusable. This channel doesn't report anything until the guard has one status or the other.

With this design, the circuit manager never needs to look at the list of guards, and the guard manager never needs to look at the list of circuits.

Subtleties concerning "guard success"

Note that the above definitions of a Reachable guard depend on reporting when the guard is successful or failed. This is not necessarily the same as reporting whether the circuit is successful or failed. For example, a circuit that fails after the first hop does not necessarily indicate that there's anything wrong with the guard. Similarly, we can reasonably conclude that the guard is working (at least somewhat) as long as we have an open channel to it.

Filename: 338-netinfo-y2038.md Title: Use an 8-byte timestamp in NETINFO cells Author: Nick Mathewson Created: 2022-03-14 Status: Accepted

Introduction

Currently Tor relays use a 4-byte timestamp (in seconds since the Unix epoch) in their NETINFO cells. Notoriously, such a timestamp will overflow on 19 January 2038.

Let's get ahead of the problem and squash this issue now, by expanding the timestamp to 8 bytes. (8 bytes worth of seconds will be long enough to outlast the Earth's sun.)

Proposed change

I propose adding a new link protocol version. (The next one in sequence, as of this writing, is version 6.)

I propose that we change the text of tor-spec section 4.5 from:

TIME (Timestamp) [4 bytes]

to

TIME (Timestamp) [4 or 8 bytes *]

and specify that this field is 4 bytes wide on link protocols 1-5, but 8 bytes wide on link protocols 6 and beyond.

Rejected alternatives

Our protocol specifies that parties MUST ignore extra data at the end of cells. Therefore we could add additional data at the end of the NETINFO cell, and use that to store the high 4 bytes of the timestamp without having to increase the link protocol version number. I propose that we don't do that: it's ugly.

As another alternative, we could declare that parties must interpret the timestamp such that its high 4 bytes place it as close as possible to their current time. I'm rejecting this kludge because it would give confusing results in the too-common case where clients have their clocks mis-set to Jan 1, 1970.

Impacts on our implementations

Arti won't be able to implement this change until it supports connection padding (as required by link protocol 5), which is currently planned for the next Arti milestone (1.0.0, scheduled for this fall).

If we think that that's a problem, or if we want to have support for implementations without connection padding in the future, we should reconsider this plan so that connection padding support is independent from 8-byte timestamps.

Other timestamps in Tor

I've done a cursory search of our protocols to see if we have any other instances of the Y2038 problem.

There is a 4-byte timestamp in cert-spec, but that one is an unsigned integer counting hours since the Unix epoch, which will keep it from wrapping around till 478756 C.E. (The rollover date of "10136 CE" reported in cert-spec is wrong, and seems to be based on the misapprehension that the counter is in minutes.)

The v2 onion service protocol has 4-byte timestamps, but it is thoroughly deprecated and unsupported.

I couldn't find any other 4-byte timestamps, but that is no guarantee: others should look for them too.

Filename: 339-udp-over-tor.md Title: UDP traffic over Tor Author: Nick Mathewson Created: 11 May 2020 Status: Accepted

Introduction

Tor currently only supports delivering two kinds of traffic to the internet: TCP data streams, and a certain limited subset of DNS requests. This proposal describes a plan to extend the Tor protocol so that exit relays can also relay UDP traffic to the network?.

Why would we want to do this? There are important protocols that use UDP, and in order to support users that rely on these protocols, we'll need to support them over Tor.

This proposal is a minimal version of UDP-over-Tor. Notably, it does not add an unreliable out-of-order transport to Tor's semantics. Instead, UDP messages are just tunneled over Tor's existing reliable in-order circuits. (Adding a datagram transport to Tor is attractive for some reasons, but it presents a number of problems; see this whitepaper for more information.)

In some parts of this proposal I'll assume that we have accepted and implemented some version of proposal 319 (relay fragment cells) so that we can transmit relay messages larger than 498 bytes.

Overview

UDP is a datagram protocol; it allows messages of up to 65536 bytes, though in practice most protocols will use smaller messages in order to avoid having to deal with fragmentation.

UDP messages can be dropped or re-ordered. There is no authentication or encryption baked into UDP, though it can be added by higher-level protocols like DTLS or QUIC.

When an application opens a UDP socket, the OS assigns it a 16-bit port on some IP address of a local interface. The application may send datagrams from that address:port combination, and will receive datagrams sent to that address:port.

With most (all?) IP stacks, a UDP socket can either be connected to a remote address:port (in which case all messages will be sent to that address:port, and only messages from that address will be passed to the application), or unconnected (in which case outgoing messages can be sent to any address:port, and incoming messages from any address:port will be accepted).

In this version of the protocol, we support only connected UDP sockets, though we provide extension points for someday adding unconnected socket support.

Tor protocol specification

Overview

We reserve three new relay commands: CONNECT_UDP, CONNECTED_UDP and DATAGRAM.

The CONNECT_UDP command is sent by a client to an exit relay to tell it to open a new UDP stream "connected" to a targeted address and UDP port. The same restrictions apply as for CONNECT cells: the target must be permitted by the relay's exit policy, the target must not be private, localhost, or ANY, the circuit must appear to be multi-hop, there must not be a stream with the same ID on the same circuit, and so on.

On success, the relay replies with a CONNECTED_UDP cell telling the client the IP address it is connected to, and which IP address and port (on the relay) it has bound to. On failure, the relay replies immediately with an END cell.

(Note that we do not allow the client to choose an arbitrary port to bind to. It doesn't work when two clients want the same port, and makes it too easy to probe which ports are in use.)

When the UDP stream is open, the client can send and receive DATAGRAM messages from the exit relay. Each such message corresponds to a single UDP datagram. If a datagram is larger than 498 bytes, it is transmitted as a fragmented message.

When a client no longer wishes to use a UDP stream, but it wants to keep the circuit open, it sends an END cell over the circuit. Upon receiving this message, the exit closes the stream, and stops sending any more cells on it.

Exits MAY send an END cell on a UDP stream; when a client receives it, it must treat the UDP stream as closed. Exits MAY send END cells in response to resource exhaustion, time-out signals, or (TODO what else?).

(TODO: Should there be an END ACK? We've wanted one in DATA streams for a while, to know when we can treat a stream as definitively gone-away.)

Optimistic traffic is permitted as with TCP streams: a client MAY send DATAGRAM messages immediately after its CONNECT_UDP message, without waiting for a CONNECTED_UDP. These are dropped if the CONNECT_UDP fails.

Clients and exits MAY drop incoming datagrams if their stream or circuit buffers are too full. (Once a DATAGRAM message has been sent on a circuit, however, it cannot be dropped until it reaches its intended recipient.)

Circuits carrying UDP traffic obey the same SENDME congestion control protocol as other circuits. Rather than using XON/XOFF to control transmission, excess packets may simply be dropped. UDP and TCP traffic can be mixed on the same circuit, but not on the same stream.

Discussion on "too full"

(To be determined! We need an algorithm here before we implement, though our choice of algorithm doesn't need to be the same on all exits or for all clients, IIUC.)

Discussion from the pad:

- "Too full" should be a pair of watermark consensus parameter in implementation, imo. At the low watermark, random early dropping MAY be performed, a-la RED, etc. At the high watermark, all packets SHOULD be dropped. - mike - +1. I left "too full" as deliberately underspecified here, since I figured you would have a better idea than me about what it should really be. Maybe we should say "for one suggested algorithm, see section X below" and describe the algorithm you propose above in a bit more detail? -nickm - I have not dug deeply into drop strategies, but I believe that BLUE is what is in use now: https://en.wikipedia.org/wiki/Blue_(queue_management_algorithm) - Additionally, an important implementation detail is that it is likely best to actually continue to read even if our buffer is full, so we can perform the drop ourselves and ensure the kernel/socket buffers don't also bloat on us. Though this may have tradeoffs with the eventloop bottleneck on C-Tor. Because of that bottleneck, it might be best to stop reading. arti likely will have different optimal properties here. -mike

Message formats

Here we describe the format for the bodies of the new relay messages, along with extensions to some older relay message types. We note in passing how we could extend these messages to support unconnected UDP sockets in the future.

Common Format

We define here a common format for an "address" that is used both in a CONNECT_UDP and CONNECTED_UDP cell.

Address

Defines an IP or Hostname address along with its port. This can be seen as the ADDRPORT of a BEGIN cell defined in tor-spec.txt but with a different format.

/* Address types. Note that these are the same as in RESOLVED cells. */ const T_HOSTNAME = 0x00; const T_IPV4 = 0x04; const T_IPV6 = 0x06; struct address { u8 type IN [T_IPV4, T_IPV6, T_HOSTNAME]; u8 len; union addr[type] with length len { T_IPV4: u32 ipv4; T_IPV6: u8 ipv6[16]; T_HOSTNAME: u8 hostname[]; }; u16 port; }

The hostname follows the RFC1035 for its accepted length that is 63 characters or less that is a len between 0 and 255 (bytes). It should contain a sequence of nonzero octets as in any nul byte results in a malformed cell.

CONNECT_UDP

/* Tells an exit to connect a UDP port for connecting to a new target address. The stream ID is chosen by the client, and is part of the relay header. */ struct connect_udp_body { /* As in BEGIN cells. */ u32 flags; /* Address to connect to. */ struct address addr; // The rest is ignored. // TODO: Is "the rest is ignored" still a good idea? Look at Rochet's // research. } /* As in BEGIN cells: these control how hostnames are interpreted. Clients MUST NOT send unrecognized flags; relays MUST ignore them. See tor-spec for semantics. */ const FLAG_IPV6_OKAY = 0x01; const FLAG_IPV4_NOT_OKAY = 0x02; const FLAG_IPV6_PREFERRED = 0x04;

A "hostname" is a DNS hostname that can only contain ascii characters. It is NOT following the large and broad DNS syntax. These behaves exacly like BEGIN cell behave with regards to the hostname given.

CONNECTED_UDP

A CONNECTED_UDP cell sent in response to a CONNECT_UDP cell has the following format.

struct udp_connected_body { /* The address that the relay has bound locally. This might not * be an address that is advertised in the relay's descriptor. */ struct address our_address; /* The address that the stream is connected to. */ struct address their_address; // The rest is ignored. There is no resolved-address TTL. // TODO: Is "the rest is ignored" still a good idea? Look at Rochet's // research. }

Both our_address and their_address MUST NOT be of type T_HOSTNAME else the cell is considered malformed.

DATAGRAM

struct datagram_body { /* The datagram body is the entire body of the message. * This length is in the relay message header. */ u8 body[..]; }

END

We explicitly allow all END reasons from the existing Tor protocol.

We may wish to add more as we gain experience with this protocol.

Extensions for unconnected sockets

Because of security concerns I don't suggest that we support unconnected sockets in the first version of this protocol. But if we did, here's how I'd suggest we do it.

  1. We would add a new "FLAG_UNCONNECTED" flag for CONNECT_UDP messages.

  2. We would designate the ANY addresses 0.0.0.0:0 and [::]:0 as permitted in CONNECT_UDP messages, and as indicating unconnected sockets. These would be only permitted along with the FLAG_UNCONNECTED flag, and not permitted otherwise.

  3. We would designate the ANY addresses above as permitted for the their_address field in the CONNECTED_UDP message, in the case when FLAG_UNCONNECTED was used.

  4. We would define a new DATAGRAM message format for unconnected streams, where the first 6 or 18 bytes were reserved for an IPv4 or IPv6 address:port respectively.

Specifying exit policies and compatibility

We add the following fields to relay descriptors and microdescriptors:

// In relay descriptors ipv4-udp-policy accept PortList ipv6-udp-policy accept PostList // In microdescriptors p4u accept PortList p6u accept PortList

(We need to include the policies in relay descriptors so that the authorities can include them in the microdescriptors when voting.)

As in the p and p6 fields, the PortList fields are comma-separated lists of port ranges. Only "accept" policies are parsed or generated in this case; the alternative is not appreciably shorter. When no policy is listed, the default is "reject 1-65535".

This proposal would also add a new subprotocol, "Datagram". Only relays that implement this proposal would advertise "Datagram=1". Doing so would not necessarily mean that they permitted datagram streams, if their exit policies did not say so.

MTU notes and issues

Internet time. I might have this wrong.

The "maximum safe IPv4 UDP payload" is "well known" to be only 508 bytes long: that's defined by the 576-byte minimum-maximum IP datagram size in RFC 791 p.12, minus 60 bytes for a very big IPv4 header, minus 8 bytes for the UDP header.

Unfortunately, our RELAY body size is only 498 bytes. It would be lovely if we could easily move to larger relay cells, or tell applications not to send datagrams whose bodies are larger than 498 bytes, but there is probably a pretty large body of tools out there that assume that they will never have to restrict their datagram size to fit into a transport this small.

(That means that if we implement this proposal without fragmentation, we'll probably be breaking a bunch of stuff, and creating a great deal of overhead.)

Integration issues

I do not know how applications should tell Tor that they want to use this feature. Any ideas? We should probably integrate with their MTU discovery systems too if we can. (TODO: write about some alternatives)

Resource management issues

TODO: Talk about sharing server-side relay sockets, and whether it's safe to do so, and how to avoid information leakage when doing so.

TODO: Talk about limiting UDP sockets per circuit, and whether that's a good idea?

Security issues

  • Are there any major DoS or amplification attack vectors that this enables? I think no, because we don't allow spoofing the IP header. But maybe some wacky protocol out there lets you specify a reply address in the payload even if the source IP is different. -mike

  • Are there port-reuse issues with source port on exits, such that destinations could become confused over the start and end of a UDP stream, if a source port is reused "too fast"? This also likely varies by protocol. We should prameterize time-before-reuse on source port, in case we notice issues with some broken/braindead UDP protocol later. -mike

Future work

Extend this for onion services, possibly based on Matt's prototypes.

Filename: 340-packed-and-fragmented.md Title: Packed and fragmented relay messages Author: Nick Mathewson Created: 31 May 2022 Status: Open

Introduction

Tor sends long-distance messages on circuits via relay cells. The current relay cell format allows one relay message (e.g., "BEGIN" or "DATA" or "END") per relay cell. We want to relax this 1:1 requirement, between messages and cells, for two reasons:

  • To support relay messages that are longer than the current 498-byte limit. Applications would include wider handshake messages for postquantum crypto, UDP messages, and SNIP transfer in walking onions.

  • To transmit small messages more efficiently. Several message types (notably SENDME, XON, XOFF, and several types from proposal 329) are much smaller than the relay cell size, and could be sent comparatively often.

In this proposal, we describe a way to decouple relay cells from relay messages. Relay messages can now be packed into multiple cells or split across multiple cells.

This proposal combines ideas from proposal 319 (fragmentation) and proposal 325 (packed cells). It requires ntor v3 and prepares for next-generation relay cryptography.

Additionally, this proposal has been revised to incorporate another protocol change, and move StreamId from the relay cell header into a new, optional header.

A preliminary change: Relay encryption, version 1.5

We are fairly sure that, whatever we do for our next batch of relay cryptography, we will want to increase the size of the data used to authenticate relay cells to 128 bits. (Currently it uses a 4-byte tag plus 2 bytes of zeros.)

To avoid proliferating formats, I'm going to suggest that we make the other changes in this proposal changes concurrently with a change in our relay cryptography, so that we do not have too many incompatible cell formats going on at the same time.

The new format for a decrypted relay cell will be:

recognized [2 bytes] digest [14 bytes] body [509 - 16 = 493 bytes]

The recognized and digest fields are computed as before; the only difference is that they occur before the rest of the cell, and that digest is truncated to 14 bytes instead of 4.

If we are lucky, we won't have to build this encryption at all, and we can just move to some version of GCM-UIV or other RPRP that reserves 16 bytes for an authentication tag or similar cryptographic object.

The body MUST contain exactly 493 bytes as cells have a fixed size.

New relay message packing

We define this new format for a relay message which has to fit within one relay cell. However, the body can be split accross many relay cells:

Message Header command u8 length u16 Message Routing Header (optional) stream_id u16 Message Body data u8[length]

One big change from the current tor protocol is something that has become optional: we have moved stream_id into a separate inner header that only appears sometimes named the Message Routing Header. The command value tells us if the header is to be expected or not.

The following message types take required stream IDs: BEGIN, DATA, END, CONNECTED, RESOLVE, RESOLVED, and BEGIN_DIR, XON, XOFF.

The following message types from proposal 339 (UDP) take required stream IDs: CONNECT_UDP, CONNECTED_UDP and DATAGRAM.

No other message types take stream IDs. The stream_id field, when present, MUST NOT be zero.

Messages can be split across relay cells; multiple messages can occur in a single relay cell. We enforce the following rules:

  • Headers may not be split across cells.
  • If a 0 byte follows a message body, there are no more messages.
  • A relay cell may not be "empty": it must have at least some part of some message.

Unless specified elsewhere, all message types may be packed, and all message types may be fragmented.

Every command has an associated maximum length for its messages. If not specified elsewhere, the maximum length for every message is 498 bytes (for legacy reasons).

Receivers MUST validate that the cell header and the message header are well-formed and have valid lengths while handling the cell in which the header is encoded. If any of them is invalid, the circuit MUST be destroyed.

An unrecognized command is considered invalid and thus MUST result in the circuit being immediately destroyed.

Negotiation

This message format requires a new Relay subprotocol version to indicate support. If a client wants to indicate support for this format, it sends the following extension as part of its ntor3 handshake:

EXT_FIELD_TYPE:

[03] -- Packed and Fragmented Cell Request

This field is zero payload length. Its presence signifies that the client wants to use packed and fragmented cells on the circuit.

The Exit side ntorv3 response payload is encoded as:

EXT_FIELD_TYPE:

[04] -- Packed and Fragmented Cell Response

Again, the extension presence indicates to the client that the Exit has acknowledged the feature and is ready to use it. If the extension is not present, the client MUST not use the packed and fragmented feature even though the Exit has advertised the correct protover.

The client MUST reject the handshake and thus close the circuit if:

  • The response extension is seen for a non-ntorv3 handshake.
  • The response extension is seen but no request was made initially.

Migration

We add a consensus parameter, "streamed-relay-messages", with default value 0, minimum value 0, and maximum value 1.

If this value is 0, then clients will not (by default) negotiate this relay protocol. If it is 1, then clients will negotiate it when relays support it.

For testing, clients can override this setting. Once enough relays support this proposal, we'll change the consensus parameter to 1. Later, we'll change the default to 1 as well.

Packing decisions

We specify the following greedy algorithm for making decisions about fragmentation and packing. Other algorithms are possible, but this one is fairly simple, and using it will help avoid distinguishability issues:

Whenever a client or relay is about to send a cell that would leave at least 32 bytes unused in a relay cell, it checks to see whether there is any pending data to be sent in the same circuit (in a data cell). If there is, then it adds a DATA message to the end of the current cell, with as much data as possible. Otherwise, the client sends the cell with no packed data.

Onion services

Negotiating this for onion services will happen in a separate proposal; it is not a current priority, since there is nothing sent over rendezvous circuits that we currently need to fragment or pack.

Miscellany

Handling RELAY_EARLY

The RELAY_EARLY status for a command is determined based on the relay cell in which the command's header appeared.

Handling SENDMEs

SENDME messages may not be fragmented; the body and the command must appear in the same cell. (This is necessary so authenticated sendmes can have a reasonable implementation.)

An exception for DATA.

Data messages may not be fragmented. (There is never a reason to do this.)

Extending message-length maxima

For now, the maximum length for every message body is 493 bytes, except as follows:

  • DATAGRAM messages (see proposal 339) have a maximum body length of 1967 bytes. (This works out to four relay cells, and accommodates most reasonable MTU choices)

Any increase in maximum length for any other message type requires a new Relay subprotocol version. (For example, if we later want to allow EXTEND2 messages to be 2000 bytes long, we need to add a new proposal saying so, and reserving a new subprotocol version.)

Appendix: Example cells

Here is an example of the simplest case: one message, sent in one relay cell:

Cell 1: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] message header: command BEGIN [1 byte] length 23 [2 bytes] message routing header: stream_id 42 [2 bytes] message body: "www.torproject.org:443\0" [23 bytes] end-of-messages marker: 0 [1 byte] padding up to end of cell: random [464 bytes]

Total of 514 bytes which is the absolute maximum cell size.

Here's an example with fragmentation only: a large EXTEND2 message split across two relay cells.

Cell 1: header: circid .. [4 bytes] command RELAY_EARLY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] message header: command EXTEND [1 byte] length 800 [2 bytes] message body: (extend body, part 1) [490 bytes] Cell 2: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] message body, continued: (extend body, part 2) [310 bytes] (310+490=800) end-of-messages marker: 0 [1 byte] padding up to end of cell: random [182 bytes]

Each cells are 514 bytes for a message body totalling 800 bytes.

Here is an example with packing only: A BEGIN_DIR message and a data message in the same cell.

Cell 1: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] # First relay message message header: command BEGIN_DIR [1 byte] length 0 [2 bytes] message routing header: stream_id 32 [2 bytes] # Second relay message message header: command DATA [1 byte] length 25 [2 bytes] message routing header: stream_id 32 [2 bytes] message body: "HTTP/1.0 GET /tor/foo\r\n\r\n" [25 bytes] end-of-messages marker: 0 [1 byte] padding up to end of cell: random [457 bytes]

Here is an example with packing and fragmentation: a large DATAGRAM cell, a SENDME cell, and an XON cell.

(Note that this sequence of cells would not actually be generated by the algorithm described in "Packing decisions" above; this is only an example of what parties need to accept.)

Cell 1: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] # First message message header: command DATAGRAM [1 byte] length 1200 [2 bytes] message routing header: stream_id 99 [2 bytes] message body: (datagram body, part 1) [488 bytes] Cell 2: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] message body, continued: (datagram body, part 2) [493 bytes] Cell 3: header: circid .. [4 bytes] command RELAY [1 byte] relay cell header: recognized 0 [2 bytes] digest (...) [14 bytes] message body, continued: (datagram body, part 3) [219 bytes] (488+493+219=1200) # Second message message header: command SENDME [1 byte] length 23 [2 bytes] message body: version 1 [1 byte] datalen 20 [2 bytes] data (digest to ack) [20 bytes] # Third message message header: command XON [1 byte] length 1 [2 bytes] message routing header: stream_id 50 [2 bytes] message body: version 1 [1 byte] end-of-messages marker: 0 [1 byte] padding up to end of cell: random [241 bytes]
Filename: 341-better-oos.md Title: A better algorithm for out-of-sockets eviction Author: Nick Mathewson Created: 25 July 2022 Status: Open

Introduction

Our existing algorithm for handling an out-of-sockets condition needs improvement. It only handles sockets used for OR connections, and prioritizes those with more circuits. Because of these weaknesses, the algorithm is trivial to circumvent, and it's disabled by default with DisableOOSCheck.

Here we propose a new algorithm for choosing which connections to close when we're out of sockets. In summary, the new algorithm works by deciding which kinds of connections we have "too many" of, and then by closing excess connections of each kind. The algorithm for selecting connections of each kind is different.

Intuitions behind the algorithm below

We want to keep a healthy mix of connections running; favoring one kind of connection over another gives the attacker a fine way to starve the disfavored connections by making a bunch of the favored kind.

The correct mix of connections depends on the type of service we are providing. Everywhere except authorities, for example, inbound directory connections are perfectly fine to close, since nothing in our protocol actually generates them.

In general, we would prefer to close DirPort connections, then Exit connections, then OR connections.

The priority with which to close connections is different depending on the connection type. "Age of connection" or "number of circuits" may be a fine metric for how truly used an OR connection is, but for a DirPort connection, high age is suspicious.

The algorithm

Define a "candidate" connection as one that has a socket, and is either an exit stream, an inbound directory stream, or an OR connection.

(Note that OR connections can be from clients, relays, or bridges. Note that ordinary relays should not get directory streams that use sockets, since clients always use BEGIN_DIR to create tunneled directory streams.)

In all of the following, treat subtraction as saturating at zero. In other words, when you see "A - B" below, read it as "MAX(A-B, 0)".

Phase 1: Deciding how many connections to close

When we find that we are low on sockets, we pick a number of sockets that we want to close according to our existing algorithm. (That is, we try to close 1/4 of our maximum sockets if we have reached our upper limit, or 1/10 of our maximum sockets if we have encountered a failure from socket(2).) Call this N_CLOSE.

Then we decide which sockets to target based on this algorithm.

  1. Consider the total number of sockets used for exit streams (N_EXIT), the total number used for inbound directory streams (N_DIR), and the total number used for OR connections (N_OR). (In these calculations, we exclude connections that are already marked to be closed.) Call the total N_CONN = N_DIR + N_OR + N_EXIT. Define N_RETAIN = N_CONN - N_CLOSE.

  2. Compute how many connections of each type are "in excess". First, calculate our target proportions:

    • If we are an authority, let T_DIR = 1. Otherwise set T_DIR = 0.1.
    • If we are an exit or we are running an onion service, let T_EXIT = 2. Otherwise let T_EXIT = 0.1.
    • Let T_OR = 1.

    TODO: Should those numbers be consensus parameters?

    These numbers define the relative proportions of connections that we would be willing to retain retain in our final mix. Compute a number of excess connections of each type by calculating.

    T_TOTAL = T_OR + T_DIR + T_EXIT. EXCESS_DIR = N_DIR - N_RETAIN * (T_DIR / T_TOTAL) EXCESS_EXIT = N_EXIT - N_RETAIN * (T_EXIT / T_TOTAL) EXCESS_OR = N_OR - N_RETAIN * (T_OR / T_TOTAL)
  3. Finally, divide N_CLOSE among the different types of excess connections, assigning first to excess directory connections, then excess exit connections, and finally to excess OR connections.

    CLOSE_DIR = MIN(EXCESS_DIR, N_CLOSE) N_CLOSE := N_CLOSE - CLOSE_DIR CLOSE_EXIT = MIN(EXCESS_EXIT, N_CLOSE) N_CLOSE := N_CLOSE - CLOSE_EXIT CLOSE_OR = MIN(EXCESS_OR, N_CLOSE)

We will try to close CLOSE_DIR directory connections, CLOSE_EXIT exit connections, and CLOSE_OR OR connections.

Phase 2: Closing directory connections

We want to close a certain number of directory connections. To select our targets, we sort first by the number of directory connections from a similar address (see "similar address" below) and then by their age, preferring to close the oldest ones first.

This approach defeats "many requests from the same address" and "Open a connection and hold it open, and do so from many addresses". It doesn't do such a great job with defeating "open and close frequently and do so on many addresses."

Note that fallback directories do not typically use sockets for handling directory connections: theirs are usually created with BEGIN_DIR.

Phase 3: Closing exit connections.

We want to close a certain number of exit connections. To do this, we pick an exit connection at random, then close its circuit along with all the other exit connections on the same circuit. Then we repeat until we have closed at least our target number of exit connections.

This approach probabilistically favors closing circuits with a large number of sockets open, regardless of how long those sockets have been open. This defeats the easiest way of opening a large number of exit streams ("open them all on one circuit") without making the counter-approach ("open each exit stream on its own circuit") much more attractive.

Phase 3: Closing OR connections.

We want to close a certain number of OR connections, to clients, bridges, or relays.

To do this, we first close OR connections with zero circuits. Then we close all OR connections but the most recent 2 from each "similar address". Then we close OR connections at random from among those not to a recognized relay in the latest directory. Finally, we close OR connections at random.

We used to unconditionally prefer to close connections with fewer circuits. That's trivial for an adversary to circumvent, though: they can just open a bunch of circuits on their bogus OR connections, and force us to preferentially close circuits from real clients, bridges, and relays.

Note that some connections that seem like client connections ("not from relays in the latest directory") are actually those created by bridges.

What is "A similar address"?

We define two connections as having a similar address if they are in the same IPv4 /30, or if they are in the same IPv6 /90.

Acknowledgments

This proposal was inspired by a set of OOS improvements from starlight.

Filename: 342-decouple-hs-interval.md Title: Decoupling hs_interval and SRV lifetime Author: Nick Mathewson Created: 9 January 2023 Status: Draft

Motivation and introduction

Tor uses shared random values (SRVs) in the consensus to determine positions of relays within a hash ring. Which shared random value is to be used for a given time period depends upon the time at which that shared random value became valid.

But right now, the consensus voting period is closely tied to the shared random value voting cycle: and clients need to understand both of these in order to determine when a shared random value became current.

This creates tight coupling between:

  • The voting schedule
  • The SRV liveness schedule
  • The hsdir_interval parameter that determines the length of the an HSDIR index

To decouple these values, this proposal describes a forward compatible change to how Tor reports SRVs in consensuses, and how Tor decides which hash ring to use when.

Reporting SRV timestamps

In consensus documents, parties should begin to accept shared-rand-*-value lines with an additional argument, in the format of an IsoTimeNospace timestamp (like "1985-10-26T00:00:00"). When present, this timestamp indicates the time at which the given shared random value first became the "current" SRV.

Additionally, we define a new consensus method that adds these timestamps to the consensus.

We specify that, in the absence of such a timestamp, parties are to assume that the shared-rand-current-value SRV became "current" at the first 00:00 UTC on the UTC day of the consensus's valid-after timestamp, and that the shard-rand-previous-value SRV became "current" at 00:00 UTC on the previous UTC day.

Generalizing HSDir index scheduling.

Under the current HSDir design, there is one SRV for each time period, and one time period for which each SRV is in use. Decoupling hsdir_interval from 24 hours will require that we change this notion slightly.

We therefore propose this set of generalized directory behavior rules, which should be equivalent to the current rules under current parameters.

The calculation of time periods remains the same (see rend-spec-v3.txt section [TIME PERIODS]).

A single SRV is associated with each time period: specifically, the SRV that was "current" at the start of the time period.

There is a separate hash ring associated with each time period and its SRV.

Whenever fetching an onion service descriptor, the client uses the hash ring for the time period that contains the start of the liveness interval of the current consensus. Call this the "Consensus" time period.

Whenever uploading an onion service descriptor, the service uses two or three hash rings:

  • The "consensus" time period (see above).
  • The immediately preceding time period, if the SRV to calculate that hash ring is available in the consensus.
  • The immediately following time period, if the SRV to calculate that hash ring is available in the consensus.

(Under the current parameters, where hsdir_interval = SRV_interval, there will never be more than two possible time periods for which the service can qualify.)

Migration

We declare that, for at least the lifetime of the C tor client, we will not make any changes to the voting interval, the SRV interval, or the hsdir_interval. As such, we do not need to prioritize implementing these changes in the C client: we can make them in Arti only.

Issues left unsolved

There are likely other lingering issues that would come up if we try to change the voting interval. This proposal does not attempt to solve them.

This proposal does not attempt to add flexibility to the SRV voting algorithm itself.

Changing hsdir_interval would create a flag day where everybody using old and new values of hsdir_interval would get different hash rings. We do not try to solve that here.

Acknowledgments

Thanks to David Goulet for explaining all of this stuff to me!

Filename: 343-rend-caa.txt Title: CAA Extensions for the Tor Rendezvous Specification Author: Q Misell <q@as207960.net> Created: 2023-04-25 Status: Open Ticket: https://gitlab.torproject.org/tpo/core/tor/-/merge_requests/716 Overview: The document defines extensions to the Tor Rendezvous Specification Hidden Service descriptor format to allow the attachment of DNS style CAA records to Tor hidden services to allow the same security benefits as CAA provides in the DNS. Motivation: As part of the work on draft-misell-acme-onion [I-D.misell-acme-onion] at the IETF it was felt necessary to define a method to incorporate CAA records [RFC8659] into Tor hidden services. CAA records in the DNS provide an mechanism to indicate which Certificate Authorities are permitted to issue certificates for a given domain name, and restrict which validation methods are permitted for certificate validation. As Tor hidden service domains are not in the DNS another way to provide the same security benefits as CAA does in the DNS needed to be devised. It is important to note that a hidden service is not required to publish a CAA record to obtain a certificate, as is the case in the DNS. More information about this project in general can be found at https://acmeforonions.org. Specification: To enable maximal code re-use in CA codebases the same CAA record format is used in Tor hidden services as in the DNS. To this end a new field is added to the second layer hidden service descriptor [tor-rend-spec-v3] § 2.5.2.2. with the following format: "caa" SP flags SP tag SP value NL [Any number of times] The contents of "flag", "tag", and "value" are as per [RFC8659] § 4.1.1. Multiple CAA records may be present, as is the case in the DNS. A hidden service's second layer descriptor using CAA may look something like the following: create2-formats 2 single-onion-service caa 0 issue "example.com" caa 0 iodef "mailto:security@example.com" caa 128 validationmethods "onion-csr-01" introduction-point AwAGsAk5nSMpAhRqhMHbTFCTSlfhP8f5PqUhe6DatgMgk7kSL3KHCZ... As the CAA records are in the second layer descriptor and in the case of a hidden service requiring client authentication it is impossible to read them without the hidden service trusting a CA's public key, a method is required to signal that there are CAA records present (but not reveal their contents, which may disclose unwanted information about the hidden service operator to third parties). This is to allow a CA to know that it must attempt to check CAA records before issuance, and fail if it is unable to do so. To this end a new field is added to the first layer hidden service descriptor [tor-rend-spec-v3] § 2.5.1.2. with the following format: "caa-critical" NL [At most once] Security Considerations: The second layer descriptor is signed, encrypted and MACed in a way that only a party with access to the secret key of the hidden service could manipulate what is published there. Therefore, Tor CAA records have at least the same security as those in the DNS secured by DNSSEC. The "caa-critical" flag is visible to anyone with knowledge of the hidden service's public key, however it reveals no information that could be used to de-anonymize the hidden service operator. The CAA flags in the second layer descriptor may reveal information about the hidden service operator if they choose to publish an "iodef", "contactemail", or "contactphone" tag. These however are not required for primary goal of CAA, that is to restrict which CAs may issue certificates for a given domain name. No more information is revealed by the "issue" nor "issuewild" tags than would be revealed by someone making a connection to the hidden service and noting which certificate is presented. Compatibility: The hidden service spec [tor-rend-spec-v3] already requires that clients ignore unknown lines when decoding hidden service descriptors, so this change should not cause any compatibility issues. Additionally in testing no compatibility issues where found with existing Tor implementations. A hidden service with CAA records published in its descriptor is available at znkiu4wogurrktkqqid2efdg4nvztm7d2jydqenrzeclfgv3byevnbid.onion, to allow further compatibility testing. References: [I-D.misell-acme-onion] Misell, Q., "Automated Certificate Management Environment (ACME) Extensions for ".onion" Domain Names", Internet-Draft draft-misell-acme-onion-02, April 2023, <https://datatracker.ietf.org/doc/html/draft-misell-acme-onion-02>. [RFC8659] Hallam-Baker, P., Stradling, R., and J. Hoffman-Andrews, "DNS Certification Authority Authorization (CAA) Resource Record", RFC 8659, DOI 10.17487/RFC8659, November 2019, <https://www.rfc-editor.org/info/rfc8659>. [tor-rend-spec-v3] The Tor Project, "Tor Rendezvous Specification - Version 3", <https://spec.torproject.org/rend-spec-v3>.```
Filename: 344-protocol-info-leaks.txt Title: Prioritizing Protocol Information Leaks in Tor Author: Mike Perry Created: 2023-07-17 Purpose: Normative Status: Open 0. Introduction Tor's protocol has numerous forms of information leaks, ranging from highly severe covert channels, to behavioral issues that have been useful in performing other attacks, to traffic analysis concerns. Historically, we have had difficulty determining the severity of these information leaks when they are considered in isolation. At a high level, many information leaks look similar, and all seem to be forms of traffic analysis, which is regarded as a difficult attack to perform due to Tor's distributed trust properties. However, some information leaks are indeed more severe than others: some can be used to remove Tor's distributed trust properties by providing a covert channel and using it to ensure that only colluding and communicating relays are present in a path, thus deanonymizing users. Some do not provide this capability, but can be combined with other info leak vectors to quickly yield Guard Discovery, and some only become dangerous once Guard Discovery or other anonymity set reduction is already achieved. By prioritizing information leak vectors by their co-factors, impact, and resulting consequences, we can see that these attack vectors are not all equivalent. Each vector of information leak also has a common solution, and some categories even share the same solution as other categories. This framework is essential for understanding the context in which we will be addressing information leaks, so that decisions and fixes can be understood properly. This framework is also essential for recognizing when new protocol changes might introduce information leaks or not, for gauging the severity of such information leaks, and for knowing what to do about them. Hence, we are including it in tor-spec, as a living, normative document to be updated with experience, and as external research progresses. It is essential reading material for any developers working on new Tor implementations, be they Arti, Arti-relay, or a third party implementation. This document is likely also useful to developers of Tor-like anonymity systems, of which there are now several, such as I2P, MASQUE, and Oxen. They definitely share at least some, and possibly even many of these issues. Readers who are relatively new to anonymity literature may wish to first consult the Glossary in Section 3, especially if terms such as Covert Channel, Path Bias, Guard Discovery, and False Positive/False Negative are unfamiliar or hazy. There is also a catalog of historical real-world attacks that are known to have been performed against Tor in Section 2, to help illustrate how information leaks have been used adversarially, in practice. We are interested in hearing from journalists and legal organizations who learn about court proceedings involving Tor. We became aware of three instances of real-world attacks covered in Section 2 in this way. Parallel construction (hiding the true source of evidence by inventing an alternate story for the court -- also known as lying) is a possibility in the US and elsewhere, but (so far) we are not aware of any direct evidence of this occurring with respect to Tor cases. Still, keep your eyes peeled... 0.1. Table of Contents 1. Info Leak Vectors 1.1. Highly Severe Covert Channel Vectors 1.1.1. Cryptographic Tagging 1.1.2. End-to-end cell header manipulation 1.1.3. Dropped cells 1.2. Info Leaks that enable other attacks 1.2.1. Handshakes with unique traffic patterns 1.2.2. Adversary-Induced Circuit Creation 1.2.3. Relay Bandwidth Lying 1.2.4. Metrics Leakage 1.2.5. Protocol Oracles 1.3. Info Leaks of Research Concern 1.3.1. Netflow Activity 1.3.2. Active Traffic Manipulation Covert Channels 1.3.3. Passive Application-Layer Traffic Patterns 1.3.4. Protocol or Application Linkability 1.3.5. Latency Measurement 2. Attack Examples 2.1. CMU Tagging Attack 2.2. Guard Discovery Attacks with Netflow Deanonymization 2.3. Netflow Anonymity Set Reduction 2.4. Application Layer Confirmation 3. Glossary 1. Info Leak Vectors In this section, we enumerate the vectors of protocol-based information leak in Tor, in order of highest priority first. We separate these vectors into three categories: "Highly Severe Covert Channels", "Info Leaks that Enable other attacks", and "Info Leaks Of Research Concern". The first category yields deanonymization attacks on their own. The second category enables other attacks that can lead to deanonymization. The final category can be aided by the earlier vectors to become more severe, but overall severity is a combination of many factors, and requires further research to illuminate all of these factors. For each vector, we provide a brief "at-a-glance" summary, which includes a ballpark estimate of Accuracy in terms of False Positives (FP) and False Negatives (FN), as 0, near-zero, low, medium, or high. We then list what is required to make use of the info leak, the impact, the reason for the prioritization, and some details on where the signal is injected and observed. 1.1. Highly Severe Covert Channel Vectors This category of info leak consists entirely of covert channel vectors that have zero or near-zero false positive and false negative rates, because they can inject a covert channel in places where similar activity would not happen, and they are end-to-end. They also either provide or enable path bias attacks that can capture the route clients use, to ensure that only malicious exits are used, leading to full deanonymization when the requirements are met. If the adversary has censorship capability, and can ensure that users only connect to compromised Guards (or Bridges), they can fully deanonymize all users with these covert channels. 1.1.1. Cryptographic Tagging At a glance: Accuracy: FP=0, FN=0 Requires: Malicious or compromised Guard, at least one exit Impact: Full deanonymization (path bias, identifier transmission) Path Bias: Automatic route capture (all non-deanonymized circuits fail) Reason for prioritization: Severity of Impact; similar attacks used in wild Signal is: Modified cell contents Signal is injected: by guard Signal is observed: by exit First reported at Black Hat in 2009 (see [ONECELL]), and elaborated further with the path bias amplification attack in 2012 by some Raccoons (see [RACCOON23]), this is the most severe vector of covert channel attack in Tor. Cryptographic tagging is where an adversary who controls a Guard (or Bridge) XORs an identifier, such as an IP address, directly into the circuit's cipher-stream, in an area of known-plaintext. This tag can be exactly recovered by a colluding exit relay, ensuring zero false positives and zero false negatives for this built-in identifier transmission, along with their collusion signal. Additionally, because every circuit that does not have a colluding relay will automatically fail because of the failed digest validation, the adversary gets a free path bias amplification attack, such that their relay only actually carries traffic that they know they have successfully deanonymized. Because clients will continually attempt to re-build such circuits through the guard until they hit a compromised exit and succeed, this violates Tor's distributed trust assumption, reducing it to the same security level as a one-hop proxy (ie: the security of fully trusting the Guard relay). Worse still, when the adversary has full censorship control over all connections into the Tor network, Tor provides zero anonymity or privacy against them, when they also use this vector. Because the Exit is able to close *all* circuits that are not deanonymized, for maximal efficiency, the adversary's Guard capacity should exactly match their Exit capacity. To make up for the loss of traffic caused by closing many circuits, relays can lie about their bandwidth (see Section 1.2.3). Large amounts of circuit failure (that might be evidence of such an attack) are tracked and reported by C-Tor in the logs, by the path bias detector, but when the Guard is under DDoS, or even heavy load, this can yield false alarms. These false alarms happened frequently during the network-wide DDoS of 2022-2023. They can also be induced at arbitrary Guards via DoS, to make users suspicious of their Guards for no reason. The path bias detector could have a second layer in Arti, that checks to see if any specific Exits are overused when the circuit failure rate is high. This would be more indicative of an attack, but could still go off if the user is actually trying to use rare exits (ie: country selection, bittorrent). This attack, and path bias attacks that are used in the next two sections, do have some minor engineering barriers when being performed against both onion and exit traffic, because the onion service traffic is restricted to particular hops in the case of HSDIR and intro point circuits. However, because pre-built circuits are used to access HSDIR and intro points, the adversary can use their covert channel such that only exits and pre-built onion service circuits are allowed to proceed. Onion services are harder to deanonymize in this way, because the HSDIR choice itself can't be controlled by them, but they can still be connected to using pre-built circuits until the adversary also ends up in the HSDIR position, for deanonymization. Solution: Path Bias Exit Usage Counter; Counter Galois Onion (CGO) (Forthcoming update to Prop#308). Status: Unfixed (Current PathBias detector is error-prone under DDoS) Funding: CGO explicitly funded via Sponsor 112 1.1.2. End-to-end cell header manipulation At a glance: Accuracy: FP=0, FN=0 Requires: Malicious or compromised Guard, at least one exit Impact: Full deanonymization (path bias, identifier transmission) Path Bias: Full route capture is trivial Reason for prioritization: Severity of Impact; used in the wild Signal is: Modified cell commands. Signal is injected: By either guard or exit/HSDIR Signal is observed: By either guard or exit/HSDIR The Tor protocol consists of both cell header commands, and relay header commands. Cell commands are not encrypted by circuit-level encryption, so they are visible and modifiable by every relay in the path. Relay header commands are encrypted, and not visible to every hop in the path. Not all cell commands are forwarded end-to-end. Currently, these are limited to RELAY, RELAY_EARLY, and DESTROY. Because of the attack described here, great care must be taken when adding new end-to-end cell commands, even if they are protected by a MAC. Previously, a group of researchers at CMU used this property to modify the cell command header of cells on circuits, to switch between RELAY_EARLY and RELAY at exits and HSDIRs (see [RELAY_EARLY]). This creates a visible bit in each cell, that can signal collusion, or with enough cells, can encode an identifier such as an IP address. They assisted the FBI, to use this attack in the wild to deanonymize clients. We addressed the CMU attack by closing the circuit upon receiving an "inbound" (towards the client) RELAY_EARLY command cell, and by limiting the number of "outbound" (towards the exit) RELAY_EARLY command cells at relays, and by requiring the use of RELAY_EARLY for EXTEND (onionskin) relay commands. This defense is not generalized, though. Guards may still use this specific covert channel to send around 3-5 bits of information after the extend handshake, without killing the circuit. It is possible to use the remaining outbound vector to assist in path bias attacks for dropped cells, as a collusion signal to reduce the amount of non-compromised traffic that malicious exits must carry (see the following Section 1.1.3). If this covert channel is not addressed, it is trivial for a Guard and Exit relays to close every circuit that does not display this covert channel, providing path bias amplification attack and distributed trust reduction, similar to cryptographic tagging attacks. Because the inbound direction *is* addressed, we believe this kind of path bias is currently not possible with this vector by itself (thus also requiring the vector from Section 1.1.3), but it could easily become possible if this defense is forgotten, or if a new end-to-end cell type is introduced. While more cumbersome than cryptographic tagging attacks, in practice this attack is just as successful, if these cell command types are not restricted and limited. It is somewhat surprising that the FBI used this attack before cryptographic tagging, but perhaps that was just a lucky coincidence of opportunity. Solution: CGO (Updated version of Prop#308) covers cell commands in the MAC; Any future end-to-end cell commands must still limit usage Status: Fix specific to CMU attack; Outbound direction is unfixed Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112 1.1.3. Dropped cells At a glance: Accuracy: FP=0, FN=0 Requires: Malicious Guard or Netflow data (if high volume), one exit Impact: Full deanonymization (path bias amplification, collusion signal) Path Bias: Full route capture is trivial Reason for prioritization: Severity of Impact; similar attacks used in wild Signal is: Unusual patterns in number of cells received Signal is injected: By exit or HSDIR Signal is observed: at guard or client<->guard connection. Dropped cells are cells that a relay can inject that end up ignored and discarded by a Tor client. These include: - Unparsable cells - Unrecognized cells (ie: wrong source hop, or decrypt failures) - invalid relay commands - unsupported (or consensus-disabled) relay commands or extensions - out-of-context relay commands - duplicate relay commands - relay commands that hit any error codepaths - relay commands for an invalid or already-closed stream ID - semantically void relay cells (incl relay data len == 0, or PING) - onion descriptor-appended junk This attack works by injecting inbound RELAY cells at the exit or at a middle relay, and then observing anomalous traffic patterns at the guard or at the client->guard connection. The severity of this covert channel is extreme (zero false positives; zero false negatives) when they are injected in cases where the circuit is otherwise known to be silent, because of the protocol state machine. These cases include: - Immediately following an onionskin response - During other protocol handshakes (onion services, conflux) - Following relay CONNECTED or RESOLVED (not as severe - no path bias) Because of the stateful and deterministic nature of the Tor protocol, especially handshakes, it is easy to accurately recognize these specific cases even when observing only encrypted circuit traffic at the Guard relay (see [DROPMARK]). Because this covert channel is most accurate before actual circuit use, when the circuit is expected to be otherwise silent, it is trivial for a Guard relay to close every circuit that does not display this covert channel, providing path bias amplification attack and distributed trust reduction, similar to cryptographic tagging attacks and end-to-end cell header manipulation. This ability to use the collusion signal to perform path bias before circuit use differentiates dropped cells within the Tor Protocol from deadweight traffic during application usage (such as javascript requests for 404 URLs, covered in Section 1.3.2). This category is not quite as severe as these previous two categories by itself, for two main reasons. However, it is also the case that due to other factors, these reasons may not matter in practice. First, the Exit can't use this covert channel to close circuits that are not deanonymized by a colluding Guard, since there is no covert channel from the Guard to the Exit with this vector alone. Thus, unlike cryptographic tagging, the adversary's Exits will still carry non-deanonymized traffic from non-adversary Guards, and thus the adversary needs more Exit capacity than Guard capacity. These kinds of more subtle trade-offs with respect to path bias are covered in [DOSSECURITY]. However, note that this issue can be fixed by using the previous RELAY_EARLY covert channel from the Guard to the Exit (since this direction is unfixed). This allows the adversary to confirm receipt of the dropped cell covert channel, allowing both the Guard and the Exit to close all non-confirmed circuits, and thus ensure that they only need to allocate equal amounts of compromised Guard and Exit traffic, to monitor all Tor traffic. Second, encoding a full unique identifier in this covert channel is non-trivial. A significant amount of injected traffic must be sent to exchange more than a simple collusion signal, to link circuits when attacking a large number of users. In practice, this likely means some amount of correlation, and a resulting (but very small) statistical error. Obviously, the actual practical consequences of these two limitations are questionable, so this covert channel is still regarded as "Highly Severe". It can still result in full deanonymization of all Tor traffic by an adversary with censorship capability, with very little error. Solution: Forthcoming dropped-cell proposal Status: Fixed with vanguards addon; Unfixed otherwise Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112 1.2. Info Leaks that enable other attacks These info leaks are less severe than the first group, as they do not yield full covert channels, but they do enable other attacks, including guard discovery and eventual netflow deanonymization, and website traffic fingerprinting. 1.2.1. Handshakes with unique traffic patterns At a glance: Accuracy: FP=near-zero, FN=near-zero Requires: Compromised Guard Impact: Anonymity Set Reduction and Oracle; assists in Guard Discovery Path Bias: Full route capture is difficult (high failure rate) Reason for Prioritization: Increases severity of vectors 1.2.2 and 1.3.3 Signal is: Caused by client's behavior. Signal is observed: At guard Signal is: Unique cell patterns Certain aspects of Tor's handshakes are very unique and easy to fingerprint, based only on observed traffic timing and volume patterns. In particular, the onion client and onion service handshake activity is fingerprintable with near-zero false negatives and near-zero false positive rates, as per [ONIONPRINT]. The conflux link handshake is also unique (and thus accurately recognizable), because it is our only 3-way handshake. This info leak is very accurate. However, the impact is much lower than that of covert channels, because by itself, it can only tell if a particular Tor protocol, behavior, or feature is in use. Additionally, Tor's distributed trust properties remain in-tact, because there is no collusion signal built in to this info leak. When a path bias attack is mounted to close circuits during circuit handshake construction without a collusion signal to the Exit, it must proceed hop-by-hop. Guards must close circuits that do not extend to colluding middles, and those colluding middles must close circuits that don't extend to colluding exits. This means that the adversary must control some relays in each position, and has a substantially higher circuit failure rate while directing circuits to each of these relays in a path. To put this into perspective, an adversary using a collusion signal with 10% of Exits expects to fail 9 circuits before detecting their signal at a colluding exit and allowing a circuit to succeed. However, an adversary without a collusion signal and 10% of all relays expects to fail 9 circuits before getting a circuit to their middle, but then expects 9 of *those* circuits to fail before reaching an Exit, for 81 circuit failures for every successful circuit. Published attacks have built upon this info leak, though. In particular, certain error conditions, such as returning a single "404"-containing relay cell for an unknown onion service descriptor, are uniquely recognizable. This fingerprint was used in the [ONIONFOUND] guard discovery attack, and they provide a measurement of its uniqueness. Additionally, onion client fingerprintability can be used to vastly reduce the set of website traffic traces that need to be considered for website traffic fingerprinting (see Section 1.3.3), making that attack realistic and practical. Effectively, it functions as a kind of oracle in this case (see Glossary, and [ORACLES]). Solution: Padding machines at middles for protocol handshakes (as per [PCP]); Pathbias-lite. Status: Padding machines deployed for onion clients, but have weaknesses against DF and stateful cross-circuit fingerprints Funding: Not explicitly funded 1.2.2. Adversary-Induced Circuit Creation At a glance: Accuracy: FP=high, FN=high Requires: Onion service activity, or malicious exit Impact: Guard discovery Path Bias: Repeated circuits eventually provide the desired path Reason for Prioritization: Enables Guard Discovery Signal is: Inducing a client to make a new Tor circuit Signal is injected: by application layer, client, or malicious relay Signal is observed: At middle By itself, the ability for an adversary to cause a client to create circuits is not a covert channel or arguably even an info leak. Circuit creation, even bursts of frequent circuit creation, is commonplace on the Tor network. However, when this activity is combined with a covert channel from Section 1.1, with a unique handshake from Section 1.2.1, or with active traffic manipulation (Section 1.3.2), then it leads to Guard Discovery, by allowing the adversary to recognize when they are chosen for the Middle position, and thus learn the Guard. Once Guard Discovery is achieved, netflow analysis of the Guard's connections can be used to perform intersection attacks and eventually determine the client IP address (see Section 1.3.1). Large quantities of circuit creation can be induced by: - Many connections to an Onion Service - Causing a client to make connections to many onion service addresses - Application connection to ports in rare exit policies, followed by circuit close at Exit - Repeated Conflux leg failures In Tor 0.4.7 and later, onion services are protected from this activity via Vanguards-Lite (Proposal #333). This system adds a second layer of vanguards to onion service circuits, with rotation times set such that it is sufficient to protect a user for use cases on the order of weeks, assuming the adversary does not get lucky and land in a set. Non-Onion service activity, such as Conflux leg failures, is protected by feature-specific rate limits. Longer lived onion services should use the Vanguards Addon, which implements Mesh Vanguards (Prop#292). It uses two layers of vanguards, and expected use cases of months. These attack times are probabilistic expectations, and are rough estimates. See the proposals for details. To derive these numbers, the proposals assume a 100% accurate covert channel for detecting that the middle is in the desired circuit. If we address the low hanging fruit for such covert channels above, these numbers change, and such attacks also become much more easily detectable, as they will rely on application layer covert channels (See Section 1.3.2), which will resemble an application layer DoS or flood. Solution: Mesh-vanguards (Prop#292); Vanguards-lite (Prop#333); rate limiting circuit creation attempts; rate limiting the total number of distinct paths used by circuits Status: Vanguards-lite deployed in Tor 0.4.7; Mesh-vanguards is vanguards addon; Conflux leg failures are limited per-exit; Exitpolicy scanner exists Funding: Not explicitly funded 1.2.3. Relay Bandwidth Lying At a glance: Accuracy: FP=high, FN=high Requires: Running relays in the network Impact: Additional traffic towards malicious relays Path Bias: Bandwidth lying can make up for circuit rejection Reason for prioritization: Assists Covert Channel Path Bias attacks Signal is injected: by manipulating reported descriptor bandwidths Signal is observed: by clients choosing lying relays more often Signal is: the effect of using lying relays more often Tor clients select relays for circuits in proportion to their fraction of consensus "bandwidth" weight. This consensus weight is calculated by multiplying the relay's self-reported "observed" descriptor bandwidth value by a ratio that is measured by the Tor load balancing system (formerly TorFlow; now sbws -- see [SBWS] for an overview). The load balancing system uses two-hop paths to measure the stream bandwidth through all relays on the network. The ratio is computed by determining a network-wide average stream bandwidth, 'avg_sbw', and a per-relay average stream bandwidth, 'relay_sbw'. Each relay's ratio value is 'relay_sbw/avg_sbw'. (There are also additional filtering steps to remove slow outlier streams). Because the consensus weights for relays derive from manipulated descriptor values by multiplication with this ratio, relays can still influence their weight by egregiously lying in their descriptor value, thus attracting more client usage. They can also attempt to fingerprint load balancer activity and selectively give it better service, though this is more complicated than simply patching Tor to lie. This attack vector is especially useful when combined with a path bias attack from Section 1.1: if an adversary is using one of those covert channels to close a large portion of their circuits, they can make up for this loss of usage by inflating their corresponding bandwidth value by an equivalent amount, thus causing the load balancer to still measure a reasonable ratio for them, and thus still provide fast service for the fully deanonymized circuits that they do carry. There are many research papers written on alternate approaches to the measurement problem. These have not been deployed for three reasons: 1. The unwieldy complexity and fragility of the C-Tor codebase 2. The conflation of measurement with load balancing (we need both) 3. Difficulty performing measurement of the fastest relays with non-detectable/distributed mechanisms In the medium term, we will work on detecting bandwidth lying and manipulation via scanners. In the long term, Arti-relay will allow the implementation of distributed and/or dedicated measurement components, such as [FLASHFLOW]. (Note that FlashFlow still needs [SBWS] or another mechanism to handle load balancing, though, since FlashFlow only provides measurement). Solutions: Scan for lying relays; implement research measurement solutions Status: A sketch of the lying relay scanner design is in [LYING_SCANNER] Funding: Scanning for lying relays is funded via Sponsor 112 1.2.4. Metrics Leakage At a glance: Accuracy: FP=low, FN=high Requires: Some mechanism to bias or inflate reported relay metrics Impact: Guard discovery Path Bias: Potentially relevant, depending on type of leak Reason for prioritization: Historically severe issues Signal is injected: by interacting with onion service Signal is observed: by reading router descriptors Signal is: information about volume of traffic and number of IP addresses In the past, we have had issues with info leaks in our metrics reporting (see [METRICSLEAK]). We addressed them by lowering the resolution of read/write history, and ensuring certain error conditions could not willfully introduce noticeable asymmetries. However, certain characteristics, like reporting local onion or SOCKS activity in relay bandwidth counts, still remain. Additionally, during extremely large flooding or DDoS attempts, it may still be possible to see the corresponding increases in reported metrics for Guards in use by onion services, and thus discover its Guards. Solutions: Fix client traffic reporting; remove injectable asymmetries; reduce metrics resolution; add noise Status: Metrics resolution reduced to 24hr; known asymmetries fixed Funding: Not funded 1.2.5. Protocol Oracles At a glance: Accuracy: FP=medium, FN=0 (for unpopular sites: FP=0, FN=0) Requires: Probing relay DNS cache Impact: Assists Website Traffic Fingerprinting; Domain Usage Analytics Path Bias: Not Possible Reason for prioritization: Historically accurate oracles Signal is injected: by client causing DNS caching at exit Signal is observed: by probing DNS response wrt to cell ordering via all exits Signal is: If cached, response is immediate; otherwise other cells come first Protocol oracles, such as exit DNS cache timing to determine if a domain has been recently visited, increase the severity of Website Traffic Fingerprinting in Section 1.3.3, by reducing false positives, especially for unpopular websites. There are additional forms of oracles for Website Traffic Fingerprinting, but the remainder are not protocol oracles in Tor. See [ORACLES] in the references. Tor deployed a defense for this oracle in the [DNSORACLE] tickets, to randomize expiry time. This helps reduce the precision of this oracle for popular and moderately popular domains/websites in the network, but does not fully eliminate it for unpopular domains/websites. The paper in [DNSORACLE] specifies a further defense, using a pre-load of popular names and circuit cache isolation defense in Section 6.2, with third party resolvers. The purpose of the pre-load list is to preserve the cache hits for shared domains across circuits (~11-17% of cache hits, according to the paper). The purpose of circuit isolation is to avoid Tor cache hits for unpopular domains across circuits. The purpose of third party resolvers is to ensure that the local resolver's cache does not become measurable, when isolating non-preloaded domains to be per-circuit. Unfortunately, third party resolvers are unlikely to be recommended for use by Tor, since cache misses of unpopular domains would hit them, and be subject to sale in DNS analytics data at high resolution (see [NETFLOW_TICKET]). Also note that the cache probe attack can only be used by one adversary at a time (or they begin to generate false positives for each other by actually *causing* caching, or need to monitor for each other to avoid each other). This is in stark contrast to third party resolvers, where this information is sold and available to multiple adversaries concurrently, for all uncached domains, with high resolution timing, without the need for careful coordination by adversaries. However, note that an arti-relay implementation would no longer be single threaded, and would be able to reprioritize asynchronous cache activity arbitrarily, especially for sensitive uncached activity to a local resolver. This might be useful for reducing the accuracy of the side channel, in this case. Unfortunately, we lack sufficient clarity to determine if it is meaningful to implement any further defense that does not involve third party resolvers under either current C-Tor, or future arti-relay circumstances. Solutions: Isolate cache per circuit; provide a shared pre-warmed cache of popular domains; smarter cache handling mechanisms? Status: Randomized expiry only - not fully eliminated Funding: Any further fixes are covered by Sponsor 112 1.3. Info Leaks of Research Concern In this section, we list info leaks that either need further research, or are undergoing active research. Some of these are still severe, but typically less so than the already covered ones, unless they are part of a combined attack, such as with an Oracle, or with Guard Discovery. Some of these may be more or less severe than currently suspected: If we knew for certain, they wouldn't need research. 1.3.1. Netflow Activity At a glance: Accuracy: FP=high; FN=0 (FN=medium with incomplete vantage point set) Requires: Access to netflow data market, or ISP coercion Impact: Anonymity Set Reduction; Deanonymization with Guard Discovery/Oracle Path Bias: Not possible Reason for Prioritization: Low impact without Guard Discovery/Oracle Signal is: created by using the network Signal is observed: at ISP of everything that is using the network. Signal is: Connection tuple times and byte counts Netflow is a feature of internet routers that records connection tuples, as well as time stamps and byte counts, for analysis. This data is bought and sold, by both governments and threat intelligence companies, as documented in [NETFLOW_TICKET]. Tor has a padding mechanism to reduce the resolution of this data (see Section 2 of [PADDING_SPEC]), but this hinges on clients' ability to keep connections open and padded for 45-60 minutes, even when idle. This padding reduces the resolution of intersection attacks, making them operate on 30 minute time windows, rather than 15 second time windows. This increases the false positive rate, and thus increases the duration of such intersection attacks. Large scale Netflow data can also be used to track Tor users as they migrate from location to location, without necessarily deanonymizing them. Because Tor uses three directory guards, and has ~4000 Guard relays, the choice Choose(4000,3) of directory Guards is ~10 billion different combinations, though probability weighting of Guard selection does reduce this considerably in practice. Lowering the total number of Guard relays (via arti-relay and using only the fastest Guards), and using just two directory guards as opposed to three can reduce this such that false positives become more common. More thorough solutions are discussed in [GUARDSETS]. Location tracking aside, by itself, this data (especially when padded) is not a threat to client anonymity. However, this data can also be used in combination with a number of Oracles or confirmation vectors, such as: - Guard Discovery - Flooding an onion service with huge amounts of traffic in a pattern - Advertising analytics or account activity log purchase - TCP RST injection - TLS conn rotation These oracles can be used to either confirm the connection of an onion service, or to deanonymize it after Guard Discovery. In the case of clients, the use of Oracle data can enable intersection attacks to deanonymize them. The oracle data necessary for client intersection attack is also being bought and sold, as documented in [NETFLOW_TICKET]. It is unknown how long such attacks take, but it is a function of the number of users under consideration, and their connection durations. The research interest here is in determining what can be done to increase the amount of time these attacks take, in terms of increasing connection duration, increasing the number of users, reducing the total number of Guard relays, using a UDP transport, or changing user behavior. Solutions: Netflow padding; connection duration increase; QUIC transport; using bridges; decreasing total number of guards; using only two directory guards; guardsets; limiting sensitive account usage Status: Netflow padding deployed in C-Tor and arti Funding: Not explicitly funded 1.3.2. Active Traffic Manipulation Covert Channels At a Glance: Accuracy: FP=medium, FN=low Requires: Netflow data, or compromised/monitored Guard Impact: Anonymity Set Reduction; Netflow-assisted deanonymization Path Bias: Possible via exit policy or onion service reconnection Reason for Prioritization: Can assist other attacks; lower severity otherwise Signal is injected: by the target application, manipulated by the other end. Signal is observed: at Guard, or target->guard connection. Signal is: Unusual traffic volume or timing. This category of covert channel occurs after a client has begun using a circuit, by manipulating application data traffic. This manipulation can occur either at the application layer, or at the Tor protocol layer. Because it occurs after the circuit is in use, it does not permit the use of path bias or trust reduction properties by itself (unless combined with one of the above info leak attack vectors -- most often Adversary-Induced Circuit Creation). These covert channels also have a significantly higher false positive rate than those before circuit use, since application traffic is ad-hoc and arbitrary, and is also involved during the attempted manipulation of application traffic. For onion services, this covert channel is much more severe: Onion services may be flooded with application data in large-volume patterns over long periods of time, which can be seen in netflow logs. For clients, this covert channel typically is only effective after the adversary suspects an individual, for confirmation of their suspicion, or after Guard Discovery. Examples of this class of covert channel include: - Application-layer manipulation (AJAX) - Traffic delays (rainbow, swirl - see [BACKLIT]) - Onion Service flooding via HTTP POST - Flooding Tor relays to notice traffic changes in onion service throughput - Conflux leg switching patterns - Traffic inflation (1 byte data cells) Solution: Protocol checks; Padding machines at middles for specific kinds of traffic; limits on inbound onion service traffic; Backlit Status: Protocol checks performed for conflux; vanguards addon closes high-volume circuits Funding: Not explicitly funded 1.3.3. Passive Application-Layer Traffic Patterns At a Glance: Accuracy: FP=medium, FN=Medium Requires: Compromised Guard (external monitoring increases FP+FN rate) Impact: Links client and destination activity (ie: deanonymization with logs) Path Bias: Not Possible Reason for prioritization: Large FP rate without oracle, debated practicality Signal is: not injected; passively extracted Signal is observed: at Guard, or entire network Signal is: timing and volume patterns of traffic. This category of information leak occurs after a client has begun using a circuit, by analyzing application data traffic. Examples of this class of information leak include: - Website traffic fingerprinting - End-to-end correlation The canonical application of this information leak is in end-to-end correlation, where application traffic entering the Tor network is correlated to traffic exiting the Tor network (see [DEEPCOFFEA]). This attack vector requires a global view of all Tor traffic, or false negatives skyrocket. However, this information leak is also possible to exploit at a single observation point, using machine learning classifiers (see [ROBFINGERPRINT]), typically either the Guard or bridge relay, or the path between the Guard/bridge and the client. In both cases, this information leak has a significant false positive rate, since application traffic is ad-hoc, arbitrary, and self-similar. Because multiple circuits are multiplexed on one TLS connection, the false positive and false negative rates are higher still at this observation location, as opposed to on a specific circuit. In both cases, the majority of the information gained by classifiers is in the beginning of the trace (see [FRONT] and [DEEPCOFFEA]). This information leak gets more severe when it is combined with another oracle (as per [ORACLES]) that can confirm the statistically derived activity, or narrow the scope of material to analyze. Example oracles include: - DNS cache timing - Onion service handshake fingerprinting - Restricting targeting to either specific users, or specific websites - Advertising analytics or account activity log purchase (see [NETFLOW_TICKET]) Website traffic fingerprinting literature is divided into two classes of attack study: Open World and Closed World. Closed World is when the adversary uses an Oracle to restrict the set of possible websites to classify traffic against. Open World is when the adversary attempts to recognize a specific website or set of websites out of all possible other traffic. The nature of the protocol usage by the application can make this attack easier or harder, which has resulted in application layer defenses, such as [ALPACA]. Additionally, the original Google QUIC was easier to fingerprint than HTTP (See [QUICPRINT1]), but IETF HTTP3 reversed this (See [QUICPRINT2]). Javascript usage makes these attacks easier (see [INTERSPACE], Table 3), where as concurrent activity (in the case of TLS observation) makes them harder. Web3 protocols that exchange blocks of data instead of performing AJAX requests are likely to be much harder to fingerprint, so long as the web3 application is accessed via its native protocol, and not via a website front-end. The entire research literature for this vector is fraught with analysis problems, unfortunately. Because smaller web crawl sizes make the attacks more effective, and because attack papers are easier to produce than defenses generally, dismal results are commonplace. [WFNETSIM] and [WFLIVE] examine some of these effects. It is common for additional hidden gifts to adversaries to creep in, leading to contradictory results, even in otherwise comprehensive papers at top-tier venues. The entire vein of literature must be read with a skeptical eye, a fine-tooth comb, and a large dumpster nearby. As one recent example, in an otherwise comprehensive evaluation of modern defenses, [DEFCRITIC] found a contrary result with respect to the Javascript finding in the [INTERSPACE] paper, by training and testing their classifiers with knowledge of the Javascript state of the browser (thus giving them a free oracle). In truth, neither [DEFCRITIC] nor [INTERSPACE] properly examined the effects of Javascript -- a rigorous test would train and test on a mix of Javascript and non-Javascript traffic, and then compare the classification accuracy of each set separately, after joint classification. Instead, [DEFCRITIC] just reported that disabling Javascript (via the security level of Tor Browser) has "no beneficial effect", which they showed by actually letting the adversary know which traces had Javascript disabled. Such hidden gifts to adversaries are commonplace, especially in attack papers. While it may be useful to do this while comparing defenses against each other, when these assumptions are hidden, and when defenses are not re-tunable for more realistic conditions, this leads to focus on burdensome defenses with large amounts of delay or huge amounts of overhead, at the expense of ignoring lighter approaches that actually improve the situation in practice. This of course means that nothing gets done at all, because Tor is neither going to add arbitrary cell delay at relays (because of queue memory required for this and the impacts on congestion control), nor add 400% overhead to both directions of traffic. In terms of defense deployment, it makes the most sense to place these padding machines at the Guards to start, for many reasons. This is in contrast to other lighter padding machines for earlier vectors, where it makes more sense to place them at the middle relay. In this case, the heavier padding machines necessary for this vector can take advantage of higher multiplexing, which means less overhead. They can also use the congestion signal at the TLS connection, to more easily avoid unnecessary padding when the TLS connection is blocked, thus only using "slack" Guard capacity. Conflux also can be tuned to provide at least some benefit here: even if in lab conditions it provides low benefit, in the scenarios studied by [WFNETSIM] and [WFLIVE], this may actually be considerable, unless the adversary has both guards, which is more difficult for an internal adversary. Additionally, the distinction between external and internal adversaries is rarely, if ever, evaluated in the literature anyway, so there is little guidance on this distinction as a whole, right now. Solution: Application layer solutions ([ALPACA], disabling Javascript, web3 apps); Padding machines at guards for application traffic; conflux tuning Status: Unfixed Funding: Padding machine and simulator port to arti are funded via Sponsor 112 1.3.4. Protocol or Application Linkability At a Glance: Accuracy: FP=0, FN=0 Requires: Compromised Exit; Traffic Observation; Hostile Website Impact: Anonymity Set Reduction Path Bias: Not Possible Reason for prioritization: Low impact with faster releases Signal is: not injected; passively extracted Signal is observed: at Exit, or at application destination Signal is: Rare protocol usage or behavior Historically, due to Tor's slow upgrade cycles, we have had concerns about deploying new features that may fragment the anonymity set of early adopters. Since we have moved to a more rapid release cycle for both clients and relays by abandoning the Tor LTS series, these concerns are much less severe. However, they can still present concerns during the upgrade cycle. For Conflux, for example, during the alpha series, the fact that few exits supported conflux caused us to limit the number of pre-built conflux sets to just one, to avoid concentrating alpha users at just a few exits. It is not clear that this was actually a serious anonymity concern, but it was certainly a concern with respect to concentrating the full activity of all these users at just a few locations, for load balancing reasons alone. Similar concerns exist for users of alternate implementations, both of Tor, and of applications like the browser. We regard this as a potential research concern, but it is likely not a severe one. For example, assuming Tor Browser and Brave both address browser fingerprinting, how bad is it for anonymity that they address it differently? Even if they ensure that all their users have the same or similar browser fingerprints, it will still be possible for websites, analytics datasets, and possibly even Exit relays or Exit-side network observers, to differentiate the use of one browser versus the other. Does this actually harm their anonymity in a real way, or must other oracles be involved? Are these oracles easy to obtain? Similarly, letting users choose their exit country is in this category. In some circumstances, this choice has serious anonymity implications: if the choice is a permanent, global one, and the user chooses an unpopular country with few exits, all of their activity will be much more linkable. However, if the country is popular, and/or if the choice is isolated per-tab or per-app, is this still significant such that it actually enables any real attacks? It seems like not so much. Solutions: Faster upgrade cycle; Avoiding concentrated use of new features Status: Tor LTS series is no longer supported Funding: Not explicitly funded 1.3.5. Latency Measurement At a glance: Accuracy: FP=high, FN=high Requires: Onion service, or malicious Exit Impact: Anonymity Set Reduction/Rough geolocation of services Path Bias: Possible exacerbating factor Reason for Prioritization: Low impact; multiple observations required Signal is created naturally by anything that has a "reply" mechanic Signal is observed at either end. Signal is: delays between a message sent and a message received in reply. Latency's effects on anonymity set has been studied in the [LATENCY_LEAK] papers. It may be possible to get a rough idea of the geolocation of an onion service by measuring the latency over many different circuits. This seems more realistic if the Guard or Guards are known, so that their contribution to latency statistics can be factored in, over many many connections to an onion service. For normal client activity, route selection and the fact that the Exit does not know specific accounts or cookies in use likely provides enough protection. If this turns out to be severe, it seems the best option is to add a delay on the client side to attempt to mask the overall latency. This kind of approach is only likely to make sense for onion services. Other path selection alterations may help, though. Solutions: Guards, vanguards, alternative path selection, client-side delay Status: Guards and vanguards-lite are used in Tor since 0.4.7 Funding: Not explicitly funded 2. Attack Examples To demonstrate how info leaks combine, here we provide some historical real-world attacks that have used these info leaks to deanonymize Tor users. 2.1. CMU Tagging Attack Perhaps the most famous historical attack was when a group at CMU assisted the FBI in performing dragnet deanonymization of Tor users, through their [RELAY_EARLY] attack on the live network. This attack could only work on users who happened to use their Guards, but those users could be fully deanonymized. The attack itself operated on connections to monitored HSDIRs: it encoded the address of the onion service in the cell command header, via the RELAY_EARLY bitflipping technique from Section 1.1.2. Their Guards then recorded this address, along with the IP address of the user, providing a log of onion services that each IP address visited. It is not clear if the CMU group even properly utilized the full path bias attack power here to deanonymize as many Tor users as possible, or if their logs were simply of interest to the FBI because of what they happened to capture. It seems like the latter is the case. A similar, motivated adversary could use any of the covert channels in Section 1.1, in combination with Path Bias to close non-deanonymized circuits, to fully deanonymize all exit traffic carried by their Guard relays. There are path bias detectors in Tor to detect large amounts of circuit failure, but when the network (or the Guard) is also under heavy circuit load, they can become unreliable, and have their own false positives. While this attack vector requires the Guard relay, it is of interest to any adversary that would like to perform dragnet deanonymization of a wide range of Tor users, or to compel a Guard to deanonymize certain Tor users. It is also of interest to adversaries with censorship capability, who would like to monitor all Tor usage of users, rather than block them. Such an adversary would use their censorship capability to direct Tor users to only their own malicious Guards or Bridges. 2.2. Guard Discovery Attacks with Netflow Deanonymization Prior to the introduction of Vanguards-lite in Tor 0.4.7, it was possible to combine "1.2.2. Adversary-Induced Circuit Creation", with a circuit-based covert channel (1.1.3, 1.2.1, or 1.3.2), to obtain a middle relay confirmed to be next to the user's Guard. Once the Guard is obtained, netflow connection times can be used to find the user of interest. There was at least one instance of this being used against a user of Ricochet, who was fully deanonymized. The user was using neither vanguards-lite, nor the vanguards addon, so this attack was trivial. It is unclear which covert channel type was used for Guard Discovery. The netflow attack proceeded quickly, because the attacker was able to determine when the user was on and offline via their onion service descriptor being available, and the number of users at the discovered Guard was relatively small. 2.3. Netflow Anonymity Set Reduction Netflow records have been used, to varying degrees of success, to attempt to identify users who have posted violent threats in an area. In most cases, this has simply ended up hassling unrelated Tor users, without finding the posting user. However, in at least one case, the user was found. Netflow records were also reportedly used to build suspicion of a datacenter in Germany which was emitting large amounts of Tor traffic, to eventually identify it as a Tor hosting service providing service to drug markets, after further investigation. It is not clear if a flooding attack was also used in this case. 2.4. Application Layer Confirmation The first (and only) known case of fine-grained traffic analysis of Tor involved an application layer confirmation attack, using the vector from 1.3.2. In this case, a particular person was suspected as being involved in a group under investigation, due to the presence of an informant in that group. The FBI then monitored the suspect's WiFi, and sent a series of XMPP ping messages to the account in question. Despite the use of Tor, enough pings were sent such that the timings on the monitored WiFi showed overlap with the XMPP timings of sent pings and responses. This was prior to Tor's introduction of netflow padding (which generates similar back-and-forth traffic every 4-9 seconds between the client and the Guard). It should be noted that such attacks are still prone to error, especially for heavy Tor users whose other traffic would always cause such overlap, as opposed to those who use Tor for only one purpose, and very lightly or infrequently. 3. Glossary Covert Channel: A kind of information leak that allows an adversary to send information to another point in the network. Collusion Signal: A Covert Channel that only reliably conveys 1 bit: if an adversary is present. Such covert channels are weaker than those that enable full identifier transmission, and also typically require correlation. Confirmation Signal: Similar to a collusion signal, a confirmation signal is sent over a weak or noisy channel, and can only confirm that an already suspected entity is the target of the signal. False Negative: A false negative is when the adversary fails to spot the presence of an info leak vector, in instances where it is actually present. False Positive: A false positive is when the adversary attempts to use an info leak vector, but some similar traffic pattern or behavior elsewhere matches the traffic pattern of their info leak vector. Guard Discovery: The ability of an adversary to determine the Guard in use by a service or client. Identifier Transmission: The ability of a covert channel to reliably encode a unique identifier, such as an IP address, without error. Oracle: An additional mechanism used to confirm an observed info leak vector that has a high rate of False Positives. Can take the form of DNS cache, server logs, analytics data, and other factors. (See [ORACLES]). Path Bias (aka Route Manipulation, or Route Capture): The ability of an adversary to direct circuits towards their other compromised relays, by destroying circuits and/or TLS connections whose paths are not sufficiently compromised. Acknowledgments: This document has benefited from review and suggestions by David Goulet, Nick Hopper, Rob Jansen, Nick Mathewson, Tobias Pulls, and Florentin Rochet. References: [ALPACA] https://petsymposium.org/2017/papers/issue2/paper54-2017-2-source.pdf [BACKLIT] https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf [DEEPCOFFEA] https://www-users.cse.umn.edu/~hoppernj/deepcoffea.pdf [DEFCRITIC] https://www-users.cse.umn.edu/~hoppernj/sok_wf_def_sp23.pdf [DNSORACLE] https://www.usenix.org/system/files/usenixsecurity23-dahlberg.pdf https://gitlab.torproject.org/rgdd/ttapd/-/tree/main/artifact/safety-board https://gitlab.torproject.org/tpo/core/tor/-/issues/40674 https://gitlab.torproject.org/tpo/core/tor/-/issues/40539 https://gitlab.torproject.org/tpo/core/tor/-/issues/32678 [DOSSECURITY] https://www.princeton.edu/~pmittal/publications/dos-ccs07.pdf [DROPMARK] https://petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf [FLASHFLOW] https://gitweb.torproject.org/torspec.git/tree/proposals/316-flashflow.md [FRONT] https://www.usenix.org/system/files/sec20summer_gong_prepub.pdf [GUARDSETS] https://www.freehaven.net/anonbib/cache/guardsets-pets2015.pdf https://www.freehaven.net/anonbib/cache/guardsets-pets2018.pdf [INTERSPACE] https://arxiv.org/pdf/2011.13471.pdf (Table 3) [LATENCY_LEAK] https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf https://www.robgjansen.com/publications/howlow-pets2013.pdf [LYING_SCANNER] https://gitlab.torproject.org/tpo/network-health/team/-/issues/313 [METRICSLEAK] https://gitlab.torproject.org/tpo/core/tor/-/issues/23512 [NETFLOW_TICKET] https://gitlab.torproject.org/tpo/network-health/team/-/issues/42 [ONECELL] https://www.blackhat.com/presentations/bh-dc-09/Fu/BlackHat-DC-09-Fu-Break-Tors-Anonymity.pdf [ONIONPRINT] https://www.freehaven.net/anonbib/cache/circuit-fingerprinting2015.pdf [ONIONFOUND] https://www.researchgate.net/publication/356421302_From_Onion_Not_Found_to_Guard_Discovery/fulltext/619be24907be5f31b7ac194a/From-Onion-Not-Found-to-Guard-Discovery.pdf?origin=publication_detail [ORACLES] https://petsymposium.org/popets/2020/popets-2020-0013.pdf [PADDING_SPEC] https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/padding-spec.txt#L68 [PCP] https://arxiv.org/abs/2103.03831 [QUICPRINT1] https://arxiv.org/abs/2101.11871 (see also: https://news.ycombinator.com/item?id=25969886) [QUICPRINT2] https://netsec.ethz.ch/publications/papers/smith2021website.pdf [RACCOON23] https://archives.seul.org/or/dev/Mar-2012/msg00019.html [RELAY_EARLY] https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack/ [ROBFINGERPRINT] https://www.usenix.org/conference/usenixsecurity23/presentation/shen-meng [SBWS] https://tpo.pages.torproject.net/network-health/sbws/how_works.html [WFLIVE] https://www.usenix.org/system/files/sec22-cherubin.pdf [WFNETSIM] https://petsymposium.org/2023/files/papers/issue4/popets-2023-0125.pdf
Filename: 345-specs-in-mdbook.md Title: Migrating the tor specifications to mdbook Author: Nick Mathewson Created: 2023-10-03 Status: Closed

Introduction

I'm going to propose that we migrate our specifications to a set of markdown files, specifically using the mdbook tool.

This proposal does not propose a bulk rewrite of our specs; it is meant to be a low-cost step forward that will produce better output, and make it easier to continue working on our specs going forward.

That said, I think that this change will enable rewrites in the future. I'll explain more below.

What is mdbook?

Mdbook is a tool developed by members of the Rust community to create books with Markdown. Each chapter is a single markdown file; the files are organized into a book using a SUMMARY.md file.

Have a look at the mdbook documentation; this is what the output looks like.

Have a look at this source tree: that's the input that produces the output above.

Markdown is extensible: it can use numerous plugins to enhance the semantics of the the markdown input, add diagrams, output in more formats, and so on.

What would using mdbook get us immediately?

There are a bunch of changes that we could get immediately via even the simplest migration to mdbook. These immediate benefits aren't colossal, but they are things we've wanted for quite a while.

  • We'll have a document that's easier to navigate (via the sidebars).

  • We'll finally have good HTML output.

  • We'll have all our specifications organized into a single "document", able to link to one another and cross reference one another.

  • We'll have perma-links to sections.

  • We'll have a built-in text search function. (Go to the mdbook documentation and hit "s" to try it out.)

How will mdbook help us later on as we reorganize?

Many of the benefits of mdbook will come later down the line as we improve our documentation.

  • Reorganizing will become much easier.

    • Our links will no longer be based on section number, so we won't have to worry about renumbering when we add new sections.
    • We'll be able to create redirects from old section filenames to new ones if we need to rename a file completely.
    • It will be far easier to break up our files into smaller files when we find that we need to reorganize material.
  • We will be able make our documents even easier to navigate.

    • As we improve our documentation, we'll be able to use links to cross-reference our sections.
  • We'll be able to include real diagrams and tables.

  • We'll be able to integrate proposals more easily.

    • New proposals can become new chapters in our specification simply by copying them into a new 'md' file or files; we won't have to decide between integrating them into existing files or creating a new spec.

    • Implemented but unmerged proposals can become additional chapters in an appendix to the spec. We can refer to them with permalinks that will still work when they move to another place in the specs.

How should we do this?

Strategy

My priorities here are:

  • no loss of information,
  • decent-looking output,
  • a quick automated conversion process that won't lose a bunch of time.
  • a process that we can run experimentally until we are satisfied with the results

With that in mind, I'm writing a simple set of torspec-converter scripts to convert our old torspec.git repository into its new format. We can tweak the scripts until we like the that they produce.

After running a recent torspec-converter on a fairly recent torspec.git, here is how the branch looks:

https://gitlab.torproject.org/nickm/torspec/-/tree/spec_conversion?ref_type=heads

And here's the example output when running mdbook on that branch:

https://people.torproject.org/~nickm/volatile/mdbook-specs/index.html

Note: these is not a permanent URL; we won't keep the example output forever. When we actually merge the changes, they will move into whatever final location we provide.

The conversion script isn't perfect. It only recognizes three kinds of content: headings, text, and "other". Content marked "other" is marked with ``` to reneder it verbatim.

The choice of which sections to split up and which to keep as a single page is up to us; I made some initial decisions in the file above, but we can change it around as we please. See the configuration section at the end of the grinder.py script for details on how it's set up.

Additional work that will be needed

Assuming that we make this change, we'll want to build an automated CI process to build it as a website, and update the website whenever there is a commit to the specifications.

(This automated CI process might be as simple as git clone && mdbook build && rsync -avz book/ $TARGET.)

We'll want to go through our other documentation and update links, especially the permalinks in spec.torproject.org.

It might be a good idea to use spec.torproject.org as the new location of this book, assuming weasel (who maintains spec.tpo) also thinks it's reasonable. If we do that, we need to decide on what we want the landing page to look like, and we need very much to get our permalink story correct. Right now I'm generating a .htaccess file as part of the conversion.

Stuff we shouldn't do.

I think we should continue to use the existing torspec.git repository for the new material, and just move the old text specs into a new archival location in torspec. (We could make a new repository entirely, but I don't think that's the best idea. In either case, we shouldn't change the text specifications after the initial conversion.)

We'll want to figure out our practices for keeping links working as we reorganize these documents. Mdbook has decent redirect support, but it's up to us to actually create the redicrets as necessary.

The transition, in detail

  • Before the transition:

    • Work on the script until it produces output we like.
    • Finalize this proposal and determine where we are hosting everything.
    • Develop the CI process as needed to keep the site up to date.
    • Get approval and comment from necessary stakeholders.
    • Write documentation as needed to support the new way of doing things.
    • Decide on the new layout we want for torspec.git.
  • Staging the transition:

    • Make a branch to try out the transition; explicitly allow force-pushing that branch. (Possibly nickm/torspec.git in a branch called mdbook-demo, or torspec.git in a branch called mdbook-demo assuming it is not protected.)
    • Make a temporary URL to target with the transition (possibly spec-demo.tpo)
    • Once we want to do the transition, shift the scripts to tpo/torspec.git:main and spec.tpo, possibly?
  • The transition:

    • Move existing specs to a new subdirectory in torspec.git.
    • Run the script to produce an mdbook instance in torspec.git with the right layout.
    • Install the CI process to keep the site up to date.
  • Post-transition

    • Update links elsewhere.
    • Continue to improve the specs.

Integrating proposals

We could make all of our proposals into a separate book, like rust does at https://rust-lang.github.io/rfcs/ . We could also leave them as they are for now.

(I don't currently think we should make all proposals part of the spec automatically.)

Timing

I think the right time to do this, if we decide to move ahead, is before November. That way we have this issue as something people can work on during the docs hackathon.

Alternatives

I've tried experimenting with Docusaurus here, which is even more full-featured and generates pretty react sites like this. (We're likely to use it for managing the Arti documentation and website.)

For the purposes we have here, it seems slightly overkill, but I do think a migration is feasible down the road if we decide we do want to move to docusaurus. The important thing is the ability to keep our URLs working, and I'm confident we could do that

The main differences for our purposes here seem to be:

  • The markdown implementation in Docusaurus is extremely picky about stuff that looks like HTML but isn't; it rejects it, rather than passing it on as text. Thus, using it would require a more painstaking conversion process before we could include text like "<state:on>" or "A <-> B" as our specs do in a few places.

  • Instead of organizing our documents in a SUMMARY.md with an MD outline format, we'd have to organize them in a sidebar.js with a javascript syntax.

  • Docusaurus seems to be far more flexible and have a lot more features, but also seems trickier to configure.

<-- References -->