.. onionperf documentation master file, created by
   sphinx-quickstart on Fri Mar  3 18:35:00 2023.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.


OnionPerf
=========

* :ref:`search`

-  `Overview <#overview>`__

   -  `What does OnionPerf do? <#what-does-onionperf-do->`__
   -  `What does OnionPerf not do? <#what-does-onionperf--not--do->`__

-  `Installation <#installation>`__

   -  `Tor <#tor>`__
   -  `TGen <#tgen>`__
   -  `OnionPerf <#onionperf-1>`__

-  `Measurement <#measurement>`__

   -  `Starting and stopping
      measurements <#starting-and-stopping-measurements>`__
   -  `Output directories and files <#output-directories-and-files>`__
   -  `Changing Tor configurations <#changing-tor-configurations>`__
   -  `Changing the TGen traffic
      model <#changing-the-tgen-traffic-model>`__
   -  `Sharing measurement results <#sharing-measurement-results>`__
   -  `Troubleshooting <#troubleshooting>`__

-  `Analysis <#analysis>`__

   -  `Analyzing measurement results <#analyzing-measurement-results>`__
   -  `Filtering measurement results <#filtering-measurement-results>`__
   -  `Visualizing measurement
      results <#visualizing-measurement-results>`__
   -  `Interpreting the PDF output
      format <#interpreting-the-pdf-output-format>`__
   -  `Interpreting the CSV output
      format <#interpreting-the-csv-output-format>`__
   -  `Visualizations on Tor Metrics <#visualizations-on-tor-metrics>`__

-  `Contributing <#contributing>`__

Overview
--------

What does OnionPerf do?
~~~~~~~~~~~~~~~~~~~~~~~

OnionPerf measures performance of bulk file downloads over Tor. Together
with its predecessor, Torperf, OnionPerf has been used to measure
long-term performance trends in the Tor network since 2009. It is also
being used to perform short-term performance experiments to compare
different Tor configurations or implementations.

OnionPerf uses multiple processes and threads to download random data
through Tor while tracking the performance of those downloads. The data
is served and fetched on localhost using two TGen (traffic generator)
processes, and is transferred through Tor using Tor client processes and
an ephemeral Tor onion service. Tor control information and TGen
performance statistics are logged to disk and analyzed once per day to
produce a JSON analysis file that can later be used to visualize changes
in Tor client performance over time.

What does OnionPerf *not* do?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

OnionPerf does not attempt to simulate complex traffic patterns like a
web-browsing user or a voice-chatting user. It measures a very specific
user model: a bulk 5 MiB file download over Tor.

OnionPerf does not interfere with how Tor selects paths and builds
circuits, other than setting configuration values as specified by the
user. As a result it cannot be used to measure specific relays nor to
scan the entire Tor network.

Installation
------------

OnionPerf has several dependencies in order to perform measurements or
analyze and visualize measurement results. These dependencies include
Tor, TGen (traffic generator), and a few Python packages.

The following description was written with a Debian system in mind but
should be transferable to other Linux distributions and possibly even
other operating systems.

Tor
~~~

OnionPerf relies on the ``tor`` binary to start a Tor process on the
client side to make client requests and another Tor process on the
server side to host onion services.

The easiest way to satisfy this dependency is to install the ``tor``
package, which puts the ``tor`` binary into the ``PATH`` where OnionPerf
will find it. Optionally, systemd can be instructed to make sure that
``tor`` is never started as a service:

.. code:: shell

   sudo apt install tor
   sudo systemctl stop tor.service
   sudo systemctl mask tor.service

Alternatively, Tor can be built from source:

.. code:: shell

   sudo apt install automake build-essential libevent-dev libssl-dev zlib1g-dev
   cd ~/
   git clone https://git.torproject.org/tor.git
   cd tor/
   ./autogen.sh
   ./configure --disable-asciidoc
   make

In this case the resulting ``tor`` binary can be found in
``~/tor/src/app/tor`` and needs to be passed to OnionPerf’s ``--tor``
parameter when doing measurements.

TGen
~~~~

OnionPerf uses TGen to generate traffic on client and server side for
its measurements. Installing dependencies, cloning TGen to a
subdirectory in the user’s home directory, and building TGen is done as
follows:

.. code:: shell

   sudo apt install cmake libglib2.0-dev libigraph0-dev make
   cd ~/
   git clone https://github.com/shadow/tgen.git
   cd tgen/
   mkdir build
   cd build/
   cmake ..
   make

The TGen binary will be contained in ``~/tgen/build/src/tgen``, which is
also the path that needs to be passed to OnionPerf’s ``--tgen``
parameter when doing measurements.

.. _onionperf-1:

OnionPerf
~~~~~~~~~

OnionPerf is written in Python 3. The following instructions assume that
a Python virtual environment is being used, even though installation is
also possible without that.

The virtual environment is created, activated, and tested using:

.. code:: shell

   sudo apt install python3-venv
   cd ~/
   python3 -m venv venv
   source venv/bin/activate
   which python3

The last command should output something like ``~/venv/bin/python3`` as
the path to the ``python3`` binary used in the virtual environment.

The next step is to clone the OnionPerf repository and install its
requirements:

.. code:: shell

   git clone https://git.torproject.org/onionperf.git
   pip3 install --no-cache -r onionperf/requirements.txt

The final step is to install OnionPerf and print out the usage
information to see if the installation was successful:

.. code:: shell

   cd onionperf/
   python3 setup.py install
   cd ~/
   onionperf --help

The virtual environment is deactivated with the following command:

.. code:: shell

   deactivate

However, in order to perform measurements or analyses, the virtual
environment needs to be activated first. This will ensure all the paths
are found.

If needed, unit tests are run with the following command:

.. code:: shell

   cd ~/onionperf/
   python3 -m nose --with-coverage --cover-package=onionperf

Measurement
-----------

Performing measurements with OnionPerf is done by starting an
``onionperf`` process that itself starts several other processes and
keeps running until it is interrupted by the user. During this time it
performs new measurements every 5 minutes and logs measurement results
to files.

Ideally, OnionPerf is run detached from the terminal session using tmux,
systemd, or similar, except for the most simple test runs. The specifics
for using these tools are not covered in this document.

Starting and stopping measurements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The most trivial configuration is to measure onion services only. In
that case, OnionPerf runs without needing any additional configuration.
For direct measurements via exit nodes, firewall rules or port
forwarding may be required to allow inbound connections to the TGen
server.

Starting these measurements is as simple as:

.. code:: shell

   cd ~/
   onionperf measure --onion-only --tgen ~/tgen/build/tgen --tor ~/tor/src/app/tor

OnionPerf logs its main output on the console and then waits
indefinitely until the user presses ``CTRL-C`` for graceful shutdown. It
does not, however, print out measurement results or progress on the
console, just a heartbeat message every hour.

OnionPerf’s ``measure`` mode has several command-line parameters for
customizing measurements. See the following command for usage
information:

.. code:: shell

   onionperf measure --help

Output directories and files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

OnionPerf writes several files to two subdirectories in the current
working directory while doing measurements:

-  ``onionperf-data/`` is the main directory containing measurement
   results.

   -  ``htdocs/`` is created at the first UTC midnight after starting
      and contains measurement analysis result files that can be shared
      via a local web server.

      -  ``$date.onionperf.analysis.json.xz`` contains extracted metrics
         in OnionPerf’s analysis JSON format.
      -  ``index.xml`` contains a directory index with file names,
         sizes, last-modified times, and SHA-256 digests.

   -  ``tgen-client/`` is the working directory of the client-side
      ``tgen`` process.

      -  ``log_archive/`` is created at the first UTC midnight after
         starting and contains compressed log files from previous UTC
         days.
      -  ``onionperf.tgen.log`` is the current log file.
      -  ``tgen.graphml.xml`` is the traffic model file generated by
         OnionPerf and used by TGen.

   -  ``tgen-server/`` is the working directory of the server-side
      ``tgen`` process with the same structure as ``tgen-client/``.
   -  ``tor-client/`` is the working directory of the client-side
      ``tor`` process.

      -  ``log_archive/`` is created at the first UTC midnight after
         starting and contains compressed log files from previous UTC
         days.
      -  ``onionperf.tor.log`` is the current log file containing log
         messages by the client-side ``tor`` process.
      -  ``onionperf.torctl.log`` is the current log file containing
         controller events obtained by OnionPerf connecting to the
         control port of the client-side ``tor`` process.
      -  ``[...]`` (several other files written by the client-side
         ``tor`` process to its data directory)

   -  ``tor-server/`` is the working directory of the server-side
      ``tor`` process with the same structure as ``tor-client/``.

-  ``onionperf-private/`` contains private keys of the onion services
   used for measurements and potentially other files that are not meant
   to be published together with measurement results.

Changing Tor configurations
~~~~~~~~~~~~~~~~~~~~~~~~~~~

OnionPerf generates Tor configurations for both client-side and
server-side ``tor`` processes. There are a few ways to add Tor
configuration lines:

-  If the ``BASETORRC`` environment variable is set, OnionPerf appends
   its own configuration options to the contents of that variable.
   Example:

   .. code:: shell

      BASETORRC=$'Option1 Foo\nOption2 Bar\n' onionperf ...

-  If the ``--torclient-conf-file`` and/or ``--torserver-conf-file``
   command-line arguments are given, the contents of those files are
   appended to the configurations of client-side and/or server-side
   ``tor`` process.

-  If the ``--additional-client-conf`` command-line argument is given,
   its content is appended to the configuration of the client-side
   ``tor`` process.

These options can be used, for example, to change the default
measurement setup use bridges (or pluggable transports) by passing
bridge addresses as additional client configuration lines as follows:

.. code:: shell

   onionperf measure --additional-client-conf="UseBridges 1\nBridge 72.14.177.231:9001 AC0AD4107545D4AF2A595BC586255DEA70AF119D\nBridge 195.91.239.8:9001 BA83F62551545655BBEBBFF353A45438D73FD45A\nBridge 148.63.111.136:35577 768C8F8313FF9FF8BBC915898343BC8B238F3770"

Changing the TGen traffic model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

OnionPerf is a relatively simple tool that can be adapted to do more
complex measurements beyond what can be configured on the command line.

For example, the hard-coded traffic model generated by OnionPerf and
executed by the TGen processes is to send a small request from client to
server and receive a relatively large response of 5 MiB of random data
back. This model can be changed by editing
``~/onionperf/onionperf/model.py``, rebuilding, and restarting
measurements. For specifics, see the `TGen
documentation <https://github.com/shadow/tgen/blob/master/doc/TGen-Overview.md>`__
and `TGen traffic model
examples <https://github.com/shadow/tgen/blob/master/tools/scripts/generate_tgen_config.py>`__.

Sharing measurement results
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Measurement results can be further analyzed and visualized on the
measuring host. But in many cases it’s more convenient to do analysis
and visualization on another host, also to compare measurements from
different hosts to each other.

There are at least two common ways of sharing measurement results:

1. Creating a tarball of the ``onionperf-data/`` directory; and
2. Using a local web server to serve the contents of the
   ``onionperf-data/`` directory.

The details of doing either of these two methods are not covered in this
document.

Troubleshooting
~~~~~~~~~~~~~~~

If anything goes wrong while doing measurements, OnionPerf typically
informs the user in its console output. This is also the first place to
look for investigating any issues.

The second place would be to check the log files in
``~/onionperf-data/tgen-client/`` or ``~/onionperf-data/tor-client/``.

The most common configuration problems are probably related to firewall
and port forwarding for doing direct (non onion-service) measurements.
The specifics for setting up the firewall are out of scope for this
document.

Another class of common issues of long-running measurements is that one
of the ``tgen`` or ``tor`` processes dies for reasons or hints
(hopefully) to be found in their respective log files.

In order to avoid extended downtimes it is recommended to deploy
monitoring tools that check whether measurement results produced by
OnionPerf are fresh. The specifics are, again, out of scope for this
document.

Analysis
--------

The next steps after performing measurements are to analyze and
optionally visualize measurement results.

Analyzing measurement results
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While performing measurements, OnionPerf writes quite verbose log files
to disk. The first step in the analysis is to parse these log files,
extract key metrics, and write smaller and more structured measurement
results to disk. This is done with OnionPerf’s ``analyze`` mode.

For example, the following command analyzes current log files of a
running (or stopped) OnionPerf instance (as opposed to log-rotated,
compressed files from previous days):

.. code:: shell

   onionperf analyze --tgen ~/onionperf-data/tgen-client/onionperf.tgen.log --torctl ~/onionperf-data/tor-client/onionperf.torctl.log

The output analysis file is written to ``onionperf.analysis.json.xz`` in
the current working directory. The file format is described in more
detail in ``schema/onionperf-3.0.json``.

The same analysis files are written automatically as part of ongoing
measurements once per day at UTC midnight and can be found in
``onionperf-data/htdocs/``.

OnionPerf’s ``analyze`` mode has several command-line parameters for
customizing the analysis step:

.. code:: shell

   onionperf analyze --help

Filtering measurement results
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``filter`` subcommand can be used to filter out measurement results
based on given criteria. This subcommand is typically used in
combination with the ``visualize`` subcommand. The workflow is to apply
one or more filters and then visualize only those measurements with an
existing mapping between TGen transfers/streams and Tor
streams/circuits.

Currently, OnionPerf measurement results can be filtered based on Tor
relay fingerprints found in Tor circuits, although support for filtering
based on Tor streams and/or TGen transfers/streams may be added in the
future.

The ``filter`` mode takes a list of fingerprints and one or more
existing analysis files as inputs and outputs new analysis files with
the same contents as the input analysis files plus annotations on those
Tor circuits that have been filtered out. If a directory of analysis
files is given to ‘-i’, the structure and filenames of that directory
are preserved under the path specified with ‘-o’.

For example, the analysis file produced above can be filtered with the
following command, which retains only those Tor circuits with
fingerprints contained in the file ‘fingerprints.txt’:

.. code:: shell

   onionperf filter -i onionperf.analysis.json.xz -o filtered.onionperf.analysis.json.xz --include-fingerprints fingerprints.txt

OnionPerf’s ``filter`` command usage can be inspected with:

.. code:: shell

   onionperf filter --help

Visualizing measurement results
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Step two in the analysis is to process analysis files with OnionPerf’s
``visualize`` mode which produces CSV and PDF files as output.

For example, the analysis file produced above can be visualized with the
following command, using “Test Measurements” as label for the data set:

.. code:: shell

   onionperf visualize --data onionperf.analysis.json.xz "Test Measurements"

As a result, three files are written to the current working directory:

-  ``onionperf.viz.$datetime.csv`` contains visualized data in a CSV
   file format; and
-  ``onionperf.viz.$datetime.pdf`` contains visualizations in a PDF file
   format.
-  ``onionperf.outliers.$datetime.pdf`` contains measurement outliers
   visualizations in a PDF file format.

By default, both the base pdf and the outliers pdf are produced, but
this can be controlled using the ``-c`` switch on the command line.

For analysis files containing tor circuit filters, only measurements
with an existing mapping between TGen transfers/streams Tor
streams/circuits which have not been marked as ‘filtered_out’ are
visualized.

Similar to the other modes, OnionPerf’s ``visualize`` mode has
command-line parameters for customizing the visualization step:

.. code:: shell

   onionperf visualize --help

Interpreting the PDF output format
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The base PDF output file contains visualizations of the following
metrics:

-  Time to download first (last) byte, which is defined as elapsed time
   between starting a measurement and receiving the first (last) byte of
   the HTTP response.
-  Throughput, which is computed from the elapsed time between receiving
   0.5 and 1 MiB of the response for 1MiB transfers, and from the
   elapsed time between receiving 4 and 5 MiB of the response for 5MiB
   transfers.
-  Number of downloads.
-  Number and type of failures.

The measurement outliers PDF output file contains visualizations of the
following metrics:

-  Outlier relays in the TTFB (time to first byte) dataset, for public
   service measurements

-  Outlier relays in the TTFB and TTLB datasets, for onion service
   measurements

-  Common outliers in the TTFB and TTLB dataset across both public and
   onion service measurements

-  Relays most seen in circuits that failed with errors for both public
   and onion measurements

By default, we consider measurement results in the 75th percentile and
only display the top 15 fingerprints that appear the most by count. This
can be changed with command line arguments:

.. code:: shell

   onionperf visualize -d <file(s)> label --percentile 90 --threshold 50

Interpreting the CSV output format
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The CSV output file contains the same data that is visualized in the PDF
file. It contains the following columns:

-  ``id`` is the identifier used in the TGen client logs which may be
   useful to look up more details about a specific measurement.
-  ``error_code`` is an optional error code if a measurement did not
   succeed.
-  ``filesize_bytes`` is the requested file size in bytes.
-  ``label`` is the data set label as given in the ``--data/-d``
   parameter to the ``visualize`` mode.
-  ``server`` is set to either ``onion`` for onion service measurements
   or ``public`` for direct measurements.
-  ``start`` is the measurement start time.
-  ``time_to_first_byte`` is the time in seconds (with microsecond
   precision) to download the first byte.
-  ``time_to_last_byte`` is the time in seconds (with microsecond
   precision) to download the last byte.

Visualizations on Tor Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The analysis and visualization steps above can all be done by using the
OnionPerf tool. In addition to that it’s possible to visualize OnionPerf
analysis files using other tools.

For example, the `Tor Metrics
website <https://metrics.torproject.org/torperf.html>`__ contains
various graphs based OnionPerf data.

Contributing
------------

The OnionPerf code is developed at
https://gitlab.torproject.org/tpo/network-health/metrics/onionperf.

Contributions to OnionPerf are welcome and encouraged!