.. _data_retention:

Data retention
==============

We often need to know for how long the measurements are written or read, to
know whether some calculations are correct.
It's usually confusing cause we have at least two defaults, 5 and 28 days.

The 28 days default is used:

* to keep measurements for this data interval in the file system
* to read measurements from this data interval to generate the BandwidthFile

  This means that all average values, like ``bw_mean`` are calculated from
  measurements from the previous 28 days.

The value comes from :const:`~sbws.globals`::

    GENERATE_PERIOD = 28 * 24 * 60 * 60

Used in :term:`generator` :func:`~sbws.core.generate.main`::

    elif scaling_method == TORFLOW_SCALING:
        fresh_days = ceil(GENERATE_PERIOD / 24 / 60 / 60)  # 28

    results = load_recent_results_in_datadir(
        fresh_days,  # 28

The 5 days default is used:

* to read the measurements from this data intervel during scanning
* to keep in memory the measurements for this data interval during scanning

The value comes from ``config.default.ini``::

  data_period = 5

Used in :term:`scanner` :func:`~sbws.core.scanner.run_speedtest`::

    measurements_period = conf.getint("general", "data_period")

:class:`~sbws.lib.resultdump.ResultDump` ``.__init__``::

    self.fresh_days = conf.getint("general", "data_period")

:meth:`~sbws.lib.resultdump.ResultDump.store_result`::

    self.data = trim_results(self.fresh_days, self.data)

:meth:`~sbws.lib.resultdump.ResultDump.enter`::

    self.data = load_recent_results_in_datadir(
        self.fresh_days, self.datadir
    )

It's also :const:`~sbws.globals`::

    MEASUREMENTS_PERIOD = 5 * 24 * 60 * 60
    PERIOD_DAYS = int(MEASUREMENTS_PERIOD / (24 * 60 * 60))  # 5

Used in :class:`~sbws.lib.relaylist.RelayList` ``.__init__``::

    measurements_period=MEASUREMENTS_PERIOD
    self._measurements_period = measurements_period

:meth:`~sbws.lib.relaylist.RelayList._init_relays`::

      days = self._measurements_period

These defaults are overwritten when calling the class from ``scanner.py``::

    measurements_period = conf.getint("general", "data_period")  # 5
    rl = RelayList(args, conf, controller, measurements_period, state)


These functions are call with either 28 or 5 days:

:func:`~sbws.lib.resultdump.trim_results`::

    data_period = fresh_days * 24 * 60 * 60

:func:`~sbws.lib.resultdump.load_recent_results_in_datadir`::

    data_period = fresh_days + 2
    oldest_day = today - timedelta(days=data_period)

    results = trim_results(fresh_days, results)