Command Line Interface

Perun can be run from the command line (if correctly installed) using the command interface inspired by git.

The Command Line Interface is implemented using the Click library, which allows both effective definition of new commands and finer parsing of the command line arguments. The interface can be broken into several groups:

1. Core commands: namely init, config, add, rm, status, log, run commands (which consists of commands run job and run matrix) and check commands (which consists of commands check all, check head and check profiles). These commands automate the creation of performance profiles, detection of performance degradation and are used for management of the Perun repository. Refer to Perun Commands for details about commands.

2. Collect commands: group of collect COLLECTOR commands, where COLLECTOR stands for one of the collector of Supported Collectors. Each COLLECTOR has its own API, refer to Collect units for thorough description of API of individual collectors.

3. Postprocessby commands: group of postprocessby POSTPROCESSOR commands, where POSTPROCESSOR stands for one of the postprocessor of Supported Postprocessors. Each POSTPROCESSOR has its own API, refer to Postprocess units for thorough description of API of individual postprocessors.

4. View commands: group of view VISUALIZATION commands, where VISUALIZATION stands for one of the visualizer of Supported Visualizations. Each VISUALIZATION has its own API, refer to Show units for thorough description of API of individual views.

5. Utility commands: group of commands used for developing Perun or for maintenance of the Perun instances. Currently, this group contains create command for faster creation of new modules.

Graphical User Interface is currently in development and hopefully will extend the flexibility of Perun’s usage.

perun

Perun is an open source light-weight Performance Versioning System.

In order to initialize Perun in current directory run the following:

perun init

This initializes basic structure in .perun directory, together with possible reinitialization of git repository in current directory. In order to set basic configuration and define jobs for your project run the following:

perun config --edit

This opens editor and allows you to specify configuration of your project and choose set of collectors for capturing resources. See Automating Runs and Perun Configuration files for more details.

In order to generate first set of profiles for your current HEAD run the following:

perun run matrix
perun [OPTIONS] COMMAND [ARGS]...

Options

-d, --dev-mode

Suppresses the catching of all exceptions from the CLI and generating of the dump.

--no-pager

Disables the paging of the long standard output (currently affects only status and log outputs). See paging to change the default paging strategy.

-nc, --no-color

Disables the colored output.

-l, --log

Logs output of commands to log directory.

-ld, --log-dir <log_dir>

Ouputs logs to different directory (default=`.perun/logs`).

-y, --say-yes

Says yes to every confirmation prompt

-v, --verbose

Increases the verbosity of the standard output. Verbosity is incremental, and each level increases the extent of output.

--version

Prints the current version of Perun.

-m, --metrics <metrics>

Enables the collection of metrics into the given temp file(first argument) under the supplied ID (second argument).

Commands

add

Links profile to concrete minor version…

check

Applies for the points of version history…

collect

Generates performance profile using…

config

Manages the stored local and shared…

fuzz

Performs fuzzing for the specified command…

init

Initializes performance versioning system…

log

Shows history of versions and associated…

postprocessby

Postprocesses the given stored or pending…

rm

Unlinks the profile from the given minor…

run

Generates batch of profiles w.r.t.

show

Interprets the given profile using the…

showdiff

Interprets the difference of selected two…

status

Shows the status of vcs, associated…

utils

Contains set of developer commands,…

Perun Commands

perun init

Initializes performance versioning system at the destination path.

perun init command initializes the perun’s infrastructure with basic file and directory structure inside the .perun directory. Refer to Perun Internals for more details about storage of Perun. By default, following directories are created:

  1. .perun/jobs: storage of performance profiles not yet assigned to concrete minor versions.

  2. .perun/objects: storage of packed contents of performance profiles and additional information about minor version of wrapped vcs system.

  3. .perun/cache: fast access cache of selected latest unpacked profiles

  4. .perun/local.yml: local configuration, storing specification of wrapped repository, jobs configuration, etc. Refer to Perun Configuration files for more details.

The infrastructure is initialized at <path>. If no <path> is given, then current working directory is used instead. In case there already exists a performance versioning system, the infrastructure is only reinitialized.

By default, a control system is initialized as well. This can be changed by setting the --vcs-type parameter (currently we support git). Additional parameters can be passed to the wrapped control system initialization using the --vcs-params.

perun init [OPTIONS] <path>

Options

--vcs-type <type>

In parallel to initialization of Perun, initialize the vcs of <type> as well (by default svs).

Options:

git | svs

--vcs-path <path>

Sets the destination of wrapped vcs initialization at <path>.

--vcs-param <param>

Passes additional (key, value) parameter to initialization of version control system, e.g. separate-git-dir dir.

--vcs-flag <flag>

Passes additional flag to a initialization of version control system, e.g. bare.

-c, --configure

After successful initialization of both systems, opens the local configuration using the editor set in shared config.

-t, --config-template <config_template>

States the configuration template that will be used for initialization of local configuration. See Predefined Configuration Templates for more details about predefined configurations.

Arguments

<path>

Optional argument

perun config

Manages the stored local and shared configuration.

Perun supports two external configurations:

  1. local.yml: the local configuration stored in .perun directory, containing the keys such as specification of wrapped repository or job matrix used for quick generation of profiles (run perun run matrix --help or refer to Automating Runs for information how to construct the job matrix).

  2. shared.yml: the global configuration shared by all perun instances, containing shared keys, such as text editor, formatting string, etc.

The syntax of the <key> in most operations consists of section separated by dots, e.g. vcs.type specifies type key in vcs section. The lookup of the <key> can be performed in three modes, --local, --shared and --nearest, locating or setting the <key> in local, shared or nearest configuration respectively (e.g. when one is trying to get some key, there may be nested perun instances that do not contain the given key). By default, perun operates in the nearest config mode.

Refer to Perun Configuration files for full description of configurations and Configuration types for full list of configuration options.

E.g. using the following one can retrieve the type of the nearest perun instance wrapper:

$ perun config get vcs.type
vcs.type: git
perun config [OPTIONS] COMMAND [ARGS]...

Options

-l, --local

Sets the local config, i.e. .perun/local.yml, as the source config.

-h, --shared

Sets the shared config, i.e. shared.yml., as the source config

-n, --nearest

Sets the nearest suitable config as the source config. The lookup strategy can differ for set and get/edit.

Commands

edit

Edits the configuration file in the…

get

Looks up the given <key> within the…

reset

Resets the configuration file to a sane…

set

Sets the value of the <key> to the…

perun config get

Looks up the given <key> within the configuration hierarchy and returns the stored value.

The syntax of the <key> consists of section separated by dots, e.g. vcs.type specifies type key in vcs section. The lookup of the <key> can be performed in three modes, --local, --shared and --nearest, locating the <key> in local, shared or nearest configuration respectively (e.g. when one is trying to get some key, there may be nested perun instances that do not contain the given key). By default, perun operates in the nearest config mode.

Refer to Perun Configuration files for full description of configurations and Configuration types for full list of configuration options.

E.g. using the following can retrieve the type of the nearest perun wrapper:

$ perun config get vcs.type
vcs.type: git

$ perun config --shared get general.editor
general.editor: vim
perun config get [OPTIONS] <key>

Arguments

<key>

Required argument

perun config set

Sets the value of the <key> to the given <value> in the target configuration file.

The syntax of the <key> corresponds of section separated by dots, e.g. vcs.type specifies type key in vcs section. Perun sets the <key> in three modes, --local, --shared and --nearest, which sets the <key> in local, shared or nearest configuration respectively (e.g. when one is trying to get some key, there may be nested perun instances that do not contain the given key). By default, perun will operate in the nearest config mode.

The <value> is arbitrary depending on the key.

Refer to Perun Configuration files for full description of configurations and Configuration types for full list of configuration options and their values.

E.g. using the following can set the log format for nearest perun instance wrapper:

$ perun config set format.shortlog "| %source% | %collector% |"
format.shortlog: | %source% | %collector% |
perun config set [OPTIONS] <key> <value>

Arguments

<key>

Required argument

<value>

Required argument

perun config edit

Edits the configuration file in the external editor.

The used editor is specified by the general.editor option, specified in the nearest perun configuration..

Refer to Perun Configuration files for full description of configurations and Configuration types for full list of configuration options.

perun config edit [OPTIONS]

perun add

Links profile to concrete minor version storing its content in the .perun dir and registering the profile in internal minor version index.

In order to link <profile> to given minor version <hash> the following steps are executed:

  1. We check in <profile> that its origin key corresponds to <hash>. This serves as a check, that we do not assign profiles to different minor versions.

  2. The origin is removed and contents of <profile> are compressed using zlib compression method.

  3. Binary header for the profile is constructed.

  4. Compressed contents are appended to header, and this blob is stored in .perun/objects directory.

  5. New blob is registered in <hash> minor version’s index.

  6. Unless --keep-profile is set, the original profile is deleted.

If no <hash> is specified, then current HEAD of the wrapped version control system is used instead. Massaging of <hash> is taken care of by underlying version control system (e.g. git uses git rev-parse).

<profile> can either be a pending tag, pending tag range or a fullpath. Pending tags are in form of i@p, where i stands for an index in the pending profile directory (i.e. .perun/jobs) and @p is literal suffix. The pending tag range is in form of i@p-j@p, where both i and j stands for indexes in the pending profiles. The pending tag range then represents all the profiles in the interval <i, j>. When i > j, then no profiles will be added; when j; when j is bigger than the number of pending profiles, then all the non-existing pending profiles will be obviously skipped. Run perun status to see the tag annotation of pending profiles. Tags consider the sorted order as specified by the following option format.sort_profiles_by.

Example of adding profiles:

$ perun add mybin-memory-input.txt-2017-03-01-16-11-04.perf

This command adds the profile collected by memory collector during profiling mybin command with input.txt workload on 1st March at 16:11 to the current HEAD.

An error is raised if the command is executed outside of range of any perun, if <profile> points to incorrect profile (i.e. not w.r.t. Specification of Profile Format) or <hash> does not point to valid minor version ref.

See Perun Internals for information how perun handles profiles internally.

perun add [OPTIONS] <profile>

Options

-m, --minor <hash>

<profile> will be stored at this minor version (default is HEAD).

--keep-profile

Keeps the profile in filesystem after registering it in Perun storage. Otherwise it is deleted.

-f, --force

If set to true, then the profile will be registered in the <hash> minor versionindex, even if its origin <hash> is different. WARNING: This can screw the performance history of your project.

Arguments

<profile>

Required argument(s)

perun rm

Unlinks the profile from the given minor version, keeping the contents stored in .perun directory.

<profile> is unlinked in the following steps:

  1. <profile> is looked up in the <hash> minor version’s internal index.

  2. In case <profile> is not found. An error is raised.

  3. Otherwise, the record corresponding to <hash> is erased. However, the original blob is kept in .perun/objects.

If no <hash> is specified, then current HEAD of the wrapped version control system is used instead. Massaging of <hash> is taken care of by underlying version control system (e.g. git uses git rev-parse).

<profile> can either be a index tag, pending tag or a path specifying the profile either in index or in the pending jobs. Index tags are in form of i@i, where i stands for an index in the minor version’s index and @i is literal suffix. Run perun status to see the tags of current HEAD’s index. The index tag range is in form of i@i-j@i, where both i and j stands for indexes in the minor version’s index. The index tag range then represents all the profiles in the interval <i, j>. registered in index. When i > j, then no profiles will be removed; when j; when j is bigger than the number of pending profiles, then all the non-existing pending profiles will be obviously skipped. The pending tags and pending tag range are defined analogously to index tags, except they use the p character, i.e. 0@p and 0@p-2@p are valid pending tag and pending tag range. Otherwise, one can use the path to represent the removed profile. If the path points to existing profile in pending jobs (i.e. .perun/jobs directory) the profile is removed from the jobs, otherwise it is looked-up in the index. Tags consider the sorted order as specified by the following option format.sort_profiles_by.

Examples of removing profiles:

$ perun rm 2@i

This commands removes the third (we index from zero) profile in the index of registered profiles of current HEAD.

An error is raised if the command is executed outside of range of any Perun or if <profile> is not found inside the <hash> index.

See Perun Internals for information how perun handles profiles internally.

perun rm [OPTIONS] <profile>

Options

-m, --minor <hash>

<profile> will be stored at this minor version (default is HEAD).

Arguments

<profile>

Required argument(s)

perun status

Shows the status of vcs, associated profiles and perun.

Shows the status of both the nearest perun and wrapped version control system. For vcs this outputs e.g. the current minor version HEAD, current major version and description of the HEAD. Moreover, status prints the lists of tracked and pending (found in .perun/jobs) profiles lexicographically sorted along with additional information such as their types and creation times.

Unless perun --no-pager status is issued as command, or appropriate paging option is set, the outputs of status will be paged (by default using less).

An error is raised if the command is executed outside of range of any perun, or configuration misses certain configuration keys (namely format.status).

Profiles (both registered in index and stored in pending directory) are sorted according to the format.sort_profiles_by. The option --sort-by sets this key in the local configuration for further usage. This means that using the pending or index tags will consider this order.

Refer to Customizing Statuses for information how to customize the outputs of status or how to set format.status in nearest configuration.

perun status [OPTIONS]

Options

-s, --short

Shortens the output of status to include only most necessary information.

-sb, --sort-by <format__sort_profiles_by>

Sets the <key> in the local configuration for sorting profiles. Note that after setting the <key> it will be used for sorting which is considered in pending and index tags!

Options:

realpath | type | time | cmd | workload | collector | checksum | source

perun log

Shows history of versions and associated profiles.

Shows the history of the wrapped version control system and all the associated profiles starting from the <hash> point, outputting the information about number of profiles, about descriptions of concrete minor versions, their parents, parents etc.

If perun log --short is issued, the shorter version of the log is outputted.

In no <hash> is given, then HEAD of the version control system is used as a starting point.

Unless perun --no-pager log is issued as command, or appropriate paging option is set, the outputs of log will be paged (by default using less).

Refer to Customizing Logs for information how to customize the outputs of log or how to set format.shortlog in nearest configuration.

perun log [OPTIONS] <hash>

Options

-s, --short

Shortens the output of log to include only most necessary information.

Arguments

<hash>

Optional argument

perun run

Generates batch of profiles w.r.t. specification of list of jobs.

Either runs the job matrix stored in local.yml configuration or lets the user construct the job run using the set of parameters.

perun run [OPTIONS] COMMAND [ARGS]...

Options

-ot, --output-filename-template <output_filename_template>

Specifies the template for automatic generation of output filename This way the file with collected data will have a resulting filename w.r.t to this parameter. Refer to format.output_profile_template for more details about the format of the template.

-m, --minor-version <minor_version_list>

Specifies the head minor version, for which the profiles will be collected.

-c, --crawl-parents

If set to true, then for each specified minor versions, profiles for parents will be collected as well

-f, --force-dirty

If set to true, then even if the repository is dirty, the changes will not be stashed

Commands

job

Run specified batch of perun jobs to…

matrix

Runs the jobs matrix specified in the…

perun run job

Run specified batch of perun jobs to generate profiles.

This command correspond to running one isolated batch of profiling jobs, outside of regular profiling. Run perun run matrix, after specifying job matrix in local configuration to automate regular profiling of your project. After the batch is generated, each profile is tagged with origin set to current HEAD. This serves as check to not assign such profiles to different minor versions.

By default, the profiles computed by this batch job are stored inside the .perun/jobs/ directory as a files in form of:

bin-collector-workload-timestamp.perf

In order to store generated profiles run the following, with i@p corresponding to pending tag, which can be obtained by running perun status:

perun add i@p
perun run job -c time -b ./mybin -w file.in -w file2.in -p regression-analysis

This command profiles two commands ./mybin file.in and ./mybin file2.in and collects the profiling data using the Time Collector. The profiles are then modeled with the Regression Analysis.

perun run job -c complexity -b ./mybin -w sll.cpp -cp complexity targetdir=./src

This commands runs one job ‘./mybin sll.cpp’ using the Trace Collector, which uses custom binaries targeted at ./src directory.

perun run job -c mcollect -b ./mybin -b ./otherbin -w input.txt -p regressogram -p regression-analysis

This commands runs two jobs ./mybin input.txt and ./otherbin input.txt and collects the profiles using the Memory Collector. The profiles are then postprocessed, first using the postprocessors-regressogram and then with Regression Analysis.

Refer to Automating Runs and Perun’s Profile Format for more details about automation and lifetimes of profiles. For list of available collectors and postprocessors refer to Supported Collectors and Supported Postprocessors respectively.

perun run job [OPTIONS]

Options

-b, --cmd <cmd>

Required Command that is being profiled. Either corresponds to some script, binary or command, e.g. ./mybin or perun.

-a, --args <args>

Additional parameters for <cmd>. E.g. status or -al is command parameter.

-w, --workload <workload>

Inputs for <cmd>. E.g. ./subdir is possible workloadfor ls command.

-c, --collector <collector>

Required Profiler used for collection of profiling data for the given <cmd>

Options:

trace | memory | time | complexity | bounds | kperf

-cp, --collector-params <collector_params>

Additional parameters for the <collector> read from the file in YAML format

-p, --postprocessor <postprocessor>

After each collection of data will run <postprocessor> to postprocess the collected resources.

Options:

clusterizer | normalizer | regression-analysis | regressogram | moving-average | kernel-regression

-pp, --postprocessor-params <postprocessor_params>

Additional parameters for the <postprocessor> read from the file in YAML format

perun run matrix

Runs the jobs matrix specified in the local.yml configuration.

This commands loads the jobs configuration from local configuration, builds the job matrix and subsequently runs the jobs collecting list of profiles. Each profile is then stored in .perun/jobs directory and moreover is annotated using by setting origin key to current HEAD. This serves as check to not assign such profiles to different minor versions.

The job matrix is defined in the yaml format and consists of specification of binaries with corresponding arguments, workloads, supported collectors of profiling data and postprocessors that alter the collected profiles.

Refer to Automating Runs and Job Matrix Format for more details how to specify the job matrix inside local configuration and to Perun Configuration files how to work with Perun’s configuration files.

perun run matrix [OPTIONS]

Options

-q, --without-vcs-history

Will not print the VCS history tree during the collection of the data.

perun check

Applies for the points of version history checks for possible performance changes.

This command group either runs the checks for one point of history (perun check head) or for the whole history (perun check all). For each minor version (called the target) we iterate over the registered profiles and try to find a predecessor minor version (called the baseline) with profile of the same configuration (by configuration we mean the tuple of collector, postprocessors, command, arguments and workloads) and run the checks according to the rules set in the configurations.

The rules are specified as an ordered list in the configuration by degradation.strategies, where the keys correspond to the configuration (or the type) and key method specifies the actual method used for checking for performance changes. The applied methods can then be either specified by the full name or by its short string consisting of all first letter of the function name.

The example of configuration snippet that sets rules and strategies for one project can be as follows:

degradation:
  apply: first
  strategies:
    - type: mixed
      postprocessor: regression_analysis
      method: bmoe
    - cmd: mybin
      type: memory
      method: bmoe
    - method: aat

Currently, we support the following methods:

  1. Best Model Order Equality (BMOE)

  2. Average Amount Threshold (AAT)

  3. Polynomial Regression (PREG)

  4. Linear Regression (LREG)

  5. Fast Check (FAST)

  6. Integral Comparison (INT)

  7. Local Statistics (LOC)

  8. Exclusive Time Outliers (ETO)

perun check [OPTIONS] COMMAND [ARGS]...

Options

-f, --force

Force comparison of the selected profiles even if their configurationdoes not match. This may be necessary when, e.g., different projectversions build binaries with version information in their name(python3.10 and python3.11), thus failing the consistency check.

-c, --compute-missing

whenever there are missing profiles in the given point of history the matrix will be rerun and new generated profiles assigned.

-m, --models-type <models_type>

The detection models strategies predict the way of executing the detection between two profiles, respectively between relevant kinds of its models. Available only in the following detection methods: Integral Comparison (IC) and Local Statistics (LS).

Options:

best-model | best-param | best-nonparam | all-param | all-nonparam | all-models | best-both

Commands

all

Checks for changes in performance for the…

head

Checks for changes in performance between…

profiles

Checks for changes in performance between…

perun check head

Checks for changes in performance between specified minor version (or current head) and its predecessor minor versions.

The command iterates over the registered profiles of the specified minor version (target; e.g. the head), and tries to find the nearest predecessor minor version (baseline), where the profile with the same configuration as the tested target profile exists. When it finds such a pair, it runs the check according to the strategies set in the configuration (see Configuring Degradation Detection or Perun Configuration files).

By default, the hash corresponds to the head of the current project.

perun check head [OPTIONS] <hash>

Arguments

<hash>

Optional argument

perun check all

Checks for changes in performance for the specified interval of version history.

The command crawls through the whole history of project versions starting from the specified <hash> and for registered profiles (corresponding to some target minor version) tries to find a suitable predecessor profile (corresponding to some baseline minor version) and runs the performance check according to the set of strategies set in the configuration (see Configuring Degradation Detection or Perun Configuration files).

perun check all [OPTIONS] <hash>

Arguments

<hash>

Optional argument

perun check profiles

Checks for changes in performance between two profiles.

The command checks for the changes between two isolate profiles, that can be stored in pending profiles, registered in index, or be simply stored in filesystem. Then for the pair of profiles <baseline> and <target> the command runs the performance check according to the set of strategies set in the configuration (see Configuring Degradation Detection or Perun Configuration files).

<baseline> and <target> profiles will be looked up in the following steps:

  1. If profile is in form i@i (i.e, an index tag), then ith record registered in the minor version <hash> index will be used.

  2. If profile is in form i@p (i.e., an pending tag), then ith profile stored in .perun/jobs will be used.

  3. Profile is looked-up within the minor version <hash> index for a match. In case the <profile> is registered there, it will be used.

  4. Profile is looked-up within the .perun/jobs directory. In case there is a match, the found profile will be used.

  5. Otherwise, the directory is walked for any match. Each found match is asked for confirmation by user.

perun check profiles [OPTIONS] <baseline> <target>

Options

-m, --minor <hash>

Will check the index of different minor version <hash> during the profile lookup.

Arguments

<baseline>

Required argument

<target>

Required argument

perun fuzz

Performs fuzzing for the specified command according to the initial sample of workload.

perun fuzz [OPTIONS]

Options

-b, --cmd <cmd>

Required The command which will be fuzzed.

-w, --input-sample <input_sample>

Required Initial sample of workloads (the so called corpus).These will serve as initial workloads to evaluate the baseline for performance testing.The parameter expects either paths to files (which will be directly added), or paths to directories (which will be recursively searched).

-c, --collector <collector>

Collector that will be used to collect performance data and used to infer baseline or target performance profiles. The profiles are further used for performance testing.

Options:

trace | memory | time | complexity | bounds | kperf

-cp, --collector-params <collector_params>

Additional parameters for the <collector>: can be specified as a file in YAML format or as YAML string

-p, --postprocessor <postprocessor>

After each collection of performance data, the fuzzer can run <postprocessor> to postprocess the collected resources (e.g. to create models of resources). This can be used for more thorough performance analysis.

Options:

clusterizer | normalizer | regression-analysis | regressogram | moving-average | kernel-regression

-pp, --postprocessor-params <postprocessor_params>

Additional parameters for the <postprocessor>: can be specified as a file in YAML format or as YAML string

-m, --minor-version <minor_version_list>

Specifies the head minor version in the wrapped repository. The fuzzing will be performed for this particular version of the project.

-wf, --workloads-filter <regexp>

Regular expression that will the filter input workloads/corpus. E.g. to restrict to certain filetypes, filenames or subdirectories.

--skip-coverage-testing

If set to true, then the evaluation of mutations based on coverage testing will not be performed. The coverage testing is a fast heuristic to filter out mutations that will probably not lead to severe real degradation. The testing through perun is costly, though very precise.

-s, --source-path <path>

The path to the directory of the project source files.

-g, --gcno-path <path>

The path to the directory where .gcno files are stored.

-o, --output-dir <path>

Required The path to the directory where generated outputs will be stored.

-t, --timeout <float>

Time limit for fuzzing (in seconds). Default value is 1800s.

-h, --hang-timeout <float>

The time limit before the input is classified as a hang/timeout (in seconds). Default value is 10s.

-N, --max-size <int>

Absolute value of the maximum size of the generated mutation wrt parent corpus. The value will be adjusted wrt to the maximal size of the workloads in corpus. Using this option, the maximal size of the generated mutation will be set to max(size of the largest workload in corpus, <int>).

-mi, --max-size-increase <int>

Absolute value of the maximal increase in the size of the generated mutation wrt parent corpus. Using this option, the maximal size of generated mutation will be set to (size of the largest corpus in workload + <INT>). Default value is 1 000 000 B = 1MB.

-mp, --max-size-ratio <float>

Relative value of the maximal increase in the size of the generated mutation wrt parent corpus. Using this option, the maximal size of generated mutation will be set to (size of the largest corpus in workload * <INT>). E.g. 1.5, max size=largest workload size * 1.5

-e, --exec-limit <int>

The maximum number of fuzzing iteration while gathering interesting inputs. By interesting inputs we mean files that might potentially lead to timeouts, hang or severe severe performance degradation.

-l, --interesting-files-limit <int>

The minimum number of gathered mutations, that are so called interesting, before perun testing is performed. By interesting inputs we mean files that might potentially lead to timeouts, hang or severe severe performance degradation.

-cr, --coverage-increase-rate <int>

The threshold of coverage increase against base coverage, which is used to evaluate, whether the generated mutation is interesting for further evaluation by performance testing. E.g 1.5, base coverage = 100 000, so threshold = 150 000.

-mpr, --mutations-per-rule <str>

Strategy which determines how many mutations will be generated by certain fuzzing rule in one iteration: unitary, proportional, probabilistic, mixed

Options:

unitary | proportional | probabilistic | mixed

-r, --regex-rules <file>

Option for adding custom fuzzing rules specified by regular expressions, written in YAML format file.

-np, --no-plotting

Will not plot the interpretation of the fuzzing in form of graphs.

Collect Commands

perun collect

Generates performance profile using selected collector.

Runs the single collector unit (registered in Perun) on given profiled command (optionally with given arguments and workloads) and generates performance profile. The generated profile is then stored in .perun/jobs/ directory as a file, by default with filename in form of:

bin-collector-workload-timestamp.perf

Generated profiles will not be postprocessed in any way. Consult perun postprocessby --help in order to postprocess the resulting profile.

The configuration of collector can be specified in external YAML file given by the -p/--params argument.

For a thorough list and description of supported collectors refer to Supported Collectors. For a more subtle running of profiling jobs and more complex configuration consult either perun run matrix --help or perun run job --help.

perun collect [OPTIONS] COMMAND [ARGS]...

Options

-o, --output-file <output_file>

Specifies the full path to where the profile will be stored.

-pn, --profile-name <profile_name>

Specifies the name of the profile, which will be collected, e.g. profile.perf. The profile will be stored in .perun/jobs

-m, --minor-version <minor_version_list>

Specifies the head minor version, for which the profiles will be collected.

-cp, --crawl-parents

If set to true, then for each specified minor versions, profiles for parents will be collected as well

-c, --cmd <cmd>

Command that is being profiled. Either corresponds to some script, binary or command, e.g. ./mybin or perun.

-a, --args <args>

Additional parameters for <cmd>. E.g. status or -al is command parameter.

-w, --workload <workload>

Inputs for <cmd>. E.g. ./subdir is possible workload for ls command.

-p, --params <params>

Additional parameters for called collector read from file in YAML format.

-ot, --output-filename-template <output_filename_template>

Specifies the template for automatic generation of output filename This way the file with collected data will have a resulting filename w.r.t to this parameter. Refer to format.output_profile_template for more details about the format of the template.

-op, --optimization-pipeline <optimization_pipeline>

Pre-configured combinations of collection optimization methods.

Options:

custom | basic | advanced | full

-on, --optimization-on <optimization_on>

Enable the specified collection optimization method.

Options:

baseline-static | baseline-dynamic | cg-shaping | dynamic-sampling | diff-tracing | dynamic-probing | timed-sampling

-off, --optimization-off <optimization_off>

Disable the specified collection optimization method.

Options:

baseline-static | baseline-dynamic | cg-shaping | dynamic-sampling | diff-tracing | dynamic-probing | timed-sampling

-oa, --optimization-args <optimization_args>

Set parameter values for various optimizations.

--optimization-cache-off

Ignore cached optimization data (e.g., cached call graph).

--optimization-reset-cache

Remove the cached optimization resources and data.

-cg, --use-cg-type <use_cg_type>
Options:

static | dynamic | mixed

Collect units

perun collect trace

Generates trace performance profile, capturing running times of function depending on underlying structural sizes.

* Limitations: C/C++ binaries
* Metric: mixed (captures both time and size consumption)
* Dependencies: SystemTap (+ corresponding requirements e.g. kernel -dbgsym version)
* Default units: us for time, element number for size

Example of collected resources is as follows:

{
    "amount": 11,
    "subtype": "time delta",
    "type": "mixed",
    "uid": "SLList_init(SLList*)",
    "structure-unit-size": 0
}

Trace collector provides various collection strategies which are supposed to provide sensible default settings for collection. This allows the user to choose suitable collection method without the need of detailed rules / sampling specification. Currently supported strategies are:

* userspace: This strategy traces all userspace functions / code blocks without
the use of sampling. Note that this strategy might be resource-intensive.
* all: This strategy traces all userspace + library + kernel functions / code blocks
that are present in the traced binary without the use of sampling. Note that this strategy
might be very resource-intensive.
* u_sampled: Sampled version of the userspace strategy. This method uses sampling
to reduce the overhead and resources consumption.
* a_sampled: Sampled version of the all strategy. Its goal is to reduce the
overhead and resources consumption of the all method.
* custom: User-specified strategy. Requires the user to specify rules and sampling
manually.

Note that manually specified parameters have higher priority than strategy specification and it is thus possible to override concrete rules / sampling by the user.

The collector interface operates with two seemingly same concepts: (external) command and binary. External command refers to the script, executable, makefile, etc. that will be called / invoked during the profiling, such as ‘make test’, ‘run_script.sh’, ‘./my_binary’. Binary, on the other hand, refers to the actual binary or executable file that will be profiled and contains specified functions / USDT probes etc. It is expected that the binary will be invoked / called as part of the external command script or that external command and binary are the same.

The interface for rules (functions, USDT probes) specification offers a way to specify profiled locations both with sampling or without it. Note that sampling can reduce the overhead imposed by the profiling. USDT rules can be further paired - paired rules act as a start and end point for time measurement. Without a pair, the rule measures time between each two probe hits. The pairing is done automatically for USDT locations with convention <name> and <name>_end or <name>_END - or other commonly found suffixes. Otherwise, it is possible to pair rules by the delimiter ‘#’, such as <name1>#<name2>.

Trace profiles are suitable for postprocessing by Regression Analysis since they capture dependency of time consumption depending on the size of the structure. This allows one to model the estimation of trace of individual functions.

Scatter plots are suitable visualization for profiles collected by trace collector, which plots individual points along with regression models (if the profile was postprocessed by regression analysis). Run perun show scatter --help or refer to Scatter Plot for more information about scatter plots.

Refer to Trace Collector for more thorough description and examples of trace collector.

perun collect trace [OPTIONS]

Options

-e, --engine <engine>

Sets the data collection engine to be used: - stap: the SystemTap framework - ebpf: the eBPF framework

Options:

stap | ebpf

-s, --strategy <strategy>

Required Select strategy for probing the binary. See documentation for detailed explanation for each strategy.

Options:

userspace | all | u_sampled | a_sampled | custom

-f, --func <func>

Set the probe point for the given function as <lib>#<func>#<sampling>.

-u, --usdt <usdt>

Set the probe point for the given USDT location as <lib>#<usdt>#<sampling>.

-d, --dynamic <dynamic>

Set the probe point for the given dynamic location as <lib>#<cl>#<sampling>.

-g, --global-sampling <global_sampling>

Set the global sample for all probes, sampling parameter for specific rules have higher priority.

--with-usdt, --no-usdt

The selected strategy will also extract and profile USDT probes.

-b, --binary <binary>

The profiled executable. If not set, then the command is considered to be the profiled executable and is used as a binary parameter.

-l, --libs <libs>

Additional libraries that should also be profiled.

-t, --timeout <timeout>

Set time limit (in seconds) for the profiled command, i.e. the command will be terminated after reaching the time limit. Useful for, e.g., endless commands.

-z, --zip-temps

Zip and compress the temporary files (SystemTap log, raw performance data, watchdog log, etc.) into the Perun log directory before deleting them.

-k, --keep-temps

Do not delete the temporary files in the file system.

-vt, --verbose-trace

Set the trace file output to be more verbose, useful for debugging.

-q, --quiet

Reduces the verbosity of the collector info messages.

-w, --watchdog

Enable detailed logging of the whole collection process.

-o, --output-handling <output_handling>

Sets the output handling of the profiled command: - default: the output is displayed in the terminal - capture: the output is being captured into a file as well as displayed in the terminal (note that buffering causes a delay in the terminal output) - suppress: redirects the output to the DEVNULL

Options:

default | capture | suppress

-i, --diagnostics

Enable detailed surveillance mode of the collector. The collector turns on detailed logging (watchdog), verbose trace, capturing output etc. and stores the logs and files in an archive (zip-temps) in order to provide as much diagnostic data as possible for further inspection.

-sc, --stap-cache-off

Disables the SystemTap caching of compiled scripts.

-np, --no-profile

Tracer will not transform and save processed data into a perun profile.

-mcg, --extract-mixed-cg

DEBUG: Extract mixed CG.

-cg, --only-extract-cg

Tracer will only extract the CG of the current project version and terminate.

-mt, --max-simultaneous-threads <max_simultaneous_threads>

DEBUG: Maximum number of expected simultaneous threads when sampling is on.

-nds, --no-ds-update

DEBUG: Disables Dynamic Stats updates

perun collect memory

Generates memory performance profile, capturing memory allocations of different types along with target address and full call trace.

* Limitations: C/C++ binaries
* Metric: memory
* Dependencies: libunwind.so and custom libmalloc.so
* Default units: B for memory

The following snippet shows the example of resources collected by memory profiler. It captures allocations done by functions with more detailed description, such as the type of allocation, trace, etc.

{
    "type": "memory",
    "subtype": "malloc",
    "address": 19284560,
    "amount": 4,
    "trace": [
        {
            "source": "../memory_collect_test.c",
            "function": "main",
            "line": 22
        },
    ],
    "uid": {
        "source": "../memory_collect_test.c",
        "function": "main",
        "line": 22
    }
},

Refer to Memory Collector for more thorough description and examples of memory collector.

perun collect memory [OPTIONS]

Options

-s, --sampling <sampling>

Sets the sampling interval for profiling the allocations. I.e. memory snapshots will be collected each <sampling> seconds.

--no-source <no_source>

Will exclude allocations done from <no_source> file during the profiling.

--no-func <no_func>

Will exclude allocations done by <no func> function during the profiling.

-a, --all

Will record the full trace for each allocation, i.e. it will include all allocators and even unreachable records.

perun collect time

Generates time performance profile, capturing overall running times of the profiled command.

* Limitations: none
* Metric: running time
* Dependencies: none
* Default units: s

This is a wrapper over the time linux utility and captures resources in the following form:

{
    "amount": 0.59,
    "type": "time",
    "subtype": "sys",
    "uid": cmd
    "order": 1
}

Refer to Time Collector for more thorough description and examples of trace collector.

perun collect time [OPTIONS]

Options

-w, --warmup <int>

Before the actual timing, the collector will execute <int> warm-up executions.

-r, --repeat <int>

The timing of the given binaries will be repeated <int> times.

perun collect bounds

Generates memory performance profile, capturing memory allocations of

different types along with target address and full call trace.

  • Limitations: C/C++ binaries

  • Metric: memory

  • Dependencies: libunwind.so and custom libmalloc.so

  • Default units: B for memory

The following snippet shows the example of resources collected by memory profiler. It captures allocations done by functions with more detailed description, such as the type of allocation, trace, etc.

{
        "uid": {
            "source": "../test.c",
            "function": "main",
            "line": 22
            "column": 40
        }
        "bound": "1 + max(0, (k + -1))",
        "class": "O(n^1)"
        "type": "bound",
}

Refer to :ref:`collectors-bounds` for more thorough description and
examples of `bounds` collector.
perun collect bounds [OPTIONS]

Options

-s, --source, --src <path>

Source C file that will be analyzed.

-d, --source-dir <dir>

Directory, where source C files are stored. All of the existing files with valid extensions (.c).

Postprocess Commands

perun postprocessby

Postprocesses the given stored or pending profile using selected postprocessor.

Runs the single postprocessor unit on given looked-up profile. The postprocessed file will be then stored in .perun/jobs/ directory as a file, by default with filename in form of:

bin-collector-workload-timestamp.perf

The postprocessed <profile> will be looked up in the following steps:

  1. If <profile> is in form i@i (i.e, an index tag), then ith record registered in the minor version <hash> index will be postprocessed.

  2. If <profile> is in form i@p (i.e., an pending tag), then ith profile stored in .perun/jobs will be postprocessed.

  3. <profile> is looked-up within the minor version <hash> index for a match. In case the <profile> is registered there, it will be postprocessed.

  4. <profile> is looked-up within the .perun/jobs directory. In case there is a match, the found profile will be postprocessed.

  5. Otherwise, the directory is walked for any match. Each found match is asked for confirmation by user.

Tags consider the sorted order as specified by the following option format.sort_profiles_by.

For checking the associated tags to profiles run perun status.

Example 1. The following command will postprocess the given profile stored at given path by regression analysis, i.e. for each snapshot, computes regression models:

perun postprocessby ./echo-time-hello-2017-04-02-13-13-34-12.perf regression-analysis

Example 2. The following command will postprocess the second profile stored in index of commit preceding the current head using interval regression analysis:

perun postprocessby -m HEAD~1 1@i regression-analysis --method=interval

For a thorough list and description of supported postprocessors refer to Supported Postprocessors. For a more subtle running of profiling jobs and more complex configuration consult either perun run matrix --help or perun run job --help.

perun postprocessby [OPTIONS] <profile> COMMAND [ARGS]...

Options

-ot, --output-filename-template <output_filename_template>

Specifies the template for automatic generation of output filename This way the postprocessed file will have a resulting filename w.r.t to this parameter. Refer to format.output_profile_template for more details about the format of the template.

-m, --minor <minor>

Will check the index of different minor version <hash> during the profile lookup

Arguments

<profile>

Required argument

Postprocess units

perun postprocessby regression_analysis

Finds fitting regression models to estimate models of profiled resources.

* Limitations: Currently limited to models of amount depending on
structural-unit-size
* Dependencies: Trace Collector

Regression analyzer tries to find a fitting model to estimate the amount of resources depending on structural-unit-size.

The following strategies are currently available:

  1. Full Computation uses data points to obtain the best fitting model for each type of model from the database (unless --regression_models/-r restrict the set of models)

  2. Iterative Computation uses a percentage of data points to obtain some preliminary models together with their errors or fitness. The most fitting model is then expanded, until it is fully computed or some other model becomes more fitting.

  3. Full Computation with initial estimate first uses some percent of data to estimate which model would be best fitting. Given model is then fully computed.

  4. Interval Analysis uses finer set of intervals of data and estimates models for each interval providing more precise modeling of the profile.

  5. Bisection Analysis fully computes the models for full interval. Then it does a split of the interval and computes new models for them. If the best fitting models changed for sub intervals, then we continue with the splitting.

Currently, we support linear, quadratic, power, logarithmic and constant models and use the coefficient of determination (\(R^2\)) to measure the fitness of model. The models are stored as follows:

{
    "uid": "SLList_insert(SLList*, int)",
    "r_square": 0.0017560012128507133,
    "coeffs": [
        {
            "value": 0.505375215875552,
            "name": "b0"
        },
        {
            "value": 9.935159839322705e-06,
            "name": "b1"
        }
    ],
    "x_start": 0,
    "x_end": 11892,
    "model": "linear",
    "method": "full",
}

For more details about regression analysis refer to Regression Analysis. For more details how to collect suitable resources refer to Trace Collector.

perun postprocessby regression_analysis [OPTIONS]

Options

-m, --method <method>

Will use the <method> to find the best fitting models for the given profile. By default ‘full’ computation will be performed

Options:

full | iterative | interval | initial_guess | bisection

-r, --regression_models <regression_models>

Restricts the list of regression models used by the specified <method> to fit the data. If omitted, all regression models will be used in the computation.

Options:

all | constant | exponential | linear | logarithmic | power | quadratic

-s, --steps <steps>

Restricts the number of number of steps / data parts used by the iterative, interval and initial guess methods

-dp, --depending-on <depending_on>

Sets the key that will be used as a source of independent variable.

-o, --of <of_resource_key>

Sets key for which we are finding the model.

perun postprocessby regressogram

Execution of the interleaving of profiled resources by regressogram models.

* Limitations: none
* Dependencies: none

Regressogram belongs to the simplest non-parametric methods and its properties are the following:

Regressogram: can be described such as step function (i.e. constant function by parts). Regressogram uses the same basic idea as a histogram for density estimate. This idea is in dividing the set of values of the x-coordinates (<per_key>) into intervals and the estimate of the point in concrete interval takes the mean/median of the y-coordinates (<of_resource_key>), respectively of its value on this sub-interval. We currently use the coefficient of determination (\(R^2\)) to measure the fitness of regressogram. The fitness of estimation of regressogram model depends primarily on the number of buckets into which the interval will be divided. The user can choose number of buckets manually (<bucket_window>) or use one of the following methods to estimate the optimal number of buckets (<bucket_method>):

- sqrt: square root (of data size) estimator, used for its speed and simplicity
- rice: does not take variability into account, only data size and commonly overestimates
- scott: takes into account data variability and data size, less robust estimator
- stone: based on leave-one-out cross validation estimate of the integrated squared error
- fd: robust, takes into account data variability and data size, resilient to outliers
- sturges: only accounts for data size, underestimates for large non-gaussian data
- doane: generalization of Sturges’ formula, works better with non-gaussian data
- auto: max of the Sturges’ and ‘fd’ estimators, provides good all around performance

For more details about these methods to estimate the optimal number of buckets or to view the code of these methods, you can visit SciPy.

For more details about this approach of non-parametric analysis refer to postprocessors-regressogram.

perun postprocessby regressogram [OPTIONS]

Options

-bn, --bucket_number <bucket_number>

Restricts the number of buckets to which will be placed the values of the selected statistics.

-bm, --bucket_method <bucket_method>

Specifies the method to estimate the optimal number of buckets.

Options:

auto | doane | fd | rice | scott | sqrt | sturges

-sf, --statistic_function <statistic_function>

Will use the <statistic_function> to compute the values for points within each bucket of regressogram.

Options:

mean | median

-of, --of-key <of_resource_key>

Sets key for which we are finding the model (y-coordinates).

-per, --per-key <per_resource_key>

Sets the key that will be used as a source variable (x-coordinates).

perun postprocessby moving_average

Execution of the interleaving of profiled resources by moving average models.

* Limitations: none
* Dependencies: none

Moving average methods are the natural generalizations of regressogram method. This method uses the local averages/medians of y-coordinates (<of_resource_key>), but the estimate in the x-point (<per_key>) is based on the centered surroundings of these points, more precisely:

Moving Average: is a widely used estimator in the technical analysis, that helps smooth the dataset by filtering out the ‘noise’. Among the basic properties of these methods belongs the ability to reduce the effect of temporary variations in data, better improvement of the fitness of data to a line, so-called smoothing, to show the data’s trend more clearly and highlight any value below or above the trend. The most important task with this type of non-parametric approach is the choice of the <window-width>. If the user does not choose it, we try approximate this value by using the value of coefficient of determination (\(R^2\)). At the beginning, of the analysis is set the initial value of window width and then follows the interleaving of the current dataset, which runs until the value of coefficient of determination will not reach the required level. By this way is guaranteed the desired smoothness of the resulting models. The two basic and commonly used <moving-methods> are the simple moving average (sma) and the exponential moving average (ema).

For more details about this approach of non-parametric analysis refer to Moving Average Methods.

perun postprocessby moving_average [OPTIONS] COMMAND [ARGS]...

Options

-mp, --min_periods <min_periods>

Provides the minimum number of observations in window required to have a value. If the number of possible observations smaller then result is NaN.

-of, --of-key <of_resource_key>

Sets key for which we are finding the model (y-coordinates).

-per, --per-key <per_resource_key>

Sets the key that will be used as a source variable (x-coordinates).

Commands

ema

Exponential Moving Average

sma

Simple Moving Average

smm

Simple Moving Median

perun postprocessby moving_average sma

Simple Moving Average

In the most of cases, it is an unweighted Moving Average, this means that the each x-coordinate in the data set (profiled resources) has equal importance and is weighted equally. Then the mean is computed from the previous n data (<no-center>), where the n marks <window-width>. However, in science and engineering the mean is normally taken from an equal number of data on either side of a central value (<center>). This ensures that variations in the mean are aligned with the variations in the mean are aligned with variations in the data rather than being shifted in the x-axis direction. Since the window at the boundaries of the interval does not contain enough count of points usually, it is necessary to specify the value of <min-periods> to avoid the NaN result. The role of the weighted function in this approach belongs to <window-type>, which represents the suite of the following window functions for filtering:

- boxcar: known as rectangular or Dirichlet window, is equivalent to no window at all: –
- triang: standard triangular window
- blackman: formed by using three terms of a summation of cosines, minimal leakage, close to optimal
- hamming: formed by using a raised cosine with non-zero endpoints, minimize the nearest side lobe
- bartlett: similar to triangular, endpoints are at zero, processing of tapering data sets
- parzen: can be regarded as a generalization of k-nearest neighbor techniques
- bohman: convolution of two half-duration cosine lobes
- blackmanharris: minimum in the sense that its maximum side lobes are minimized (symmetric 4-term)
- nuttall: minimum 4-term Blackman-Harris window according to Nuttall (so called ‘Nuttall4c’)
- barthann: has a main lobe at the origin and asymptotically decaying side lobes on both sides
- kaiser: formed by using a Bessel function, needs beta value (set to 14 - good starting point)

For more details about this window functions or for their visual view you can see SciPyWindow.

perun postprocessby moving_average sma [OPTIONS]

Options

-wt, --window_type <window_type>

Provides the window type, if not set then all points are evenly weighted. For further information about window types see the notes in the documentation.

Options:

boxcar | triang | blackman | hamming | bartlett | parzen | bohman | blackmanharris | nuttall | barthann

--center, --no-center

If set to False, the result is set to the right edge of the window, else is result set to the center of the window

-ww, --window_width <window_width>

Size of the moving window. This is a number of observations used for calculating the statistic. Each window will be a fixed size.

perun postprocessby moving_average smm

Simple Moving Median

The second representative of Simple Moving Average methods is the Simple Moving Median. For this method are applicable to the same rules as in the first described method, except for the option for choosing the window type, which do not make sense in this approach. The only difference between these two methods are the way of computation the values in the individual sub-intervals. Simple Moving Median is not based on the computation of average, but as the name suggests, it based on the median.

perun postprocessby moving_average smm [OPTIONS]

Options

--center, --no-center

If set to False, the result is set to the right edge of the window, else is result set to the center of the window

-ww, --window_width <window_width>

Size of the moving window. This is a number of observations used for calculating the statistic. Each window will be a fixed size.

perun postprocessby moving_average ema

Exponential Moving Average

This method is a type of moving average methods, also known as Exponential Weighted Moving Average, that places a greater weight and significance on the most recent data points. The weighting for each far x-coordinate decreases exponentially and never reaching zero. This approach of moving average reacts more significantly to recent changes than a Simple Moving Average, which applies an equal weight to all observations in the period. To calculate an EMA must be first computing the Simple Moving Average (SMA) over a particular sub-interval. In the next step must be calculated the multiplier for smoothing (weighting) the EMA, which depends on the selected formula, the following options are supported (<decay>):

- com: specify decay in terms of center of mass: \({\alpha}\) = 1 / (1 + com), for com >= 0
- span: specify decay in terms of span: \({\alpha}\) = 2 / (span + 1), for span >= 1
- halflife: specify decay in terms of half-life, \({\alpha}\) = 1 - exp(log(0.5) / halflife), for halflife > 0
- alpha: specify smoothing factor \({\alpha}\) directly: 0 < \({\alpha}\) <= 1

The computed coefficient \({\alpha}\) represents the degree of weighting decrease, a constant smoothing factor, The higher value of \({\alpha}\) discounts older observations faster, the small value to the contrary. Finally, to calculate the current value of EMA is used the relevant formula. It is important do not confuse Exponential Moving Average with Simple Moving Average. An Exponential Moving Average behaves quite differently from the second mentioned method, because it is the function of weighting factor or length of the average.

perun postprocessby moving_average ema [OPTIONS]

Options

-d, --decay <decay>

Exactly one of “com”, “span”, “halflife”, “alpha” can be provided. Allowed values and relationship between the parameters are specified in the documentation (e.g. –decay=com 3).

perun postprocessby kernel-regression

Execution of the interleaving of profiles resources by kernel models.

* Limitations: none
* Dependencies: none

In statistics, the kernel regression is a non-parametric approach to estimate the conditional expectation of a random variable. Generally, the main goal of this approach is to find non-parametric relation between a pair of random variables X <per-key> and Y <of-key>. Different from parametric techniques (e.g. linear regression), kernel regression does not assume any underlying distribution (e.g. linear, exponential, etc.) to estimate the regression function. The main idea of kernel regression is putting the kernel, that have the role of weighted function, to each observation point in the dataset. Subsequently, the kernel will assign weight to each point in depends on the distance from the current data point. The kernel basis formula depends only on the bandwidth from the current (‘local’) data point X to a set of neighboring data points X.

Kernel Selection does not important from an asymptotic point of view. It is appropriate to choose the optimal kernel since this group of the kernels are continuously on the whole definition field and then the estimated regression function inherit smoothness of the kernel. For example, a suitable kernels can be the epanechnikov or normal kernel. This postprocessor offers the kernel selection in the kernel-smoothing mode, where are available five different types of kernels. For more information about these kernels or this kernel regression mode you can see perun postprocessby kernel-regression kernel-smoothing.

Bandwidth Selection is the most important factor at each approach of kernel regression, since this value significantly affects the smoothness of the resulting estimate. In case, when we choose the inappropriate value, in the most cases can be expected the following two situations. The small bandwidth value reproduce estimated data and vice versa, the large value leads to over-leaving, so to average of the estimated data. Therefore, are used the methods to determine the bandwidth value. One of the most widespread and most commonly used methods is the cross-validation method. This method is based on the estimate of the regression function in which will be omitted i-th observation. In this postprocessor is this method available in the estimator-setting mode. Another methods to determine the bandwidth, which are available in the remaining modes of this postprocessor are scott and silverman method. More information about these methods and its definition you cas see in the part perun postprocessby kernel-regression method-selection.

This postprocessor in summary offers five different modes, which does not differ in the resulting estimate, but in the way of computation the resulting estimate. Better said, it means, that the result of each mode is the kernel estimate with relevant parameters, selected according to the concrete mode. In short, we will describe the individual methods, for more information about it, you can visit the relevant parts of documentation:

* Estimator-Settings: Nadaraya-Watson kernel regression with specific settings for estimate
* User-Selection: Nadaraya-Watson kernel regression with user bandwidth
* Method-Selection: Nadaraya-Watson kernel regression with supporting bandwidth selection method
* Kernel-Smoothing: Kernel regression with different types of kernel and regression methods
* Kernel-Ridge: Nadaraya-Watson kernel regression with automatic bandwidth selection

For more details about this approach of non-parametric analysis refer to Kernel Regression Methods.

perun postprocessby kernel-regression [OPTIONS] COMMAND [ARGS]...

Options

-of, --of-key <of_resource_key>

Sets key for which we are finding the model (y-coordinates).

-per, --per-key <per_resource_key>

Sets the key that will be used as a source variable (x-coordinates).

Commands

estimator-settings

Nadaraya-Watson kernel regression with…

kernel-ridge

Nadaraya-Watson kernel regression with…

kernel-smoothing

Kernel regression with different types of…

method-selection

Nadaraya-Watson kernel regression with…

user-selection

Nadaraya-Watson kernel regression with…

perun postprocessby kernel-regression estimator-settings

Nadaraya-Watson kernel regression with specific settings for estimate.

As has been mentioned above, the kernel regression aims to estimate the functional relation between explanatory variable y and the response variable X. This mode of kernel regression postprocessor calculates the conditional mean E[y|X] = m(X), where y = m(X) + \(\epsilon\). Variable X is represented in the postprocessor by <per-key> option and the variable y is represented by <of-key> option.

Regression Estimator <reg-type>:

This mode offer two types of regression estimator <reg-type>. Local Constant (`ll`) type of regression provided by this mode is also known as Nadaraya-Watson kernel regression:

Nadaraya-Watson: expects the following conditional expectation: E[y|X] = m(X), where function m(*) represents the regression function to estimate. Then we can alternatively write the following formula: y = m(X) + \(\epsilon\), E (\(\epsilon\)) = 0. Then we can suppose, that we have the set of independent observations {(\({x_1}\), \({y_1}\)), …, (\({x_n}\), \({y_n}\))} and the Nadaraya-Watson estimator is defined as:

\[m_{h}(x) = \sum_{i=1}^{n}K_h(x - x_i)y_i / \sum_{j=1}^{n}K_h(x - x_j)\]

where \({K_h}\) is a kernel with bandwidth h. The denominator is a weighting term with sum 1. It easy to see that this kernel regression estimator is just a weighted sum of the observed responses \({y_i}\). There are many other kernel estimators that are various in compare to this presented estimator. However, since all are asymptotic equivalently, we will not deal with them closer. Kernel Regression postprocessor works in all modes only with Nadaraya-Watson estimator.

The second supported regression estimator in this mode of postprocessor is Local Linear (`lc`). This type is an extension of that which suffers less from bias issues at the edge of the support.

Local Linear: estimator, that offers various advantages compared with other kernel-type estimators, such as the Nadaraya-Watson estimator. More precisely, it adapts to both random and fixed designs, and to various design densities such as highly clustered designs and nearly uniform designs. It turns out that the local linear smoother repairs the drawbacks of other kernel regression estimators. A regression estimator m of m is a linear smoother if, for each x, there is a vector \(l(x) = (l_1(x), ..., l_n(x))^T\) such that:

\[m(x) = \sum_{i=1}^{n}l_i(x)Y_i = l(x)^TY\]

where \(Y = (Y_1, ..., Y_n)^T\). For kernel estimators:

\[l_i(x) = K(||x - X_i|| / h) / \sum_{j=1}^{n}K(||x - X_j|| / h)\]

where K represents kernel and h its bandwidth.

For a better imagination, there is an interesting fact, that the following estimators are linear smoothers too: Gaussian process regression, splines.

Bandwidth Method <bandwidth-method>:

As has been said in the general description of the kernel regression, one of the most important factors of the resulting estimate is the kernel bandwidth. When the inappropriate value is selected may occur to under-laying or over-laying fo the resulting kernel estimate. Since the bandwidth of the kernel is a free parameter which exhibits a strong influence on the resulting estimate postprocessor offers the method for its selection. Two most popular data-driven methods of bandwidth selection that have desirable properties are least-squares cross-validation (cv_ls) and the AIC-based method of Hurvich et al. (1998), which is based on minimizing a modified Akaike Information Criterion (aic):

Cross-Validation Least-Squares: determination of the optimal kernel bandwidth for kernel regression is based on minimizing

\[CV(h) = n^{-1} \sum_{i=1}^{n}(Y_i - g_{-i}(X_i))^2,\]

where \(g_{-i}(X_i)\) is the estimator of \(g(X_i)\) formed by leaving out the i-th observation when generating the prediction for observation i.

Hurvich et al.’s (1998) approach is based on the minimization of

\[AIC_c = ln(\sigma^2) + ((1 + tr(H) / n) / (1 - (tr(H) + 2) / n),\]

where

\[\sigma^2 = 1 / n \sum_{i=1}^{n}(Y_i - g(X_i))^2 = Y'(I - H)'(I - H)Y / n\]

with \(g(X_i)\) being a non-parametric regression estimator and H being an n x n matrix of kernel weights with its (i, j)-th element given by \(H_{ij} = K_h(X_i, X_j) / \sum_{l=1}^{n} K_h(X_i, X_l)\), where \(K_h(*)\) is a generalized product kernel.

Both methods for kernel bandwidth selection the least-squared cross-validation and the AIC have been shown to be asymptotically equivalent.

The remaining options at this mode of kernel regression postprocessor are described within usage from the CLI and you can see this in the list below. All these options are parameters to EstimatorSettings (see EstimatorSettings), that optimizing the kernel bandwidth based on the these specified settings.

In the case of confusion about this approach of kernel regression, you can visit StatsModels.

perun postprocessby kernel-regression estimator-settings [OPTIONS]

Options

-rt, --reg-type <reg_type>

Provides the type for regression estimator. Supported types are: “lc”: local-constant (Nadaraya-Watson) and “ll”: local-linear estimator. Default is “ll”. For more information about these types you can visit Perun Documentation.

Options:

ll | lc

-bw, --bandwidth-method <bandwidth_method>

Provides the method for bandwidth selection. Supported values are: “cv-ls”: least-squares cross validation and “aic”: AIC Hurvich bandwidth estimation. Default is “cv-ls”. For more information about these methods you can visit Perun Documentation.

Options:

cv_ls | aic

--efficient, --uniformly

If True, is executing the efficient bandwidth estimation - by taking smaller sub-samples and estimating the scaling factor of each sub-sample. It is useful for large samples and/or multiple variables. If False (default), all data is used at the same time.

--randomize, --no-randomize

If True, the bandwidth estimation is performed by taking <n_res> random re-samples of size <n-sub-samples> from the full sample. If set to False (default), is performed by slicing the full sample in sub-samples of <n-sub-samples> size, so that all samples are used once.

-nsub, --n-sub-samples <n_sub_samples>

Size of the sub-samples (default is 50).

-nres, --n-re-samples <n_re_samples>

The number of random re-samples used to bandwidth estimation. It has effect only if <randomize> is set to True. Default values is 25.

--return-median, --return-mean

If True, the estimator uses the median of all scaling factors for each sub-sample to estimate bandwidth of the full sample. If False (default), the estimator used the mean.

perun postprocessby kernel-regression user-selection

Nadaraya-Watson kernel regression with user bandwidth.

This mode of kernel regression postprocessor is very similar to estimator-settings mode. Also offers two types of regression estimator <reg-type> and that the Nadaraya-Watson estimator, so known as local-constant (lc) and the local-linear estimator (ll). Details about these estimators are available in perun postprocessby kernel-regression estimator-settings. In contrary to this mode, which selected a kernel bandwidth using the EstimatorSettings and chosen parameters, in this mode the user itself selects a kernel bandwidth <bandwidth-value>. This value will be used to execute the kernel regression. The value of kernel bandwidth in the resulting estimate may change occasionally, specifically in the case, when the bandwidth value is too low to execute the kernel regression. Then will be a bandwidth value approximated to the closest appropriate value, so that is not decreased the accuracy of the resulting estimate.

perun postprocessby kernel-regression user-selection [OPTIONS]

Options

-rt, --reg-type <reg_type>

Provides the type for regression estimator. Supported types are: “lc”: local-constant (Nadaraya-Watson) and “ll”: local-linear estimator. Default is “ll”. For more information about these types you can visit Perun Documentation.

Options:

ll | lc

-bv, --bandwidth-value <bandwidth_value>

Required The float value of <bandwidth> defined by user, which will be used at kernel regression.

perun postprocessby kernel-regression method-selection

Nadaraya-Watson kernel regression with supporting bandwidth selection method.

The last method from a group of three methods based on a similar principle. Method-selection mode offers the same type of regression estimators <reg-type> as the first two described methods. The first supported option is ll, which represents the local-linear estimator. Nadaraya-Watson or local constant estimator represents the second option for <reg-type> parameter. The more detailed description of these estimators is located in perun postprocessby kernel-regression estimator-settings. The difference between this mode and the two first modes is in the way of determination of a kernel bandwidth. In this mode are offered two methods to determine bandwidth. These methods try calculated an optimal bandwidth from predefined formulas:

Scotts’s Rule of thumb to determine the smoothing bandwidth for a kernel estimation. It is very fast compute. This rule was designed for density estimation but is usable for kernel regression too. Typically, produces a larger bandwidth, and therefore it is useful for estimating a gradual trend:

\[bw = 1.059 * A * n^{-1/5},\]

where n marks the length of X variable <per-key> and

\[A = min(\sigma(x), IQR(x) / 1.349),\]

where \(\sigma\) marks the StandardDeviation and IQR marks the InterquartileRange.

Silverman’s Rule of thumb to determine the smoothing bandwidth for a kernel estimation. Belongs to most popular method which uses the rule-of-thumb. Rule is originally designs for density estimation and therefore uses the normal density as a prior for approximating. For the necessary estimation of the \(\sigma\) of X <per-key> he proposes a robust version making use of the InterquartileRange. If the true density is uni-modal, fairly symmetric and does not have fat tails, it works fine:

\[bw = 0.9 * A * n^{-1/5},\]

where n marks the length of X variable <per-key> and

\[A = min(\sigma(x), IQR(x) / 1.349),\]

where \(\sigma\) marks the StandardDeviation and IQR marks the InterquartileRange.

perun postprocessby kernel-regression method-selection [OPTIONS]

Options

-rt, --reg-type <reg_type>

Provides the type for regression estimator. Supported types are: “lc”: local-constant (Nadaraya-Watson) and “ll”: local-linear estimator. Default is “ll”. For more information about these types you can visit Perun Documentation.

Options:

ll | lc

-bm, --bandwidth-method <bandwidth_method>

Provides the helper method to determine the kernel bandwidth. The <method_name> will be used to compute the bandwidth, which will be used at kernel regression.

Options:

scott | silverman

perun postprocessby kernel-regression kernel-smoothing

Kernel regression with different types of kernel and regression methods.

This mode of kernel regression postprocessor implements non-parametric regression using different kernel methods and different kernel types. The calculation in this mode can be split into three parts. The first part is represented by the kernel type, the second part by bandwidth computation and the last part is represented by regression method, which will be used to interleave the given resources. We will look gradually at individual supported options in the each part of computation.

Kernel Type <kernel-type>:

In non-parametric statistics a kernel is a weighting function used in estimation techniques. In kernel regression is used to estimate the conditional expectation of a random variable. As has been said, kernel width must be specified when running a non-parametric estimation. The kernel in view of mathematical definition is a non-negative real-valued integrable function K. For most applications, it is desirable to define the function to satisfy two additional requirements:

Normalization:

\[\int_{-\infty}^{+\infty}K(u)du = 1,\]

Symmetry

\[K(-u) = K(u),\]

for all values of u. The second requirement ensures that the average of the corresponding distribution is equal to that of the sample used. If K is a kernel, then so is the function \(K^*\) defined by \(K^*(u) = \lambda K (\lambda u)\), where \(\lambda > 0\). This can be used to select a scale that is appropriate for the data. This mode offers several types of kernel functions:

Kernel Name

Kernel Function, K(u)

Efficiency

Gaussian (normal)

\(K(u)=(1/\sqrt{2\pi})e^{-(1/2)u^2}\)

95.1%

Epanechnikov

\(K(u)=3/4(1-u^2)\)

100%

Tricube

\(K(u)=70/81(1-|u^3|)^3\)

99.8%

Gaussian order4

\(\phi_4(u)=1/2(3-u^2)\phi(u)\), where \(\phi\) is the normal kernel

not applicable

Epanechnikov order4

\(K_4(u)=-(15/8)u^2+(9/8)\), where K is the non-normalized Epanechnikov kernel

not applicable

Efficiency is defined as \(\sqrt{\int{}{}u^2K(u)du}\int{}{}K(u)^2du\), and its measured to the Epanechnikov kernel.

Smoothing Method <smoothing-method>:

Kernel-Smoothing mode of this postprocessor offers three different non-parametric regression methods to execute kernel regression. The first of them is called spatial-average and perform a Nadaraya-Watson regression (i.e. also called local-constant regression) on the data using a given kernel:

\[m_{h}(x) = \sum_{i=1}^{n}K_h((x-x_i)/h)y_i/\sum_{j=1}^{n}K_h((x-x_j) / h),\]

where K(x) is the kernel and must be such that E(K(x)) = 0 and h is the bandwidth of the method. Local-Constant regression was also described in perun postprocessby kernel-regression estimator-settings. The second supported regression method by this mode is called local-linear. Compared with previous method, which offers computational with different types of kernel, this method has restrictions and perform local-linear regression using only Gaussian (Normal) kernel. The local-constant regression was described in perun postprocessby kernel-regression estimator-settings and therefore will not be given no further attention to it. Local Polynomial regression is the last method in this mode and perform regression in N-D using a user-provided kernel. The local-polynomial regression is the function that minimizes, for each position:

\[m_{h}(x) = \sum_{i=0}^{n}K((x - x_i) / h)(y_i - a_0 - P_q(x_i -x))^2,\]

where K(x) is the kernel such that E(K(x)) = 0, q is the order of the fitted polynomial <polynomial-order>, \(P_q(x)\) is a polynomial or order q in x, and h is the bandwidth of the method. The polynomial \(P_q(x)\) is of the form:

\[F_d(k) = { n \in N^d | \sum_{i=1}^{d}n_i = k }\]
\[P_q(x_1, ..., x_d) = \sum_{k=1}^{q}{}\sum_{n \in F_d(k)}^{}{} a_{k,n}\prod_{i=1}^{d}x_{i}^{n_i}\]

For example we can have:

\[P_2(x, y) = a_{110}x + a_{101}y + a_{220}x^2 + a_{221}xy + a_{202}y^2\]

The last part of the calculation is the bandwidth computation. This mode offers to user enter the value directly with use of parameter <bandwidth-value>. The parameter <bandwidth-method> offers to user the selection from the two methods to determine the optimal bandwidth value. The supported methods are Scotts’s Rule and Silverman’s Rule, which are described in perun postprocessby kernel-regression method-selection. This parameter cannot be entered in combination with <bandwidth-value>, then will be ignored and will be accepted value from <bandwidth-value>.

perun postprocessby kernel-regression kernel-smoothing [OPTIONS]

Options

-kt, --kernel-type <kernel_type>

Provides the set of kernels to execute the kernel-smoothing with kernel selected by the user. For exact definitions of these kernels and more information about it, you can visit the Perun Documentation.

Options:

epanechnikov | tricube | normal | epanechnikov4 | normal4

-sm, --smoothing-method <smoothing_method>

Provides kernel smoothing methods to executing non-parametric regressions: local-polynomial perform a local-polynomial regression in N-D using a user-provided kernel; local-linear perform a local-linear regression using a gaussian (normal) kernel; and spatial-average perform a Nadaraya-Watson regression on the data (so called local-constant regression) using a user-provided kernel.

Options:

spatial-average | local-linear | local-polynomial

-bm, --bandwidth-method <bandwidth_method>

Provides the helper method to determine the kernel bandwidth. The <bandwidth_method> will be used to compute the bandwidth, which will be used at kernel-smoothing regression. Cannot be entered in combination with <bandwidth-value>, then will be ignored and will be accepted value from <bandwidth-value>.

Options:

scott | silverman

-bv, --bandwidth-value <bandwidth_value>

The float value of <bandwidth> defined by user, which will be used at kernel regression. If is entered in the combination with <bandwidth-method>, then method will be ignored.

-q, --polynomial-order <polynomial_order>

Provides order of the polynomial to fit. Default value of the order is equal to 3. Is accepted only by local-polynomial <smoothing-method>, another methods ignoring it.

perun postprocessby kernel-regression kernel-ridge

Nadaraya-Watson kernel regression with automatic bandwidth selection.

This mode implements Nadaraya-Watson kernel regression, which was described above in perun postprocessby kernel-regression estimator-settings. While the previous modes provided the methods to determine the optimal bandwidth with different ways, this method provides a little bit different way. From a given range of potential bandwidths <gamma-range> try to select the optimal kernel bandwidth with use of leave-one-out cross-validation. This approach was described in perun postprocessby kernel-regression estimator-settings, where was introduced the least-squares cross-validation and it is a modification of this approach. Leave-one-out cross validation is K-fold cross validation taken to its logical extreme, with K equal to N, the number of data points in the set. The original gamma-range will be divided on the base of size the given step <gamma-step>. The selection of specific value from this range will be executing by minimizing mean-squared-error in leave-one-out cross-validation. The selected bandwidth-value will serves for gaussian kernel in resulting estimate: \(K(x, y) = exp(-gamma * ||x-y||^2)\).

perun postprocessby kernel-regression kernel-ridge [OPTIONS]

Options

-gr, --gamma-range <gamma_range>

Provides the range for automatic bandwidth selection of the kernel via leave-one-out cross-validation. One value from these range will be selected with minimizing the mean-squared error of leave-one-out cross-validation. The first value will be taken as the lower bound of the range and cannot be greater than the second value.

-gs, --gamma-step <gamma_step>

Provides the size of the step, with which will be executed the iteration over the given <gamma-range>. Cannot be greater than length of <gamma-range>, else will be set to value of the lower bound of the <gamma_range>.

Show Commands

perun show

Interprets the given profile using the selected visualization technique.

Looks up the given profile and interprets it using the selected visualization technique. Some of the techniques outputs either to terminal (using ncurses) or generates HTML files, which can be browsable in the web browser (using bokeh library). Refer to concrete techniques for concrete options and limitations.

The shown <profile> will be looked up in the following steps:

  1. If <profile> is in form i@i (i.e, an index tag), then ith record registered in the minor version <hash> index will be shown.

  2. If <profile> is in form i@p (i.e., an pending tag), then ith profile stored in .perun/jobs will be shown.

  3. <profile> is looked-up within the minor version <hash> index for a match. In case the <profile> is registered there, it will be shown.

  4. <profile> is looked-up within the .perun/jobs directory. In case there is a match, the found profile will be shown.

  5. Otherwise, the directory is walked for any match. Each found match is asked for confirmation by user.

Tags consider the sorted order as specified by the following option format.sort_profiles_by.

Example 1. The following command will show the first profile registered at index of HEAD~1 commit. The resulting graph will contain bars representing sum of amounts per each subtype of resources and will be shown in the browser:

perun show -m HEAD~1 0@i bars sum --of 'amount' --per 'subtype' -v

Example 2. The following command will show the profile at the absolute path using in raw JSON format:

perun show ./echo-time-hello-2017-04-02-13-13-34-12.perf raw

For a thorough list and description of supported visualization techniques refer to Supported Visualizations.

perun show [OPTIONS] <profile> COMMAND [ARGS]...

Options

-m, --minor <minor>

Will check the index of different minor version <hash> during the profile lookup

Arguments

<profile>

Required argument

Show units

perun show bars

Customizable interpretation of resources using the bar format.

* Limitations: none.
* Interpretation style: graphical
* Visualization backend: Bokeh

Bars graph shows the aggregation (e.g. sum, count, etc.) of resources of given types (or keys). Each bar shows <func> of resources from <of> key (e.g. sum of amounts, average of amounts, count of types, etc.) per each <per> key (e.g. per each snapshot, or per each type). Moreover, the graphs can either be (i) stacked, where the different values of <by> key are shown above each other, or (ii) grouped, where the different values of <by> key are shown next to each other. Refer to resources for examples of keys that can be used as <of>, <key>, <per> or <by>.

Bokeh library is the current interpretation backend, which generates HTML files, that can be opened directly in the browser. Resulting graphs can be further customized by adding custom labels for axes, custom graph title or different graph width.

Example 1. The following will display the sum of sums of amounts of all resources of given for each subtype, stacked by uid (e.g. the locations in the program):

perun show 0@i bars sum --of 'amount' --per 'subtype' --stacked --by 'uid'

The example output of the bars is as follows:

                                <graph_title>
                        `
                        -         .::.                ````````
                        `         :&&:                ` # \  `
                        -   .::.  ::::        .::.    ` @  }->  <by>
                        `   :##:  :##:        :&&:    ` & /  `
        <func>(<of>)    -   :##:  :##:  .::.  :&&:    ````````
                        `   ::::  :##:  :&&:  ::::
                        -   :@@:  ::::  ::::  :##:
                        `   :@@:  :@@:  :##:  :##:
                        +````||````||````||````||````

                                    <per>

Refer to Bars Plot for more thorough description and example of bars interpretation possibilities.

perun show bars [OPTIONS] <aggregation_function>

Options

-o, --of <of_resource_key>

Required Sets key that is source of the data for the bars, i.e. what will be displayed on Y axis.

-p, --per <per_resource_key>

Sets key that is source of values displayed on X axis of the bar graph.

-b, --by <by_resource_key>

Sets the key that will be used either for stacking or grouping of values

-s, --stacked

Will stack the values by <resource_key> specified by option –by.

-g, --grouped

Will stack the values by <resource_key> specified by option –by.

-f, --filename <html>

Sets the outputs for the graph to the file.

-xl, --x-axis-label <text>

Sets the custom label on the X axis of the bar graph.

-yl, --y-axis-label <text>

Sets the custom label on the Y axis of the bar graph.

-gt, --graph-title <text>

Sets the custom title of the bars graph.

-v, --view-in-browser

The generated graph will be immediately opened in the browser (firefox will be used).

Arguments

<aggregation_function>

Optional argument

perun show flamegraph

Flame graph interprets the relative and inclusive presence of the resources according to the stack depth of the origin of resources.

* Limitations: memory profiles generated by
* Interpretation style: graphical
* Visualization backend: HTML

Flame graph intends to quickly identify hotspots, that are the source of the resource consumption complexity. On X axis, a relative consumption of the data is depicted, while on Y axis a stack depth is displayed. The wider the bars are on the X axis are, the more the function consumed resources relative to others.

Acknowledgements: Big thanks to Brendan Gregg for creating the original perl script for creating flame graphs w.r.t simple format. If you like this visualization technique, please check out this guy’s site (https://brendangregg.com) for more information about performance, profiling and useful talks and visualization techniques!

The example output of the flamegraph is more or less as follows:

                    `
                    -                         .
                    `                         |
                    -              ..         |     .
                    `              ||         |     |
                    -              ||        ||    ||
                    `            |%%|       |--|  |!|
                    -     |## g() ##|     |#g()#|***|
                    ` |&&&& f() &&&&|===== h() =====|
                    +````||````||````||````||````||````

Refer to Flame Graph for more thorough description and examples of the interpretation technique. Refer to perun.profile.convert.to_flame_graph_format() for more details how the profiles are converted to the flame graph format.

perun show flamegraph [OPTIONS]

Options

-f, --filename <filename>

Sets the output file of the resulting flame graph.

perun show flow

Customizable interpretation of resources using the flow format.

* Limitations: none.
* Interpretation style: graphical, textual
* Visualization backend: Bokeh, ncurses

Flow graph shows the values resources depending on the independent variable as basic graph. For each group of resources identified by unique value of <by> key, one graph shows the dependency of <of> values aggregated by <func> depending on the <through> key. Moreover, the values can either be accumulated (this way when displaying the value of ‘n’ on x-axis, we accumulate the sum of all values for all m < n) or stacked, where the graphs are output on each other and then one can see the overall trend through all the groups and proportions between each of the group.

Bokeh library is the current interpretation backend, which generates HTML files, that can be opened directly in the browser. Resulting graphs can be further customized by adding custom labels for axes, custom graph title or different graph width.

Example 1. The following will show the average amount (in this case the function running time) of each function depending on the size of the structure over which the given function operated:

perun show 0@i flow mean --of 'amount' --per 'structure-unit-size'
    --accumulated --by 'uid'

The example output of the bars is as follows:

                                <graph_title>
                        `
                        -                      ______    ````````
                        `                _____/          ` # \  `
                        -               /          __    ` @  }->  <by>
                        `          ____/      ____/      ` & /  `
        <func>(<of>)    -      ___/       ___/           ````````
                        `  ___/    ______/       ____
                        -/  ______/        _____/
                        `__/______________/
                        +````||````||````||````||````

                                  <through>

Refer to Flow Plot for more thorough description and example of flow interpretation possibilities.

perun show flow [OPTIONS] <aggregation_function>

Options

-o, --of <of_resource_key>

Required Sets key that is source of the data for the flow, i.e. what will be displayed on Y axis, e.g. the amount of resources.

-t, --through <through_key>

Sets key that is source of the data value, i.e. the independent variable, like e.g. snapshots or size of the structure.

-b, --by <by_resource_key>

Required For each <by_resource_key> one graph will be output, e.g. for each subtype or for each location of resource.

-s, --stacked

Will stack the y axis values for different <by> keys on top of each other. Additionally shows the sum of the values.

--accumulate, --no-accumulate

Will accumulate the values for all previous values of X axis.

-f, --filename <html>

Sets the outputs for the graph to the file.

-xl, --x-axis-label <text>

Sets the custom label on the X axis of the flow graph.

-yl, --y-axis-label <text>

Sets the custom label on the Y axis of the flow graph.

-gt, --graph-title <text>

Sets the custom title of the flow graph.

-v, --view-in-browser

The generated graph will be immediately opened in the browser (firefox will be used).

Arguments

<aggregation_function>

Optional argument

perun show scatter

Interactive visualization of resources and models in scatter plot format.

Scatter plot shows resources as points according to the given parameters. The plot interprets <per> and <of> as x, y coordinates for the points. The scatter plot also displays models located in the profile as a curves/lines.

* Limitations: none.
* Interpretation style: graphical
* Visualization backend: Bokeh

Features in progress:

  • uid filters

  • models filters

  • multiple graphs interpretation

Graphs are displayed using the Bokeh library and can be further customized by adding custom labels for axis, custom graph title and different graph width.

The example output of the scatter is as follows:

                          <graph_title>
                  `                         o
                  -                        /
                  `                       /o       ```````````````````
                  -                     _/         `  o o = <points> `
                  `                   _- o         `    _             `
    <of>          -               __--o            `  _-  = <models> `
                  `    _______--o- o               `                 `
                  -    o  o  o                     ```````````````````
                  `
                  +````||````||````||````||````

                              <per>

Refer to Scatter Plot for more thorough description and example of scatter interpretation possibilities. For more thorough explanation of regression analysis and models refer to Regression Analysis.

perun show scatter [OPTIONS]

Options

-o, --of <of_key>

Data source for the scatter plot, i.e. what will be displayed on Y axis.

Default:

'amount'

-p, --per <per_key>

Keys that will be displayed on X axis of the scatter plot.

Default:

'structure-unit-size'

-f, --filename <html>

Outputs the graph to the file specified by filename.

-xl, --x-axis-label <text>

Label on the X axis of the scatter plot.

-yl, --y-axis-label <text>

Label on the Y axis of the scatter plot.

-gt, --graph-title <text>

Title of the scatter plot.

-v, --view-in-browser

Will show the graph in browser.

Utility Commands

perun utils

Contains set of developer commands, wrappers over helper scripts and other functions that are not the part of the main perun suite.

perun utils [OPTIONS] COMMAND [ARGS]...

Commands

create

According to the given <template>…

stats

Provides a set of operations for…

temp

Provides a set of operations for…

perun utils create

According to the given <template> constructs a new modules in Perun for <unit>.

Currently, this supports creating new modules for the tool suite (namely collect, postprocess, view) or new algorithms for checking degradation (check). The command uses templates stored in ../perun/templates directory and uses _jinja as a template handler. The templates can be parametrized by the following by options (if not specified ‘none’ is used).

Unless --no-edit is set, after the successful creation of the files, an external editor, which is specified by general.editor configuration key.

perun utils create [OPTIONS] <template> <unit>

Options

-nb, --no-before-phase

If set to true, the unit will not have before() function defined.

-na, --no-after-phase

If set to true, the unit will not have after() function defined.

-ne, --no-edit

Will open the newly created files in the editor specified by general.editor configuration key.

-st, --supported-type <supported_types>

Sets the supported types of the unit (i.e. profile types).

Arguments

<template>

Required argument

<unit>

Required argument

perun temp

Provides a set of operations for maintaining the temporary directory (.perun/tmp/) of perun.

perun temp [OPTIONS] COMMAND [ARGS]...

Commands

delete

Deletes the temporary file or directory.

list

Lists the temporary files of the…

sync

Synchronizes the ‘.perun/tmp/’ directory…

perun temp list

Lists the temporary files of the ‘.perun/tmp/’ directory. It is possible to list only files in specific subdirectory by supplying the ROOT path.

The path can be either absolute or relative - the base of the relative path is the tmp/ directory.

perun temp list [OPTIONS] [ROOT]

Options

-t, --no-total-size

Do not show the total size of all the temporary files combined.

-f, --no-file-size

Do not show the size of each temporary file.

-p, --no-protection-level

Do not show the protection level of the temporary files.

-s, --sort-by <sort_by>

Sorts the temporary files on the output.

Options:

name | protection | size

-fp, --filter-protection <filter_protection>

List only temporary files with the given protection level.

Options:

all | unprotected | protected

Arguments

ROOT

Optional argument

perun temp sync

Synchronizes the ‘.perun/tmp/’ directory contents with the internal tracking file. This is useful when some files or directories were deleted manually and the resulting inconsistency is causing troubles - however, this should be a very rare condition.

Invoking the ‘temp list’ command should also synchronize the internal state automatically.

perun temp sync [OPTIONS]

perun stats

Provides a set of operations for manipulating the stats directory (.perun/stats/) of perun.

perun stats [OPTIONS] COMMAND [ARGS]...

Commands

clean

Cleans the stats directory by…

delete

Allows the deletion of stat files, minor…

list-files

Show stat files stored in the stats…

list-versions

Show minor versions stored as directories…

sync

Synchronizes the actual contents of the…

perun stats list-files

Show stat files stored in the stats directory (.perun/stats/). This command shows only a limited number of the most recent files by default. This can be, however, changed by the –top and –from-minor options.

The default output format is ‘file size | minor version | file name’.

perun stats list-files [OPTIONS]

Options

-N, --top <top>

Show only stat files from top N minor versions. Show all results if set to 0. The minor version to start at can be changed using –from-minor.

Default:

20

-m, --from-minor <hash>

Show stat files starting from a certain minor version (default is HEAD).

-i, --no-minor

Do not show the minor version headers in the output.

-f, --no-file-size

Do not show the size of each stat file.

-t, --no-total-size

Do not show the total size of all the stat files combined.

-s, --sort-by-size

Sort the files by size instead of the minor versions order.

perun stats list-versions

Show minor versions stored as directories in the stats directory (.perun/stats/). This command shows only a limited number of the most recent versions by default. This can be, however, changed by the –top and –from-minor options.

The default output format is ‘directory size | minor version | file count’.

perun stats list-versions [OPTIONS]

Options

-N, --top <top>

Show only top N minor versions. Show all versions if set to 0. The minor version to start at can be changed using –from-minor.

Default:

20

-m, --from-minor <hash>

Show minor versions starting from a certain minor version (default is HEAD).

-d, --no-dir-size

Do not show the size of the version directory.

-f, --no-file-count

Do not show the number of files in each version directory.

-t, --no-total-size

Do not show the total size of all the versions combined.

-s, --sort-by-size

Sort the versions by size instead of their VCS order.

perun stats delete

Allows the deletion of stat files, minor versions or the whole stats directory.

perun stats delete [OPTIONS] COMMAND [ARGS]...

Commands

.

Deletes the whole content of the stats

file

Deletes a stat file in either specific…

minor

Deletes the specified minor version…

perun stats delete file

Deletes a stat file in either specific minor version or across all the minor versions in the stats directory.

perun stats delete file [OPTIONS] NAME

Options

-m, --in-minor <hash>

Delete the stats file in the specified minor version (HEAD if not specified) or across all the minor versions if set to “.”.

-k, --keep-directory

Possibly empty directory of minor version will be kept in the file system.

Arguments

NAME

Required argument

perun stats delete minor

Deletes the specified minor version directory in stats with all its content.

perun stats delete minor [OPTIONS] VERSION

Options

-k, --keep-directory

Resulting empty directory of minor version will be kept in the file system.

Arguments

VERSION

Required argument

perun stats delete ll

Deletes the whole content of the stats directory.

perun stats delete ll [OPTIONS]

Options

-k, --keep-directory

Resulting empty directories of minor versions will be kept in the file system.

perun stats clean

Cleans the stats directory by synchronizing the internal state, deleting distinguishable custom files and directories (i.e. not all the custom-made or manually created files / directories can be identified as custom, e.g. when they comply the correct format etc.) and by removing the empty minor version directories.

perun stats clean [OPTIONS]

Options

-c, --keep-custom

The custom stats directories will not be removed.

-e, --keep-empty

The empty version directories will not be removed.

perun stats sync

Synchronizes the actual contents of the stats directory with the internal ‘index’ file. The synchronization should be needed only rarely - mainly in cases when the stats directory has been manually tampered with and some files or directories were created or deleted by a user.

perun stats sync [OPTIONS]