BiweightStats.jl

Code Build Status PkgEval Coverage License

Robust statistics based on the biweight transform.

Installation

BiweightStats.jl is a registered package and can be installed using the Julia package manager. From the Julia REPL, enter Pkg mode (by pressing ])

julia>]

pkg> add BiweightStats

To exit Pkg mode, just backspace. Once the package is installed it can be imported with

julia> using BiweightStats

To exit Pkg mode, just backspace. Once the package is installed it can be imported with For more information, see the Pkg documentation.

Usage

The following examples show the biweight location and scale for a few distributions, and comparing them to the mean and standard deviation.

To run these examples, make sure the following packages need to be installed

using Pkg; Pkg.add(["Distributions", "StatsPlots"])
using BiweightStats
using Distributions: Normal, Logistic, Cauchy, Laplace
using Statistics
using StatsPlots
using Random
rng = Random.seed!(1001)
Random.TaskLocalRNG()

for all our examples, we'll sample 10,000 points

N = 10000
10000
function sample_measure_and_plot_dists(rng, dist, N; kwargs...)
    samples = rand(rng, dist, N)

    mu = mean(samples)
    sig = std(samples; mean=mu)

    loc = location(samples; c=9)
    sca = scale(samples; c=9, M=loc)

    p = histogram(samples; normalize=true, fill=0.3, c=1, lab="", leg=:topleft, kwargs...)
    plot!(dist; c=1, lab="", lw=3)
    vline!([mu loc]; c=[2 3], lab=["mean ± std" "biweight location ± scale"], lw=2)
    vline!([mu loc] .+ [-sig -sca; sig sca]; c=[2 3], ls=:dash, lab="", lw=2)
    return p
end
sample_measure_and_plot_dists (generic function with 1 method)

Gaussian

dist = Normal()
sample_measure_and_plot_dists(rng, dist, N; title="Gaussian")

Logistic

dist = Logistic()
sample_measure_and_plot_dists(rng, dist, N; title="Logistic")

Cauchy

dist = Cauchy()
sample_measure_and_plot_dists(rng, dist, N; title="Cauchy", xlim=(-40, 40))

Laplace

dist = Laplace()
sample_measure_and_plot_dists(rng, dist, N; title="Laplace")

Benchmarks

This package has been benchmarked against astropy.stats. The benchmarking code can be found in bench/.

System Information

Benchmarks were ran on a 2021 M1 Macbook Pro

Julia Version 1.8.0-beta2
Commit b655b2c008 (2022-03-21 12:50 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.3.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
  Threads: 8 on 8 virtual cores

The benchmarks that were ran generated 2 sets of $n$ normally-distributed samples without any outliers or non-finite numbers added. One of these sets was used for testing both this package's and astropy's implementations of the univariate statistics. For the midcovariance statistic, the data was combined into an $(N, 2)$ matrix to match the astropy function signature.

  • StatsBase.jl

    Contains a couple robust statistics, but has no overlapping functionality with this package.

  • RobustStats.jl

    Contains many more robust statistics primarily based on the WRS R package. Appears to be unmaintained and not updated to Julia v1. The bivar function is the same as this package's midvar, although bivar does not have definitions for the statistics across axes of an array.

  • astropy.stats

    Python implementations of all the statistics presented here. There are some slight differences in the function signatures, including the default cutoff value c (for some statistics).

Contributing and Support

If you would like to contribute, feel free to open a pull request. If you want to discuss something before contributing, head over to discussions and join or open a new topic. If you're having problems with something, please open an issue.