Treffer: Feature-Based Time-Series Analysis in R using the Theft Ecosystem.
Weitere Informationen
Time series are measured and analyzed across the sciences. One method of quantifying the structure of time series is by calculating a set of summary statistics or ‘features’, and then representing a time series in terms of its properties as a feature vector. The resulting feature space is interpretable and informative, and enables conventional statistical learning approaches, including clustering, regression, and classification, to be applied to time-series datasets. Many open-source software packages for computing sets of time-series features exist across multiple programming languages, including ‘catch22’ (22 features: Matlab, R, Python, Julia), ‘feasts’ (43 features: R), ‘tsfeatures’ (62 features: R), ‘Kats’ (40 features: Python), ‘tsfresh’ (783 features: Python), and ‘TSFEL’ (156 features: Python). However, there are several issues: (i) a singular access point to these packages is not currently available; (ii) to access all feature sets, users must be fluent in multiple languages; and (iii) these featureextraction packages lack extensive accompanying methodological pipelines for performing feature-based time-series analysis, such as applications to time-series classification. Here we introduce a solution to these issues in the form of two complementary statistical software packages for R called ‘theft’: Tools for Handling Extraction of Features from Time series and ‘theftdlc’: theft ‘downloadable content’. ‘theft’ is a unified and extendable framework for computing features from the six open-source time-series feature sets listed above as well as custom user-specified features. ‘theftdlc’ is an extension package to ‘theft’ which includes a suite of functions for processing and interpreting the performance of extracted features, including extensive data-visualization templates, low-dimensional projections, and time-series classification. With an increasing volume and complexity of large time-series datasets in the sciences and industry, ‘theft’ and ‘theftdlc’ provide a standardized framework for comprehensively quantifying and interpreting informative structure in time series. [ABSTRACT FROM AUTHOR]
Copyright of R Journal is the property of R Foundation and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)