Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Hi all,

Thanks for open sourcing this library, I am looking for Algorithm 1 TSMixup: Time Series Mixup as in the paper, could you point out where in the code repo I can find the code snippet about TSMixup?

Thank you!

You must be logged in to vote

@vincehass Please use discussions for questions, since this is not a bug.

The TSMixup script is not in this repository because we already released both the raw and tsmixed datasets on HuggingFace.

If you are still looking for a reference implementation, this should be close enough:

from copy import deepcopy
from itertools import islice
from pathlib import Path

import numpy as np
import pandas as pd
from gluonts.dataset.arrow import ArrowWriter
from gluonts.itertools import Cyclic
from tqdm.auto import tqdm


def ts_sampler(datasets):
    infinite_iterators = list(map(lambda d: iter(Cyclic(d)), datasets))
    while True:
        idx = np.random.randint(len(datasets))
        yield next(in…

Replies: 2 comments · 3 replies

Comment options

@vincehass Please use discussions for questions, since this is not a bug.

The TSMixup script is not in this repository because we already released both the raw and tsmixed datasets on HuggingFace.

If you are still looking for a reference implementation, this should be close enough:

from copy import deepcopy
from itertools import islice
from pathlib import Path

import numpy as np
import pandas as pd
from gluonts.dataset.arrow import ArrowWriter
from gluonts.itertools import Cyclic
from tqdm.auto import tqdm


def ts_sampler(datasets):
    infinite_iterators = list(map(lambda d: iter(Cyclic(d)), datasets))
    while True:
        idx = np.random.randint(len(datasets))
        yield next(infinite_iterators[idx])


def mean_scaler(ts_entry):
    ts_entry = deepcopy(ts_entry)
    mean = np.nanmean(ts_entry["target"])
    mean = mean if mean > 0.0 and mean == mean else 1.0
    ts_entry["target"] = ts_entry["target"] / mean
    return ts_entry


def slice_ts(ts_entry, length):
    ts_length = len(ts_entry["target"])
    start_idx = np.random.randint(ts_length - length + 1)
    return ts_entry["target"][start_idx : start_idx + length]


def validate_entry(entry):
    if np.isnan(entry["target"]).mean() >= 0.9:
        return False
    if (entry["target"] == 0.0).mean() >= 0.9:
        return False
    return True


def make_augmentation(
    ts_generator,
    K=3,
    alpha=1.5,
    min_length=128,
    max_length=1024,
):
    while True:
        k = np.random.randint(1, K + 1)
        length = np.random.randint(min_length, max_length + 1)
        tss = list(islice(filter(lambda x: len(x["target"]) >= length, ts_generator), k))
        tss = np.array([slice_ts(mean_scaler(ts), length) for ts in tss])
        weights = np.random.dirichlet([alpha] * k)
        ts = np.einsum("k, kl -> l", weights, tss)

        entry = {"start": np.datetime64("2000-01-01 00:00", "s"), "target": ts}
        if validate_entry(entry):
            return entry


def make_augmentations(
    datasets,
    num_series=100,
    K=3,
    alpha=1.5,
    min_length=128,
    max_length=2048,
):
    ts_generator = ts_sampler(datasets)
    for _ in tqdm(range(num_series)):
        yield make_augmentation(ts_generator, K=K, alpha=alpha, min_length=min_length, max_length=max_length)


def generate_dataset(num_series):
    """Generates a random GluonTS-style dataset"""
    dataset = []
    for _ in range(num_series):
        series_length = np.random.randint(10_000)
        dataset.append({"start": pd.Period("2024", freq="h"), "target": np.random.randn(series_length)})
    return dataset


# Usage Example

# Let's generate 10 random GluonTS-style datasets
datasets = [generate_dataset(num_series=np.random.randint(10_000)) for i in range(10)]
# Generate TXMixup augmentations
ts_mixed_up_dataset = make_augmentations(datasets, num_series=100_000)
# Save in Arrow format
ArrowWriter(compression="lz4").write_to_file(ts_mixed_up_dataset, Path("tsmixup-example.arrow"))
You must be logged in to vote
3 replies
@kanghui-learning
Comment options

Hi, I noticed there are 200 separate files of TSMixup data. Is each file generated from the entire dataset, or only from a subset of it?

@gvsmothish
Comment options

Hello @abdulfatir,

Could you please specify the location of the raw datasets on Hugging Face? We were unable to find them. Only Mixed dataset is visible.

Thanks in advance.

@abdulfatir
Comment options

https://huggingface.co/datasets/autogluon/chronos_datasets

Answer selected by abdulfatir
Comment options

Awesome thank you very much!

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
question Further information is requested
4 participants
Converted from issue

This discussion was converted from issue #212 on November 27, 2024 21:24.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.