Skip to main content
  1. About
  2. For Teams
Asked
Viewed 54k times
5

I have an 8000-element 1D array.

I want to obtain the following two arrays:

  1. test contains the element with the index from [1995:1999], [3995:3999], [5999:5999], [7995:7999].

  2. train should contains everything else.

How should I do that?


idx = [1995,1996,1997,1998, 1999, 3995, 3996, 3997,3998, 3999, 5995, 5996, 5997, 5998, 5999, 7995, 7996, 7997, 7998, 7999]
test = [X[i] for i in idx]

train = [X[i] for i **not** in idx]
4
  • Look into vstack from numpy. What have you tried so far?
    Arya McCarthy
    –  Arya McCarthy
    2017-05-22 03:03:36 +00:00
    Commented May 22, 2017 at 3:03
  • is there a not in command in python?
    wrek
    –  wrek
    2017-05-22 03:08:10 +00:00
    Commented May 22, 2017 at 3:08
  • What do you mean?
    Arya McCarthy
    –  Arya McCarthy
    2017-05-22 03:09:15 +00:00
    Commented May 22, 2017 at 3:09
  • look at my update
    wrek
    –  wrek
    2017-05-22 03:09:28 +00:00
    Commented May 22, 2017 at 3:09

5 Answers 5

5

Based on your example, a simple workaround would be this:

train = [X[i] for i, _ in enumerate(X) if i not in idx]
Sign up to request clarification or add additional context in comments.

Comments

2

I looks like you are looking for numpy.where, here is a simple example to get you started:

In [18]: import numpy as np

In [19]: a = np.array([[0,3],[1,2],[2,3],[3,2],[4,5],[5,1]])

In [20]: a[np.where((a[:, 0] > 1) & (a[:, 0] < 5))[0]]
Out[20]: 
array([[2, 3],
       [3, 2],
       [4, 5]])

In [21]: a[np.where(~((a[:, 0] > 1) & (a[:, 0] < 5)))[0]]
Out[21]: 
array([[0, 3],
       [1, 2],
       [5, 1]])

The first element in row can be your index, and second your value. numpy.where checks whether condition is true or false, and returns a binary array (actually tuple of arrays), once we have binary array, we can index the original array based on that.

Comments

2

If you want, you can use masks

mask = np.ones(len(X), dtype=bool)
mask[idx] = False
train = X[mask]
test = X[idx]

# you can also use this for test
test = X[np.logical_not(mask)]

1 Comment

It should be mask. You negate the mask which is bool. idx is integer type.
2

When building train, you need to iterate through all of your source data.

Using enumerate should make things easy:

>>> data = list(range(8000))
>>> train, test = [], []
>>> for i, value in enumerate(data):
...     if 1995 <= i <= 1999 or 3995 <= i <= 3999 or 5995 <= i <= 5999 or 7995 <= i <= 7999:
...         test.append(value)
...     else:
...         train.append(value)
...
>>> test
[1995, 1996, 1997, 1998, 1999, 3995, 3996, 3997, 3998, 3999, 5995, 5996, 5997, 5998, 5999, 7995, 7996, 7997, 7998, 7999]
>>> len(train)
7980

Comments

1

This is one possibility, assuming array is the name of the list containing 8000 elements:

idx = {1995, 1996, 1997, 1998, 1999, 3995, 3996, 3997, 3998, 3999, 5995, 5996, 5997, 5998, 5999, 7995, 7996, 7997, 7998, 7999}

test = [array[x] for x in idx]

train = [x for i, x in enumerate(array) if i not in idx]

Comments

Your Answer

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.