Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH Allow for appropriate dtype us in preprocessing.PolynomialFeatures for sparse matrices #23731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 125 commits into from
May 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
7eef7ad
[WIP] FIX index overflow error in sparse matrix polynomial expansion …
niuk-a Jul 8, 2021
4adbf38
Merge branch 'main' into csr_polynomial
Micky774 Jun 16, 2022
baa98a2
Reconciled with main
Micky774 Jun 16, 2022
2b9187d
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
Micky774 Jun 20, 2022
55424a0
Merge branch 'main' into csr_polynomial
Micky774 Jun 22, 2022
69438dc
Removed extra `total_nnz` assignment
Micky774 Jun 22, 2022
9ecbf8a
Added fused type
Micky774 Jun 22, 2022
345e043
Added clarifying comment
Micky774 Jun 22, 2022
1d23b1d
Merge branch 'main' into csr_polynomial
Micky774 Jun 23, 2022
cc6a548
Added changelog entry
Micky774 Jun 23, 2022
8b189bb
Merge branch 'main' into csr_polynomial
Micky774 Jun 23, 2022
15b00fd
Fixed PR tag in changelog entry
Micky774 Jun 23, 2022
ee8a3ba
Apply suggestions from code review
Micky774 Jun 29, 2022
cd346f1
Merge branch 'main' into csr_polynomial
Micky774 Jun 29, 2022
0a17dee
Streamlined logic and improved tests
Micky774 Jun 29, 2022
b118a3c
Added test depending on scipy version
Micky774 Jun 30, 2022
fa1ecf2
Clarified breaking and renamed types
Micky774 Jun 30, 2022
d735c8f
Merge branch 'main' into csr_polynomial
Micky774 Jun 30, 2022
f3bb5cd
Merge branch 'main' into csr_polynomial
Micky774 Jun 30, 2022
0c9a563
Merge branch 'main' into csr_polynomial
Micky774 Jun 30, 2022
a006bf0
Merge branch 'main' into csr_polynomial
Micky774 Jul 4, 2022
96259d7
Apply suggestions from code review
Micky774 Jul 4, 2022
2e44f39
Merge branch 'main' into csr_polynomial
Micky774 Jul 7, 2022
377f6a9
Apply suggestions from code review
Micky774 Jul 7, 2022
8a77b66
Improved tests
Micky774 Jul 8, 2022
d2e6339
Merge branch 'csr_polynomial' of https://github.com/Micky774/scikit-l…
Micky774 Jul 8, 2022
c70c216
Merge branch 'main' into csr_polynomial
Micky774 Jul 8, 2022
a9d39a7
Initial addition -- fails with segfault
Micky774 Aug 11, 2022
e1262f9
Improved documentation
Micky774 Aug 11, 2022
5ee0d96
Added license information
Micky774 Aug 11, 2022
ceca8ed
Updated fused-type name to hopefully clarify purpose
Micky774 Aug 11, 2022
9bd99ca
Used vectors and updated implementation
Micky774 Aug 17, 2022
8943412
Removed accidentally-added file
Micky774 Aug 17, 2022
5be9a13
Simplified and cleaned up implementation
Micky774 Aug 18, 2022
27974ba
Slightly better formatting and variable name
Micky774 Aug 18, 2022
cec3005
Fixed dtype bug and added testing
Micky774 Aug 18, 2022
5116a1d
Merge branch 'main' into csr_polynomial
Micky774 Aug 18, 2022
764d8bd
Updated test to verify nnz count and indices
Micky774 Aug 18, 2022
102e2fa
Improved dtype resolution and clarified with comments
Micky774 Aug 18, 2022
8430c3f
Fixed inexact index error
Micky774 Aug 18, 2022
db78c7e
Updated formatting
Micky774 Aug 18, 2022
057a4f5
Cleaner diff and blame history
Micky774 Aug 18, 2022
23e9acf
Merge branch 'main' into csr_polynomial
Micky774 Aug 28, 2022
46745a8
Fixed overflow bug in expanded index calculation
Micky774 Aug 28, 2022
9ff8413
Fix intermediate calculation overflow and refactor tests
Micky774 Aug 30, 2022
5a221f2
Merge branch 'main' into csr_polynomial
Micky774 Aug 30, 2022
baea39e
Fixed duplicated changelog entries
Micky774 Aug 30, 2022
ff3d050
Merge branch 'main' into csr_polynomial
Micky774 Oct 6, 2022
54c7d2e
Apply suggestions from code review
Micky774 Oct 6, 2022
1c8a98b
Update comment for scipy min version (new backport)
Micky774 Oct 6, 2022
016ae5b
Removed vendored csr_hstack and instead error where appropriate
Micky774 Oct 19, 2022
0e14c8d
Merge branch 'main' into csr_polynomial
Micky774 Oct 19, 2022
a5c17dc
Updated error message
Micky774 Oct 19, 2022
34e7d2a
CLN Add authorship and delete cosmetic changes
Micky774 Oct 19, 2022
e82a9f9
Update sklearn/preprocessing/tests/test_polynomial.py
Micky774 Oct 23, 2022
0ef1b95
Revert
Micky774 Oct 23, 2022
0d7be70
Merge branch 'main' into csr_polynomial
Micky774 Oct 23, 2022
56e8c34
Moved calculation of number of non-zero elements to Cython
Micky774 Oct 25, 2022
ac07342
Merge branch 'main' into csr_polynomial
Micky774 Oct 25, 2022
c973d64
Addressed misc review feedback
Micky774 Oct 25, 2022
86da1a0
Merge branch 'main' into csr_polynomial
Micky774 Nov 5, 2022
4ac5deb
Added format specification
Micky774 Nov 7, 2022
ce2308b
Added explicit equation used to generate constants
Micky774 Nov 8, 2022
0543ccd
Merge branch 'main' into csr_polynomial
Micky774 Dec 14, 2022
a97c526
Improved documentation and introduced error
Micky774 Dec 15, 2022
391d049
Improved wording
Micky774 Dec 15, 2022
68615c8
Apply suggestions from code review
Micky774 Dec 15, 2022
66738fa
Merge branch 'main' into csr_polynomial
Micky774 Dec 15, 2022
e03f689
Overhauled tests
Micky774 Dec 15, 2022
c35cf2b
Improved cython routines adressed feedback
Micky774 Dec 15, 2022
447296b
Improved code organization
Micky774 Feb 4, 2023
9c93e46
Merge branch 'main' into csr_polynomial
Micky774 Feb 4, 2023
8809a94
Opted for explicit `cnp.*` typing for `DATA_t`
Micky774 Feb 4, 2023
b0c7bf5
Reverted extraneous change
Micky774 Feb 4, 2023
b13a13c
Adjusted tests for un-indexable values on 32bit systems
Micky774 Feb 6, 2023
4215b0b
Apply suggestions from code review
Micky774 Feb 13, 2023
e767c14
Merge branch 'main' into csr_polynomial
Micky774 Feb 13, 2023
1793791
Addressed Cython bug
Micky774 Feb 13, 2023
9144db7
Added documentation for secondary checks
Micky774 Feb 13, 2023
e14f3e4
Update sklearn/preprocessing/_csr_polynomial_expansion.pyx
Micky774 Feb 13, 2023
ec3e9ed
Formatting
Micky774 Feb 13, 2023
29a1cb8
Merge branch 'csr_polynomial' of https://github.com/Micky774/scikit-l…
Micky774 Feb 13, 2023
ddcc960
Merge branch 'main' into csr_polynomial
Micky774 Mar 10, 2023
56301a8
Update sklearn/preprocessing/_csr_polynomial_expansion.pyx
Micky774 Mar 10, 2023
209e511
Merge branch 'main' into csr_polynomial
Micky774 Mar 12, 2023
25b2f60
Factored equations to mitigate overflow risks
Micky774 Mar 12, 2023
b77f071
Merge branch 'main' into csr_polynomial
Micky774 Mar 14, 2023
9757640
Added `__int128` support when available
Micky774 Mar 14, 2023
0f96d48
Added back python computation for overflow protection
Micky774 Mar 14, 2023
6d1a9f1
Corrected for linux
Micky774 Mar 14, 2023
fa91e4e
Added support for CLANG and improved documentation
Micky774 Mar 14, 2023
3e9238a
Merge branch 'main' into csr_polynomial
Micky774 Mar 14, 2023
62d9979
Removed unreachable code
Micky774 Mar 15, 2023
39dab07
Slight change in equation form
Micky774 Mar 18, 2023
d8bdec3
Merge branch 'main' into csr_polynomial
Micky774 Mar 21, 2023
4eb98ca
Updated to include test to confirm expected integer width
Micky774 Mar 21, 2023
4195a4d
Updated fused types
Micky774 Mar 21, 2023
40cd06d
Updated typing
Micky774 Mar 21, 2023
f1dc6dc
Included feedback, and caught/handled old scipy bug
Micky774 Mar 21, 2023
51ea18c
Merge branch 'main' into csr_polynomial
Micky774 Mar 22, 2023
4189339
Apply suggestions from code review
Micky774 Mar 28, 2023
6d9f698
Fixed typo
Micky774 Mar 28, 2023
c06e26f
Merge branch 'main' into csr_polynomial
ogrisel Apr 2, 2023
e1a0725
Merge branch 'main' into csr_polynomial
ogrisel Apr 2, 2023
ca5c4d5
Update sklearn/preprocessing/_polynomial.py
Micky774 Apr 6, 2023
8e218d1
Merge branch 'main' into csr_polynomial
Micky774 Apr 6, 2023
2d2124d
Improved tests
Micky774 Apr 6, 2023
b00d149
Improved tests
Micky774 Apr 6, 2023
c16e33c
Clean paranthesis
Micky774 Apr 6, 2023
af934a8
Updated test to account for 32 bit systems
Micky774 Apr 6, 2023
c4eeb7c
Updated ValueError match string
Micky774 Apr 7, 2023
6576bdd
Fixed overflow bug in tests for Windows
Micky774 Apr 7, 2023
99eabab
Adopted review feedback
Micky774 Apr 11, 2023
9a90a59
Improved constant documentation
Micky774 Apr 11, 2023
9d9d21b
Improved variable names
Micky774 Apr 11, 2023
7ac1a41
Update sklearn/preprocessing/_polynomial.py
Micky774 Apr 12, 2023
1ebfbbe
Merge branch 'main' into csr_polynomial
Micky774 Apr 21, 2023
7756f62
Incorporated typedef changes
Micky774 Apr 21, 2023
6112cf8
Merge branch 'main' into csr_polynomial
Micky774 Apr 25, 2023
cf3c00c
Added check for 32bit-ness for clang
Micky774 Apr 26, 2023
810fb3b
Improved documentation
Micky774 Apr 27, 2023
ff36da1
Merge branch 'main' into csr_polynomial
Micky774 Apr 27, 2023
3033e4b
Merge branch 'main' into csr_polynomial
Micky774 May 1, 2023
ce6b72a
Updated for emscripten edge-case
Micky774 May 1, 2023
bcdee5d
Removed extraneous assertion
Micky774 May 1, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions 9 doc/whats_new/v1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,15 @@ Changelog
during `transform` with no prior call to `fit` or `fit_transform`.
:pr:`25190` by :user:`Vincent Maladière <Vincent-Maladiere>`.

- |Enhancement| :class:`preprocessing.PolynomialFeatures` now calculates the
number of expanded terms a-priori when dealing with sparse `csr` matrices
in order to optimize the choice of `dtype` for `indices` and `indptr`. It
can now output `csr` matrices with `np.int32` `indices/indptr` components
when there are few enough elements, and will automatically use `np.int64`
for sufficiently large matrices.
:pr:`20524` by :user:`niuk-a <niuk-a>` and
:pr:`23731` by :user:`Meekail Zain <micky774>`

- |API| A `FutureWarning` is now raised when instantiating a class which inherits from
a deprecated base class (i.e. decorated by :class:`utils.deprecated`) and which
overrides the `__init__` method.
Expand Down
2 changes: 1 addition & 1 deletion 2 setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ def check_package_status(package, min_version):
},
],
"preprocessing": [
{"sources": ["_csr_polynomial_expansion.pyx"], "include_np": True},
{"sources": ["_csr_polynomial_expansion.pyx"]},
{
"sources": ["_target_encoder_fast.pyx"],
"include_np": True,
Expand Down
283 changes: 186 additions & 97 deletions 283 sklearn/preprocessing/_csr_polynomial_expansion.pyx
Original file line number Diff line number Diff line change
@@ -1,73 +1,178 @@
# Author: Andrew nystrom <awnystrom@gmail.com>
# Authors: Andrew nystrom <awnystrom@gmail.com>
# Meekail Zain <zainmeekail@gmail.com>
from ..utils._typedefs cimport uint8_t, int64_t, intp_t

from scipy.sparse import csr_matrix
cimport numpy as cnp
import numpy as np
ctypedef uint8_t FLAG_t

# We use the following verbatim block to determine whether the current
# platform's compiler supports 128-bit integer values intrinsically.
# This should work for GCC and CLANG on 64-bit architectures, but doesn't for
# MSVC on any architecture. We prefer to use 128-bit integers when possible
# because the intermediate calculations have a non-trivial risk of overflow. It
# is, however, very unlikely to come up on an average use case, hence 64-bit
# integers (i.e. `long long`) are "good enough" for most common cases. There is
# not much we can do to efficiently mitigate the overflow risk on the Windows
# platform at this time. Consider this a "best effort" design decision that
# could be revisited later in case someone comes up with a safer option that
# does not hurt the performance of the common cases.
# See `test_sizeof_LARGEST_INT_t()`for more information on exact type expectations.
cdef extern from *:
"""
#ifdef __SIZEOF_INT128__
typedef __int128 LARGEST_INT_t;
#elif (__clang__ || __EMSCRIPTEN__) && !__i386__
typedef _BitInt(128) LARGEST_INT_t;
#else
typedef long long LARGEST_INT_t;
#endif
"""
ctypedef long long LARGEST_INT_t
Micky774 marked this conversation as resolved.
Show resolved Hide resolved


# Determine the size of `LARGEST_INT_t` at runtime.
# Used in `test_sizeof_LARGEST_INT_t`.
def _get_sizeof_LARGEST_INT_t():
return sizeof(LARGEST_INT_t)

cnp.import_array()

# TODO: use `cnp.{int,float}{32,64}` when cython#5230 is resolved:
# TODO: use `{int,float}{32,64}_t` when cython#5230 is resolved:
# https://github.com/cython/cython/issues/5230
ctypedef fused DATA_T:
ctypedef fused DATA_t:
float
double
int
long
long long
# INDEX_{A,B}_t are defined to generate a proper Cartesian product
# of types through Cython fused-type expansion.
ctypedef fused INDEX_A_t:
signed int
signed long long
ctypedef fused INDEX_B_t:
signed int
signed long long


cdef inline cnp.int32_t _deg2_column(
cnp.int32_t d,
cnp.int32_t i,
cnp.int32_t j,
cnp.int32_t interaction_only,
) noexcept nogil:
cdef inline int64_t _deg2_column(
LARGEST_INT_t n_features,
LARGEST_INT_t i,
LARGEST_INT_t j,
FLAG_t interaction_only
) nogil:
"""Compute the index of the column for a degree 2 expansion

d is the dimensionality of the input data, i and j are the indices
n_features is the dimensionality of the input data, i and j are the indices
for the columns involved in the expansion.
"""
if interaction_only:
return d * i - (i**2 + 3 * i) / 2 - 1 + j
return n_features * i - i * (i + 3) / 2 - 1 + j
else:
return d * i - (i**2 + i) / 2 + j
return n_features * i - i* (i + 1) / 2 + j


cdef inline cnp.int32_t _deg3_column(
cnp.int32_t d,
cnp.int32_t i,
cnp.int32_t j,
cnp.int32_t k,
cnp.int32_t interaction_only
) noexcept nogil:
cdef inline int64_t _deg3_column(
LARGEST_INT_t n_features,
LARGEST_INT_t i,
LARGEST_INT_t j,
LARGEST_INT_t k,
FLAG_t interaction_only
) nogil:
"""Compute the index of the column for a degree 3 expansion

d is the dimensionality of the input data, i, j and k are the indices
n_features is the dimensionality of the input data, i, j and k are the indices
for the columns involved in the expansion.
"""
if interaction_only:
return ((3 * d**2 * i - 3 * d * i**2 + i**3
+ 11 * i - 3 * j**2 - 9 * j) / 6
+ i**2 - 2 * d * i + d * j - d + k)
return (
(
(3 * n_features) * (n_features * i - i**2)
+ i * (i**2 + 11) - (3 * j) * (j + 3)
) / 6 + i**2 + n_features * (j - 1 - 2 * i) + k
)
else:
return (
(
(3 * n_features) * (n_features * i - i**2)
+ i ** 3 - i - (3 * j) * (j + 1)
) / 6 + n_features * j + k
)


def py_calc_expanded_nnz_deg2(n, interaction_only):
return n * (n + 1) // 2 - interaction_only * n


def py_calc_expanded_nnz_deg3(n, interaction_only):
return n * (n**2 + 3 * n + 2) // 6 - interaction_only * n**2


cpdef int64_t _calc_expanded_nnz(
LARGEST_INT_t n,
FLAG_t interaction_only,
LARGEST_INT_t degree
):
"""
Calculates the number of non-zero interaction terms generated by the
non-zero elements of a single row.
"""
# This is the maximum value before the intermediate computation
# d**2 + d overflows
# Solution to d**2 + d = maxint64
# SymPy: solve(x**2 + x - int64_max, x)
cdef int64_t MAX_SAFE_INDEX_CALC_DEG2 = 3037000499

# This is the maximum value before the intermediate computation
# d**3 + 3 * d**2 + 2*d overflows
# Solution to d**3 + 3 * d**2 + 2*d = maxint64
# SymPy: solve(x * (x**2 + 3 * x + 2) - int64_max, x)
cdef int64_t MAX_SAFE_INDEX_CALC_DEG3 = 2097151

if degree == 2:
# Only need to check when not using 128-bit integers
if sizeof(LARGEST_INT_t) < 16 and n <= MAX_SAFE_INDEX_CALC_DEG2:
return n * (n + 1) / 2 - interaction_only * n
return <int64_t> py_calc_expanded_nnz_deg2(n, interaction_only)
else:
return ((3 * d**2 * i - 3 * d * i**2 + i ** 3 - i
- 3 * j**2 - 3 * j) / 6
+ d * j + k)


def _csr_polynomial_expansion(
const DATA_T[:] data,
const cnp.int32_t[:] indices,
const cnp.int32_t[:] indptr,
cnp.int32_t d,
cnp.int32_t interaction_only,
cnp.int32_t degree
# Only need to check when not using 128-bit integers
if sizeof(LARGEST_INT_t) < 16 and n <= MAX_SAFE_INDEX_CALC_DEG3:
return n * (n**2 + 3 * n + 2) / 6 - interaction_only * n**2
return <int64_t> py_calc_expanded_nnz_deg3(n, interaction_only)

cpdef int64_t _calc_total_nnz(
INDEX_A_t[:] indptr,
FLAG_t interaction_only,
int64_t degree,
):
"""
Perform a second-degree polynomial or interaction expansion on a scipy
Calculates the number of non-zero interaction terms generated by the
non-zero elements across all rows for a single degree.
"""
cdef int64_t total_nnz=0
cdef intp_t row_idx
for row_idx in range(len(indptr) - 1):
total_nnz += _calc_expanded_nnz(
indptr[row_idx + 1] - indptr[row_idx],
interaction_only,
degree
)
return total_nnz


cpdef void _csr_polynomial_expansion(
const DATA_t[:] data, # IN READ-ONLY
const INDEX_A_t[:] indices, # IN READ-ONLY
const INDEX_A_t[:] indptr, # IN READ-ONLY
INDEX_A_t n_features,
DATA_t[:] result_data, # OUT
INDEX_B_t[:] result_indices, # OUT
INDEX_B_t[:] result_indptr, # OUT
FLAG_t interaction_only,
FLAG_t degree
) nogil:
"""
Perform a second or third degree polynomial or interaction expansion on a
compressed sparse row (CSR) matrix. The method used only takes products of
non-zero features. For a matrix with density d, this results in a speedup
on the order of d^k where k is the degree of the expansion, assuming all
rows are of similar density.
non-zero features. For a matrix with density :math:`d`, this results in a
speedup on the order of :math:`(1/d)^k` where :math:`k` is the degree of
the expansion, assuming all rows are of similar density.

Parameters
----------
Expand All @@ -80,9 +185,21 @@ def _csr_polynomial_expansion(
indptr : memory view on nd-array
The "indptr" attribute of the input CSR matrix.

d : int
n_features : int
The dimensionality of the input CSR matrix.

result_data : nd-array
The output CSR matrix's "data" attribute.
It is modified by this routine.

result_indices : nd-array
The output CSR matrix's "indices" attribute.
It is modified by this routine.

result_indptr : nd-array
The output CSR matrix's "indptr" attribute.
It is modified by this routine.

interaction_only : int
0 for a polynomial expansion, 1 for an interaction expansion.

Expand All @@ -95,47 +212,11 @@ def _csr_polynomial_expansion(
Matrices Using K-Simplex Numbers" by Andrew Nystrom and John Hughes.
"""

assert degree in (2, 3)

if degree == 2:
expanded_dimensionality = int((d**2 + d) / 2 - interaction_only*d)
else:
expanded_dimensionality = int((d**3 + 3*d**2 + 2*d) / 6
- interaction_only*d**2)
if expanded_dimensionality == 0:
return None
assert expanded_dimensionality > 0

cdef cnp.int32_t total_nnz = 0, row_i, nnz

# Count how many nonzero elements the expanded matrix will contain.
for row_i in range(indptr.shape[0]-1):
# nnz is the number of nonzero elements in this row.
nnz = indptr[row_i + 1] - indptr[row_i]
if degree == 2:
total_nnz += (nnz ** 2 + nnz) / 2 - interaction_only * nnz
else:
total_nnz += ((nnz ** 3 + 3 * nnz ** 2 + 2 * nnz) / 6
- interaction_only * nnz ** 2)

# Make the arrays that will form the CSR matrix of the expansion.
cdef:
DATA_T[:] expanded_data = np.empty(
shape=total_nnz, dtype=data.base.dtype
)
cnp.int32_t[:] expanded_indices = np.empty(
shape=total_nnz, dtype=np.int32
)
cnp.int32_t num_rows = indptr.shape[0] - 1
cnp.int32_t[:] expanded_indptr = np.empty(
shape=num_rows + 1, dtype=np.int32
)

cnp.int32_t expanded_index = 0, row_starts, row_ends
cnp.int32_t i, j, k, i_ptr, j_ptr, k_ptr, num_cols_in_row

cdef INDEX_A_t row_i, row_starts, row_ends, i, j, k, i_ptr, j_ptr, k_ptr
cdef INDEX_B_t expanded_index=0, num_cols_in_row, col
with nogil:
expanded_indptr[0] = indptr[0]
result_indptr[0] = indptr[0]
for row_i in range(indptr.shape[0]-1):
row_starts = indptr[row_i]
row_ends = indptr[row_i + 1]
Expand All @@ -145,24 +226,32 @@ def _csr_polynomial_expansion(
for j_ptr in range(i_ptr + interaction_only, row_ends):
j = indices[j_ptr]
if degree == 2:
col = _deg2_column(d, i, j, interaction_only)
expanded_indices[expanded_index] = col
expanded_data[expanded_index] = (
data[i_ptr] * data[j_ptr])
col = <INDEX_B_t> _deg2_column(
n_features,
i, j,
interaction_only
)
result_indices[expanded_index] = col
result_data[expanded_index] = (
data[i_ptr] * data[j_ptr]
)
expanded_index += 1
num_cols_in_row += 1
else:
# degree == 3
for k_ptr in range(j_ptr + interaction_only, row_ends):
k = indices[k_ptr]
col = _deg3_column(d, i, j, k, interaction_only)
expanded_indices[expanded_index] = col
expanded_data[expanded_index] = (
data[i_ptr] * data[j_ptr] * data[k_ptr])
col = <INDEX_B_t> _deg3_column(
n_features,
i, j, k,
interaction_only
)
result_indices[expanded_index] = col
result_data[expanded_index] = (
data[i_ptr] * data[j_ptr] * data[k_ptr]
)
expanded_index += 1
num_cols_in_row += 1

expanded_indptr[row_i+1] = expanded_indptr[row_i] + num_cols_in_row

return csr_matrix((expanded_data, expanded_indices, expanded_indptr),
shape=(num_rows, expanded_dimensionality))
result_indptr[row_i+1] = result_indptr[row_i] + num_cols_in_row
return
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.