Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

MAINT Remove all Cython, C and C++ compilations warnings #24875

Copy link
Copy link
Closed
@jjerphan

Description

@jjerphan
Issue body actions

Context

scikit-learn builds with Cython, C and C++ warnings (when building wheels or for when installing locally (for development)). There are several kinds of warnings, each kind having its own cause, solutions and mitigations.

🏷 Use of the deprecated NumPy API (via Cython) (-Wcpp)

Some Cython implementations might use old Cython idiomatic constructs or old symbols of NumPy directly via NumPy's Cython API (generally used via cnp from cimport numpy as cnp) which are deprecated.

Examples of such symbols non-exhaustively involves:

  • the cnp.ndarray
  • some constants (see this enum for instance)

Such warnings are not raised compiling Cython code, but are when compiling C and C++ code, generated by Cython (or even in standalone implementations using the NumPy C API directly).

🧰 Resolutions involves:

Example of such warning in context
  building 'sklearn.cluster._dbscan_inner' extension
  creating /tmp/tmpuptw2_72.build-temp/sklearn/cluster
  gcc -pthread -B /home/jjerphan/.local/share/miniconda3/envs/sk/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/jjerphan/.local/share/miniconda3/envs/sk/include -fPIC -O2 -isystem /home/jjerphan/.local/share/miniconda3/envs/sk/include -fPIC -I/home/jjerphan/.local/share/miniconda3/envs/sk/lib/python3.11/site-packages/numpy/core/include -I/home/jjerphan/.local/share/miniconda3/envs/sk/include/python3.11 -c sklearn/cluster/_dbscan_inner.cpp -o /tmp/tmpuptw2_72.build-temp/sklearn/cluster/_dbscan_inner.o -O3 -fopenmp
  In file included from /home/jjerphan/.local/share/miniconda3/envs/sk/lib/python3.11/site-packages/numpy/core/include/numpy/ndarraytypes.h:1948,
                   from /home/jjerphan/.local/share/miniconda3/envs/sk/lib/python3.11/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                   from /home/jjerphan/.local/share/miniconda3/envs/sk/lib/python3.11/site-packages/numpy/core/include/numpy/arrayobject.h:5,
                   from sklearn/cluster/_dbscan_inner.cpp:794:
  /home/jjerphan/.local/share/miniconda3/envs/sk/lib/python3.11/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
     17 | #warning "Using deprecated NumPy API, disable it with " \
        |  ^~~~~~~

🏷 Discarded const-qualification (-Wdiscarded-qualifiers)

In some implementation, some variables are being defined using const-qualified variables.

For instance in C:

// a is const-qualified and can't be modified then
const int a = 5;
// The const-qualification of a is discarded for the definition of b.
// b is not const-qualified and thus can be modified.
int b = a;

// Note that modifying b does not modify a.
// This would if a and b were pointers to objects
// (hence the reason for this kind of warning). 

While most of such definitions aren't harmful, some might be and compilers generally raise warning when the const-qualifications are discarded.

🧰 Resolutions involves:

  • const qualifying variables (like b) which currently aren't
Example of such warning in context
  sklearn/metrics/_dist_metrics.c: In function ‘__pyx_pf_7sklearn_7metrics_13_dist_metrics_14DistanceMetric_22_pairwise_sparse_dense’:
  sklearn/metrics/_dist_metrics.c:9143:29: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
   9143 |             __pyx_v_x2_data = ((&(*((__pyx_t_7sklearn_5utils_9_typedefs_DTYPE_t const  *) ( /* dim=1 */ ((char *) (((__pyx_t_7sklearn_5utils_9_typedefs_DTYPE_t const  *) ( /* dim=0 */ (__pyx_v_Y_data.data + __pyx_t_14 * __pyx_v_Y_data.strides[0]) )) + __pyx_t_18)) )))) + (__pyx_v_i2 * __pyx_v_n_features));
        |                             ^

🏷 Comparison on different integral-typed values (Wsign-compare) and integer overflow (-Woverflow)

Various integral dtypes (namely (unsigned) {short,int,long}, an NumPy aliases i.e. cnp.int*, size, Py_ssize_t, other custom aliases in scikit-learn, etc.) are used in scikit-learn.

In Cython and C and C++ implementation, it is possible that binary operations (comparison, assignation, etc.) are applied on variable having different dtypes. Some of

While such instructions aren't harmful, some might be and compilers generally raise warning in their presence.

🧰 Resolutions involves:

  • adapting type so that both values have the same one
  • explicitly casting some variables' value to other temporary variables of the same type
  • reworking the use of fused types which generate specialised versions of functions for various combinations of types
Example of such warning in context
  sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors.cpp: In function ‘void __pyx_f_7sklearn_7metrics_29_pairwise_distances_reduction_17_radius_neighbors_17RadiusNeighbors64_compute_exact_distances(__pyx_obj_7sklearn_7metrics_29_pairwise_distances_reduction_17_radius_neighbors_RadiusNeighbors64*)’:
  sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors.cpp:6285:59: warning: comparison of integer expressions of different signedness: ‘__pyx_t_7sklearn_5utils_9_typedefs_ITYPE_t’ {aka ‘long int’} and ‘std::vector<long int>::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
   6285 |                             for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {
        |                                                 ~~~~~~~~~~^~~~~~~~~~~

🏷 Incompatible pointers types casts (-Wincompatible-pointer-types) (Difficulty level: 🧙)

On some Cython class, polymorphism implementations apparently cast to a incompatible pointers type.

This runs and works fine but this is a code smell which yet is to be inspected.

This problems mainly is present for the DistanceMetric and BinaryTree class hierarchies.

A similar problem has been reported in cython/cython#2747.

🧰 Resolutions involves:

  • experimenting with the @final Cython decorator for DistanceMetrics' methods'
  • understanding how polymorphism is implemented
  • changing DistanceMetrics'' methods' signature
Example of such warning in context
sklearn/metrics/_dist_metrics.c: In function ‘__pyx_f_7sklearn_7metrics_13_dist_metrics_17EuclideanDistance_dist_csr’:
  sklearn/metrics/_dist_metrics.c:11320:88: warning: passing argument 1 of ‘__pyx_f_7sklearn_7metrics_13_dist_metrics_17EuclideanDistance_rdist_csr’ from incompatible pointer type [-Wincompatible-pointer-types]
  11320 |   __pyx_t_1 = __pyx_f_7sklearn_7metrics_13_dist_metrics_17EuclideanDistance_rdist_csr(((struct __pyx_obj_7sklearn_7metrics_13_dist_metrics_DistanceMetric *)__pyx_v_self), __pyx_v_x1_data, __pyx_v_x1_indices, __pyx_v_x2_data, __pyx_v_x2_indices, __pyx_v_x1_start, __pyx_v_x1_end, __pyx_v_x2_start, __pyx_v_x2_end, __pyx_v_size); if (unlikely(__pyx_t_1 == ((__pyx_t_7sklearn_5utils_9_typedefs_DTYPE_t)-1.0))) __PYX_ERR(1, 927, __pyx_L1_error)
        |                                                                                       ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |                                                                                        |
        |                                                                                        struct __pyx_obj_7sklearn_7metrics_13_dist_metrics_DistanceMetric *
  

Proposed solution: treat one kind of warning for one file/extension at a time

  • Only perform changes necessary to remove one kind of warning (🏷) for one file or one Cython extension at a time using information given in 🧰.

    • ⚠ It might not be possible to remove all warnings at once
    • ⚠ Changes might not be trivial or might necessitate adaption. In this case, please do explain the logic.
  • Create PRs for each warning with the following title:

    MAINT Remove `-Wkind` warnings when compiling `source_file.ext`
    

    where:

    • kind is the kind of warning (🏷), here represented as the formatted string in the subtitles, i.e. an element of (cpp, incompatible-pointer-types, sign-compare,overflow)
    • source_file.ext is the file being treated
  • To be discussed among maintainers: once a kind of warnings as been entirely removed, turn on strict compilations compilation flags to raise error for this kind of warning.

Miscellaneous

Example of compilation trace

Examples of warnings in their context stem from a verbose trace when building scikit-learn locally.

To see this full development installation trace (as of 2e481f1), see this gist.

Filtering compilation trace

You can compile the whole project and filter the compilation trace for some module or source, using grep for instance. Note that stderr better be redirected to stdout to have all the information.

For instance, to perform this for sklearn.linear_model._cd_fast, you can running in the scikit-learn's worktree (using -C to have 6 surroundings lines in the context of matchings):

pip install --verbose --editable . 2>&1 | grep -C 3 "sklearn.linear_model._cd_fast"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Build / CIC/C++EasyWell-defined and straightforward way to resolveWell-defined and straightforward way to resolveMeta-issueGeneral issue associated to an identified list of tasksGeneral issue associated to an identified list of taskscython

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.