Description
When running a large update on a plot, sometimes the time is dominated by the lookup for missing names, even when this lookup serves no purpose, such as in plotly.subplots.make_subplots
where new axes are being added by the plotly library itself, and finding "nearby" names is not helpful when all that is needed is a check if the name already exists.
In plotly.basedatatypes.BaseDatatype._perform_update
, for each key to be updated it calls _check_path_in_prop_tree
, but then ignores the error that is generated if the plotly_obj
is a BaseLayoutType
and the key matches one of the _subplot_re_match
keywords, and just adds the key to the plotly_obj instead.
This call to _check_path_in_prop_tree
is expensive when
- there are a large number of defined keys in the layout (such as when there are a large number of subplots being initialized) and
- the path is not yet in the prop tree
When those conditions are met, the prop lookup check_path_in_prop_tree
calls BasePlotlyType.__getitem__
, which then calls BasePlotlyType._raise_on_invalid_property_error
, which then calls _plotly_utils.utils.find_closest_string
for each new property being added. find_closest_string
uses a levenshtein
lookup and this takes geometrically more time the more keys there are to compare, which is the root of the problem.
If the lookup could be disabled in _raise_on_invalid_property_error
, that would help, but there is a problem in that _check_path_in_prop_tree
is using an implicit __getitem__
on the plotly_obj
, here
The issue can be resolved by disabling the lookup completely, as shown in the second example notebook below, where _plotly_utils.utils
is monkey-patched so that find_closest_string
always raises an error, but this doesn't seem like the best way to handle it.
Example one: without disabling lookup
import plotly.subplots
import cProfile
import pstats
from pstats import SortKey
cProfile.run('fig = plotly.subplots.make_subplots(rows=20, cols=20)', 'restats')
p = pstats.Stats('restats')
p.sort_stats(SortKey.CUMULATIVE).print_stats(20)
Mon Mar 13 15:24:56 2023 restats
55179619 function calls (54968062 primitive calls) in 18.368 seconds
Ordered by: cumulative time
List reduced from 868 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
150/1 0.000 0.000 18.368 18.368 {built-in method builtins.exec}
1 0.000 0.000 18.368 18.368 <string>:1(<module>)
1 0.000 0.000 18.368 18.368 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/subplots.py:7(make_subplots)
1 0.002 0.002 18.368 18.368 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/_subplots.py:45(make_subplots)
1 0.000 0.000 18.043 18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/graph_objs/_figure.py:736(update_layout)
1 0.000 0.000 18.043 18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1378(update_layout)
1 0.000 0.000 18.043 18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5096(update)
802/2 0.010 0.000 17.536 8.768 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:3841(_perform_update)
19302/14494 0.065 0.000 17.219 0.001 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:4659(__getitem__)
4893/4891 0.018 0.000 17.209 0.004 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:159(_check_path_in_prop_tree)
14498/14494 0.015 0.000 17.096 0.001 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5828(__getitem__)
798 0.005 0.000 16.890 0.021 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5047(_ret)
798 0.001 0.000 16.879 0.021 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:449(find_closest_string)
1600 0.176 0.000 16.878 0.011 {built-in method builtins.sorted}
390621 0.099 0.000 16.702 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:450(_key)
470776/390621 11.539 0.000 16.603 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:430(levenshtein)
24961225 3.540 0.000 3.540 0.000 {built-in method builtins.min}
24971843 1.419 0.000 1.419 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.507 0.507 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/contextlib.py:139(__exit__)
2 0.000 0.000 0.507 0.253 {built-in method builtins.next}
<pstats.Stats at 0x7faae67616f0>
Example two: disabling lookup
## Patch the library to disable levenshtein lookup for missing strings.
def disable_find_closest_string(string, strings):
raise ValueError()
import _plotly_utils.utils
_plotly_utils.utils.find_closest_string = disable_find_closest_string
import plotly.subplots
import cProfile
import pstats
from pstats import SortKey
cProfile.run('fig = plotly.subplots.make_subplots(rows=20, cols=20)', 'restats')
p = pstats.Stats('restats')
p.sort_stats(SortKey.CUMULATIVE).print_stats(20)
Mon Mar 13 15:25:05 2023 restats
2674042 function calls (2542640 primitive calls) in 1.498 seconds
Ordered by: cumulative time
List reduced from 866 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
150/1 0.000 0.000 1.498 1.498 {built-in method builtins.exec}
1 0.000 0.000 1.498 1.498 <string>:1(<module>)
1 0.000 0.000 1.498 1.498 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/subplots.py:7(make_subplots)
1 0.002 0.002 1.497 1.497 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/_subplots.py:45(make_subplots)
1 0.000 0.000 1.170 1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/graph_objs/_figure.py:736(update_layout)
1 0.000 0.000 1.170 1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1378(update_layout)
1 0.000 0.000 1.170 1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5096(update)
802/2 0.009 0.000 0.626 0.313 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:3841(_perform_update)
1 0.000 0.000 0.544 0.544 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/contextlib.py:139(__exit__)
2 0.000 0.000 0.544 0.272 {built-in method builtins.next}
2 0.000 0.000 0.544 0.272 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2995(batch_update)
1 0.000 0.000 0.544 0.544 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2860(plotly_update)
9697 0.072 0.000 0.443 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:51(_str_to_dict_path_full)
1 0.000 0.000 0.417 0.417 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2934(_perform_plotly_update)
1 0.004 0.004 0.417 0.417 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2611(_perform_plotly_relayout)
1600 0.003 0.000 0.371 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5842(__setitem__)
1596 0.004 0.000 0.355 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5726(_set_subplotid_prop)
1598 0.010 0.000 0.353 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5237(_set_compound_prop)
103626 0.042 0.000 0.347 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1811(_str_to_dict_path)
19302/14494 0.066 0.000 0.343 0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:4659(__getitem__)
<pstats.Stats at 0x7ff048631750>