Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Implement line labels feature #17035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
Loading
from

Conversation

omarchehab98
Copy link
Contributor

@omarchehab98 omarchehab98 commented Apr 5, 2020

Co-authored-by: Robert Augustynowicz robert.augustynowicz@mail.utoronto.ca
Co-authored-by: Jinyang Hu jinyang.hu@mail.utoronto.ca
Co-authored-by: Dennis Tismenko dtismenkodeveloper@gmail.com
Co-authored-by: Jinming Zhang jinming.zhang@mail.utoronto.ca
Closes: #12939

Firstly, thank you for taking the time out of your day to review our pull request. We appreciate your time and feedback.

PR Summary

In this pull request, we implement a feature on Axes called label_lines. We chose to make it's API similar to legend.

In our implementation we took inspiration from the Bachelor's degrees example

We got the other three example data sets from ourworldindata.org/food-prices

# Temporary manual test suite for active development
cd test_label_lines
python3 test_label_lines.py

Inspiration

image

line_labels

how-much-food-can-you-buy-for-working-one-hour-in-the-manufacturing-sector.csv

Inspiration

image

line_labels

domestic-food-price-volatility-index.csv

Inspiration

image

line_labels

commodity-price-index-of-cereal-crops-and-petroleum-1850-2015.csv

Inspiration

image

line_labels

percent_bachelors_degrees_women_usa.csv

line_labels

bucket_fill.csv

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@omarchehab98 omarchehab98 marked this pull request as ready for review April 5, 2020 01:58
@RobertAugustynowicz RobertAugustynowicz force-pushed the Feature-12939 branch 2 times, most recently from 1e1ea77 to 952a9db Compare April 5, 2020 21:23
@omarchehab98
Copy link
Contributor Author

omarchehab98 commented Apr 6, 2020

I am trying to write an image comparison test for label_lines, here is my progress:

@image_comparison(['label_lines'])
def test_label_lines():
    plt.rcParams["figure.figsize"] = (5, 3)
    fig, ax = plt.subplots()
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.get_xaxis().tick_bottom()
    ax.get_yaxis().tick_left()
    plt.plot([0, 1, 2], [0, 1, 2], label="y=x")
    plt.plot([0, 1, 2], [1, 1.5, 2], label="y=x/2+1")
    plt.plot([0, 1, 2], [-1, 1, 3], label="y=2x-1")
    ax.label_lines()

I am unsure why the annotations are not being displayed in the image comparison test. What's confusing is that when I run the same code in the python REPL with plt.show() at the end, I see the annotations.

Image Comparison Test

label_lines

REPL with plt.show()

image

@jklymak
Copy link
Member

jklymak commented Apr 6, 2020

Because they are off the page?

@omarchehab98
Copy link
Contributor Author

Because they are off the page?

@jklymak thanks but your comment is not helpful because there is space for them to partially render like plt.show() renders them.

label_lines

@jklymak
Copy link
Member

jklymak commented Apr 6, 2020

Clipping? Note the tests are typically run with a different style than the default for historical reasons.

test_label_lines/test_label_lines.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
zip(handles, labels, xys)),
key=by_y):
data_x, data_y = xy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this? Why is this more complicated than putting the text at the end of each line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our main goal with this implementation is to ensure readability for the user at minimal cost for them. The given examples such as Bachelor's degrees by gender were created by individually placing text objects and using offsets so that they do not overlap. Therefore to try and automatically format the texts in such a way that there is no overlap, we decided to create this bitmap system for spacing rather than simply placing a text at the end of each line. This need for automatic spacing is exemplified by the the same graph from the feature request, where it lacks the offsets.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jklymak basically, if you place the text at the end of each line. In some cases the text will overlap, consider the bachelor's degree example posted by my colleague above.

Copy link
Contributor Author

@omarchehab98 omarchehab98 Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jklymak for the explanation you requested:

We are placing annotations in "buckets". To do so, we sort the last data points by Y ascending and use a boolean array to keep track of which buckets have been used. If a bucket has been used, we try to place the text in the next empty bucket.

We also consider the data point density of each bucket, if there are many data points in a bucket the annotations should be centered around the bucket rather than be placed bottom-up starting from the center (due to our iteration order because of the sort)

@omarchehab98
Copy link
Contributor Author

@jklymak we've made changes based on your feedback, I've marked your inline comments as resolved. We also added an example file and a new entry in doc/users/next_whats_new/.

We are working on unit tests and will push them as soon as they are done.

@RobertAugustynowicz RobertAugustynowicz force-pushed the Feature-12939 branch 2 times, most recently from 5319e90 to c19284f Compare April 6, 2020 17:51
@WeatherGod
Copy link
Member

I wonder if the constrained layout engine could get repurposed to help solve the optimal text label position problem? It is already in matplotlib for other purposes. I am not familiar enough with it though to comment on whether or not it could be used.

@ImportanceOfBeingErnest
Copy link
Member

ImportanceOfBeingErnest commented Apr 6, 2020

There is a nice tool called adjustText which could be used to repell the texts from each other. With it, such graph could look like

import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt

def lines_legend(ax):
    from adjustText import adjust_text
    texts=[]
    for line in ax.lines:
        l = ax.annotate(line.get_label(), xy=(1, line.get_ydata()[-1]), xytext=(5,0),
                        xycoords=ax.get_yaxis_transform(), textcoords="offset points",
                        va="center", color=line.get_color())
        texts.append(l)
    
    adjust_text(texts, only_move={'points':'y', 'text':'y'}, ha='left',
                va="center", autoalign=False, ax=ax)
    
    
x = np.linspace(0,1,40)
y = np.cumsum(np.random.randn(40,10), axis=0)
labels = ["".join(np.random.choice(list("ABCDEMNOP"), size=8)) for i in range(10)]

fig, ax = plt.subplots()
fig.subplots_adjust(right=0.8)
for i, label in enumerate(labels):
    ax.plot(x, y[:,i], label=label)
        
lines_legend(ax)

plt.show()

image


The reason I'm mentioning it, is that I'm not sure if we really want such a highly specialized code for only one single purpose.
Or in other words, I would be in favour of either adding a general purpose "text-repeller" that could then be used for more than this special task, or not to add such thing at all and ask people to use the existing external tools for it.

@omarchehab98
Copy link
Contributor Author

I wonder if the constrained layout engine could get repurposed to help solve the optimal text label position problem? It is already in matplotlib for other purposes. I am not familiar enough with it though to comment on whether or not it could be used.

@WeatherGod thank you for your thought, we would need more guidance to make it actionable.

I am not too familiar with the internals of constrained layout, however, it seems to me that the problem constrained layout is trying to solve is a different layout problem than we are trying to solve.

  1. Constrained layout adjusts the size of subplots to accommodate the size of axes and axes decorations. When laying out text near their data-points, nothing needs to be resized.

  2. Constrained layout deals with two-dimensions. Laying out text near the end of lines is only concerned with one dimension.

@jklymak
Copy link
Member

jklymak commented Apr 6, 2020

We discussed this during the weekly call. This feature is cool, but as is, it is likely too brittle for main-library use. i.e. if we zoom, what happens? What happens if there are 300 lines? etc.

We strongly suggest anyone who wants to make this a feature to first write a third-party library, get it fully functional, and if the changes are not too invasive, then feel free to promote it for the main branch.

As for @WeatherGod's suggestion, you could indeed use something like kiwisolver to solve for the text positions, but it would probably need a different implementation (though you may be able to use _layout_boxes.py)

@tacaswell
Copy link
Member

This may also be a good candidate to write up on the blog (https://github.com/matplotlib/matplotblog).

@RobertAugustynowicz
Copy link
Contributor

Thank you for the response, therefore would it be preferable to continue working on and improving our vertical text aligner, and implement it as an independent method which takes in a list of texts and aligns them without overlap using vertical spacing which is then used by label_lines. This would take the original position into account and try and place it as near as it can be to that position. Or would such a method still be too specific.

@tacaswell
Copy link
Member

The biggest problem is that the text become de-coupled from the lines so if the user pans / zooms the labels no longer are in the right place. The aligning / de-conflicting needs to happen every time the view is changed. We have one example of text that does this (the code in clabel for labeling contours).

If you want to keep working on this the right place to start is to create a new artist class that knows about the Line2D objects it is labeling, has all of the text as children, and put the adjustment logic in it's draw method (and then call the draw method on all of the text objects).

If you use Annotation you can probably also get the gray connecting lines without too much extra work.

@omarchehab98
Copy link
Contributor Author

it is likely too brittle for main-library use. i.e. if we zoom, what happens? What happens if there are 300 lines? etc.

@jklymak with our current implementation annotations are deleted and re-added on zoom/pan. Since the line labeling routine is fast, the user experience has been smooth. We've tested it with up to 175 lines.

The biggest problem is that the text become de-coupled from the lines so if the user pans / zooms the labels no longer are in the right place.

@tacaswell

In the graph below, there are 175 lines. All 175 annotations don't fit on the screen, however, the annotations that show are in the correct order starting from the minimum.

When zooming in, the annotations get redrawn since the line ends are more sparsely populated.

Our algorithm runs in linear time with respect to the number of lines, it positions them precisely based on available space. @ImportanceOfBeingErnest demonstrated how adjustText could be used to solve the same problem. We would like to note that adjustText is an iterative estimator, which means it is slow and imprecise. Our solution is fast and precise (we don't try to estimate, we calculate exactly where things should go).

175 lines

image

image

@henryhu123 henryhu123 force-pushed the Feature-12939 branch 2 times, most recently from cb80237 to d8ed7f4 Compare April 7, 2020 20:05
@jklymak
Copy link
Member

jklymak commented Apr 10, 2020

So, I finally got around to downloading this. Suggest the following example for testing based on @ImportanceOfBeingErnest above.

import numpy as np
import matplotlib.pyplot as plt

# Create plots with pre-defined labels.
fig, ax = plt.subplots(constrained_layout=False)
x = np.linspace(0,1,40)
Nlines = 400
y = np.cumsum(np.random.randn(40,Nlines), axis=0)
labels = ["".join(np.random.choice(list("ABCDEMNOP"), size=8)) for i in range(Nlines)]

for i, label in enumerate(labels):
    ax.plot(x, y[:,i], label=label)

ax.label_lines()

plt.show()

It actually all works pretty well. Sure if you have 400 lines the labels all overlap, but that seems fine. If you zoom the redraw, and if you pan around they refresh.

The only issue in my mind is that they are drawn in data space, so if you zoom they can go out of view. Not sure if it would't be better to just have them at x = 1 in axes co-ordinates, and just change their y value based on what lines are visible?

Otherwise this looks like a useful, relatively unobtrusive, addition.

@jklymak jklymak added this to the v3.3.0 milestone Apr 10, 2020
@omarchehab98
Copy link
Contributor Author

@jklymak I think it's a good idea to move them out of the data space into the axes coordinates, it makes more sense for a label/legend anyways

@QuLogic
Copy link
Member

QuLogic commented Apr 10, 2020

I guess you should probably update the Bachelor's degree example to use this new function now, too.

@omarchehab98
Copy link
Contributor Author

@QuLogic will do!

@jklymak jklymak marked this pull request as draft April 14, 2020 15:02
Co-authored-by: Robert Augustynowicz <robert.augustynowicz@mail.utoronto.ca>
Co-authored-by: Omar Chehab <omarchehab98@gmail.com>
Co-authored-by: Jinyang Hu <jinyang.hu@mail.utoronto.ca>
Co-authored-by: Dennis Tismenko <dtismenkodeveloper@gmail.com>
Co-authored-by: Jinming Zhang <jinming.zhang@mail.utoronto.ca>

Closes: matplotlib#12939
@QuLogic QuLogic modified the milestones: v3.3.0, v3.4.0 May 5, 2020
@anntzer
Copy link
Contributor

anntzer commented Jun 8, 2020

Maybe worth comparing with https://github.com/nschloe/dufte?

@QuLogic QuLogic modified the milestones: v3.4.0, v3.5.0 Jan 21, 2021
@tacaswell tacaswell modified the milestones: v3.5.0, v3.6.0 Aug 5, 2021
@timhoffm timhoffm modified the milestones: v3.6.0, unassigned Apr 30, 2022
@story645 story645 modified the milestones: unassigned, needs sorting Oct 6, 2022
@github-actions
Copy link

Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.

@github-actions github-actions bot added the status: inactive Marked by the “Stale” Github Action label Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to place legend labels near to the data
10 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.