Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Fail gracefully if can't decode font names #11263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented May 17, 2018

PR Summary

Not all Pythons ship with legacy encodings. A couple of fields in TTF files are encoded in the obsolete encoding "MacRoman", and reading these currently fails on those platforms. This falls back to decoding as ascii, replacing special characters with ? if MacRoman isn't available. This should be ok to do in the vast majority of cases: this will only be different if a font happens to have a diacritical character in the latin alphabet (which none of the fonts that matplotlib ships do).

A more forward-looking patch (but perhaps riskier from a lack-of-testing perspective) would be to look for a unicode name first, and failing that, fall back to this.

PR Checklist

  • Has Pytest style unit tests
  • Code is PEP 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@jklymak jklymak added this to the v3.0 milestone May 17, 2018
@anntzer
Copy link
Contributor

anntzer commented May 17, 2018

I am fairly strongly -1 on guessing the wrong encoding, and +1 on trying to load a more common encoding first. Will need to do some survey of what entries are "usually" (on "common" fonts present in the sfnt tables.

@mdboom
Copy link
Member Author

mdboom commented May 18, 2018

I think for historical reasons, MacRoman is the only one that's guaranteed, though if the fonts are modern enough, Unicode can be reasonably expected to be present. I think a good compromise is:

  • try Unicode
  • try MacRoman
  • failing that, do a best approximation of the MacRoman entry

I've noticed also that there are more instances of this scattered throughout the code. I'll update this PR to include those as well when I get the chance.

@anntzer
Copy link
Contributor

anntzer commented May 19, 2018

A quick check suggests that fonts on my linux install tend to have exactly two platform/encoding pairs available in the SFNT name table: MACINTOSH/MAC_ID_ROMAN (MacRoman) and MICROSOFT/MS_IS_UNICODE_CS (UTF-16BE, per https://docs.microsoft.com/en-us/typography/opentype/spec/name#windows-platform-specific-encoding-and-language-ids-platform-id-3).
I am fine first trying the latter, but would still prefer just let things fail (e.g. failure to load the font at all, so can be more graceful than a hard crash) if Unicode is present and MacRoman is not available on the nonstandard Python platform (we are finally moving away from the Py2/Py3 split, I'd rather not get again into a mess of Python dialects with slightly different available features).

@mdboom
Copy link
Member Author

mdboom commented May 19, 2018

From a pragmatic perspective, I think what you suggest is fine. The worst outcome is "please upgrade your legacy font from the 20th century."

However, I have to strongly reject your premise that matplotlib shouldn't support alternative platforms. Imagine if we had said the same thing about sparc, ppc, Google appengine, or maintaining python 2 as long as we did... These things have been critical to the breadth of the project over the years, even if some are no longer relevant. Hard to say what will be important going forward, but it seems short sighted to reject on the basis of non-standardness. One of cpython's great strengths is its modularilty and portability. It's explicit in the build that not all parts of the stdlib are always there.

That said, I realise it's hard to catch this stuff. I'd be willing to submit a pr to add a Travis testing configuration with the non-guaranteed parts of python removed if such a thing would be welcome.

@tacaswell
Copy link
Member

I'd rather not get again into a mess of Python dialects with slightly different available features

We definitely have to support all 'weird' builds of the Python versions we support or be very clear what build conditions we do not support.

@anntzer
Copy link
Contributor

anntzer commented May 19, 2018

Well, right now AFAIK we support CPython{python.org,majorlinuxdistros,conda,homebrew} and PyPy (insert version number here). Note that GAE support has fallen in disrepair (i.e. broken) at some point after 1.2.0 and been "officially" removed in 2.1 (#8939). To avoid that from happening again, it would definitely help to have testing for nonstandard platforms.

The real question is what "the non-guaranteed parts of python" means (there may well be a spec, I may well just not know it!). (Assuming that such a spec doesn't exist) I have all the respect in the world for you work on Python-in-the-browser, but can I just come up with my own implementation missing feature X, Y, and Z, and make a PR that works around these in Matplotlib? That doesn't seem like a sustainable model. Again, the point is not to dismiss the quality of pyodide, only to reach agreement on a sustainable dev model.

@anntzer anntzer mentioned this pull request May 19, 2018
6 tasks
@mdboom
Copy link
Member Author

mdboom commented May 21, 2018

This file defines which modules in CPython are optional -- all the ones from these comment and below:

https://github.com/python/cpython/blob/master/Modules/Setup.dist#L141

I would propose that we add a Travis-CI config that removes those modules from Python (this could even be done programatically based on this file) and makes sure all of the matplotlib tests run. It gives a pretty clear target of "what matplotlib officially supports" that's defined by core Python itself. That doesn't mean matplotlib can't use these optional things (bzip2 springs to mind), just that they would be optional features.

I think the most sustainable development module is to upstream as much of the workarounds as is feasible, which is probably not everything, but most of them will be pretty reasonable. I think it's usually a worse situation to have a bunch of non-merged patches floating around downstream.

- Try Unicode (utf16-be) first
- Failing that, try MacRoman
- If MacRoman isn't available or that key isn't found:
  - raise a ValueError so the font won't be included in the cache at all

Update the handful of locations that were doing sfnt lookups to use this
function.
@mdboom
Copy link
Member Author

mdboom commented May 21, 2018

I've update the PR to include the approach discussed (trying Unicode first, then falling back to MacRoman, then skipping font altogether).

@mdboom
Copy link
Member Author

mdboom commented May 21, 2018

I updated this PR before noticing your #11277. Either approach is fine to me. I'd consider #11277 to also close this bug.

@anntzer
Copy link
Contributor

anntzer commented May 21, 2018

I'm going to close this in favor of #11277 then.

@anntzer anntzer closed this May 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.