-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fail gracefully if can't decode font names #11263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I am fairly strongly -1 on guessing the wrong encoding, and +1 on trying to load a more common encoding first. Will need to do some survey of what entries are "usually" (on "common" fonts present in the sfnt tables. |
I think for historical reasons, MacRoman is the only one that's guaranteed, though if the fonts are modern enough, Unicode can be reasonably expected to be present. I think a good compromise is:
I've noticed also that there are more instances of this scattered throughout the code. I'll update this PR to include those as well when I get the chance. |
A quick check suggests that fonts on my linux install tend to have exactly two platform/encoding pairs available in the SFNT name table: MACINTOSH/MAC_ID_ROMAN (MacRoman) and MICROSOFT/MS_IS_UNICODE_CS (UTF-16BE, per https://docs.microsoft.com/en-us/typography/opentype/spec/name#windows-platform-specific-encoding-and-language-ids-platform-id-3). |
From a pragmatic perspective, I think what you suggest is fine. The worst outcome is "please upgrade your legacy font from the 20th century." However, I have to strongly reject your premise that matplotlib shouldn't support alternative platforms. Imagine if we had said the same thing about sparc, ppc, Google appengine, or maintaining python 2 as long as we did... These things have been critical to the breadth of the project over the years, even if some are no longer relevant. Hard to say what will be important going forward, but it seems short sighted to reject on the basis of non-standardness. One of cpython's great strengths is its modularilty and portability. It's explicit in the build that not all parts of the stdlib are always there. That said, I realise it's hard to catch this stuff. I'd be willing to submit a pr to add a Travis testing configuration with the non-guaranteed parts of python removed if such a thing would be welcome. |
We definitely have to support all 'weird' builds of the Python versions we support or be very clear what build conditions we do not support. |
Well, right now AFAIK we support CPython{python.org,majorlinuxdistros,conda,homebrew} and PyPy (insert version number here). Note that GAE support has fallen in disrepair (i.e. broken) at some point after 1.2.0 and been "officially" removed in 2.1 (#8939). To avoid that from happening again, it would definitely help to have testing for nonstandard platforms. The real question is what "the non-guaranteed parts of python" means (there may well be a spec, I may well just not know it!). (Assuming that such a spec doesn't exist) I have all the respect in the world for you work on Python-in-the-browser, but can I just come up with my own implementation missing feature X, Y, and Z, and make a PR that works around these in Matplotlib? That doesn't seem like a sustainable model. Again, the point is not to dismiss the quality of pyodide, only to reach agreement on a sustainable dev model. |
This file defines which modules in CPython are optional -- all the ones from these comment and below: https://github.com/python/cpython/blob/master/Modules/Setup.dist#L141 I would propose that we add a Travis-CI config that removes those modules from Python (this could even be done programatically based on this file) and makes sure all of the matplotlib tests run. It gives a pretty clear target of "what matplotlib officially supports" that's defined by core Python itself. That doesn't mean matplotlib can't use these optional things (bzip2 springs to mind), just that they would be optional features. I think the most sustainable development module is to upstream as much of the workarounds as is feasible, which is probably not everything, but most of them will be pretty reasonable. I think it's usually a worse situation to have a bunch of non-merged patches floating around downstream. |
- Try Unicode (utf16-be) first - Failing that, try MacRoman - If MacRoman isn't available or that key isn't found: - raise a ValueError so the font won't be included in the cache at all Update the handful of locations that were doing sfnt lookups to use this function.
I've update the PR to include the approach discussed (trying Unicode first, then falling back to MacRoman, then skipping font altogether). |
I'm going to close this in favor of #11277 then. |
PR Summary
Not all Pythons ship with legacy encodings. A couple of fields in TTF files are encoded in the obsolete encoding "MacRoman", and reading these currently fails on those platforms. This falls back to decoding as ascii, replacing special characters with
?
if MacRoman isn't available. This should be ok to do in the vast majority of cases: this will only be different if a font happens to have a diacritical character in the latin alphabet (which none of the fonts that matplotlib ships do).A more forward-looking patch (but perhaps riskier from a lack-of-testing perspective) would be to look for a unicode name first, and failing that, fall back to this.
PR Checklist