Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 61bbeb8

Browse filesBrowse files
markusicuecheran
authored andcommitted
ICU-22723 download 76rc
1 parent 73626da commit 61bbeb8
Copy full SHA for 61bbeb8

File tree

Expand file treeCollapse file tree

1 file changed

+177
-43
lines changed
Open diff view settings
Filter options
Expand file treeCollapse file tree

1 file changed

+177
-43
lines changed
Open diff view settings
Collapse file

‎docs/download/76.md‎

Copy file name to clipboardExpand all lines: docs/download/76.md
+177-43Lines changed: 177 additions & 43 deletions
  • Display the source diff
  • Display the rich diff
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,29 @@ License & terms of use: http://www.unicode.org/copyright.html
1414

1515
# ICU 76
1616

17-
ICU is the [premier library for software internationalization](https://icu.unicode.org/#h.i33fakvpjb7o), used by a [wide array of companies and organizations](https://icu.unicode.org/#h.f9qwubthqabj).
17+
ICU is the [premier library for software internationalization](https://icu.unicode.org/#h.i33fakvpjb7o),
18+
used by a [wide array of companies and organizations](https://icu.unicode.org/#h.f9qwubthqabj).
1819

1920
## Release Overview
2021

21-
ICU 76 updates to [Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/) (TODO: link to blog),
22+
ICU 76 updates to
23+
[Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/)
24+
([blog](https://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html)),
2225
including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations.
23-
It also updates to [CLDR 46](https://github.com/unicode-org/cldr/blob/main/docs/site/downloads/cldr-46.md) (TODO: link to blog) locale data with new locales and various additions and corrections.
26+
27+
It also updates to
28+
[CLDR 46](https://cldr.unicode.org/downloads/cldr-46)
29+
([beta blog](https://blog.unicode.org/2024/09/unicode-cldr-46-beta-available-for.html))
30+
locale data with new locales, signficant updates to existing locales,
31+
and various additions and corrections.
32+
For example, the CLDR and Unicode default sort orders are now very nearly the same.
33+
34+
Most of the java.time (Temporal) types can now be formatted directly
35+
using the existing ICU4J date/time formatting classes.
2436

2537
There are some new APIs to make ICU easier to use with modern C++ and Java patterns.
38+
Most of the C/C++ APIs added for this purpose are implemented as C++ header-only APIs,
39+
and usable on top of binary stable C APIs, which is a first for ICU.
2640

2741
The Java and C++ technology preview implementations of the (also in [tech preview](https://github.com/unicode-org/message-format-wg?tab=readme-ov-file#messageformat-2-technical-preview)) CLDR MessageFormat 2.0 specification have been updated to match recent changes.
2842

@@ -34,7 +48,7 @@ Please use the [icu-support mailing list](https://icu.unicode.org/contacts) and/
3448

3549
The initial release has library version number 76.1.
3650

37-
* Release date: 2024-10-TODO
51+
* Release date: _planned for_ 2024-10-24
3852
* [List of tickets fixed in ICU 76](https://unicode-org.atlassian.net/issues/?jql=project%20%3D%20ICU%20AND%20status%20%3D%20Done%20AND%20resolution%20in%20%28Fixed%2C%20%22Fixed%20by%20Other%20Ticket%22%29%20AND%20fixVersion%20%3D%2076.1%20ORDER%20BY%20component%20ASC%2C%20created%20DESC)
3953

4054
If there are maintenance releases, they will be 76.2, 76.3, etc. (During ICU 76 development, the library version number was 76.0.x.)
@@ -43,51 +57,168 @@ Note: There may be additional commits on the [maint/maint-76](https://github.com
4357

4458
## Common Changes
4559

46-
* [Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/) (TODO: link to blog):
47-
* TODO
48-
* [CLDR 46](https://github.com/unicode-org/cldr/blob/main/docs/site/downloads/cldr-46.md) (TODO: link to blog):
49-
* TODO: new stuff
50-
* TODO: below is from 45
51-
* MessageFormat 2.0 tech preview being included into LDML.
52-
* Structural “under the hood” work and limited data bug fixes, but no new data collection.
53-
* Some time zones deprecated following IANA TZ database changes.
54-
* TODO: new stuff
55-
* TODO: below is from 75
56-
* New Unicode properties APIs for Identifier_Status and Identifier_Type, defined by UTS \#39 Unicode Security Mechanisms, [General Security Profile for Identifiers](https://www.unicode.org/reports/tr39/#General_Security_Profile). ([ICU-11396](https://unicode-org.atlassian.net/browse/ICU-11396))
57-
* Time zone data (tzdata) version 2024a (2024-jan). Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream [tzdata](https://www.iana.org/time-zones) release since 2021b.
60+
* [Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/)
61+
([blog](https://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html)):
62+
* Adds five modern-use scripts: Garay, Gurung Khema, Kirat Rai, Ol Onal, Sunuwar
63+
* Adds two historic scripts & almost 4000 additional Egyptian Hieroglyphs
64+
* Seven new emoji characters
65+
* Over 700 symbols from legacy computing environments
66+
* ICU line breaking improvements have been upstreamed into
67+
[UAX #14](https://www.unicode.org/reports/tr14/tr14-53.html#Modifications)
68+
* ICU 76 adds support for the new UCD property Modifier_Combining_Mark for
69+
[UAX #53](https://www.unicode.org/reports/tr53/) Arabic Mark Rendering
70+
* ICU 76 also adds support for the UCD property Indic_Conjunct_Break
71+
which was new in Unicode 15.1. ([ICU-22503](https://unicode-org.atlassian.net/browse/ICU-22503))
72+
* [IDNA](https://www.unicode.org/reports/tr46/tr46-33.html#Modifications):
73+
The handling of UseSTD3ASCIIRules was simplified.
74+
Some existing characters changed from disallowed (when that was only for compatibility with
75+
long-obsolete IDNA2003) to valid.
76+
* [CLDR 46](https://github.com/unicode-org/cldr/blob/main/docs/site/downloads/cldr-46.md)
77+
([beta blog](https://blog.unicode.org/2024/09/unicode-cldr-46-beta-available-for.html)):
78+
* Significant data updates across all locales
79+
* Locales which are now at modern coverage level: Nigerian Pidgin, Tigrinya
80+
* Locales which are now at moderate coverage level:
81+
Akan, Baluchi (Latin), Kangri, Tajik, Tatar, Wolof
82+
* New measurement units "night" and "light-speed"
83+
* Note: ICU 76 does not yet support `portion-per-1e9` (aka per-billion). (See [ICU-22781](https://unicode-org.atlassian.net/browse/ICU-22781))
84+
* [MessageFormat 2.0 tech preview updates](https://cldr.unicode.org/downloads/cldr-46#message-format-specification)
85+
* Language matching: Dropped the fallback mapping
86+
desired="uk" → supported="ru"
87+
(so that Ukrainian (uk) doesn’t fall back to Russian (ru))
88+
* [Collation](https://cldr.unicode.org/downloads/cldr-46#collation-data-changes):
89+
Significant changes to the CLDR root collation (CLDR default sort order)
90+
* Realigned With DUCET:
91+
The order of groups of characters which sort below letters is now the same.
92+
In both sort orders, non-decimal-digit numeric characters now sort after decimal digits,
93+
and the CLDR root collation no longer tailors any currency symbols
94+
(making some of them sort like letter sequences, as in the DUCET).
95+
_These changes eliminate sort order differences among almost all
96+
regular characters between the CLDR root collation and the DUCET._
97+
* Improved Han Radical-Stroke Order:
98+
The CLDR radical-stroke order now matches that of the Unicode Radical-Stroke Index;
99+
traditional vs. simplified forms of radicals are now distinguished on a lower level than the number of residual strokes.
100+
In alphabetic indexes for radical-stroke sort orders,
101+
only the traditional forms of radicals are now available as index characters.
102+
* Time zone data (tzdata) version 2024b (2024-sep). Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream [tzdata](https://www.iana.org/time-zones) release since 2021b.
103+
* The Asia/Almaty time zone has become an alias following IANA TZ database changes.
104+
* CLDR added support for deprecated timezone codes by remapping:
105+
CST6CDT → America/Chicago, EST → America/Panama, EST5EDT → America/New_York,
106+
MST7MDT → America/Denver, PST8PDT → America/Los_Angeles
107+
(These IANA TZ changes were motivated by CLDR, see
108+
[CLDR-17111](https://unicode-org.atlassian.net/browse/CLDR-17111))
58109

59110
## ICU4C Specific Changes
60111

61-
* [API changes since ICU4C 75 (Markdown)](https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.md) / [(HTML)](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.html)
62-
* TODO: new stuff
63-
* TODO: below is from 75
64-
* MessageFormat 2.0 tech preview new API ([ICU-22261](https://unicode-org.atlassian.net/browse/ICU-22261))
65-
* C: Require C11 (up from C99)
66-
* C++: Require C++17 (up from C++11)
67-
* Many changes for more robust string and buffer handling.
112+
* [API changes since ICU4C 75 (Markdown)](https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.md) / [(HTML)](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.html)
113+
* A UnicodeString can now be converted to & from UTF-16 standard string_view types
114+
(std::u16string_view, and on Windows to/from std::wstring_view)
115+
and other UTF-16 types (string literals, standard string classes).
116+
Several other member functions have been widened to accept standard UTF-16 types as well.
117+
([ICU-22843](https://unicode-org.atlassian.net/browse/ICU-22843))
118+
* New APIs for colloquial iteration over the elements of a C++ UnicodeSet or a C USet. ([ICU-22876](https://unicode-org.atlassian.net/browse/ICU-22876))
119+
* For details and an example see the “C++ Header-Only APIs” section of the [Migration Issues](#migration-issues) below.
120+
* New APIs for colloquial use of C++ Collator / C UCollator with
121+
standard C++ algorithms (e.g, sort) & data structures (e.g., map).
122+
([ICU-22879](https://unicode-org.atlassian.net/browse/ICU-22879))
123+
(The UCollator wrappers are also C++ header-only APIs.)
124+
* Note: Some APIs were changed to accept a wider range of input types than before,
125+
but in the API change report they look like the old, stable signatures are removed,
126+
and like the wider signatures are added as “born stable”.
127+
For example, several UnicodeString constructors that take a raw pointer
128+
have been replaced with a signature that accepts such raw pointers but also additional input types.
129+
* Note: Similarly, the API change report appears to show removal+addition of
130+
certain UnicodeString::remove() and UnicodeString::removeBetween() overloads,
131+
but only the _expression_ of one of their default parameter values has changed.
132+
* Many changes for more robust string and memory handling.
68133

69134
## ICU4J Specific Changes
70135

71-
* [API Changes since ICU4J 75](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4j/APIChangeReport.html)
72-
* TODO: new stuff
73-
* TODO: below is from 75
74-
* MessageFormat 2.0 tech preview update ([ICU-22690](https://unicode-org.atlassian.net/browse/ICU-22690))
75-
* Performance (multi-threading / lock contention) improvement for BreakIterator.clone() and ULocale.getDefault(). ([ICU-22582](https://unicode-org.atlassian.net/browse/ICU-22582))
136+
* [API Changes since ICU4J 75](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4j/APIChangeReport.html)
137+
* Most of the java.time (Temporal) types can now be formatted directly
138+
using the existing ICU4J date/time formatting classes. ([ICU-22853](https://unicode-org.atlassian.net/browse/ICU-22853))
139+
* New APIs for colloquial iteration over the elements of a UnicodeSet.
140+
In addition to the existing ranges(), strings(), and UnicodeSet-is-an-Iterable,
141+
there is a new codePoints() (returns an Iterable),
142+
and new methods that return Streams (e.g., codePointStream() & rangeStream()).
143+
([ICU-22845](https://unicode-org.atlassian.net/browse/ICU-22845))
76144

77145
## Known Issues
78146

79-
* TODO: new stuff
80-
* TODO: below is from 75
81-
* [ICU-22729](https://unicode-org.atlassian.net/browse/ICU-22729) udatpg_getBestPattern requires exact skeleton match in ICU 76
82-
* Due to a combination of an ICU bug fix and issues with CLDR availableFormats data, some skeletons in some languages yield inconsistent data/time formatting patterns.
147+
* None yet
83148

84149
## Migration Issues
85150

86-
* See [CLDR 46 migration issues](https://github.com/unicode-org/cldr/blob/main/docs/site/downloads/cldr-46.md#migration)
87-
* TODO: new stuff
88-
* TODO: below is from 75
89-
* ICU4C behavior for ill-formed locale IDs/language tags: uloc_getName(), uloc_getLanguage() and similar functions (and functions that rely on them) may fail with a U_ILLEGAL_ARGUMENT_ERROR when they used to fail only with a U_BUFFER_OVERFLOW_ERROR. (due to changes for [ICU-22520](https://unicode-org.atlassian.net/browse/ICU-22520))
90-
* On Linux, the configure script now defaults to "cc" rather than preferring "clang". If you want to choose clang, then configure for "Linux/clang". ([ICU-22556](https://unicode-org.atlassian.net/browse/ICU-22556))
151+
### IDNA Default Option Changed to Nontransitional Processing
152+
After all major browsers have switched to nontransitional processing,
153+
Unicode 15.1 (a year ago) changed the [UTS #46 spec](https://www.unicode.org/reports/tr46/#Processing)
154+
to declare transitional processing deprecated.
155+
156+
ICU 76 changes the "DEFAULT" API constants from 0 to UIDNA_NONTRANSITIONAL_TO_ASCII | UIDNA_NONTRANSITIONAL_TO_UNICODE.
157+
158+
ICU 76 does not change the behavior of using options value 0.
159+
(That would change the behavior of existing binaries linking with new ICU libraries.)
160+
However, when code is recompiled against a new version of ICU,
161+
and when it uses the DEFAULT constant, then it will pass these option flags into the factory method.
162+
163+
* In C/C++: unicode/uidna.h [UIDNA_DEFAULT](https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uidna_8h.html#a726ca809ffd3d67ab4b8476646f26635aa1eb63014cdaf41c7ea6cf3abecf1169)
164+
* In Java: IDNA.java [DEFAULT](https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/text/IDNA.html#DEFAULT)
165+
166+
See [ICU-22294](https://unicode-org.atlassian.net/browse/ICU-22294)
167+
168+
### SimpleNumber::truncateStart() Removed
169+
ICU 75 renamed the still-draft SimpleNumber::truncateStart() to setMaximumIntegerDigits().
170+
ICU 76 removes the never-stable, original function.
171+
Same for the C API usnum_truncateStart().
172+
([ICU-22900](https://unicode-org.atlassian.net/browse/ICU-22900))
173+
174+
### C++ Header-Only APIs
175+
ICU 76 is the first version where we add what we call C++ header-only APIs.
176+
These are especially intended for users who rely on only binary stable DLL/library exports of C APIs
177+
(C++ APIs cannot be binary stable).
178+
179+
_Please test these new APIs and let us know if you find problems —
180+
especially if you find a platform/compiler/options combination
181+
where the call site does end up calling into ICU DLL/library exports._
182+
183+
Remember that regular C++ APIs can be hidden by callers defining `U_SHOW_CPLUSPLUS_API=0`.
184+
The new header-only APIs can be separately enabled via `U_SHOW_CPLUSPLUS_HEADER_API=1`.
185+
186+
([GitHub query for `U_SHOW_CPLUSPLUS_HEADER_API` in public header files](https://github.com/search?q=repo%3Aunicode-org%2Ficu+U_SHOW_CPLUSPLUS_HEADER_API+path%3Aunicode%2F*.h&type=code))
187+
188+
These are C++ definitions that are not exported by the ICU DLLs/libraries,
189+
are thus inlined into the calling code,
190+
and which may call ICU C APIs but not into ICU non-header-only C++ APIs.
191+
192+
The header-only APIs are defined in a nested `header` namespace.
193+
If entry point renaming is turned off (the main namespace is `icu` rather than `icu_76` etc.),
194+
then the new `U_HEADER_ONLY_NAMESPACE` is `icu::header`.
195+
196+
([Link to the API proposal which introduced this concept](https://docs.google.com/document/d/1xERVccTYsptzjfbjcj6HDtoKVF_mEKmslPsOiQzzaFg/view#heading=h.cf4bmhjgozry))
197+
198+
For example, for iterating over the code point ranges in a `USet` (excluding the strings):
199+
200+
```c++
201+
U_NAMESPACE_USE
202+
using U_HEADER_NESTED_NAMESPACE::USetRanges;
203+
LocalUSetPointer uset(uset_openPattern(u"[abcçカ🚴]", -1, &errorCode));
204+
for (auto [start, end] : USetRanges(uset.getAlias())) {
205+
printf("uset.range U+%04lx..U+%04lx\n", (long)start, (long)end);
206+
}
207+
for (auto range : USetRanges(uset.getAlias())) {
208+
for (UChar32 c : range) {
209+
printf("uset.range.c U+%04lx\n", (long)c);
210+
}
211+
}
212+
```
213+
214+
(Implementation note: On most platforms, when compiling ICU itself,
215+
the `U_HEADER_ONLY_NAMESPACE` is `icu::internal`,
216+
so that any such symbols that get exported differ from the ones that calling code sees.
217+
On Windows, where DLL exports are explicit,
218+
the namespace is always the same, but these header-only APIs are not marked for export.)
219+
220+
### Migration Issues Related to CLDR
221+
* See [CLDR 46 migration issues](https://cldr.unicode.org/downloads/cldr-46#migration)
91222
92223
## ICU4C Platform Support
93224
@@ -97,27 +228,30 @@ We routinely test on recent versions of Linux, macOS, and Windows.
97228
98229
We accept patches for other platforms.
99230
231+
For ICU 76, we have received a contribution to make ICU4C work again on z/OS,
232+
using a newer (clang-based) compiler. ([ICU-22714](https://unicode-org.atlassian.net/browse/ICU-22714) [icu/pull/3008](https://github.com/unicode-org/icu/pull/3008) + [ICU-22916](https://unicode-org.atlassian.net/browse/ICU-22916) [icu/pull/3208](https://github.com/unicode-org/icu/pull/3208))
233+
100234
Windows: The minimum supported version is Windows 7. (See [How To Build And Install On Windows](../userguide/icu4c/build.html#how-to-build-and-install-on-windows) for more details.)
101235
102236
## ICU4J Platform Support
103237
104-
ICU4J works on Java 8..17 (at least).
238+
ICU4J works on Java 8..21 (at least).
105239
106240
ICU4J should work on Android API level 21 and later but may require “[library desugaring](https://developer.android.com/studio/write/java8-support#library-desugaring)”.
107241
108242
## Download
109243
110-
Source and binary downloads are available on the git/GitHub tag page: TODO: https://github.com/unicode-org/icu/releases/tag/release-76-1
244+
Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-76-rc
111245
112246
See the [Source Code Setup](../devsetup/source/) page for how to download the ICU file tree directly from GitHub.
113247
114248
ICU locale data was generated from CLDR data equivalent to:
115249
116-
* TODO: fix/update
117-
* https://github.com/unicode-org/cldr/releases/tag/release-46-beta4
118-
* https://github.com/unicode-org/cldr-staging/releases/tag/release-46-beta4
250+
* https://github.com/unicode-org/cldr/releases/tag/release-46-beta3
251+
* https://github.com/unicode-org/cldr-staging/releases/tag/release-46-beta3
119252
120-
TODO: Maven dependency:
253+
[Maven dependency](https://central.sonatype.com/artifact/com.ibm.icu/icu4j):
254+
TODO
121255
```
122256
<dependency>
123257
<groupId>com.ibm.icu</groupId>

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.