Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Make console windows fully UTF-8 by default on Windows, in line with the behavior on Unix-like platforms - character encoding, code page #7233

Copy link
Copy link
@mklement0

Description

@mklement0
Issue body actions

PowerShell Core now commendably defaults to UTF-8 encoding, including when sending strings to external programs, as reflected in $OutputEncoding's default value.

However, because the console-window shortcut file / taskbar entry still defaults to the OEM code page implied by the legacy system locale (e.g. 437 on US-English systems), it misinterprets strings from external programs; e.g., with Node.js installed:

PSCoreOnWin> $captured = '' | node -pe "require('fs').readFileSync(0).toString().trim()"; $captured
Γé¼    # !! node's UTF-8 output was misinterpreted.

This currently requires the following workaround (in addition to requiring the console window to use a TrueType font (true by default on Windows 10)):

[console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

Prepend $OutputEncoding = to make a Windows PowerShell console fully UTF-8-aware.

The above implicitly switches to the UTF-8 code page (65001), as then reflected in chcp.

This obscure workaround shouldn't be necessary, and I think it would make sense for PowerShell to automatically set [console]::InputEncoding and [console]::OutputEncoding to (BOM-less) UTF-8 on startup.

Update: When this issue was originally created, there was no mechanism for presetting code page 65001 (UTF-8) system-wide, which necessitated the awkward workaround. In recent versions of Windows 10 it is now possible to switch to code page 65001 as the system locale and therefore system-wide, although as of Windows 10 version 1909 that feature is still in beta - see this SO answer.

  • Caveat: In addition to defaulting the OEM code page to 65001 in all console windows (including cmd.exe windows), this invariably also makes Windows PowerShell's ANSI-encoding-default cmdlets default to UTF-8, notably Get-Content and Set-Content, which can be problematic from a backward-compatibility perspective.
    Additionally, there is a bug - see below.

The change, which can also be made programmatically (see below), requires administrative privileges and a reboot.

Environment data

PowerShell Core 7.1.0-preview.3 on Windows 10
Reactions are currently unavailable

Metadata

Metadata

Assignees

No one assigned

    Labels

    PowerShell-Docs neededThe PR was reviewed and a PowerShell Docs update is neededThe PR was reviewed and a PowerShell Docs update is neededResolution-Won't FixThe issue won't be fixed, possibly due to compatibility reason.The issue won't be fixed, possibly due to compatibility reason.WG-Interactive-Consolethe console experiencethe console experienceWG-ReviewedA Working Group has reviewed this and made a recommendationA Working Group has reviewed this and made a recommendation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.