Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Set-Content/Add-Content/Get-Content use an 8-bit character encoding by default, but the help topics state ASCII; problematic Core default file encoding #3248

Copy link
Copy link
@mklement0

Description

@mklement0
Issue body actions

This issue has two distinct aspects:

  • discussion of an existing documentation bug
  • discussion of the problematic fixed default file encoding currently (alpha16) chosen for Core.

Steps to reproduce

'ö' | Set-Content -NoNewline -Encoding ASCII tmp.txt 
'ö' | Add-Content -Encoding ASCII -NoNewline tmp.txt 
Get-Content -Encoding ASCII tmp.txt
(Get-Content -Encoding Byte -TotalCount 2 tmp.txt) | % { '0x{0:x}' -f $_ }
'--'
'ö' | Set-Content -NoNewline tmp.txt   # use default encoding
'ö' | Add-Content -NoNewline tmp.txt   # use default encoding
Get-Content tmp.txt                    # use default encoding
(Get-Content -Encoding Byte -TotalCount 2 tmp.txt) | % { '0x{0:x}' -f $_ }

Expected behavior

??
0x3f
0x3f
--
??
0x3f
0x3f

Actual behavior

??
0x3f
0x3f
--
öö
0xf6
0xf6

That is, ASCII encoding turns a non-ASCII character into literal ? (0x3f)

The fact that Set-Content without an -Encoding argument resulted in ö on reading implies that ASCII encoding wasn't used, and the specific byte value of 0xf6 further implies that that a single-byte, extended-ASCII encoding was used:

In contrast, Get-Help Set-Content, Get-Help Add-Content, and Get-Help Get-Content state for parameter -Encoding:

Specifies the file encoding. The default is ASCII.

The help-topic sources (branch live) for the relevant cmdlets can be found here.

Additionally:

  • While these cmdlets accept an encoding identifier Default, as used in other cmdlets, the help only mentions String.

  • Given that the two appear to result in the same encoding - what is their relationship?

  • The description for encoding String in the online help is inadequate:

Uses the encoding type for a string.

Environment data

PowerShell Core v6.0.0-alpha (v6.0.0-alpha.16) on Microsoft Windows 10 Pro (64-bit; v10.0.14393)
Reactions are currently unavailable

Metadata

Metadata

Assignees

Labels

Issue-BugIssue has been identified as a bug in the productIssue has been identified as a bug in the productIssue-Discussionthe issue may not have a clear classification yet. The issue may generate an RFC or may be reclassifthe issue may not have a clear classification yet. The issue may generate an RFC or may be reclassifResolution-FixedThe issue is fixed.The issue is fixed.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.