Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integers passed to a method expected an array implicitly allocate an array of that length #21563

Closed
5 tasks done
colejohnson66 opened this issue May 1, 2024 · 13 comments
Closed
5 tasks done
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. WG-Engine core PowerShell engine, interpreter, and runtime

Comments

@colejohnson66
Copy link

colejohnson66 commented May 1, 2024

Prerequisites

Steps to reproduce

If one invokes a method that takes an array, passing a numeric type should fail. Instead, an array of that length is implicitly allocated and passed in.

For this issue, Convert.ToHexString has three overloads: byte[], byte[] plus offset+length, and ReadOnlySpan<byte>. When called with a numeric type (such as a character), PowerShell implicitly casts the numeric value to an array instead of failing.

See: dotnet/runtime#101757

Expected behavior

PS> [Convert]::ToHexString([char]0x200F)
InvalidArgument: Cannot convert value ... to type "System.Byte[]".

Actual behavior

PS> [Convert]::ToHexString([char]0x200F)
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000...

Specifically, 16414 zeros (or 0x200F*2) are printed.

Environment data

PS> $PSVersionTable

Name                           Value
----                           -----
PSVersion                      7.4.2
PSEdition                      Core
GitCommitId                    7.4.2
OS                             Microsoft Windows 10.0.22631
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
@colejohnson66 colejohnson66 added the Needs-Triage The issue is new and needs to be triaged by a work group. label May 1, 2024
@237dmitry
Copy link

The same result:

-join ([byte[]][char]0x200f)

@mklement0
Copy link
Contributor

PowerShell allows passing [char] to [byte]-typed parameters, and therefore also to [byte[]]-typed parameters.

In the case of [byte[]], the unexpected array-allocation behavior occurs, namely if the code point of the character is > 0xFF (255), i.e. doesn't fit into a single [byte].

The bug also surfaces with PowerShell-native commands - even with casts, as @237dmitry as discovered - so the simplest repro with a parameter is:

# OK -> 1, 255
# That is: One byte with value 255
& { param([byte[]] $bytes) $bytes.Length; Write-Host $bytes  } ([char] 255)

# !! BROKEN, due to the character's code point being > 255 (0xFF) 
# !! -> 256(!) bytes with value 0
# !! That is, values >= 256 are mistakenly interpreted as the size of an array of 0 bytes to allocated.
& { param([byte[]] $bytes) $bytes.Length; Write-Host $bytes  } ([char] 256)

By contrast, note that the behavior is correct in the following cases, resulting in a statement-terminating error with message "Value was either too large or too small for an unsigned byte" if [char] code point or number > 255 is passed:

  • with [char] for [byte]-typed parameters (non-arrays), and if you pass an array of [char] values to a [byte[]]-typed parameter

  • with any instance of a numeric type, both in the [byte] and [byte[]] cases

@jhoneill
Copy link

jhoneill commented May 1, 2024

The same result:

-join ([byte[]][char]0x200f)

Well that's plain odd. Granted most assumptions about chars converting to 7 or 8 bits have been broken since unicode arrived, but even so it's madly inconsistent.

> ([char]8364)
€

>$x = ([char]8364)

> [int[]]$x
8364

> ([byte[]]$x).count   #  it's that number of zeros
8364

> ([byte[]]'€').count  # String literal fails 
InvalidArgument: Cannot convert value "€" to type "System.Byte[]".

([byte[]][char]'€').count  # String cast to a char (same as $x) gives that number of zeros
8364

> ( [byte[]]::new(8364) ).count   # which is a bit like new() for a an array of byte
8364

( [byte[]]::new([char]'€') ).Count  # and works with a char 
8364

> ( [byte[]]::new('€') ).Count   # but not with a string literal 
MethodException: Cannot convert argument "0", with value: "€", for ".ctor" to type "System.Int32":


 ([byte[]][char[]]'€')    # Single char gives that number of zeros, but array fails 
InvalidArgument: Cannot convert value "€" to type "System.Byte".

> ([byte[]]8364)    # and single int fails 
InvalidArgument: Cannot convert value "8364" to type "System.Byte[]".
```

@SeeminglyScience
Copy link
Collaborator

One of the conversion paths PowerShell utilizes is trying to pass the conversion subject as the first argument to a constructor of the target type.

So while it's definitely surprising, and one of the more wacky examples, it does fit within PowerShell's conversion logic. It ends up generating something sorta close to this:

[Convert]::HexString([byte[]]::new([int][char]0x200F))

@mklement0
Copy link
Contributor

Interesting, so[char] is allowed to bind to any numeric parameter, not just [byte].

I suspect the ship has sailed, but:

In general:

While supporting implicit [char]-to-[byte] conversion is understandable, I wonder what the benefit is of allowing something like the following, especially given that use of [char] requires a deliberate act in PowerShell:

# -> 8364
& { param([double] $foo) $foo } ([char] 0x20ac)

As for the specific case at hand:

You could argue that an array-typed parameter should never attempt a constructor call like that.
In fact, it is sensibly not attempted when you pass a number directly:

# -> "Cannot convert value "8364" to type "System.Byte".
#      Error: "Value was either too large or too small for an unsigned byte.
& { param([byte[]] $foo) } 0x20ac

In other words: an array-typed parameter that isn't given an argument of the exact type should only ever:

  • Try to convert a scalar value to the target type and then create a single-element array from it.
  • Try to convert a differently typed array element-by-element to create an array of the target type.

An inability to convert the argument / at least one argument element should result in a (statement-terminating) error.

@SeeminglyScience
Copy link
Collaborator

Interesting, so[char] is allowed to bind to any numeric parameter, not just [byte].

It's more just that char > int is a valid conversion path. Just like how string can also convert to int, so this does the same:

[byte[]]'10000'

But direct answer, the conversion path for char > int is a very common way of getting the numeric value for a character. Might actually be my most used conversion interactively. Double is less clear of a case though for sure.

@mklement0
Copy link
Contributor

mklement0 commented May 1, 2024

char > int is a very common way of getting the numeric value for a character.

Good point.

However, I think the point about the inappropriate array-constructor call in the case at hand is still worth addressing.
There's no benefit to the current behavior, and it is only likely to cause confusion.

@SeeminglyScience
Copy link
Collaborator

However, I think the point about the inappropriate array-constructor call in the case at hand is still worth addressing. There's no benefit to the current behavior, and it is only likely to cause confusion.

So to be clear it's a generic conversion path, not just for array types. For instance [IO.FileInfo]'some path' only works because of the "try passing it as an arg to a constructor" conversion path.

I'll definitely grant you that in a vacuum there's no benefit to the specifics of how that conversion plays out, but that's an inevitability with how permissive the system is.

@mklement0
Copy link
Contributor

Understood about the generic conversion path, @SeeminglyScience, but I'm still baffled by the following discrepancy:

# Sensibly FAILS - 0x20ac is only interpreted as [byte] - no fallback to constructor.
[byte[]] 0x20ac

# !! Unexpectedly equivalent to [byte[]]::new(0x20ac)
[byte[]] [char] 0x20ac

Note that the latter seemingly works only with [char], whereas any numeric cast - e.g. [byte[]] [long] 0x20ac - sensibly fails too.

@SeeminglyScience
Copy link
Collaborator

Ahh okay so there's an existing exclusion in that rule for numeric types that gets circumvented when a non-numeric type could be cast as one. That's sort of the issue with hard coding exclusions (or adding to existing exclusions) the rules get really hard to reason about.

I'll open it up to engine to discuss, but either way I think the implementation would make it challenge without just hard coding a list of every type that is convertible to int.

@SeeminglyScience SeeminglyScience added Issue-Enhancement the issue is more of a feature request than a bug WG-Engine core PowerShell engine, interpreter, and runtime labels May 1, 2024
@SeeminglyScience SeeminglyScience added Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. and removed Needs-Triage The issue is new and needs to be triaged by a work group. labels May 13, 2024
@SeeminglyScience
Copy link
Collaborator

The Engine WG discussed this and agree that the behavior is confusing, but decided that adding additional hard coded exclusions here is not worth the extra potential confusion and/or breaking changes.

@SeeminglyScience SeeminglyScience removed their assignment May 13, 2024
Copy link
Contributor

This issue has been marked as won't fix and has not had any activity for 1 day. It has been closed for housekeeping purposes.

Copy link
Contributor

microsoft-github-policy-service bot commented May 14, 2024

📣 Hey @colejohnson66, how did we do? We would love to hear your feedback with the link below! 🗣️

🔗 https://aka.ms/PSRepoFeedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. WG-Engine core PowerShell engine, interpreter, and runtime
Projects
None yet
Development

No branches or pull requests

5 participants