Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speculative: --only-binary by default? #9140

Open
pfmoore opened this issue Nov 16, 2020 · 83 comments
Open

Speculative: --only-binary by default? #9140

pfmoore opened this issue Nov 16, 2020 · 83 comments
Labels
state: awaiting PR Feature discussed, PR is needed state: needs discussion This needs some more discussion type: deprecation Related to deprecation / removal.

Comments

@pfmoore
Copy link
Member

pfmoore commented Nov 16, 2020

What's the problem this feature will solve?
A lot of users are reporting issues when there's no Python 3.9 binary for projects they need, and pip tries to build from source and fails with an obscure error (because the user doesn't have a compiler, or isn't set up to build the relevant packages).

Describe the solution you'd like
Pip shouldn't try to build from source if the user isn't prepared to deal with build errors. As it's not possible to know the user's level of expertise, we should err on the side of caution, and by default only allow wheels to be installed. Users who know they need to install from source and have checked that they can do so, can explicitly say so using a new --allow-source flag, which acts as an "opt-in" to source builds.

Alternative Solutions
Improve the error messages when a source build fails. This is hard, because the details of what went wrong are entirely the responsibility of the build backend.

Additional context
I don't realistically think this can be added without a lot of disruption, but given that significant numbers of projects ship wheels these days, maybe it isn't as unthinkable as it once was. I do think it's worth discussing the implications, if only as a thought experiment, and I don't know where else we could do that apart from here.

One big problem area is that we can't distinguish between "pure Python" projects that are shipped only as sdists, but which only need Python to build, and complex projects that need a compiler. So restricting to wheels only would require an explicit opt-in for some projects which currently install with no issue.

@dstufft
Copy link
Member

dstufft commented Nov 16, 2020 via email

@uranusjr
Copy link
Member

But we can’t know what “all projects” means before deciding whether to set the flag, since dependency information is inside the sdist/wheel 🙃

@pfmoore
Copy link
Member Author

pfmoore commented Nov 17, 2020

@uranusjr I'm suggesting making --only-binary :all: the default, which doesn't need to know dependency information...

@uranusjr
Copy link
Member

Oops, my previous response was toward @dstufft’s “intelligent” suggestion. Sorry for the confusion.

To express my thoughts in more words, I think the “only wheel unless some project needs to compile from source” would be very difficult to implement since the two parts in the logic depend on each other. I would much prefer @pfmoore’s original suggestion of having --only-binary :all: unless the user explicitly allows source distributions.

@dstufft
Copy link
Member

dstufft commented Nov 17, 2020 via email

@pfmoore
Copy link
Member Author

pfmoore commented Nov 17, 2020

Maybe we simply make --prefer-binary the default (rather than --only-binary)? I didn't suggest that originally because it means that we trigger "why don't I get the latest version?" questions. But maybe that's a less serious breakage?

@pradyunsg
Copy link
Member

This makes it so that as soon as you upload a wheel for a given version, you’re effectively signaling that not only should a wheel version be preferable, but that the sdist should only be used if explicitly configured to by the user.

I like it? I think something like 98% of packages on PyPI have wheels in the latest release, so I don't think this is catastrophically bad.

Improve the error messages when a source build fails. This is hard, because the details of what went wrong are entirely the responsibility of the build backend.

IMO one of the improvements we should make here is adding a sentence like: "This failure occurred while trying to generate [a wheel / metadata] for packageName. This is not an error in pip."

This also applies to the proposed approach here too -- clearer error messaging would be good. :)

@uranusjr
Copy link
Member

uranusjr commented Nov 17, 2020

I like it? I think something like 98% of packages on PyPI have wheels in the latest release, so I don't think this is catastrophically bad.

I suspect the number would be significantly lower if you count percentage of downloads instead. There are a bunch of popular pure-Python projects that don’t bother with wheels because the effect is minimal. django-grappelli is one of my favourite examples: it’s popular, well-maintained, regularly released, and has very spotty wheel support. --prefer-binary by default would break a lot of Django setups out there.

@pfmoore
Copy link
Member Author

pfmoore commented Nov 17, 2020

I think something like 98% of packages on PyPI have wheels in the latest release

I'm pretty sure that's a figure I gave you, and I found the bug in my calculation a bit later 🙁 I need to re-do the sums, but I think it's a lot lower than that, unfortunately.

I suspect the number would be significantly lower if you count percentage of downloads instead.

The number's a lot lower without doing the sums incorrectly 🙂 Sorry about that. I don't have download information, but I'm re-doing the numbers right now, and I'll see what things look like if you factor in "uploaded a file in the last 12 months" as well.

I might try getting download numbers from the BigQuery data for offline analysis. Downloads per project, per year (month?) might be sufficiently interesting, if I can work out how to get that relatively easily in a CSV format or similar.

To confirm, my query has just completed. Comparing "number of projects that distribute sdists but no wheels for their latest version", vs "number of projects that distribute wheels for their latest version", the numbers are almost identical (124508 vs 124782). Looking at projects which have released at least one file in the last year, the values are 32890 and 66635.

So half of all projects, 2/3 of projects active in the last year, have wheels.

As I say, I think that however we did this, it would result in a lot of breakage.

@pradyunsg pradyunsg added state: needs discussion This needs some more discussion type: deprecation Related to deprecation / removal. labels Nov 17, 2020
@dstufft
Copy link
Member

dstufft commented Nov 17, 2020

It's a backwards incompatible change, so regardless it's going to break someone. The goal behind my proposal is to limit the blast radius, so that we limit the breakage, either to specific projects, or to specific versions within a project.

I think there's two questions here too:

  • What do we want the long term position to be, are we happy saying that eventually a project that has never shipped, and will never ship wheels requires an opt in on the CLI to install?
  • Given the answer to the first, what stepping stones can we make to get there? Is there any or do we need a big bang migration?

I'm not sure about the long term "right" answer. I can see an argument that we want to encourage wheels where possible.. but I also think that there are some projects that simply cannot be shipped as wheels, and maybe will never be able to be shipped as wheels. We need to figure out if going wheel only by default will end up being worth it, or if we will push too many projects out of viability.

For the second one, I think having the default by to filter out sdists, for any project version that has any wheels uploaded, solves the main driver to this proposal, without breaking projects that are not shipping wheels (or used to ship wheels, but found out that was problematic). That could be useful as a stepping stone for getting to a wheel only default (for instance, we could provide warning when installing from sdist then), or it could be a reasonable end state that solves the surprising accidental sdist install, without dropping support for sdist only projects by default.

@pfmoore
Copy link
Member Author

pfmoore commented Nov 17, 2020

I think we could do a lot better if we could somehow identify which projects are "hard" to build from source. I feel like blocking sdists that build into universal wheels is going a bit far. In the most general sense, that's basically impossible, but maybe we could add metadata somewhere (in the simple index?) to mark "pure Python" projects?

I agree it's not clear what the best long term answer is. We're seeing a lot more people using Python nowadays who honestly don't want to, or know how to, deal with building stuff from source. For those people, pip downloading a sdist that needs a compiler to build is almost certainly just a source of problems. But they are also precisely the sorts of user who won't know enough to add --prefer-binary. However, optimising for such users is going to impact a big chunk of our "traditional" user base negatively.

@dstufft
Copy link
Member

dstufft commented Nov 17, 2020

I wonder if we can leverage PyPI in some way to encourage wheels, or to at least surface better information to highlight which projects don't ship wheels? This might be a better question for discourse? I dunno.

@pfmoore
Copy link
Member Author

pfmoore commented Nov 17, 2020

I've got a big chunk of downloaded data from PyPI that I am querying to get a better feel for this sort of stuff. The biggest problem is the vast amount of (to be polite) "limited value projects" on there - without some form of insight, it's hard to know for sure whether it's OK to ignore a project called "0html" or "django-3-jet-zupit" - especially when it comes up in the same query as "090807040506030201testpip"...

@uranusjr
Copy link
Member

uranusjr commented Nov 18, 2020

What if PyPI automatically builds the simplest pure Python wheels? There’s recent interest to detect malicious source distributions on PyPI, and the wheel it would produce as the side effect should be able to be reused.

@mattip
Copy link
Contributor

mattip commented Jun 9, 2021

Any more thoughts here? I especially like the idea

... having the default by to filter out sdists, for any project version that has any wheels uploaded, solves the main driver to this proposal, without breaking projects that are not shipping wheels

The metadata option also seems reasonable, then the scientific python community could mark NumPy, Scipy, tensorflow, pytorch as "prefer binary by default" and save a lot of CI and cloud resources.

@uranusjr
Copy link
Member

I like the idea as well, maybe with a twist: Versions with only sdist are excluded, unless there are no wheels available at all prior to that version.

Use django-grappelli as an example, this means that

  1. Wheels are selected for 2.15.1, 2.14.4, 2.14.3, and 2.14.2.
  2. Sdists between 2.14.1 and 2.11.2 are all ignored since there are older wheels.
  3. Wheels from 2.11.1, 2.10.2, 2.10.1, 2.9.1, and 2.8.3 can be selected. Sdists between 2.11.1 and 2.8.3 are all ignored.
  4. Sdists from 2.8.2 downwards are allowed, since there are no wheels available past that version.

@mattip
Copy link
Contributor

mattip commented Jul 12, 2021

For another data point here is an issue filed by a python3.5 user of cffi where they cannot build with the sdist, and changing the default would have helped them.

@mattip
Copy link
Contributor

mattip commented Jul 15, 2021

Please edit the title binary-only -> only-binary. I always have to check pip --help to figure out the correct spelling.

@pradyunsg
Copy link
Member

FWIW, that tells me that we should add an alias for that option.

@uranusjr uranusjr changed the title Speculative: --binary-only by default? Speculative: --only-binary by default? Jul 15, 2021
@rgommers
Copy link

rgommers commented Jul 16, 2021

+1 for a solution via either package metadata or via a simple rule like "--only-binary :all: is applied if a package has any wheels".

Otherwise it has the risk of becoming a pip-only solution which is hard to understand. Today the problems mostly surface via pip because it's by far the most popular installer, but this is really a PyPI-ecosystem problem where the dual model of offering both source and binary packages and allowing freely mixing those is the root cause.

Sdists from 2.8.2 downwards are allowed, since there are no wheels available past that version.

This does not seem like a good idea. Not only is it harder to understand, it also partially defeats the purpose here. If a package has a very old source-only release (e.g., from the pre-wheels era) then that will be will be found the moment there's no suitable wheel for a user.

In your particular example, django-grappelli 2.8.2 is from 2016; a user who types pip install django-grappelli almost certainly does not want a version that old.

@uranusjr
Copy link
Member

uranusjr commented Jul 23, 2021

Makes sense. I think it's quite difficult to gauge the actual impact here, since people here all care much about Python packaging (for apparent reasons) and likely push for wheels in projects we are involved. So I feel the only way to go forward is to actually try to implement this (maybe as a --use-feature first) and see if we can survive it make it work in real life usages.

There are probably still some implementation details we need to sort out. Should we go with --prefer-binary or --only-binary by default? How does a user disable this and prefer an sdist with newer version? etc. But I'm going to mark this as "awaiting PR" so anyone can try to come up with something. It's easier to put things into perspective when there is an implementation and test cases to object to 😛.

@tacaswell
Copy link

I would propose an alternate path forward. Rather than changing the default behavior of pip to prefer wheels, add a second CLI entry point of pipw (pipb?) which is an alias with the default of --prefer-binary / --only-binary (and maybe rejects any attempt to change source-only installs from pypi and local source installs?). I think adding a 'w' is a much easier mnemonic to remember that the right flag(s).

As has been mentioned above, pip currently mixes two different things (building and installing from source and installing from pre-built binaries) and I think it is a mistake to tilt pip even more in favor of being a binary-only package manager. By adding a new CLI entry point it is possible to make what ever changes are needed to make pip behave like a binary package manager without having to worry about breaking an existing users.

I think another issue here is a disagreement as to what exactly wheels are for. I have always considered (and I may be the only one to hold this position) the sdist the canonical source of truth for what the released version of the package is on pypi with the wheels are provided for the convenience of the user (the linux wheel spec is "manylinux" which suggests it is a best-effort rather than authoritative artifact!). I think making pip more-binary package-manager like by default will only re-enforce the expectation that projects will (promptly) provide a wheel for your platform / Python version / Python implementation and one not existing is a "bug".

There was a discussion on the numpy mailing list about the ever expanding number of platforms that projects are expected provide wheels for becoming un-sustainable (the latest beta-release of Matplotlib has 21 wheels and we are not yet covering the full Python version/Python implementation/arch/OS matrix https://pypi.org/manage/project/matplotlib/release/3.5.0b1/). If pip is going keep going down the path of binary packaging, I think there needs to more discussions about how filling out the build matrix can be lifted from the projects to some centralized build service like the homebrew, conda-forge, and the Linux distributions do already. Separating the wheels into their own channel/management chain would also make it easier to manage things like updating version pinning on the wheels post-facto (e.g. putting an upper bound on something or banning known-bad version combinations), re-building with updated versions of non-Python dependencies (xref h5py/h5py#1942), or dealing with CVEs much easier.

@mattip
Copy link
Contributor

mattip commented Nov 24, 2021

How can we make the abstract discussion here more concrete? I see a couple of subjects being mixed together

topic possible mitigation
aliasing only-binary and binary-only PR to implement, should be the least controversial change suggested here
providing a path for naive users to prefer wheels over sdists by making only-binary the default, making prefer-binary the default, or providing a different cli entry point competing PRs to do these would provide a forum for discussion over the name and/or need for this
preferences when using --prefer-binary when sdists are available for newer versions and wheels available for older ones ???
wider ranging changes in the way wheels are built and distributed for the growing Nd matrix of python-versions/implementations/os-versions/machine-architectures/available-hardware ??? - mailing list/discourse?

I apologize if I missed some of the topics here, please feel free to add to the table. The next question is who will do the work ...

@uranusjr
Copy link
Member

uranusjr commented Nov 24, 2021

I’m dropping a link to the RFC proposing to disable install scripts by default for NPM, which would have roughly the same effect as making --only-binary the default (not --prefer-binary). npm/rfcs#488

@pfmoore
Copy link
Member Author

pfmoore commented Aug 25, 2022

Previously @rgommers was looking into getting funding for this. Is that still in progress, or did it end up not getting anywhere? Regardless, I see no problem with having two attempts to get funding under way 🙂

@pradyunsg
Copy link
Member

pradyunsg commented Dec 16, 2022

FWIW, I just realised that we have a clear migratory mechanism, for allowing people to build wheels intentionally: pip wheel.

With that, the migration in broad strokes would look something like:

  • "I want to install project and its dependencies, building wheels for packages that don't have them and install them"
    • now:
      pip install project
      
    • later (idk what we call the option, or reuse an existing one):
      pip install project --allow-builds-from-source :all:
      
    • both:
      pip wheel project -w wheelhouse/
      pip install project --no-index --find-links wheelhouse/
      
  • "I want to install project and its dependencies, using only available wheels"
    • now:
      pip install project --only-binary :all:
      
    • later:
      pip install project
      

@rgommers
Copy link

Previously @rgommers was looking into getting funding for this. Is that still in progress, or did it end up not getting anywhere?

I see that I failed to reply to this in August, apologies. Thanks for asking @pfmoore. This topic is still on my radar and of high interest. Regarding funding: I did not manage to get it externally funded, however I did/do plan on self-funding it from my team's budget (assuming the plan I outlined seemed reasonable, and there's good confidence we can execute). A response on that effort and budget estimate would still be great (I'll ping everyone).

This year we did invest a significant amount of effort compared to the year before on packaging topics. This particular one took a back seat to some other ones that were higher-effort than expected, in particular:

  • Building and maturing meson-python (building a build backend really should not be that painful!)
  • Dealing with the fallout of the removal of distutils in the PyData Stack

I'm really looking forward to those two things being sorted out completely (I think by Q2 2023).

Regardless, I see no problem with having two attempts to get funding under way

Agreed.

@rgommers
Copy link

With that, the migration in broad strokes would look something like:

That seems very reasonable to me.

pip install project --only-binary :all:

This is very unintuitive UX by the way, I've been misled by it multiple times. The :all: should not be needed, --only-binary should mean "give me only binaries for all projects".

@pradyunsg
Copy link
Member

pradyunsg commented Dec 19, 2022

It's mostly there to let you do --only-binary numpy or equivalent. I reckon it's reasonable to add alternatively named flags that have a different UX tho. only-binary is a bad name anyway. :)

@pradyunsg
Copy link
Member

pradyunsg commented Dec 19, 2022

FWIW, I think we got a decent amount forward with #10795 on the error messaging front (that was mentioned+discussed as an alternative to doing this).

I do want to eventually setup something akin to https://sphinx-theme-builder.readthedocs.io/en/latest/errors/ within pip's documentation; but we're talking longer-term goals for #10421. :)

@jeanas
Copy link
Contributor

jeanas commented Jun 11, 2023

As an outsider lurker, I think making --only-binary :all: the default is going to cause a lot of pain while the goal could be achieved with much less pain.

When installing from sdists, pip could print a big fat warning like

WARNING: The project authors of 'xyz' do not provide pre-built distributions (wheels). pip will attempt to build the package from source. This may just be because the authors forgot wheels, but it could also involve a complex compilation process and require further setup on your machine.

or

WARNING: The project authors of 'xyz' provides pre-built distributions (wheels), but not for your platform. You are running macOS aarch64 (Apple Silicon), but only these platforms are supported: Linux x64_64, macOS x86_64 (Intel), Windows x86_64. pip will attempt to build the package from source. This will likely involve a complex compilation process and may require further setup on your machine.

with the wording subject to reflection of course, but you get the idea.

@jeanas
Copy link
Contributor

jeanas commented Jun 11, 2023

Some more thoughts.

I think pip should try to be much more informative in its error messages than it is currently. Granted, building the package is under the sole responsibility of the build backend. However, 95% of problematic cases should be covered by a few simple heuristics.

  1. Package has wheels, but not for the current Python version.

Current message:

$ python3.12 -m venv venv
$ source venv/bin/activate
(venv) ~/tmp $ pip install pyqt5
Collecting pyqt5
  Using cached PyQt5-5.15.9-cp37-abi3-manylinux_2_17_x86_64.whl (8.4 MB)
Collecting PyQt5-sip<13,>=12.11 (from pyqt5)
  Downloading PyQt5_sip-12.12.1.tar.gz (122 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 122.9/122.9 kB 687.2 kB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting PyQt5-Qt5>=5.15.2 (from pyqt5)
  Using cached PyQt5_Qt5-5.15.2-py3-none-manylinux2014_x86_64.whl (59.9 MB)
Building wheels for collected packages: PyQt5-sip
  Building wheel for PyQt5-sip (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for PyQt5-sip (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      running bdist_wheel
      running build
      running build_ext
      building 'PyQt5.sip' extension
      creating build
      creating build/temp.linux-x86_64-cpython-312
      gcc -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/home/jean/tmp/venv/include -I/usr/include/python3.12 -c apiversions.c -o build/temp.linux-x86_64-cpython-312/apiversions.o
      apiversions.c:21:10: erreur fatale: Python.h : Aucun fichier ou dossier de ce type
         21 | #include <Python.h>
            |          ^~~~~~~~~~
      compilation terminée.
      error: command '/usr/lib64/ccache/gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for PyQt5-sip
Failed to build PyQt5-sip
ERROR: Could not build wheels for PyQt5-sip, which is required to install pyproject.toml-based projects

Wished warning:

WARNING: The xyz-foobar package (version x.y.z) does not provide pre-built distributions for Python 3.12 (only for Python 3.7, 3.8, 3.9, 3.10 and 3.11). pip will attempt to compile the package from source. This might be an involved process and require special setup on your machine. Consider using a compatible Python version instead.

Because if there are wheels specific to some Python versions but no cross-Python (any or abi3) wheels, it almost certainly means the package has some C/C++/Rust extensions.

  1. Package has wheels, but not for the current platform.

Current message:

~/tmp $ rm -rf venv/
~/tmp $ python -m venv venv
~/tmp $ source venv/bin/activate
(venv) ~/tmp $ pip install windows-fonts
Collecting windows-fonts
  Using cached windows_fonts-1.0.0.tar.gz (22 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: windows-fonts
  Building wheel for windows-fonts (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for windows-fonts (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [55 lines of output]
      Running `maturin pep517 build-wheel -i /home/jean/tmp/venv/bin/python --compatibility off`
         Compiling target-lexicon v0.12.5
         Compiling proc-macro2 v1.0.50
         Compiling quote v1.0.23
         Compiling unicode-ident v1.0.6
         Compiling syn v1.0.107
         Compiling autocfg v1.1.0
         Compiling once_cell v1.17.0
         Compiling libc v0.2.139
         Compiling siphasher v0.3.10
         Compiling parking_lot_core v0.9.6
         Compiling rand_core v0.6.4
         Compiling lock_api v0.4.9
         Compiling rand v0.8.5
         Compiling memoffset v0.6.5
         Compiling phf_shared v0.11.1
         Compiling anyhow v1.0.68
         Compiling scopeguard v1.1.0
         Compiling cfg-if v1.0.0
         Compiling smallvec v1.10.0
         Compiling pyo3-build-config v0.17.3
         Compiling thiserror v1.0.38
         Compiling phf_generator v0.11.1
         Compiling parking_lot v0.12.1
         Compiling indoc v1.0.8
         Compiling unindent v0.1.11
         Compiling windows v0.42.0
         Compiling pyo3-ffi v0.17.3
         Compiling pyo3 v0.17.3
         Compiling pyo3-macros-backend v0.17.3
         Compiling phf_macros v0.11.1
         Compiling thiserror-impl v1.0.38
         Compiling phf v0.11.1
         Compiling pyo3-macros v0.17.3
         Compiling windows-fonts v1.0.0 (/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e)
      error: could not compile `windows-fonts` due to 2 previous errors
      💥 maturin failed
        Caused by: Failed to build a native library through cargo
        Caused by: Cargo build finished with "exit status: 101": `"cargo" "rustc" "--release" "--message-format" "json" "--lib" "--crate-type" "cdylib"`
      📦 Including license file "LICENSE"
      🍹 Building a mixed python/rust project
      🔗 Found pyo3 bindings
      🐍 Found CPython 3.11 at /home/jean/tmp/venv/bin/python
      error: linking with `cc` failed: exit status: 1
        |
        = note: LC_ALL="C" PATH="/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/tmp/pip-build-env-8mjdw3q8/overlay/bin:/tmp/pip-build-env-8mjdw3q8/normal/bin:/home/jean/tmp/venv/bin:/home/jean/perl5/bin:/home/jean/.opam/default/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/jean/py-venvs/miniconda/condabin:/home/jean/perl5/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/jean/.local/bin:/home/jean/bin:/usr/lib64/ccache:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/jean/repos/proost/target/debug/:/home/jean/.cargo/bin:/home/jean/repos/proost/target/debug/:/home/jean/.cargo/bin" VSLANG="1033" "cc" "-Wl,--version-script=/tmp/rustc4P5er5/list" "-m64" "/tmp/rustc4P5er5/symbols.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.0.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.1.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.10.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.11.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.12.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.13.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.14.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.15.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.2.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.3.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.4.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.5.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.6.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.7.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.8.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts._windows_fonts.7b335fdd-cgu.9.rcgu.o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/_windows_fonts.2ifxub03uuevdkou.rcgu.o" "-Wl,--as-needed" "-L" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps" "-L" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libthiserror-8c071bdd02eb0cfb.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libwindows-2f658c1d990c6f11.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libpyo3-3590b1bc6bf7a4fd.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libmemoffset-88aa671ee19aa9ff.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libparking_lot-db0e5065b6df6ad0.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libparking_lot_core-82ba8ff295734f59.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libcfg_if-1cc625933ee68d00.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libsmallvec-7f0cf18c1d8c92e6.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/liblock_api-c3a218c1b1d3e820.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libscopeguard-e96c7d1ad7d765e2.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libpyo3_ffi-2ecf7c405f46bbea.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/liblibc-0fced85994d17419.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libunindent-653dbee696199cb4.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libphf-f14164909cc655fd.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libphf_shared-58f47f9557714ca7.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libsiphasher-952ef5db61634e71.rlib" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/libanyhow-383708ca089e1a38.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-89bc084783fdc439.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-8bee4b287d4367c1.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-d61707aed80694c0.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-d85366256f22345b.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-96069b86b8a8cae9.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-d19d53abf68dfa6c.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-787cbccd19d64ac6.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-b3837a36b830e0d0.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-e3deb0e7e3f04966.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-dabbb79c9815def4.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-305b01f34c9409f2.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-f833521df6074e73.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-9ac333113350d171.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-1c126114322d0eee.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-bad9164fdeeecf92.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-f9374b1e480fa681.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-207f06e41d9603af.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-7e2768e66e984e85.rlib" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-957b4aff41f8cd46.rlib" "-Wl,-Bdynamic" "-ld2d1" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/home/jean/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/tmp/pip-install-zo9pshjp/windows-fonts_0362efbc1147467eb7b2605b04a6980e/target/release/deps/lib_windows_fonts.so" "-Wl,--gc-sections" "-shared" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs"
        = note: /usr/bin/ld: cannot find -ld2d1: No such file or directory
                collect2: error: ld returned 1 exit status
      
      
      
      error: aborting due to previous error
      
      
      Error: command ['maturin', 'pep517', 'build-wheel', '-i', '/home/jean/tmp/venv/bin/python', '--compatibility', 'off'] returned non-zero exit status 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for windows-fonts
Failed to build windows-fonts
ERROR: Could not build wheels for windows-fonts, which is required to install pyproject.toml-based projects

[notice] A new release of pip available: 22.3.1 -> 23.1.2
[notice] To update, run: pip install --upgrade pip

Wished warning:

WARNING: The xyz-foobar package (version x.y.z) does not provide pre-built packages for your platform. You are running Linux x86_64, but it only has pre-built packages for Windows x86_64. pip will attempt to compile the package from source. This might be an involved process and require special setup on your machine.
  1. Package has no wheels at all, and build backend is in a short list of build backends typically used to compile C/C++/Rust extensions: meson-python, scikit-build, maturin, sip, etc., or build backend is setuptools and sdist contains at least one .c or .cpp or .cc file.
WARNING: The xyz-foobar package (version x.y.z) seems to contain non-Python code that must be compiled. No pre-built binaries were provided by the package authors, so pip will try to build it from source. This might be an involved process and require special setup on your machine. Reach out to the xyz-foobar authors in case of problems.

Overall, I have to say that pip's current error messages are sometimes quite uninformative; I would recommend improving them first and seeing if the current problem of zillions of people reporting that pip doesn't work persists. Yes, it's true that many people won't read them — but many will, too.

@pfmoore
Copy link
Member Author

pfmoore commented Jun 11, 2023

Overall, I have to say that pip's current error messages are sometimes quite uninformative; I would recommend improving them first and seeing if the current problem of zillions of people reporting that pip doesn't work persists.

I don't disagree with you, and it would be nice if this could be done. But I hope you don't think that we haven't been trying to improve things here. If anyone has any good ideas on how to improve things, we'd love to work with them on this. But the important point is thinking about how improved messages can be implemented - it's often easy to think "it would be nice if pip could tell me XYZ" but when you look at the code, it becomes impossible to even see how pip can ever know that XYZ is the case.

To give a specific example, I have no idea how we could usefully deliver the warnings you suggest. Information on what wheels are available is only available in the finder, and when the finder is called, we have no assurance that we'll ever do a build of that package. Furthermore, if we do select a source-only candidate, there's still no certainty that we'll do a build - we'll call the backend hook to prepare the metadata as part of the resolution process, but the backend might very well not do a build at that point, it might be able to calculate metadata without needing to do a build (for example, setuptools has the egg_info subcommand for this). And the metadata might cause pip to discard the candidate without ever building. Conversely, the backend is completely within its rights to do a full build just to provide metadata. So it's possible that a build will be triggered for a project that we never actually install in the end.

All of which is to say that your suggestions are really useful feedback, and match a lot of other suggestions we've seen (including a number we received from the user interface work that was funded a few years back). But unless someone comes along to explain how we can implement these ideas in practice, they will never become anything more than "nice to have" suggestions, I'm afraid.

@brabster
Copy link

brabster commented Mar 27, 2024

Just found this discussion whilst looking for options that help protect me and the inexperienced users I regularly have contact with against the apparently growing trend of malicious packages that act through arbitrary code execution on package install (I wrote about that here referencing the 2022 W4SP stealer amongst other resources).

Am I right in thinking the --only-binary :all: (edited after @takluyver's comment) flag would prevent this kind of attack? If so, adopting it as the default behaviour would protect users by default and disincentivise the bad actors by removing the easy marks?

A carefully worded opt-in flag that can be set as an environment variable would seem to give users who decide it's worth the risk a minimal-fuss option to continue as before (maybe --yolo? 🤣)

@pfmoore
Copy link
Member Author

pfmoore commented Mar 27, 2024

Using --no-binary :all: is certainly a mitigation that you can apply now. I'm not going to make promises, but certainly "not running arbitrary code when installing" is one of the points of this option.

Making it a default is generally considered to be a good idea, as well, but we need to plan the transition, and that's what this is stalled on. Breaking every package that hasn't uploaded wheels isn't really a good move...

@takluyver
Copy link
Member

To be clear, --only-binary would be the option to avoid running arbitrary code at install time. --no-binary is the opposite, which ensures that packages' build instructions are run.

IDK if this was just a typo, or a misunderstanding. It may be surprising if you're used to thinking of 'binary' meaning executable, as in the bin/ folder. In this context, 'binary' means pre-built. If you install a source distribution ('sdist'), you have to run something chosen by the package author to build it. If you have a pre-built 'wheel' package, you don't - but it may include compiled libraries, so you have to trust that they were really built from the source code you expect.

@notatallshaw
Copy link
Contributor

notatallshaw commented Mar 29, 2024

I think the confusion comes from thinking about install time versus run time. Both can run arbitary code at run time (when the package is imported).

But at install time with binary you only need to download it, you don't need to execute it, because it's already been pre-built (which could include compiled executables). Whereas non-binary is source code and you need to build it which can involve running arbitary code. So no, it's not a typo. (Edit I misread the original comment)

@notatallshaw
Copy link
Contributor

Btw, as real world evidence, in the now defunct rip project they started off with only supporting wheels (binary) and it made lots of resolutions impossible, and I wasn't able to test it against any real world work project I had until it started to support sdists.

@brabster
Copy link

@takluyver you're right, it was a typo on my part - I did mean --only-binary. Edited.

@geofft
Copy link

geofft commented May 24, 2024

I think there's another approach here that's more straightforward to implement:

  • Add a new type of sdist, called e.g. a "discouraged sdist."
  • Extend PyPI to accept files named things like foo-1.0.tar.gz.discouraged.
  • Ask the maintainers of projects that have complicated build processes and that intend to comprehensively provide wheels to rename their sdists before uploading.
  • Extend pip (and other installers) to treat these as normal sdists when a command-line flag is provided (maybe pip --build-discouraged :all: numpy), and to ignore them if not. The behavior of --no-binary/--only-binary is unchanged.

The inspiration here is there are a few projects that already do not provide sdists at all because the build process is so difficult that they don't want users to try. That's the current workaround for this issue, and while it's deeply unfortunate that sdists don't exist, the nice thing about that workaround is that it's opt-in on a per-project basis. This proposal merely formalizes that workaround and makes the sdists available if you really want them.

Existing projects that provide only sdists, or that do not reliably provide wheels, would continue to upload normal sdists, and they would be resolved like normal.

No changes to build tools are strictly required (as with the proposals to add metadata); all you need to do is rename the file. We certainly could make this nicer but it's not necessary.

This is backwards-compatible with existing versions of pip (and other installers): for projects that provide discourage sdists, they will just not recognize the discouraged sdists as a file type they can use and they'll gracefully degrade to same behavior as if sdists are not uploaded at all. And for projects that don't, the current behavior (including attempting to build normal sdists) is correct.

(We can probably come up with a better name than "discouraged" ... "manual"? "intentional"?)

Does this seem like a reasonable approach? If so I can suggest it on the forums, since it's mostly not a pip change.

(Thanks @zooba for mentioning this issue to me in another forum thread.)

@pfmoore
Copy link
Member Author

pfmoore commented May 24, 2024

Add a new type of sdist, called e.g. a "discouraged sdist."

Doing that would require a new standard, and hence a PEP. It's not something pip (or PyPI) would adopt unless it was standardised.

Personally, I don't think it's something we should try to standardise, but if you want to, then feel free to develop a PEP for it.

I'd rather we simply made --only-binary the default, and didn't over-complicate things any more than that.

@zooba
Copy link
Contributor

zooba commented May 24, 2024

I'd rather we simply made --only-binary the default, and didn't over-complicate things any more than that.

I think the only unavoidable complication is how to handle packages that have an sdist but no wheels at all. In that case, I'd prefer the default to be to use the sdist, but that's a complication (and if I specify --only-binary then I don't want to use the sdist, so it's really a new middle-ground option).

@dstufft
Copy link
Member

dstufft commented May 24, 2024

Maybe it needs a --prefer-binary?

@zooba
Copy link
Contributor

zooba commented May 24, 2024

Maybe it needs a --prefer-binary?

Which is what already happens, so it's slightly stronger than that in that the presence of any binary options at all prevents choosing the source option.

I don't have a problem with the naming, just that the obvious interpretation isn't what we need here. --avoid-sdist might have better implications, but I don't really like it as a name.

@geofft
Copy link

geofft commented May 24, 2024

I'd rather we simply made --only-binary the default, and didn't over-complicate things any more than that.

Well, from the discussion on this thread, that is quite complicated itself, right? I think the least complicated proposal is --avoid-sdist (i.e., --only-binary :packages-that-have-at-least-one-wheel:), but there's an example above of a popular package that will have worse behavior in practice with that change.

Did I miss something in the discussion above that makes this change easy to implement? I'm happy to write the code if so, I've got a bit of free time and I care about this problem, but I didn't see any designs that looked uncomplicated - but I did see comments about fundraising and a careful rollout process with community outreach. :)

My proposal does not change any existing behavior, and is therefore safe to roll out immediately without any coordination.

Doing that would require a new standard, and hence a PEP. It's not something pip (or PyPI) would adopt unless it was standardised.

Yes, agreed this is a PEP and not just a pip issue. I started a discussion here: https://discuss.python.org/t/preventing-unwanted-attempts-to-build-sdists/54169

@pfmoore
Copy link
Member Author

pfmoore commented May 24, 2024

Well, from the discussion on this thread, that is quite complicated itself, right?

Yes. I just think it's less complicated than adding a new type of sdist into the mix.

@pradyunsg
Copy link
Member

pradyunsg commented May 24, 2024

I will say: I think using a rollout coupled with a specific unreleased-at-start-time Python version will be a better mechanism to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: awaiting PR Feature discussed, PR is needed state: needs discussion This needs some more discussion type: deprecation Related to deprecation / removal.
Projects
None yet
Development

No branches or pull requests