Ticket #136 (closed enhancement: fixed)

Opened 6 years ago

Last modified 8 months ago

[PATCH] Extract interpreter dependencies from #!/usr/bin/env scripts.


Reported by: scop Assigned to: pmatilai Priority: minor Milestone:
Component: rpm Version: RPM Development Keywords: Cc:


Description

Here's one possible approach to extract interpreter dependencies from "#!/usr/bin/env foo" scripts, to be applied over the patches in #134 and #135.

The first added sed line finds /usr/bin/foo from "#!/usr/bin/env /usr/bin/foo" (however useless using /usr/bin/env that way might be) and should be uncontroversial.

The 2nd sed line and the path loop might be a bit controversial but is the first straightforward implementation that I thought of; for relative things like "foo" in "#!/usr/bin/env foo" it checks if the "foo" executable is in some of "rpm managed" dirs that are assumed to be in all users' $PATH and emits a dependency on the first found such one.

Attachments

0008-Extract-interpreter-dependencies-from-usr-bin-env-sc.patch (1.3 kB) - added by scop on 02/10/10 21:19:54.

Change History

02/10/10 21:19:54 changed by scop

  • attachment 0008-Extract-interpreter-dependencies-from-usr-bin-env-sc.patch added.

(follow-up: [↓ 2]{.small} ) 02/11/10 15:16:16 changed by pmatilai

  • owner changed from RpmTickets to pmatilai.
  • status changed from new to assigned.

The normal case of non-absolute path is tricky as there's no guarantee that the interpreter exists on the build host. One possibility could be adding a special interpreter() dependency for it, ie "#!/usr/bin/env foo" would turn into

/usr/bin/env interpreter(foo)

The downside is of course that I dont see a sane way to automatically add the matching interpreter() provides :-/

Didn't apply yet, I suspect this needs further head-scratching.

(in reply to: [↑ 1]{.small} ; follow-ups: [↓ 3]{.small} [↓ 4]{.small} ) 02/15/10 20:33:09 changed by scop

Replying to pmatilai:

The normal case of non-absolute path is tricky as there's no guarantee that the interpreter exists on the build host.

Right, but packagers could have the package as BuildRequires?... but yep, that's somewhat cumbersome. But I'd like to point out that I think this is not that different from say automatic python byte-compilation which is a bit of an apples to oranges comparison but somewhat related anyway - one needs to have python installed for that to happen, and rpm-build does not have a dependency on python.

BTW for the things in /usr/bin that use /usr/bin/env I have on my F-12 box (a2x, asciidoc, asciidoc.py, bshdoc, gpsprof, hg-ssh, mercurial-convert-repo, rsvg) all ones using relative paths would be handled fine without additional BuildRequires? just for the interpreter package, it's BuildRequired? already anyway due to other reasons based on a quick lookup (caveat: *quick*; could be I missed something). bshdoc uses an absolute path ("#!/usr/bin/env /usr/bin/bsh") and handling of that is also implemented by this patch (and I think that part of the patch could be safely applied in any case).

Didn't apply yet, I suspect this needs further head-scratching.

Sure, take your time. And no problem if you end up scratching it.

You probably already realized it but if the path based approach in the patch is taken, in case an interpreter moves from /usr/bin to /bin or vice versa without a compatibility symlink or something like that the dependency will break. So maybe just do it for /usr/bin and forget about the rest...

(in reply to: [↑ 2]{.small} ) 02/15/10 20:35:10 changed by scop

Replying to scop:

You probably already realized it but if the path based approach in the patch is taken, in case an interpreter moves from /usr/bin to /bin or vice versa without a compatibility symlink or something like that the dependency will break.

(...even though the actual script using it with /usr/bin/env and a non-absolute path would not break.)

(in reply to: [↑ 2]{.small} ) 02/17/10 09:04:19 changed by pmatilai

Replying to scop:

BTW for the things in /usr/bin that use /usr/bin/env I have on my F-12 box (a2x, asciidoc, asciidoc.py, bshdoc, gpsprof, hg-ssh, mercurial-convert-repo, rsvg) all ones using relative paths would be handled fine without additional BuildRequires? just for the interpreter package, it's BuildRequired? already anyway due to other reasons based on a quick lookup (caveat: *quick*; could be I missed something).

Nod, in practise mapping non-absolute "env interpreters" to /usr/bin/ would work for vast majority of cases.

bshdoc uses an absolute path ("#!/usr/bin/env /usr/bin/bsh") and handling of that is also implemented by this patch (and I think that part of the patch could be safely applied in any case).

Yes, that part is obviously correct and will apply.

Didn't apply yet, I suspect this needs further head-scratching.

Sure, take your time. And no problem if you end up scratching it. You probably already realized it but if the path based approach in the patch is taken, in case an interpreter moves from /usr/bin to /bin or vice versa without a compatibility symlink or something like that the dependency will break. So maybe just do it for /usr/bin and forget about the rest...

Yup. The path based approach would likely work in practise fairly well, but "correct" it is not.

One (somewhat crazy) idea to determine interpreters automatically, with pathless interepreter(foo) style provides: whenever interpreter dependencies are present in the package being built (possibly including scriptlet interpreters), see if the interpreter path is in the current package. If found, then we know the path is used as an interpreter and we can add interpreter(basename) provide for it. For added bonus, add provides for symlinks and hardlinks to the file too. It shouldn't find any false positives but considering how easily it would break (some script is no longer shipped with a package and poof, there goes the interpreter provide), it's probably not worth the trouble.

I'm in middle of reworking the dependency generator internals atm, one of the things that's going to be added is additional path-based "coloring" using regexes (eventually to be read from configuration instead of being hardcoded in rpmfc.c), eg

    { "^/usr/lib(64)?/python[[:digit:]]\\.[[:digit:]]/.*\\.(py|pyc|pyo)$", RPMFC_PYTHON },
    /* We want to grab the version we're building now, %{__python} wont do */
    { "^%{_bindir}/python[[:digit:]]\\.[[:digit:]]$", RPMFC_PYTHON },
    { NULL, 0 },

...which would also allow marking various things as interpreters, something like:

    { "^/bin/(da|ba|k|tc)?sh$", RPMFC_INTERPRETER },

...but the obvious downside is that somebody somewhere needs to add such entry for any new interepreter, so the advantage over telling packagers to just add "Provides: interpreter(foo)" to their packages is questionable.

04/24/15 08:03:57 changed by FlorianFesti

  • status changed from assigned to closed.
  • resolution set to fixed.

RPM uses a simplified version of this that does not try to evaluate to the full patch for quite a while now. Guess we stay with this. Closing.