Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Add tree-sitter PHP grammar #303

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Conversation

maxbrunsfeld
Copy link

@maxbrunsfeld maxbrunsfeld commented Dec 14, 2017

⚠️ Work in Progress ⚠️

This adds an alternative PHP grammar that uses tree-sitter-php for parsing instead of first-mate.

Depends on atom/atom#16299

/cc @joshvera

@Ismail-elkorchi
Copy link

@maxbrunsfeld Will this PR be merged ?

@maxbrunsfeld
Copy link
Author

Will this PR be merged?

It will, but there is still a bit of work to do and it's not on my immediate roadmap for the next month. If anyone is interested in picking this up, I would probably have time for assistance and code review. I added a task list to the body.

@PHLAK
Copy link

PHLAK commented Sep 27, 2018

Seems like tree-sitter is going to be enabled by default with the next release (after 1.31) of Atom. Maybe it's time to get this PR rolling. I may be able to contribute a little but how would one be able to see the effect their changes are making locally?

Also remove scope for variable and for syntax-error
Fixes scopes of anonymous nodes
Scopes constant.language.php for language constants
Scope the class of catch expressions as a class name
Change object_creation_expression to scope as entity.name.type.class
@Kaspik
Copy link

Kaspik commented Mar 8, 2020

Tree sitter has version 0.16.1 and it has not been implemented into this language for almost two years. Any plans on finishing this PR?

@claytonrcarter
Copy link

Is anyone available to offer guidance on this? I don't have any previous XP w/ grammars and such, but I have been tinkering w/ this a little bit and I would love to help move it along. Right now I have a branch that:

  1. updates this to master (and resolves the indicated conflicts)
  2. updates tree-sitter-php to 0.16.2
  3. adds some basic tests for the tree sitter grammar
  4. adds highlighting for php start tags

This leaves me w/ a few questions:

  1. php start tags were easy, but it looks like end tags aren't currently recognized by tree-sitter-php. Is that correct, or am I missing something?
  2. I tried using tree-sitter-php@0.19 but was receiving errors: "Incompatible language version. Compatible range: 9 - 12. Got: 13". The error appears to be generated by the copy of tree-sitter that is included in the Atom app.asar. Downgrading to 0.16.2 resolves this.
  3. Is @maxbrunsfeld task list in PR description still accurate, or did @Ben3eeE updates move the needle a little bit?
  4. Is the goal of an Atom tree-sitter grammar to exactly duplicate the scopes of the TextMate grammar, or are some differences expected?

I know that these are just minor, pretty easy updates, but I'd love to be able to bump this PR forward a little bit. Having someone to bounce questions off of as I work on it would be great, if anyone is available.

/cc @sadick254

@KapitanOczywisty
Copy link
Contributor

KapitanOczywisty commented Jul 29, 2021

@claytonrcarter

  1. Try "?>"
  2. PR is old so version update was needed.
  3. Most likely yes, but you will probably find out with testing.
  4. It's best to keep at least similar scopes, to not invalidate user-themes. Remember that this repository (thus scopes) is used by many other editors like vscode - that one will not make use of tree sitter, but some themes may still use the same scope styles for different editors.

I'm maintaining textmate grammar, but I can help with some questions.

@claytonrcarter
Copy link

Yup, "?>" totally worked. Thanks! And knowing that we want to keep the scopes the same is also helpful. Now I know that I can write tests against the textmate grammar and then make the tree-sitter version pass.

Thank you! I'll be tinkering w/ this on and off for a while. Thanks for your quick reply!

@claytonrcarter
Copy link

Most of the scopes appear to be pretty straightforward and easy to set up w/i the tree-sitter-php output. The two that appear to be tricky ATM are source.php and meta.embedded.block.php. Consider this snippet, and the scopes applied by line (there are other scopes, but these are just the relevant ones)

  // foo.php                  TM Scope
  <b>this is html</b>         text.html.php
  <?php                       text.html.php meta.embedded.block.php
  $foo = 1;                   text.html.php meta.embedded.block.php source.php
  ?>                          text.html.php meta.embedded.block.php
  <b>this is more html</b>    text.html.php

Trees sitter is parsing this as

(program [0, 0] - [5, 0]
  (text [0, 0] - [1, 0])
  (php_tag [1, 0] - [1, 5])
  (expression_statement [2, 0] - [2, 9]
    (assignment_expression [2, 0] - [2, 8]
      left: (variable_name [2, 0] - [2, 4]
        (name [2, 1] - [2, 4]))
      right: (integer [2, 7] - [2, 8])))
  (text_interpolation [3, 0] - [5, 0]
    (text [4, 0] - [5, 0])))

(Note that the ?> end tag is included in the text_interpolation node)

My read here is that I would need to select everything from php_tag up to text_interpolation > "?>" to apply meta.embedded.block.php and then everything between php_tag and text_interpolation > "?>" to apply source.php , but I don't see a straightforward way of doing this w/ selectors in the tree sitter grammar docs. Are these cases that would best be added to an injected grammer from the tree-sitter-html package? (That package has built-in suppport for injected erb, but I haven't looked into it in depth.)

As currently implemented, this TS grammar just applies source.php to the entire program node, which appears to be a significant break from the TM grammar. (Though, again, I'm a n00b on grammars.) I looked around at other grammars for examples of meta.embedded.block.<lang> and my read is that the language-ruby package may not support it (it appears to be included in the TM ruby/erb grammar, but not the TS grammar).

Aside from these, getting the more specific syntax scopes in place seems to be pretty straightforward; just time consuming. I'll keep chipping away at it, but having some guidance on these two scopes would be helpful. Thank you!

@KapitanOczywisty
Copy link
Contributor

KapitanOczywisty commented Jul 29, 2021

With limited confidence text.html.php and meta.embedded.block.php you can probably drop. source.php it should be supported in my opinion. You will not get exactly the same scopes, the last one is most important, so slight changes could be okish.

With rest maybe @maxbrunsfeld could help?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants