Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Major tidy-up #137

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Major tidy-up #137

wants to merge 5 commits into from

Conversation

Kroc
Copy link

@Kroc Kroc commented May 20, 2016

Let's make this readable!

  • Key names have had wrapping quotes removed, except for non-schema keys
    (i.e. part of the PHP language definition rather than the language
    schema)
  • Comment keys moved to multi-line strings for readability
  • Whitespace (?x) regexes have been cleaned up and formatted -- they
    had both actual whitespace and escaped whitespace!
  • Capture indicies collapsed for better readability

Note: This file is functionally the same as it was before, this has just
been a reformat and tidy

Kroc added 2 commits May 20, 2016 21:15
Let's make this readable!
* Key names have had wrapping quotes removed, except for non-schema keys
(i.e. part of the PHP language definition rather than the language
schema)
* Comment keys moved to multi-line strings for readability
* Whitespace `(?x)` regexes have been cleaned up and formatted -- they
had both actual whitespace and escaped whitespace!
* Capture indicies collapsed for better readability

Note: This file is functionally the same as it was before, this has just
been a reformat and tidy
@Kroc
Copy link
Author

Kroc commented May 22, 2016

NB: Should this PR be merged, I'll be happy to redo the Heredoc fix #77 on top of the new cleaned up CSON. This PR also obsoletes #123 as I've cleaned up the whitespace-regexes too.

@50Wliu
Copy link
Contributor

50Wliu commented May 23, 2016

Key names have had wrapping quotes removed, except for non-schema keys (i.e. part of the PHP language definition rather than the language schema)

Can you please revert this change? All other grammars have every key wrapped in quotes.

@Kroc
Copy link
Author

Kroc commented May 23, 2016

Well, it's not a requirement of CSON, and it actually makes it clearer what keys are part of PHP and what are part of the grammar schema. Reverting this would make it much harder, visually, to find the strings that one needs to edit. It also makes it easier for one to understand what the schema is just by looking, and how to use it.

It shouldn't matter what the other grammars are doing -- they're doing it wrong -- this way is much friendlier for anybody wanting to examine or modify the grammar. I can update the other grammars if it's consistency you want :)

@50Wliu
Copy link
Contributor

50Wliu commented May 23, 2016

This is delving into personal preference - but I personally like everything quoted, and you obviously like them unquoted when possible :).

This has been suggested before on both language-javascript and language-go but a final decision was never reached. atom/language-go#68 (comment)

@Kroc
Copy link
Author

Kroc commented May 23, 2016

We all have closely held preferences (tabs for sure), but just look at this screen-shot -- especially the minimap; this is just quantitatively better when it comes to readability. It's just a wall of green if it's all quoted.

image

You basically have domain over the language grammars, so it's up to you, but there's no point to this PR if we stick to the wall of green. I simply can't work with that, it's hell to comprehend the structure. :(

@50Wliu 50Wliu self-assigned this May 26, 2016
@50Wliu
Copy link
Contributor

50Wliu commented May 26, 2016

After debating over this for a bit, I've come to the conclusion that unquoting the keys does make the CSON file easier to read. All the default language packages should be updated though to match. @atom/feedback any concerns, etc.?

@Kroc
Copy link
Author

Kroc commented May 26, 2016

That's so amazing, thank you! This is going to make such a difference to people approaching the language files, they're a freighting prospect to edit. Would you like to me handle the other languages as I have done here?

@50Wliu
Copy link
Contributor

50Wliu commented May 26, 2016

Would you like to me handle the other languages as I have done here?

If no one raises any concerns in a few days, feel free to do so.

This seems to be a hold-over from the XML/PLIST format from TM.
@BinaryMuse
Copy link

I'm 👍 for the unquoted keys.

Thanks for looking at this!!

@Ingramz
Copy link
Contributor

Ingramz commented May 27, 2016

As mentioned above, some keys have been intentionally left quoted - if this is going to be a thing, then it would be better if the rules for quoted/unquoted keys were documented somewhere.

@50Wliu
Copy link
Contributor

50Wliu commented May 27, 2016

Hey @Kroc, any chance you can move all the unquoting stuff into a separate commit so I can review the other changes separately? Thanks :).

@Kroc
Copy link
Author

Kroc commented May 27, 2016

I honestly have no idea how to do that! I'm not a strong git user. You could just search/replace the common keys I suppose?

@Ingramz : The idea of that is to distinguish between the schema (allowed keys), and the user-defined-keys that acts as 'IDs' in the repository. This way when a reader sees import: '#something' they know to find the quoted 'something' key in the repository.

-- edit: I can add comments to the top to describe the use of quoted/unquoted keys. Since all language grammars will be updated to match, there'll always be a guide in the right place.

@Ingramz
Copy link
Contributor

Ingramz commented May 27, 2016

@Kroc I fully understood what you meant by it when you first mentioned it a couple days ago.

Why I am asking this is because then it would be settled across the project once and for all in which way the grammar files should be formatted without having to rely on a person knowing it, making the new style easy for anyone to verify or follow. That solves the problem we are having here and the problem that others had in other repositories in a way that cannot be countered with "I like it that way".

Also it would make it possible to extend on the current idea, if needed - add new rules or modify existing ones.

If someone made a change to a grammar file, which wasn't styled correctly, it would be possible to easily point to the piece of text which defines the style instead of having to educate each person individiually.

For motivation, see the following image posted in this stackexchange question.

@50Wliu
Copy link
Contributor

50Wliu commented May 27, 2016

@Kroc Fair enough, it would be helpful in future PRs though :).

@Kroc
Copy link
Author

Kroc commented May 27, 2016

@50Wliu OK, so for the other grammars, you want to submit the PR with the quotes intact for one commit, and then without afterwards -- or do you mean to submit the PR with the quotes and wait until you've merged it first?

@Ingramz I had not thought on such a broad scale -- but remember too that there are developers like me who are going straight to these particular files without looking at any other part of Atom. Comments in the language grammars would be a starting point until some broader-scale decision could be made -- perhaps you could raise that issue in the right place?

@50Wliu
Copy link
Contributor

50Wliu commented May 27, 2016

@Kroc The former.

@50Wliu
Copy link
Contributor

50Wliu commented Jun 2, 2016

Ok, finally had some time to review this. Initial comments:

  • I'm not sure if the capture squashing (1: name: 'scope') helps with readability.
  • The comments added in f6cd811 are superfluous and can be removed. We should create better documentation for language files in a central location (such as the Flight Manual) rather than commenting each one. Ditto with 9054d56 and the other header comments.
  • The suuuuuuper long lines that are just keyword after keyword can be beautified by making them multiline (example here).
  • Multiline comments haven't been converted to their CSON counterparts
  • A lot of characters are being escaped when they don't need to be, such as {, }, #, and @. That can be addressed in a different PR though.

Regarding the first bullet point, the new formatting is not in line with our CoffeeScript formatting guidelines, and the new CSON file is currently producing 236 errors locally. Only one space can follow a colon. We should really set up Travis to lint CSON files as well...

That's all I have for now. This is obviously all personal observations, so /cc @atom/feedback again.

@Kroc
Copy link
Author

Kroc commented Jun 2, 2016

@50Wliu Thanks for taking a look; this is the first time I'm contributing to a large project, so please excuse my lack of understanding. On your points:

I'm not sure if the capture squashing (1: name: 'scope') helps with readability.

Ehhh, it's hard to say. Here's a comparison shot. I thought the tabs aligned nicer. It's more compact which makes it much easier to reference with the regex above.

cson-cpatures

The comments [...] are superfluous and can be removed.

For developers, like me, who go straight for the language grammar to make improvements these comments will reduce greatly the amount of reviewing and question answering you may have to field in the future -- everything the developer needs to know is right in front of them. Note that documentation on these grammars is already slim-to-none. The complete lack of direction on how to understand the file and what to do with it is an absurdly high barrier to entry. Only because I am very persistent did I not just give up and walk away, which is what I expect hundreds of developers have already done having faced a file with no comments and no direction on format / schema, not to mention 'the wall of green'.

Central documentation is good, but that's a separate issue and I would urge that until such documentation exists and can be referenced in a comment, the grammar files should contain enough comments to get developers started. They do not have to be mutually exclusive, and updating the comments in the future is a very, very simple task.

The suuuuuuper long lines that are just keyword after keyword can be beautified by making them multiline

Actually, that's what I've intended to do from the beginning, but I didn't want to push too many changes at once -- I'm just interested in making the leap from quoted to unquoted keys first and improvements like this one are an easy enough following step that I'm happy to tackle.

Multiline comments haven't been converted to their CSON counterparts

I didn't particularly see the need, but I have no objection to this at all.

A lot of characters are being escaped when they don't need to be, such as {, }, #, and @. That can be addressed in a different PR though.

I absolutely agree. All the regexes need a fine look, which I'm prepared to do should we be able to move to unquoted keys as this makes my life easier editing the file.

the new formatting is not in line with our CoffeeScript formatting guidelines,

I was not aware of this -- which is exactly why a link to them should be in the comments so that you have less issues with PRs in the future. I'm happy to conform to the linter and can update the PR accordingly.

Thanks for the continued direction and support! 👍

@50Wliu
Copy link
Contributor

50Wliu commented Jun 10, 2016

I'm still going to argue that the comments should be related to content and not how the file is parsed. In addition, having the same comments in each file is redundant when there could be a central location for it. A simple link in the README should suffice, and developers should definitely read the README before starting to work on a project 😉.

@hediyi
Copy link

hediyi commented Jun 26, 2016

Does anyone else think the name of each pattern should be on the top instead of being buried in the middle?

To me, the name is just like the title of an article, it describes the pattern. If placed on the top, you can instantly tell what the pattern is about, instead of having to search for the name first then going back to read from the top. Putting the title in the middle everything really doesn't make sense.

@Alhadis
Copy link
Contributor

Alhadis commented Sep 28, 2016

Does anyone else think the name of each pattern should be on the top instead of being buried in the middle?

Yep, seconded. That's how I structure all my grammars:

  1. name
  2. match
  3. captures

Or,

  1. name
  2. begin
  3. end
  4. beginCaptures
  5. endCaptures

I'm also 1000% for the unquoted keys, because pointlessly quoting stuff is pretty pointless stuff.

It's also possible the quoted keys were simply vestigial of the .tmbundle -> .cson conversion back in 2014.

@50Wliu
Copy link
Contributor

50Wliu commented Sep 28, 2016

I think you mean .tmbundle -> .json -> .cson 😉

And I would argue for begin -> beginCaptures, then end -> endCaptures so that the context is closer.

@Alhadis
Copy link
Contributor

Alhadis commented Sep 29, 2016

And I would argue for begin -> beginCaptures, then end -> endCaptures so that the context is closer.

Eh, I'm talking about my grammars, which're indented in a manner best described as, uh, idiosyncratic.

Or calligraphic. Or "hideous", depending on how much you hate my indentation:

screen shot 2016-09-29 at 12 12 17 pm

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants