Make filename-base filetype detection independent of working directory#4112
Conversation
…d at time of file open
|
But basically all we need is to exchange |
imma be honest i wanted to make the minimum change given it's a codebase i looked at for the first time yesterday, written in a language i used for the first time yesterday, and changing one parameter and the contract of a function called in two places seemed pretty low risk. in any case, the change specifically to the lemme see what u mean about that function tho ty for getting back so fast homie |
- Since the regex is compared against the absolute path, start of string (`^`) replaced with `(^|/|\\)` - Replaced exclusively unix style paths in a couple places with unix or dos - Updated filename detection based purely on the directory to extend into the filename (`.+`) so that our new matcher succeeds - Fixed a random issue with apache confs allowing matching any file whose path included apache.d (with or without infinite backslashes)
|
Ok sorry I thought you were requesting a change and not asking for a clarification(?) and then i got distracted. Went to triple double check that I wouldn't be breaking any regexes, and yep, i was. Caught a few other minor issues, too. |
|
|
||
| detect: | ||
| filename: "httpd\\.conf|mime\\.types|vhosts\\.d\\\\*|\\.htaccess" | ||
| filename: "httpd\\.conf|mime\\.types|vhosts\\.d(/|\\\\).*|\\.htaccess" |
There was a problem hiding this comment.
The vhosts\\.d\\\\* previously would've matched:
vhosts.d(as in a file namedvhosts.d)vhosts.d\vhosts.deezvhosts.d\eezvhosts.d\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\eez
With the addition of .* onto the end of the path, the matched substring of this regex extends into the name of the file, so it passes the new function definition.
|
|
||
| detect: | ||
| filename: "\\.m[ems]$|\\.rof|\\.tmac$|^tmac." | ||
| filename: "\\.m[ems]$|\\.rof|\\.tmac$|(^|/|\\\\)tmac." |
There was a problem hiding this comment.
This situation covers most of the issues addressed in this commit: start of string. Previously this regex also would've failed if you opened dir/.tmac. I hadn't considered start-of-string based matching, so, there's a few more previously-failing-now-fixed regexes with this patch.
|
|
||
| detect: | ||
| filename: "\\.(ini|desktop|lfl|override|tscn|tres)$|(mimeapps\\.list|pinforc|setup\\.cfg|project\\.godot)$|weechat/.+\\.conf$" | ||
| filename: "\\.(ini|desktop|lfl|override|tscn|tres)$|(mimeapps\\.list|pinforc|setup\\.cfg|project\\.godot)$|weechat(/|\\\\).+\\.conf$" |
There was a problem hiding this comment.
Unrelated: fixing for DOS users.
|
|
||
| detect: | ||
| filename: "(nftables\\.(conf|rules)$|nftables(\\.rules)?\\.d/)" | ||
| filename: "(nftables\\.(conf|rules)$|nftables(\\.rules)?\\.d(/|\\\\).+)" |
There was a problem hiding this comment.
Again, note the need for the .+ (.* also fine) so that it picks up part of the filename.
|
|
||
| detect: | ||
| filename: "(^|/|\\.)(ex|vim)rc$|\\.vim" | ||
| filename: "(^|/|\\\\)\\.?(ex|vim)rc$|\\.vim" |
There was a problem hiding this comment.
I am not sure what the intention here was, but with this change this no longer matches dir/something.vimrc etc. I don't know if people name their vimrcs that way. I use micro.
|
But also: I itch to rename |
| // MatchFilePath will check the given file's absolute path with the stored regex; | ||
| // Only matches which include at least part of the filename are considered. | ||
| func (header *Header) MatchFilePath(absoluteFilePath string) bool { |
There was a problem hiding this comment.
I'm not really convinced, that all of this is really needed.
Using header.MatchFileName(b.AbsPath) to check against header.FileNameRegex seems feasible combined with your syntax files changes, unless proven wrong.
Did I oversee something?
At the moment, filetype detection based on the filename is rather brittle: for regexes that rely on the path, they rely on the relative path. If the filename regex is, for instance,
/privoxy\/config$/, syntax highlighting only works if your relative path includes the parent directory right before the file. So,micro ../privoxy/config,micro privoxy/config, etc. work, butmicro config(where config lives in a folder named privoxy),micro privoxy/./config(convoluted but u get my drift), etc. do not.Currently this issue affects two filetypes in particular, whose regex relies completely on the path:
privoxy-config, and, specifically, weechat config files (typeini). A number of filetypes make use of path-based heuristics, however, and those heuristics break down depending upon where you happen to be when you open the file.I've updated
Header.MatchFileName(nowMatchFilePath) to take the absolute path to the file (updating callers), to run the check against that, and only return true if a match exists that extends into the filename.Note that this will both add and remove categories of matches:
/some-file/matches on/this/is/not/some-file/but/itstillmatches. This seems like unintended behavior to me, but who knows. This wacky behavior can still be achieved by using/some-file.*/, of course.privoxy/configfrom within theprivoxyfolder (micro config) would not match, and now it does. IMO this seems like the originally intended behavior, and allows for more particular regexes.I wound up finding this behavior while writing a syntax highlighter for Kitty config files, since I was using a regex of
/kitty\/.*\.conf/to not be overly broad, while still including every.conffile within that folder. I'm about to open a PR with that addition pending this change if that's chill : )In any case, please let me know what you think! Very happy user of micro for many years : )
(p.s. I've never written Go before 👀 it's probably fine)