dykstrom / basic-mode Goto Github PK
View Code? Open in Web Editor NEWEmacs major mode for editing BASIC code
License: GNU General Public License v3.0
Emacs major mode for editing BASIC code
License: GNU General Public License v3.0
When basic-line-numbers-cols is 0, basic-renumber refuses to run, saying
No room for numbers. Please adjust ‘basic-line-number-cols’.
Looking up constants should work like looking up dimmed variables.
Renumber lines fails in some situations that should work after implementing #19.
10 GO TO 20
10 GO SUB 20
10 IF X=5THEN20ELSE30
Goto line number (M-.) also fails in some situations.
10 GOTO20
See basic-xref-identifier-at-point
.
Issue #26 lists multiple ways in which line numbers can be referred to in a BASIC program. It appears basic-renumber
does not yet take some of those into account. In particular, it is missing:
For completeness, it may make sense to handle
Note that one reason to not bother being complete is that ranges do not necessarily refer to existing lines. For example, Microsoft's BASIC for the TRS-80 Model 100 allows 10 EDIT 10 - 99
which loads up all the lines between 10 and 99, inclusive, in a text editor.
I don't think any version of BASIC allows RENUM in a program, but theoretically it is possible.
Starting with emacs -Q
and then using load-file
to load the 20180919.1752 version of basic-mode from Melpa, indentation of nested if statements does not seem to work correctly.
The first if statement will indent fine, but the second will not. I'll give an example - I typed this into an empty buffer without using any movement keys or backspacing:
if x < 3 then
if y > 2 then
x = 2
y = 3
endif
endif
This it how it autoindented. Running indent-region
with the text marked does not change it. Removing all indentation and then running indent-region
on the marked text gives me the same thing, but without the y = 3
line indented.
I'll examine the code later (I'm going to have to modify it quite a bit for the BASIC dialect I'm working with anyway), but I figured I'd give you a heads up that something was up with it.
It seems highlighting is always performed using the default values of basic-keywords etc.
Here are some ideas for additional BASIC statements to be supported:
call
cls
def
dim
mat
on (... goto/gosub)
peek
poke
repeat
restore
stop
sub
tron
troff
usr
One feature Telnet23's highlighting has that basic-mode.el should steal acquire is highlighting line numbers when they are referenced by statements like GOTO 10
in the line number color.
Note that Telnet23's regexps only handle numbers directly after THEN, ELSE, GO\s*TO, and GOSUB. There are other ways that BASIC can refer to line numbers.
These should all have the numbers highlighted as line numbers:
Exceptions:
Originally posted by @hackerb9 in #21 (comment)
If possible, it would be nice if syntax highlighting worked in the way that most BASIC interpreters do, allowing spaces to be omitted. This cramped style is most commonly seen on small, 8-bit microcomputers. Leaving out spaces not only saves precious bytes, it can actually make the program run faster. Of course, such a situation is both hard to read and exactly when syntax highlighting would be the most helpful.
In the following three examples, I believe the first should be easy to implement and will handle 90% of the cases that bother me. The second may be possible but the third is, I believe, unlikely.
10 IF X=5THEN X=6ELSE20
20 GO TO 30
30 FORT=1TO1000:?MID$(BUFFER$,T,1):NEXTT
In BASIC, some of the characters Emacs treats as being valid within a "symbol"† are actually operators and should delimit keywords. For example, only IF and THEN are syntax highlighted in the following:
IF A$<>CHR$(65) THEN Z$=CHR$(X+SGN(Y)*RND(Z))
In particular, I've noticed the problem in the following operators:
Symbol | Name | ASCII value |
---|---|---|
* | Asterisk | 42 |
+ | Plus | 43 |
, | Comma‡ | 44 |
- | Minus | 45 |
. | Period‡ | 46 |
/ | Slash | 47 |
< | Less than | 60 |
= | Equals | 61 |
> | Greater than | 62 |
I believe the following change, which sets them all to "Punctuation", will fix the problem and not cause more bugs.
(defvar basic-mode-syntax-table
(let ((table (make-syntax-table)))
(modify-syntax-entry (cons ?* ?/) ". " table) ; Operators * + , - . /
(modify-syntax-entry (cons ?< ?>) ". " table) ; Operators < = >
(modify-syntax-entry ?_ "w " table) ; Underscore is valid in variable names in some BASIC dialects
(modify-syntax-entry ?. "w " table) ; xxx Is period ever allowed in variable names? xxx
(modify-syntax-entry ?' "< " table) ; Comment starts with '
(modify-syntax-entry ?\n "> " table) ; Comment ends with newline
(modify-syntax-entry ?\^m "> " table) ; or carriage return
table)
"Syntax table used while in ‘basic-mode'.")
Just for reference, my understanding from the Emacs manual is that in a programming language the syntax table characters have the following meanings:
Class Name | Character | Description | Examples |
---|---|---|---|
Whitespace | Space | Characters that separate “symbols” (variable names and keywords) | Tab, Space |
Word constituents | w | The characters allowed in symbols | A to Z, digits |
Symbol constituents | _ | Extra characters allowed in symbols, but that aren't parts of a word | C allows underscore |
Punctuation characters | . | Operators that separate symbols | +, -, *, / |
(There are many more classes, but I think those are the ones relevant here.)
† “Symbol” is an old term for a variable name or keyword. In modern parlance, it'd be called an “identifier.”
‡ Comma and period are already marked in the default syntax table as punctuation, I included them only so that I can set the entire range from 42 to 47 in the solution. Also note that in BASIC mode, the syntax for period was already being change to "word". I don't know why that is as I am not familiar with any BASIC dialect in which a period can be used in the name of a variable or command. It looks like "period" is used in Visual BASIC as a dot operator to access structures. As in C, that should be classed with punctuation, not word chars. Or, was this change made so that Emacs would treat a long floating point number as a single word?
Auto-number does not work unless line-number-cols is also set, which is not always desirable.
It is not clear to me why it is required. If I understand correctly, line-number-cols reserves a margin so that line numbers will be right aligned, like so:
10 GOTO 100
100 GOSUB 1000
1000 RUN 10
Line-number-cols can cause problems when collaborating with people who have not yet accepted Emacs as the one true editor 😉 and who still left-align their line numbers.
It would be nice if auto-number was a separate feature and could be turned on independently without having to even know about line-number-cols. I suggest that if a person has line-number-cols set to the default (0), line numbers should simply be left aligned when auto-number is set. Like so:
10 RETURN
100 NEXT
1000 ERROR 42
Thanks!
As basic-mode continues to expand to support more dialects of basic, I believe it will be helpful if the user can specify within the source file which dialect the file is targeting.
This may need a combination of approaches involving
Also for consideration
The output of this issue will be a recommended solution for consideration by the maintainer and other basic-mode contributors.
I noticed a valid BASIC program was not highlighted correctly because the string in line 10 is missing the double-quote at the end. It looked something like this:
10 PRINT "Hello, World!
20 FOR T=32768 TO 65534
30 IF PEEK(T)=ASC("H") AND PEEK(T+1)=ASC("e") THEN PRINT T
40 NEXT T
50 END
I do not know how many forms of BASIC allow literal strings to be terminated by the end of the line, but the TRS-80 Model 100 BASIC does and Microsoft's GW-BASIC does as well. (Or, at least I believe GW-BASIC does, I'm testing using an emulator called PC-BASIC, which is supposed to be very accurate.)
While it is being highlighted like a comment, a rem
statement is not actually a comment. This causes problems when the text after a REM contains an apostrophe ('
), which emacs does recognize as a comment from the syntax table The solution is to add REM to the syntax table, like so:
(setq-local syntax-propertize-function
(syntax-propertize-rules ("\\(\\_<REM\\_>\\)" (1 "<"))))
What are the conventions for syntax highlighting of BASIC code? While working on issue #20 (derived modes), I have run into a problem that I do not know the meaning of groupings like basic-builtin-regexp and basic-keyword-regexp.
If there is a "typical" convention, it would be helpful to have it in the comments in basic-mode.el. If there is not, and I suspect there isn't yet, it may be good to see what others have done and look at what makes sense.
Note that not all of the categories are confusing. Comments, constants, strings, and so on are self-explanatory. The ones that I'd like to get nailed down are:
SIN()
) which return a value. The definition of it makes sense, but there is some question about whether things like PEEK()
and TIME$
belong. And is this distinction even helpful to a programmer?AS
and RANDOMIZE
. And wouldn't it make more sense for data type declarations like DEFINT
to be highlighted as a type?PRINT
, PEEK
, POKE
, etc. But there are several counter examples to that idea. DATA
and LET
seem to be more structural. AND
, MOD
, NOT
, OR
, and XOR
are operators and I would have expected them to be highlighted differently.Here are some ideas for additional BASIC functions to be supported:
asc
atn
chr$
cos
exp
int
len
left$
log
log10
mid$
pi
right$
rnd
sgn
sin
sqr
str$
tab
tan
val
Would you be willing accept contributions that add support of the QBASIC dialect of BASIC? There is a list of additional keywords that need highlighting. Also, there would need to be modifications for indentation of FUNCTION and SELECT CASE.
The QBASIC IDE automatically uppercases keywords as that was the convention for the dialect. There are probably a couple of other minor things from the IDE that could be brought into this mode as well.
If you are receptive to the idea, I can break this request down into a number of smaller issues and also commit to making contributions where possible.
As discussed in #18, it would be useful if basic-mode let users pick from a set of predefined BASIC flavors and made it easier for them to create customized modes own.
One solution would be to use define-derived-mode
to create new submodes. Here is an example based on the reference card for the TRS-80 Model III's BASIC. It is similar to most other TRS-80 BASICs (like Color Basic and Model 100 BASIC), but has some notable differences. For example, one of the statements allowed is CLOAD?
, so in this example I've modified the syntax table so that question mark is part of an identifier instead of punctuation. [Side note: An unintended benefit of this is that programs that use ?
as shorthand for PRINT
are properly syntax highlighted, as long as the question mark is followed by a space.]
(defun basic-mode-initialize ()
"Initializations for sub-modes of basic-mode.
This is called by basic-mode on startup and by its derived modes
after making customizations to font-lock keywords and syntax tables."
(setq-local basic-font-lock-keywords
(list (list basic-comment-regexp 0 'font-lock-comment-face)
(list basic-linenum-regexp 0 'font-lock-constant-face)
(list basic-label-regexp 0 'font-lock-constant-face)
(list basic-constant-regexp 0 'font-lock-constant-face)
(list basic-keyword-regexp 0 'font-lock-keyword-face)
(list basic-type-regexp 0 'font-lock-type-face)
(list basic-function-regexp 0 'font-lock-function-name-face)
(list basic-builtin-regexp 0 'font-lock-builtin-face)))
(if basic-syntax-highlighting-require-separator
(setq-local font-lock-defaults (list basic-font-lock-keywords nil t))
(setq-local font-lock-defaults (list basic-font-lock-keywords nil t basic-font-lock-syntax)))
(unless font-lock-mode
(font-lock-mode 1)))
(define-derived-mode basic-trs80-mode basic-mode "Basic[TRS-80]"
"Programming mode for BASIC for TRS-80 machines.
This is just a demo and so only handles TRS-80 Model III. At a
minimum, this is missing keywords used in the TRS-80 Model 100
BASIC and TRS-80 Extended Color Computer BASIC.
For more details see `basic-mode'."
(setq-local basic-function-regexp
(regexp-opt '("abs" "asc" "atn" "cdbl" "cint" "chr$" "cos" "csng" "erl" "err"
"exp" "fix" "fre" "inkey$" "inp" "int" "left$" "len" "log"
"mem" "mid$" "point" "pos" "reset" "right$" "set" "sgn" "sin"
"sqr" "str$" "string$" "tan" "time$" "usr" "val" "varptr")
'symbols))
(setq-local basic-builtin-regexp
(regexp-opt '("?" "auto" "clear" "cload" "cload?" "cls" "data" "delete"
"edit" "input" "input #" "let" "list" "llist"
"lprint" "lprint tab" "lprint using" "new"
"mod" "not" "or" "out" "peek" "poke"
"print" "print tab" "print using"
"read" "restore" "system" "troff" "tron")
'symbols))
(setq-local basic-keyword-regexp
(regexp-opt '("as" "defdbl" "defint" "defsng" "defstr"
"dim" "do" "else" "end" "error" "for"
"gosub" "go sub" "goto" "go to" "if" "next" "on"
"step" "random" "resume" "return" "then" "to")
'symbols))
(modify-syntax-entry ?? "w " basic-mode-syntax-table) ; Treat ? as part of identifier ("cload?")
(modify-syntax-entry ?# "w " basic-mode-syntax-table) ; Treat # as part of identifier ("input #")
(basic-mode-initialize))
As you can see, I'm setting some of the variables, like basic-keyword-regexp, that were declared constant via defconst
, so I changed those to defvar
.
In addition, I removed some of the font-lock initialization from end of the definition of basic-mode and moved it into basic-mode-initialize. I may have moved more than necessary, but it seems to work.
(define-derived-mode basic-mode prog-mode "Basic"
"Major mode for editing BASIC code."
:group 'basic
(add-hook 'xref-backend-functions #'basic-xref-backend nil t)
(setq-local indent-line-function 'basic-indent-line)
(setq-local comment-start "'")
(setq-local syntax-propertize-function
(syntax-propertize-rules ("\\(\\_<REM\\_>\\)" (1 "<"))))
(basic-mode-initialize))
Further questions:
What is the difference between basic-builtin-regexp and basic-keyword-regexp? It seemed like keywords were mostly control-flow (IF, GOTO, CALL) and data declarations (DIM, DEFINT), while builtins were everything else (POKE, PRINT), but I couldn't quite tell. For example, why are "AS" and "RANDOMIZE" keywords? Why isn't "PEEK" a function?
It would be nice to be able to just add-to-list/delete a few keywords instead of having to redefine everything when deriving a submode. To do that would require creating variables to hold the basic-builtins, basic-keywords, and basic-functions, and not run regexp-opt
until in basic-mode-initialize. That shouldn't be too hard, but I haven't done it in my example code.
Conf-mode is often able to autodetect the proper submode. Are there ways to detect a BASIC dialect?
Are submodes the best way to provide this functionality? They have very nice hierarchical inheritance properties that would be hard to beat. (For example, if there were different submodes for QBasic, QuickBasic, GW-BASIC, and BASICA, they might all derive from a common Microsoft-BASIC ancestor). Are there any other possibilities besides define-derived-mode?
Trivia: According to the Small BASIC FAQ, there are over 230 different BASIC dialects.
Which is the best basic compiler for emacs?
I am on debian...
sudo apt install ???
I installed your package on my Emacs 24.5.1 installed from the package manager of Linux Mint 18.2 (based on Ubuntu LTS 16.04), and couldn't find a way to enable basic-mode. I tried to see if M-x basic-mode was available, but the only 2 shortcuts smex completion showed me were irrelevant. I tried to see if I could find the customization group and couldn't find it either (but that was expected, as many packages require their mode to be enabled first and then they show the relevant options at the Customize interface). After I perused the code and saw something about .bas listed there, I tried to create a .bas file with a simple Hello World program, to see if I could get that recognised by basic-mode, but it was opened as Fundamental mode instead. Hope you can fix this, as it basically makes the package unusable.
Thanks for the understanding.
With auto-number turned on, pressing enter when there are digits at and after the cursor point, merges the newly created line number and the number after point.
For example, given this file:
10 REM Copylefted (ↄ) 2022.
^
20 REM All wrongs reversed.
Hitting enter while the cursor is flashing on the first digit of 2022 will result in this:
10 REM Copylefted (ↄ)
152022 .
^
20 REM All wrongs reversed.
It should have put a space before 2022, not after. And the point should be left on the first digit of 2022, not on the period.
As C64 BASIC was (and is?) very popular, basic-mode should also include a C64 sub mode.
Try renumbering the following code with basic-syntax-highlighting-require-separator
set to t.
10 if a > 0 then10else40
20 if a > 0 then 10 else 40
30 if a > 0 then goto10 else gosub40
40 goto10 : goto 10 : go to10 : go to 10
50 gosub40 : gosub 40 : go sub40 : go sub 40
60 on a goto10,20,40
With the flag set to t it should not even touch code like goto10
, but instead it scrambles it. With the flag set to nil it works as intended.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.