GithubHelp home page GithubHelp logo

dykstrom / basic-mode Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 10.0 90 KB

Emacs major mode for editing BASIC code

License: GNU General Public License v3.0

Emacs Lisp 81.43% BASIC 6.01% Stata 12.56%
basic emacs-lisp emacs-mode programming

basic-mode's People

Contributors

dykstrom avatar hackerb9 avatar jcs090218 avatar pederkl avatar spauldo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

basic-mode's Issues

basic-find-jumps doesn't find all target lines for renumbering

Issue #26 lists multiple ways in which line numbers can be referred to in a BASIC program. It appears basic-renumber does not yet take some of those into account. In particular, it is missing:

  • ELSE xxxxxx
  • RESTORE xxxxxx
  • RESUME xxxxxx
  • IF ERL = xxxxxx
  • RUN xxxxxx

For completeness, it may make sense to handle

  • LIST xxxxxx - yyyyyy
  • LLIST xxxxxx - yyyyyy
  • EDIT xxxxxx - yyyyyy

Note that one reason to not bother being complete is that ranges do not necessarily refer to existing lines. For example, Microsoft's BASIC for the TRS-80 Model 100 allows 10 EDIT 10 - 99 which loads up all the lines between 10 and 99, inclusive, in a text editor.

I don't think any version of BASIC allows RENUM in a program, but theoretically it is possible.

Nested if statements do not indent correctly

Starting with emacs -Q and then using load-file to load the 20180919.1752 version of basic-mode from Melpa, indentation of nested if statements does not seem to work correctly.

The first if statement will indent fine, but the second will not. I'll give an example - I typed this into an empty buffer without using any movement keys or backspacing:

if x < 3 then
    if y > 2 then
	x = 2
 y = 3
endif
endif

This it how it autoindented. Running indent-region with the text marked does not change it. Removing all indentation and then running indent-region on the marked text gives me the same thing, but without the y = 3 line indented.

I'll examine the code later (I'm going to have to modify it quite a bit for the BASIC dialect I'm working with anyway), but I figured I'd give you a heads up that something was up with it.

Additional BASIC statements

Here are some ideas for additional BASIC statements to be supported:

call
cls
def
dim
mat
on (... goto/gosub)
peek
poke
repeat
restore
stop
sub
tron
troff
usr

Highlight references to line numbers using the line number color

One feature Telnet23's highlighting has that basic-mode.el should steal acquire is highlighting line numbers when they are referenced by statements like GOTO 10 in the line number color.

Note that Telnet23's regexps only handle numbers directly after THEN, ELSE, GO\s*TO, and GOSUB. There are other ways that BASIC can refer to line numbers.

These should all have the numbers highlighted as line numbers:

  • ON var GOSUB 10, 20, 30
  • ON var GO TO 10, 20, 30
  • RESTORE 10
  • RESUME 10
  • RETURN 10
  • RUN 10
  • LIST -500
  • LLIST 10
  • RENUM 1000,10
  • DELETE 10-40
  • IF ERL=10 THEN...

Exceptions:

  • RENUM [newnum] [,[oldnum] [,increment]]
    Increment should not be highlighted.

Originally posted by @hackerb9 in #21 (comment)

Optional spaces and syntax highlighting

If possible, it would be nice if syntax highlighting worked in the way that most BASIC interpreters do, allowing spaces to be omitted. This cramped style is most commonly seen on small, 8-bit microcomputers. Leaving out spaces not only saves precious bytes, it can actually make the program run faster. Of course, such a situation is both hard to read and exactly when syntax highlighting would be the most helpful.

In the following three examples, I believe the first should be easy to implement and will handle 90% of the cases that bother me. The second may be possible but the third is, I believe, unlikely.

Example 1: No space between keywords and numbers

10 IF X=5THEN X=6ELSE20

Example 2: Optional space within GOTO

20 GO TO 30

Example 3: No space between keywords

30 FORT=1TO1000:?MID$(BUFFER$,T,1):NEXTT

Syntax highlighting fails when next to certain operators

In BASIC, some of the characters Emacs treats as being valid within a "symbol" are actually operators and should delimit keywords. For example, only IF and THEN are syntax highlighted in the following:

IF A$<>CHR$(65) THEN Z$=CHR$(X+SGN(Y)*RND(Z))

In particular, I've noticed the problem in the following operators:

Symbol Name ASCII value
* Asterisk 42
+ Plus 43
, Comma 44
- Minus 45
. Period 46
/ Slash 47
< Less than 60
= Equals 61
> Greater than 62

I believe the following change, which sets them all to "Punctuation", will fix the problem and not cause more bugs.

(defvar basic-mode-syntax-table
  (let ((table (make-syntax-table)))
    (modify-syntax-entry (cons ?* ?/)   ".   " table)   ; Operators * + , - . /                                       
    (modify-syntax-entry (cons ?< ?>)   ".   " table)   ; Operators < = >                                             
    (modify-syntax-entry ?_   "w   " table)             ; Underscore is valid in variable names in some BASIC dialects                       
    (modify-syntax-entry ?.   "w   " table)             ; xxx Is period ever allowed in variable names?  xxx                          
    (modify-syntax-entry ?'   "<   " table)             ; Comment starts with '                                       
    (modify-syntax-entry ?\n  ">   " table)             ; Comment ends with newline                                   
    (modify-syntax-entry ?\^m ">   " table)             ;                or carriage return                           
    table)
  "Syntax table used while in ‘basic-mode'.")

Just for reference, my understanding from the Emacs manual is that in a programming language the syntax table characters have the following meanings:

Class Name Character Description Examples
Whitespace Space Characters that separate “symbols” (variable names and keywords) Tab, Space
Word constituents w The characters allowed in symbols A to Z, digits
Symbol constituents _ Extra characters allowed in symbols, but that aren't parts of a word C allows underscore
Punctuation characters . Operators that separate symbols +, -, *, /

(There are many more classes, but I think those are the ones relevant here.)


“Symbol” is an old term for a variable name or keyword. In modern parlance, it'd be called an “identifier.”

Comma and period are already marked in the default syntax table as punctuation, I included them only so that I can set the entire range from 42 to 47 in the solution. Also note that in BASIC mode, the syntax for period was already being change to "word". I don't know why that is as I am not familiar with any BASIC dialect in which a period can be used in the name of a variable or command. It looks like "period" is used in Visual BASIC as a dot operator to access structures. As in C, that should be classed with punctuation, not word chars. Or, was this change made so that Emacs would treat a long floating point number as a single word?

Decouple auto-number from line-number-cols

Auto-number does not work unless line-number-cols is also set, which is not always desirable.

It is not clear to me why it is required. If I understand correctly, line-number-cols reserves a margin so that line numbers will be right aligned, like so:

    10 GOTO 100
   100 GOSUB 1000
  1000 RUN 10

Line-number-cols can cause problems when collaborating with people who have not yet accepted Emacs as the one true editor 😉 and who still left-align their line numbers.

It would be nice if auto-number was a separate feature and could be turned on independently without having to even know about line-number-cols. I suggest that if a person has line-number-cols set to the default (0), line numbers should simply be left aligned when auto-number is set. Like so:

10 RETURN
100 NEXT
1000 ERROR 42

Thanks!

Investigate ways to indicate source file specific dialect

As basic-mode continues to expand to support more dialects of basic, I believe it will be helpful if the user can specify within the source file which dialect the file is targeting.

This may need a combination of approaches involving

  1. File variables Specifying-File-Variables
  2. Interactive functions (i.e. M-x basic-set-dialect)
  3. Customizable arguments (basic-mode-default-dialect)

Also for consideration

  1. projectile-esque .basic-mode file in a "project folder" that specifies the dialect and other customizable settings for source files in the current directory and below
  2. history cache in ~/.emacs/basic-mode-source-history that remembers which dialect was specified for which source files have been opened by the user

The output of this issue will be a recommended solution for consideration by the maintainer and other basic-mode contributors.

Double-quoted strings can be terminated by CR/NL

I noticed a valid BASIC program was not highlighted correctly because the string in line 10 is missing the double-quote at the end. It looked something like this:

10 PRINT "Hello, World!
20 FOR T=32768 TO 65534
30 IF PEEK(T)=ASC("H") AND PEEK(T+1)=ASC("e") THEN PRINT T
40 NEXT T
50 END

I do not know how many forms of BASIC allow literal strings to be terminated by the end of the line, but the TRS-80 Model 100 BASIC does and Microsoft's GW-BASIC does as well. (Or, at least I believe GW-BASIC does, I'm testing using an emulator called PC-BASIC, which is supposed to be very accurate.)

REM is not syntactically a comment

While it is being highlighted like a comment, a rem statement is not actually a comment. This causes problems when the text after a REM contains an apostrophe ('), which emacs does recognize as a comment from the syntax table The solution is to add REM to the syntax table, like so:

 (setq-local syntax-propertize-function
              (syntax-propertize-rules ("\\(\\_<REM\\_>\\)" (1 "<"))))

Syntax highlighting categories

What are the conventions for syntax highlighting of BASIC code? While working on issue #20 (derived modes), I have run into a problem that I do not know the meaning of groupings like basic-builtin-regexp and basic-keyword-regexp.

If there is a "typical" convention, it would be helpful to have it in the comments in basic-mode.el. If there is not, and I suspect there isn't yet, it may be good to see what others have done and look at what makes sense.


Note that not all of the categories are confusing. Comments, constants, strings, and so on are self-explanatory. The ones that I'd like to get nailed down are:

  • functions: this one seems to be for the keywords (like SIN()) which return a value. The definition of it makes sense, but there is some question about whether things like PEEK() and TIME$ belong. And is this distinction even helpful to a programmer?
  • keywords: currently appears to be the bones which lay the structure of a BASIC program. Includes control-flow and function/subroutine definitions. There are few confusing entries, though, like AS and RANDOMIZE. And wouldn't it make more sense for data type declarations like DEFINT to be highlighted as a type?
  • builtins: appears to be the meat: all the other statements which do something. PRINT, PEEK, POKE, etc. But there are several counter examples to that idea. DATA and LET seem to be more structural. AND, MOD, NOT, OR, and XOR are operators and I would have expected them to be highlighted differently.

Additional BASIC functions

Here are some ideas for additional BASIC functions to be supported:

asc
atn
chr$
cos
exp
int
len
left$
log
log10
mid$
pi
right$
rnd
sgn
sin
sqr
str$
tab
tan
val

Add support for QBASIC dialect/IDE Features

Would you be willing accept contributions that add support of the QBASIC dialect of BASIC? There is a list of additional keywords that need highlighting. Also, there would need to be modifications for indentation of FUNCTION and SELECT CASE.

The QBASIC IDE automatically uppercases keywords as that was the convention for the dialect. There are probably a couple of other minor things from the IDE that could be brought into this mode as well.

If you are receptive to the idea, I can break this request down into a number of smaller issues and also commit to making contributions where possible.

Support dialects of BASIC

As discussed in #18, it would be useful if basic-mode let users pick from a set of predefined BASIC flavors and made it easier for them to create customized modes own.

One solution would be to use define-derived-mode to create new submodes. Here is an example based on the reference card for the TRS-80 Model III's BASIC. It is similar to most other TRS-80 BASICs (like Color Basic and Model 100 BASIC), but has some notable differences. For example, one of the statements allowed is CLOAD?, so in this example I've modified the syntax table so that question mark is part of an identifier instead of punctuation. [Side note: An unintended benefit of this is that programs that use ? as shorthand for PRINT are properly syntax highlighted, as long as the question mark is followed by a space.]

Click to see basic-trs80-mode
(defun basic-mode-initialize ()
  "Initializations for sub-modes of basic-mode.
This is called by basic-mode on startup and by its derived modes
after making customizations to font-lock keywords and syntax tables."
  (setq-local basic-font-lock-keywords
	(list (list basic-comment-regexp 0 'font-lock-comment-face)
              (list basic-linenum-regexp 0 'font-lock-constant-face)
              (list basic-label-regexp 0 'font-lock-constant-face)
              (list basic-constant-regexp 0 'font-lock-constant-face)
              (list basic-keyword-regexp 0 'font-lock-keyword-face)
              (list basic-type-regexp 0 'font-lock-type-face)
              (list basic-function-regexp 0 'font-lock-function-name-face)
              (list basic-builtin-regexp 0 'font-lock-builtin-face)))
  (if basic-syntax-highlighting-require-separator
      (setq-local font-lock-defaults (list basic-font-lock-keywords nil t))
    (setq-local font-lock-defaults (list basic-font-lock-keywords nil t basic-font-lock-syntax)))
  (unless font-lock-mode
    (font-lock-mode 1)))

(define-derived-mode basic-trs80-mode basic-mode "Basic[TRS-80]"
  "Programming mode for BASIC for TRS-80 machines.
This is just a demo and so only handles TRS-80 Model III. At a
minimum, this is missing keywords used in the TRS-80 Model 100
BASIC and TRS-80 Extended Color Computer BASIC. 
For more details see `basic-mode'."

  (setq-local basic-function-regexp
	(regexp-opt '("abs" "asc" "atn" "cdbl" "cint" "chr$" "cos" "csng" "erl" "err"
                      "exp" "fix" "fre" "inkey$" "inp" "int" "left$" "len" "log" 
		      "mem" "mid$" "point" "pos" "reset" "right$" "set" "sgn" "sin"
                      "sqr" "str$" "string$" "tan" "time$" "usr" "val" "varptr")
		    'symbols))

  (setq-local basic-builtin-regexp
	      (regexp-opt '("?" "auto" "clear" "cload" "cload?" "cls" "data" "delete"
			    "edit" "input" "input #" "let" "list" "llist"
			    "lprint" "lprint tab" "lprint using" "new"
			    "mod" "not" "or" "out" "peek" "poke"
			    "print" "print tab" "print using" 
			    "read" "restore" "system" "troff" "tron")
			  'symbols))

  (setq-local basic-keyword-regexp
	      (regexp-opt '("as" "defdbl" "defint" "defsng" "defstr"
			    "dim" "do" "else" "end" "error" "for"
			    "gosub" "go sub" "goto" "go to" "if" "next" "on"
			    "step" "random" "resume" "return" "then" "to")
			  'symbols))

  (modify-syntax-entry ?? "w   " basic-mode-syntax-table) ; Treat ? as part of identifier ("cload?")
  (modify-syntax-entry ?# "w   " basic-mode-syntax-table) ; Treat # as part of identifier ("input #")

  (basic-mode-initialize))

As you can see, I'm setting some of the variables, like basic-keyword-regexp, that were declared constant via defconst, so I changed those to defvar.

In addition, I removed some of the font-lock initialization from end of the definition of basic-mode and moved it into basic-mode-initialize. I may have moved more than necessary, but it seems to work.

basic-mode
(define-derived-mode basic-mode prog-mode "Basic"
  "Major mode for editing BASIC code."
  :group 'basic
  (add-hook 'xref-backend-functions #'basic-xref-backend nil t)
  (setq-local indent-line-function 'basic-indent-line)
  (setq-local comment-start "'")
  (setq-local syntax-propertize-function
              (syntax-propertize-rules ("\\(\\_<REM\\_>\\)" (1 "<"))))
  (basic-mode-initialize))

Further questions:

  • What is the difference between basic-builtin-regexp and basic-keyword-regexp? It seemed like keywords were mostly control-flow (IF, GOTO, CALL) and data declarations (DIM, DEFINT), while builtins were everything else (POKE, PRINT), but I couldn't quite tell. For example, why are "AS" and "RANDOMIZE" keywords? Why isn't "PEEK" a function?

  • It would be nice to be able to just add-to-list/delete a few keywords instead of having to redefine everything when deriving a submode. To do that would require creating variables to hold the basic-builtins, basic-keywords, and basic-functions, and not run regexp-opt until in basic-mode-initialize. That shouldn't be too hard, but I haven't done it in my example code.

  • Conf-mode is often able to autodetect the proper submode. Are there ways to detect a BASIC dialect?

  • Are submodes the best way to provide this functionality? They have very nice hierarchical inheritance properties that would be hard to beat. (For example, if there were different submodes for QBasic, QuickBasic, GW-BASIC, and BASICA, they might all derive from a common Microsoft-BASIC ancestor). Are there any other possibilities besides define-derived-mode?


Trivia: According to the Small BASIC FAQ, there are over 230 different BASIC dialects.

Can't seem to enable basic-mode

I installed your package on my Emacs 24.5.1 installed from the package manager of Linux Mint 18.2 (based on Ubuntu LTS 16.04), and couldn't find a way to enable basic-mode. I tried to see if M-x basic-mode was available, but the only 2 shortcuts smex completion showed me were irrelevant. I tried to see if I could find the customization group and couldn't find it either (but that was expected, as many packages require their mode to be enabled first and then they show the relevant options at the Customize interface). After I perused the code and saw something about .bas listed there, I tried to create a .bas file with a simple Hello World program, to see if I could get that recognised by basic-mode, but it was opened as Fundamental mode instead. Hope you can fix this, as it basically makes the package unusable.
Thanks for the understanding.

Auto-number gets confused by digits after point

With auto-number turned on, pressing enter when there are digits at and after the cursor point, merges the newly created line number and the number after point.

For example, given this file:

10 REM Copylefted (ↄ) 2022.
                      ^
20 REM All wrongs reversed.

Hitting enter while the cursor is flashing on the first digit of 2022 will result in this:

10 REM Copylefted (ↄ)                                                   
152022 .                                                             
       ^                                                             
20 REM All wrongs reversed.                                              

It should have put a space before 2022, not after. And the point should be left on the first digit of 2022, not on the period.

Add C64 sub mode

As C64 BASIC was (and is?) very popular, basic-mode should also include a C64 sub mode.

Renumber sometimes scrambles code

Try renumbering the following code with basic-syntax-highlighting-require-separator set to t.

   10 if a > 0 then10else40
   20 if a > 0 then 10 else 40
   30 if a > 0 then goto10 else gosub40
   40 goto10 : goto 10 : go to10 : go to 10
   50 gosub40 : gosub 40 : go sub40 : go sub 40
   60 on a goto10,20,40

With the flag set to t it should not even touch code like goto10, but instead it scrambles it. With the flag set to nil it works as intended.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.