tsoding / porth Goto Github PK
View Code? Open in Web Editor NEWIt's like Forth but in Python
It's like Forth but in Python
The code below seems to generate the expected output in simulation mode but don't output anything in compilation mode.
include "std.porth"
"abcdefghijklmnopqrstuvxyz\0"
mem swap .64
mem + 1 swap .
mem ,64 print // string address in memory
mem + 1 , print // 26 string length
mem ,64 , print // 97 (a)
Simulation output: ./porth.py sim
31
26
97
Compilation output is empty. ./porth.py com -r
I was interested in contributing to this, so I figured I should ask how you're thinking this should be implemented.
Mainly, should the size of the value to be read be an input (like 8, 16, 32, 64 bit) OR should it just have intrinsic support for 64 bit values and then have standard library macros provide support for other sizes?
If its only 64 bit then it can probably just be done with
pop rax
xor rbx, rbx
mov rbx, [rax] ;; Instead of mov bl, [rax]
push rbx
I understand that that currently language is changing rapidly, but @rexim can you please consider PRs that add porth code, at least ones that do not affect your main activity of writing the compiler, i.e. adding examples or maybe additions/fixes to std?
It would also be very nice to reflect this in the CONTRIBUTING.md
Unfortunately I can't watch the code sessions online on twitch, but I'm following it on YouTube :). I was curious about how the performance will be if the simulator was written in Go
or C
.
As I'm not a C
developer I tried to translate it to Go
. You can see the source code in Go-Porth.
As you can notice I tried to keep the code more close as the original code as possible. And before you guys tell me, I know that this is not the more idiomatic Go code as I also know that the idea of the project is to be self hosted.
I would like to thank you for the code sessions you're posting on YouTube!
It would be nice to only produce jump points in the assembly when that jump point is actually used.
This is useful because it makes it clear where an instance like:
addr_79:
;; -- plus --
pop rax
pop rbx
add rax, rbx
push rax
addr_80:
;; -- load --
pop rax
xor rbx, rbx
mov bl, [rax]
push rbx
can be safely reduced to:
addr_79:
;; -- plus --
pop rax
pop rbx
add rax, rbx
;; push rax
addr_80:
;; -- load --
;; pop rax
xor rbx, rbx
mov bl, [rax]
push rbx
while this one can't:
addr_11:
;; -- push int 0 --
mov rax, 0
push rax
addr_12:
;; -- while --
addr_13:
;; -- dup --
pop rax
push rax
push rax
because a jump might occur to addr_12
.
Doing this would enable to make optimizations with a separate tool instead of making the compiler more complex.
A quick and dirty test removing some of the redundant pushes and pops manually from the generated assembly takes the runtime of rule110 on a board of size 1000 from 0.069s to 0.065s (consistently over a batch of 10 runs), and I definitely missed at least half the pops.
So, probably not worth much but it's still something.
EDIT I removed a few more and now it's 0.069s to 0.046s, by also replacing instances of:
push rax
pop rbx
by:
mov rbx, rax
Right now macros in porth.porth refer to addresses of corresponding objects
And they de referenced as needed
Template for names of addresses:
&{name}
So, macros that gets value of single cell object can be implemented like this:
macro {name}
&{name} @64
end
And for writing single cell values, like this (not sure about syntax):
macro ={name}
&{name} !64
end
With #129 this code
macro push-op // type operand --
ops-count @64 sizeof(Op) * ops +
dup Op.operand rot swap !64
Op.type !64
ops-count inc64
end
transforms into this
macro push-op // type operand --
ops-count &ops[]
dup Op.operand rot swap !64
Op.type !64
&ops-count inc64
end
A lot more readable
Because i want to contribute to porth, i might be able to work on packaging
In compile mode, we are using len
on the token value directly, which means we are not considering the size difference introduced by utf-8 characters, which was handled in simulation mode, but you missed that in compilation mode.
It's impossible to use an if
or while
block inside of a macro, as it results in a compiler error. Simulation or compile doesn't matter, it's broken in both.
macro test
2 1 < if
12 print
end
end
$ python porth.py sim borked.porth
Traceback (most recent call last):
File "porth.py", line 924, in <module>
program = compile_file_to_program(program_path, include_paths);
File "porth.py", line 862, in compile_file_to_program
return compile_tokens_to_program(lex_file(file_path), include_paths)
File "porth.py", line 709, in compile_tokens_to_program
block_ip = stack.pop()
IndexError: pop from empty list
Version d7ecaa2
I just realized that there's no need to reverse the tokens since you can just tokens.pop(0)
and tokens = macros[token.value].tokens + tokens
when you need to append the list of tokens to tokens
(in case of expanding a macro for example).
I don't know if this can bring any performance improvements but I think it simplifies the code a little bit.
If you want I can change the code and send a PR.
To reproduce this bug imagine the following scenario:
$ cat test.porth
34 35 + print
$ ./porth.py com -r test.py
$
As you can see the expected output should be 69
but it won't print anything. This happens because basedir
is empty and basepath
is ["test"]
so exit(cmd_call_echoed([basepath] + argv, silent))
will run the command "test" which is a valid command in $PATH.
Push instruction supports only 8,16 and 32 bit immidiate values. To push 64 bit, value must be stored in register e.g.
mov rax, 8589934592
push rax
Relevant code
Lines 286 to 287 in 0b4afff
Python uses bigints by default, which means you can have arbitrarily large numbers. I noticed this when writing an abs
macro. The following code produces two different outputs on my machine:
include "std.porth"
macro abs
if dup 0 < do
0 over - swap drop
end
end
-18446744073709551615 abs
// becomes 1 on bare metal, 18446744073709551615 in Python
1 -
if 0 = do
"We're running on bare metal!\n" puts
else
"We're in a simulation!\n" puts
end
Not sure if it's worth fixing in Python as when Porth becomes self-hosted this issue will go away.
At least it's amusing :)
typo on this line:
ment -> meant
Line 93 in 80c3be9
I'd like to not need to include std.porth in every project I write so I have an idea - if a file (std.porth) is not found in the specified directory in include - make it look in /usr/lib/include/porth
or /usr/lib/include/std.porth
macro ifSmaller
< if
end
'Half' macros like this sadly don't work currently
As far as I can see, the lexer goes through the macro
content and actually evaluates the if end sequence
instead of ignoring it until all macros are expanded.
Correct me if I'm wrong.
Currently The type checker handles the swap operation by just popping the (type, location) tuple from the type stack and then pushing the tuples back in the new order, but this means the source location in the tuple doesn't change, i.e. the arguably real source of the element (the swap operation) is not recorded. Is this really the intended behavior?
This is not limited to the SWAP operation of course
computer info:
OS:raspios
Model:model 4 b
Recently I was playing with a new keyword defined
so we can check up front if a word
is already defined, so we can for example avoid including a file twice or change the library behavior if some macro is already defined.
But for this happen I should have the ability to redefine a macro, something like this.
defined std.porth 0 = if
macro std.porth 1 end
...
end // defined std.porth
or
defined SOMETHING 0 = if // if SOMETHING is not defined
macro SOMETHING 69 end // define a default value in case of SOMETHING is undefined
end
I don't know if this new keyword will be useful, but I think the ability to redefine a macro
should be possible. What you guys think about it? Should macros behave like a constant expression?
I'd recommend to enable the repositories discussion area.
Having a place (in public) where people can discuss Porth
related projects would be quite useful.
When downloading the code and then doing the Quickstart, instead of working, it prints out:
File "porth.py", line 18
MEM_CAPACITY = 640_000 # should be enough for everyone
^
SyntaxError: invalid syntax
This also happens for:
SIM_STR_CAPACITY = 640_000
SIM_ARGV_CAPACITY = 640_000
Note: Because I'm using Windows, I am using a Ubuntu emulator found in the Windows Store
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.