alexfru / smallerc Goto Github PK
View Code? Open in Web Editor NEWSimple C compiler
License: BSD 2-Clause "Simplified" License
Simple C compiler
License: BSD 2-Clause "Simplified" License
Typically when you run an exe file you can do something like /? -h or --help as the only command line parameter to get a list of all command line parameters it accepts, and usually a short description on how to use them. Unfortunately I can't seem to do that with any of the EXEs that make up this compiler (the compiler, compiler driver, linker, etc). And I can't find a list of them on this github page either. So can someone here publish a list of available command line parameters for this software?
I have been looking for a small c compiler to hack a backend into for my bytecode virtual machine and so far most are not approachable at all..
is it possible to create an MZ EXE with both 16-bit and 32-bit code segments? (for example, you have a stub that enters Protected Mode, calls into your main application to do some work, then returns to DOS) without DPMI?
I am running into functional code issues when using char
and short
(or any type smaller than a word) with MIPS code generation.
Consider the following example:
extern char ReturnAChar();
extern void SomethingWithChar(char);
void main() {
char x = ReturnAChar();
if (x == 0) {
SomethingWithChar(x);
}
}
When I inspect the output, I see an issue with the handling of x
:
subu $29, $29, 16
jal ReturnAChar
subu $29, $29, -16
sw $2, -4($30) # issue!
lb $2, -4($30)
bne $2, $0, $L3
x
gets stored as a word, but is then retrieved as a byte. At least on a big endian machine, this is going to result in an incorrect read.
I tried looking into the callers of GenWriteLocal
in cgmips.c
, but I am having a hard time tracking down why they might be passing the wrong value for opSz
, especially since the loads are being generated fine.
The default filename for dos or windows targets is "aout.exe"
That name is not detected by autoconf, and I think,
that SmallerC is far too small that autoconf would change that.
Easiest fix would be to use a default name,
which is already detected by autoconf:
ac_files="a.out conftest.exe conftest a.exe a_out.exe b.out conftest.*"
Since the sourcefile has the name "conftest.c", i suggest to create a "conftest.exe" or fall back to "a.exe"
("a_out.exe" is also detected,
but not removed before the test or after the test)
Hey,
I'm working on a MikeOS C library and I'm having trouble with the lack of long
type that is needed for some functions. It would also be nice to support floating point in printf
and other C functions. Do you have any plans to add the long or float types for 16-bit binaries?
If not, perhaps you could explain the problems and I'll try to implement it.
By the way, I've notice that using a floating point operation in 16-bit mode causes the compiler to emit a symbol such as: __addsf3
Could I add floating point support at a library level by providing my own functions for these symbols?
Thanks
Hi,
Recently I've been developing a 16-bit library/archive. The library is written in assembly (NASM) and makes calls to immediate values. It assemblies into an elf object just fine, but when I link it with smlrl, I get an Unhandled exception!
error. I've been able to trace the problem back to the call instructions. I'm not sure if I'm making a silly mistake or if it's a genuine problem, but it seems that the linker can't handle handle my 16-bit calls!
Below is the readout of the build process:
smlrc.exe -seg16 -seg16 -SI .\../include test.c test.asm
nasm.exe -f elf test.asm -o test.o
smlrl.exe -flat16 -tiny -origin 0x3200 -entry _main test.o mlib.a -o test.bin
Unhandled exception!
Executed command failed
I can give more information if needed. Any help would be appreciated!
Hi,
While compiling an ELF loader on Windows, I ran into the following errors:
Error in "elf-loader/elf-module.c" (38:41)
Unexpected declaration or expression of type void
Failed command 'smlrc.exe -seg32 -winstack -nopp elf-loader/elf-module.i elf-loader/elf-module.asm'
The file (elf-loader/elf-module.c
) is as follows:
/*
* ELF loadable modules
*
* 12 Feb 2011, Yury Ossadchy
*/
#include <string.h>
#include "elf-module-private.h"
static char *elf_module_sym_name(elf_module_t *elf, int offs)
{
return elf->names + offs;
}
static void elf_module_layout(elf_module_t *elf)
{
int i;
Elf_Shdr *shdr;
for (i = 1; i < elf->header->e_shnum; i++) {
shdr = &elf->sections[i];
if (!(shdr->sh_flags & SHF_ALLOC))
continue;
shdr->sh_addr = shdr->sh_addralign
? (elf->size + shdr->sh_addralign - 1) & ~(shdr->sh_addralign - 1)
: elf->size;
elf->size = shdr->sh_addr + shdr->sh_size;
}
}
static void *elf_module_get_ptr(elf_module_t *elf, Elf_Addr addr)
{
return elf->start + addr;
}
static void *elf_module_sec_ptr(elf_module_t *elf, Elf_Shdr *shdr)
{
return (void *) elf->header + shdr->sh_offset;
}
static int elf_module_reloc(elf_module_t *elf)
{
int i;
Elf_Shdr *shdr = &elf->sections[0];
for (i = 0; i < elf->header->e_shnum; i++) {
switch (shdr->sh_type) {
case SHT_REL:
elf_module_reloc_section(elf, shdr);
break;
case SHT_RELA:
elf_module_reloca_section(elf, shdr);
break;
}
shdr++;
}
return 0;
}
static int elf_module_link(elf_module_t *elf, elf_module_link_cbs_t *cbs)
{
int err;
int n = elf->symtab->sh_size / sizeof(Elf_Sym);
Elf_Sym *sym;
Elf_Sym *symtab = elf_module_sec_ptr(elf, elf->symtab);
Elf_Sym *end = &symtab[n];
for (sym = &symtab[1]; sym < end; sym++) {
switch (sym->st_shndx) {
case SHN_COMMON:
return -EME_NOEXEC;
case SHN_ABS:
break;
case SHN_UNDEF:
/* resolve external symbol */
sym->st_value = (Elf_Addr)
cbs->resolve(elf, elf_module_sym_name(elf, sym->st_name));
if (!sym->st_value)
return -EME_UNDEFINED_REFERENCE;
break;
default:
/* bind to physical section location and define as accessible symbol */
sym->st_value += (Elf_Addr) elf_module_get_ptr(elf,
elf->sections[sym->st_shndx].sh_addr);
if (ELF_SYM_TYPE(sym->st_info) != STT_SECTION) {
err = cbs->define(elf, elf_module_sym_name(elf, sym->st_name),
(void *) sym->st_value);
if (err < 0)
return err;
}
}
}
return 0;
}
int elf_module_init(elf_module_t *elf, void *data, size_t size)
{
int i;
elf->header = data;
if (memcmp(elf->header->e_ident, ELF_MAGIC, sizeof(ELF_MAGIC) - 1)
|| !elf_module_check_machine(elf)) {
return -EME_NOEXEC;
}
elf->sections = data + elf->header->e_shoff;
elf->strings = data + elf->sections[elf->header->e_shstrndx].sh_offset;
elf->size = 0;
/* section 0 is reserved */
for (i = 1; i < elf->header->e_shnum; i++) {
Elf_Shdr *shdr = &elf->sections[i];
if (shdr->sh_type == SHT_SYMTAB) {
elf->symtab = &elf->sections[i];
elf->strtab = &elf->sections[elf->sections[i].sh_link];
elf->names = data + elf->strtab->sh_offset;
}
}
elf_module_layout(elf);
return 0;
}
size_t elf_module_get_size(elf_module_t *elf)
{
return elf->size;
}
void elf_module_set_data(elf_module_t *elf, void *data)
{
elf->data = data;
}
void *elf_module_get_data(elf_module_t *elf)
{
return elf->data;
}
int elf_module_load(elf_module_t *elf, void *dest, elf_module_link_cbs_t *cbs)
{
int i;
int res;
Elf_Shdr *shdr;
elf->start = dest;
for (i = 1; i < elf->header->e_shnum; i++) {
shdr = &elf->sections[i];
if (!(shdr->sh_flags & SHF_ALLOC))
continue;
memcpy(elf_module_get_ptr(elf, shdr->sh_addr),
(void *) elf->header + shdr->sh_offset, shdr->sh_size);
}
res = elf_module_link(elf, cbs);
if (res < 0)
goto out;
res = elf_module_reloc(elf);
out:
return res;
}
void *elf_module_lookup_symbol(elf_module_t *elf, char *name)
{
int i;
int n = elf->symtab->sh_size / sizeof(Elf_Sym);
Elf_Sym *sym = (void *) elf->header + elf->symtab->sh_offset + sizeof(Elf_Sym);
for (i = 1; i < n; i++, sym++) {
switch (sym->st_shndx) {
case SHN_ABS:
break;
case SHN_UNDEF:
break;
default:
if (!strcmp(elf_module_sym_name(elf, sym->st_name), name))
return (void *) sym->st_value;
}
}
return NULL;
}
Line 38 is the function elf_module_get_ptr
.
I use clang(with fsanitize=address) to compile this project.
when I call $ make
I receive:
awk -v l=/opt/SmallerC/v0100/srclib/ '/.$/{$0=l$0}{print}' /opt/SmallerC/v0100/srclib/lcds.txt > lcds.op
./smlrcc -SI /opt/SmallerC/v0100/include -I /opt/SmallerC/v0100/srclib @lcds.op
==15851==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1400 byte(s) in 1 object(s) allocated from:
#0 0x4da2e0 in realloc (/opt/SmallerC/smlrcc+0x4da2e0)
#1 0x51445b in fatargs /opt/SmallerC/v0100/smlrcc.c:550:23
SUMMARY: AddressSanitizer: 1400 byte(s) leaked in 1 allocation(s).
/opt/SmallerC/v0100/../common.mk:42: recipe for target 'lcds.a' failed
make: *** [lcds.a] Error 1
rm lcds.op
I created a project:
https://c-testsuite.github.io/
https://github.com/c-testsuite/c-testsuite
and wondered if you would like to join in, things are sparse and need organizing, but perhaps you understand what I am aiming for.
In case of DOS please add __seg (segment type) modificator support. Just like for BC++.
There is a description on Russian http://citforum.ru/programming/bcpp/r77_1.shtml
Still trying to port mksh to SmallerC. The compiler does not find the following header files:
<sys/ioctl.h>
(I might work around that by disabling core functionality, if absolutely needed; see below)<sys/wait.h>
(mandatory for process handling)<dirent.h>
(mandatory for globbing)<pwd.h>
(adding -DMKSH_NOPWNAM
to CPPFLAGS
makes this optional)<termios.h>
(preferred) or <termio.h>
(fallback)Incidentally, these headers exist in the system C library, but if you roll your own you’ll need to implement them.
As for ioctl
:
<termio.h>
is used, ioctl is mandatory (TCGETA
and TCSETAW
); <termios.h>
uses tc{g,s}etattr
insteadTIOCSCTTY
is nice but not strictly neededTIOCGWINSZ
is… really needed if we don’t want users to complain about broken displayWhen I use make -j4
I get a variety of errors indicating that v0100/srclib/c0.asm
is used before it is built or is partially built. This is usually because of a missing dependency but I can't figure out where c0
is in the Makefile.
make -j12
shows even more outlandish errors. make -j1
always works.
5.15.133.1-microsoft-standard-WSL2
commit b120a9c
$ valgrind --leak-check=full smlrcc 1.c
==81279== Memcheck, a memory error detector
==81279== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==81279== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==81279== Command: smlrcc 1.c
==81279==
==81279==
==81279== HEAP SUMMARY:
==81279== in use at exit: 381 bytes in 7 blocks
==81279== total heap usage: 57 allocs, 50 frees, 5,221 bytes allocated
==81279==
==81279== 35 bytes in 1 blocks are definitely lost in loss record 4 of 7
==81279== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==81279== by 0x10C28D: Malloc (smlrcc.c:355)
==81279== by 0x10C28D: SystemFileExists (smlrcc.c:1220)
==81279== by 0x10C4F6: AddSystemPaths (smlrcc.c:1299)
==81279== by 0x109807: main (smlrcc.c:1909)
==81279==
==81279== LEAK SUMMARY:
==81279== definitely lost: 35 bytes in 1 blocks
==81279== indirectly lost: 0 bytes in 0 blocks
==81279== possibly lost: 0 bytes in 0 blocks
==81279== still reachable: 346 bytes in 6 blocks
==81279== suppressed: 0 bytes in 0 blocks
==81279== Reachable blocks (those to which a pointer was found) are not shown.
==81279== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==81279==
==81279== For lists of detected and suppressed errors, rerun with: -s
==81279== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
int main()
{ return 0;}
char* SystemFileExists(const char* path, int slash, const char* pathsuffix, const char* name)
{
size_t plen = strlen(path);
char* p = Malloc(plen + 1/*slash*/ + (pathsuffix ? strlen(pathsuffix) : 0) + strlen(name) + 1/*NUL*/); // leak here
....
}
maybe the solution is add free(pinclude); pinclude = NULL; after 1301 line?
I'm writing a little operating system using your awesome compiler but I'm a bit confused as to why the outputted (using -flat16
option) assembly has function names and calls beginning with an '_'. This is quite annoying since I'm often calling functions defined in assembly so I have to either use:
asm("call function");
Or I have to write a wrapper function:
void function(int arg){
asm("push "+arg);
asm("call function");
}
I know this is a really small thing but I find it a bit annoying. Is there any particular reason that there is an '_' character? Also, I hope I'm posting this in the correct place - I could find an email address or anything. Great compiler by the way. Thanks.
#include <stdio.h>
void main()
{
putchar(' ');
}
smlrcc says:
./smlrcc -map mapfile ../putchar_test.c -I ./v0100/include/
Symbol '__start' not found
Failed command 'smlrl -map mapfile -elf ../putchar_test.o -o a.out'
Hello,
First off I would say Great Work!
This is possibly the only compiler that satisfies my needs (less bloat,
simple and loose syntax, NASM compatible etc.)
I have been recently playing with this compiler for DOS development,
sadly it does not support (until now) the parametrized macros, which is needed for functions like va_end, va_start and va_arg which are in turn needed for functions like printf. Is there a workaround?
-ALLDESP
I would like to know whether Win16 support is doable: whether SmallerC knows how to generate ASM code that can then be assembled (by NASM, etc) to Win16 NE executable, and be possibly linked to other Win16 dlls.
debbie@debian:~/SmallerC$ ./smlrcc -I./v0100/include/ ~/t/t.c
Symbol '__start' not found
Failed command 'smlrl -elf /home/debbie/t/t.o -o a.out'
hello, what did I do wrong?
Hello, I was looking for a C compiler for an 80C188 processor and this one came to be a really nice one. However, and after looking in the Wiki, I found something that might cripple my intentions.
This board happens to have a 128k RAM and 512k flash. If the flat image produced by the compiler ends up in the read only map of the processor (in the flash), some code (ie global rw variables) will become read only.
I was looking the smlrl.c file but I couldn't tell how to force that section to be in other place (for my purposes, at the bottom of the memory map, 0x0). I know that I'll need an initializer stub that should run prior to my code.
So, my question is, it possible to achieve this without a lot of fiddling in the source file of smlrl? Or maybe some kind of hack using other tool alongside SmallerC?
Sorry if this is not a proper "issue", but I wanted to ask directly to the author.
Hi!
I started using this compiler again just to prototype some code on the same board I've mentioned again on my other Issue (which is 3 y.o. now!) and this time I hit a snag when trying to use more "heavy" code (like petit FS from ElmChan). 32 bit data storage is extensively used on that codebase; mostly using a typedef for DWORD. I understand that the CPU can't do any 32 bit operations, and stubs for these operations need to be created. But that said, would it be feasible to add these stubs (even unoptimized!) to an existing assembly code (for instance the crt lib) and then make the compiler use these when it encounters a 32 bit operation?
Thanks.
I use nasm to compile "demo.asm" to "demo.o"
then I try "smlrl -tiny demo.o -o demo.com"
and I get the Error "Symbol '__start' not found"
Currently I have to use this tool:
https://codeberg.org/tkchia/lfanew
on a smallerc-generated binaries.
This adds the e_lfanew field at offset
0x3c. But as a bonus, the overlay
information block is enlarged, and
spans from 0x1c to 0x3c. I use that
space because I have a lot of an
overlay data.
Would it be possible to support the
lfanew format natively? This will
give people more overlay space, and
you can use e_lfanew field to point
to the next section.
context: developing a DOS 16-bit program. Using nasm
+ smlrl
.
when using nasm -fobj
I can do this:
section .data
...
section .code
mov ax, data
mov ds, ax
but when using nasm -felf
(needed for smalrl
) that code doesn't compile.
The error code is: `error: symbol 'data' undefinied'
My question is, how can I access the .data
segment in when using -felf
.
I know this is not strictly a smlrl
issue, but smlrl
is forcing me to using -felf
.
Thanks!
Hello alexfru,
I am having trouble compiling a program with the Smaller C compiler hosted on this GitHub page. When trying to compile a program with a pointer to a constant I get a warning about exceeding the limit of a signed type.
I compiled the Smaller C compiler using gcc 4.8.2 on Linux x64. Using Smaller C with no additional options, I compiled the following code.
#define foo 60000
void start()
{
void *bar = (void*)foo;
}
This yielded a compiler error: "Constant too big for 16-bit signed type".
However the compilation is successful if the constant is in hexadecimal.
I would have expect a pointer assignment with an explicit cast to be unsigned.
Is this intended behaviour?
Thanks for your assistance,
zerokelvinkeyboard
Would you like add new architecture for your compiler?
I have virtual machine with some exotic CPU, but have no development tools, except an assembler language.
Of course, there is no one reason for spending your time for this issue, but I should ask for that.
I’m trying to port mksh to SmallerC, and it fails at even detecting the compiler because the compiler driver lacks the option -E
(preprocess only).
$ cat >conftest.c <<\EOF
const char *
#if defined(__ICC) || defined(__INTEL_COMPILER)
ct="icc"
#elif defined(__xlC__) || defined(__IBMC__)
ct="xlc"
#elif defined(__SUNPRO_C)
ct="sunpro"
#elif defined(__SMALLER_C__)
ct="smlrc"
#elif defined(__neatcc__)
ct="neatcc"
#elif defined(__lacc__)
ct="lacc"
#elif defined(__ACK__)
ct="ack"
#elif defined(__BORLANDC__)
ct="bcc"
#elif defined(__WATCOMC__)
ct="watcom"
#elif defined(__MWERKS__)
ct="metrowerks"
#elif defined(__HP_cc)
ct="hpcc"
#elif defined(__DECC) || (defined(__osf__) && !defined(__GNUC__))
ct="dec"
#elif defined(__PGI)
ct="pgi"
#elif defined(__DMC__)
ct="dmc"
#elif defined(_MSC_VER)
ct="msc"
#elif defined(__ADSPBLACKFIN__) || defined(__ADSPTS__) || defined(__ADSP21000__)
ct="adsp"
#elif defined(__IAR_SYSTEMS_ICC__)
ct="iar"
#elif defined(SDCC)
ct="sdcc"
#elif defined(__PCC__)
ct="pcc"
#elif defined(__TenDRA__)
ct="tendra"
#elif defined(__TINYC__)
ct="tcc"
#elif defined(__llvm__) && defined(__clang__)
ct="clang"
#elif defined(__NWCC__)
ct="nwcc"
#elif defined(__GNUC__)
ct="gcc"
#elif defined(_COMPILER_VERSION)
ct="mipspro"
#elif defined(__sgi)
ct="mipspro"
#elif defined(__hpux) || defined(__hpua)
ct="hpcc"
#elif defined(__ultrix)
ct="ucode"
#elif defined(__USLC__)
ct="uslc"
#elif defined(__LCC__)
ct="lcc"
#elif defined(MKSH_MAYBE_KENCC)
/* and none of the above matches */
ct="kencc"
#else
ct="unknown"
#endif
;
const char *
#if defined(__KLIBC__) && !defined(__OS2__)
et="klibc"
#else
et="unknown"
#endif
;
EOF
$ smlrcc -linux -E -I. -I'/root/mksh' -DMKSH_BUILDSH -D_GNU_SOURCE -DSETUID_CAN_FAIL_WITH_EAGAIN conftest.c
Invalid or unsupported command line option '-E'
Hi!
It is explicitly stated in the docs that the
far pointers are unsupported. Yet its a pity,
as it prevents the building of many existing
DOS projects with this compiler.
So I'd like to know if the support is planned,
and if not - why, and how difficult is it to add
such a support? Given that the huge pointers
are already supported, what is the remaining
problems that prevents the support of the far
pointers?
The little doc you provide is all over the place. There should be some kind of documentation that explains, how to use it as a compiler, once it has been built. Preferably separated into folders by architect. All the resources sites, at the bottom of the read me, are far less important then a good documentation page.
There really should be a few documentations for each architect here: The doc explaining how to build it properly and then a doc explaining how to use it as the compiler, the assembler, the linker. You had no form of contact otherwise I would have went that away. This project makes small c seem more unorganized then it is.
Compiling produces this error
./smlrcc -SI /tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/include -I /tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/srclib @lcw.op
Error in "/tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/srclib/kernel32/closehan.c" (9:109)
Invalid or too long file name or path name
Failed command 'smlrc -seg32 -winstack -Wall -nopp /tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/srclib/kernel32/closehan.i /tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/srclib/kernel32/closehan.asm'
make: *** [/tmp/makepkg-chris/smallerc/src/SmallerC-1.0.1-dos.win.1ab15c7/v0100/../common.mk:41: lcw.a] Error 255
This patch fixes the problem.
diff -pNaru5 a/v0100/smlrc.c b/v0100/smlrc.c
--- a/v0100/smlrc.c>2021-09-11 17:34:07.000000000 -0400
+++ b/v0100/smlrc.c>2024-03-25 17:25:33.162558811 -0400
@@ -194,11 +194,11 @@ int fsetpos(FILE*, fpos_t*);
#ifndef SYNTAX_STACK_MAX
#define SYNTAX_STACK_MAX (2048+1024)
#endif
.
#ifndef MAX_FILE_NAME_LEN
-#define MAX_FILE_NAME_LEN 95
+#define MAX_FILE_NAME_LEN 254
#endif
.
#ifndef NO_PREPROCESSOR
#define MAX_INCLUDES 8
#define PREP_STACK_SIZE 8
Greetings,
I have a problem where I need to change the enter key.
You are one of the rare person which produce a public source code for a standard library which work under dos.
But I still need help about it.
What does the values in__chartype__[257]
correspond to?
Also, I can’t find the code which is responsible for waiting input in __doscan()
.
Nor I can find the one which responsible for ending the input on a\n
.
Sorry for contacting you in this way...
Hello Alexey.
I have developed a low level VM with intent to use C as a scripting language.
Can I commission you to have SmallerC be able to output assembly code for my VM?
I have already created a custom assembler for the VM as well so there's no need to output direct bytecode unless you prefer to do that.
Sincerely, Assyrianic.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.