earlephilhower / newlib-xtensa Goto Github PK

This project forked from igrr/newlib-xtensa

newlib-xtensa fork intended for esp8266

License: GNU General Public License v2.0

Makefile 12.00% Shell 0.74% M4 0.89% Emacs Lisp 0.04% C 71.22% Assembly 4.53% C++ 9.09% Mathematica 0.01% GDB 0.01% TeX 0.80% Perl 0.28% Roff 0.13% Yacc 0.02% Awk 0.01% DIGITAL Command Language 0.01% Scala 0.02% RPC 0.03% Python 0.05% Raku 0.15% XSLT 0.01%

newlib-xtensa's Introduction

		   README for GNU development tools

This directory contains various GNU compilers, assemblers, linkers, 
debuggers, etc., plus their support routines, definitions, and documentation.

If you are receiving this as part of a GDB release, see the file gdb/README.
If with a binutils release, see binutils/README;  if with a libg++ release,
see libg++/README, etc.  That'll give you info about this
package -- supported targets, how to use it, how to report bugs, etc.

It is now possible to automatically configure and build a variety of
tools with one command.  To build all of the tools contained herein,
run the ``configure'' script here, e.g.:

	./configure 
	make

To install them (by default in /usr/local/bin, /usr/local/lib, etc),
then do:
	make install

(If the configure script can't determine your type of computer, give it
the name as an argument, for instance ``./configure sun4''.  You can
use the script ``config.sub'' to test whether a name is recognized; if
it is, config.sub translates it to a triplet specifying CPU, vendor,
and OS.)

If you have more than one compiler on your system, it is often best to
explicitly set CC in the environment before running configure, and to
also set CC when running make.  For example (assuming sh/bash/ksh):

	CC=gcc ./configure
	make

A similar example using csh:

	setenv CC gcc
	./configure
	make

Much of the code and documentation enclosed is copyright by
the Free Software Foundation, Inc.  See the file COPYING or
COPYING.LIB in the various directories, for a description of the
GNU General Public License terms under which you can copy the files.

REPORTING BUGS: Again, see gdb/README, binutils/README, etc., for info
on where and how to report problems.

newlib-xtensa's People

Contributors

Stargazers

Watchers

Forkers

s-hadinger mhightower83 d-a-v someburner jjsuwa-sys3175 mcspr junqiang-yang

newlib-xtensa's Issues

TZ: Issue parsing glibc timezones

ref: esp8266/Arduino#7699

Posix timezone strings defined in current common linux distributions running glibc are sometimes incorrectly parsed by newlib.

Their format starts with ABRVnn[ABRV[nn]][,...].
For example: GMT0BST,... is London TZ descriptor with two abbreviations GMT and BST.
ABRV is an abbreviation meaning something for humans. BST means "British Summer Time".

Such abbreviations are not defined for every timezone around the world. It is said in https://data.iana.org/time-zones/theory.html (source) that :

If there is no common English abbreviation, use numeric offsets like -05 and +0530 that are generated by zic's %z notation.

These numeric offsets are enclosed between <...>. For example, abbreviation for Sao Paulo TZ is <-03>3 (instead of for example valid SAOPAULO3).

The full path from official definitions starts from the above repository: zic the zoneinfo compiler uses files defining timezones on all continents to build most linux distribution's /usr/share/zoneinfo/* files, which are parsed by this tool to produce a csv file used by esp8266/arduino. One will notice that quite a large number of abbreviations are numeric.

The issue is that numeric abbreviations like <-03>3 are incorreclty parsed by newlib.

On the other hand, it seems that glibc's TZ parser is able to do so despite the fact that numeric abbreviations do not seem to follow posix TZ definition.

Abbreviations values are anyway unused in esp8266/arduino time library. To circumvent the parsing issue, numeric abbreviations are ~~(about to be)~~ converted to a posix compliant random string thanks to a script (in the PR referred on top of this message).

Q: slightly more compact memmove_P

Looking at the current memmove_P implementation

newlib-xtensa/newlib/libc/sys/xtensa/string_pgmspace.c

Lines 184 to 190 in ebc9675

 void *memmove_P(void *dest, const void *src, size_t n) 

 { 

 if ( ((const char *)src >= (const char *)0x40000000) && ((const char *)dest < (const char *)0x40000000) ) 

 return memcpy_P(dest, src, n); 

 else 

 return memmove(dest, src, n); 

 }

Since it is checking for a number with only one bit set... I wondered if just checking that fact does anything to the code, since we could simply discard the idea that it is going to be used on any 'higher' addresses.
Not sure how to benchmark it, though, so I am not really sure if this does anything useful at all
(besides making it 5 bytes smaller :)

// > cat memmove.c
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>
#include <sys/pgmspace.h>

inline static bool inFlash(const void* ptr) {
    // common comparison would use >=0x40000000
    // instead, slightly reduce the footprint by
    // checking *only* for numbers below it
    static const uintptr_t Mask = 1 << 30;
    return ((uintptr_t)(ptr) & Mask) > 0;
}

void* memmove_P2 (void* dest, const void* src, size_t n) {
    if (inFlash(src) && !inFlash(dest)) {
        return memcpy_P(dest, src, n);
    } else {
        return memmove(dest, src, n);
    }
}

void* memmove_P1 (void* dest, const void* src, size_t n)
{
    if ( ((const char *)src >= (const char *)0x40000000) && ((const char *)dest < (const char *)0x40000000) )
        return memcpy_P(dest, src, n);
    else
        return memmove(dest, src, n);
}

> xtensa-lx106-elf-gcc -c -Os memmove.c
> xtensa-lx106-elf-nm --radix=d -S memmove.o | grep memmove
         U memmove
00000020 00000023 T memmove_P1
00000000 00000018 T memmove_P2
> xtensa-lx106-elf-gcc -S -Os memmove.c

    .file   "memmove.c"
    .text
    .literal_position
    .align  4
    .global memmove_P2
    .type   memmove_P2, @function
memmove_P2:
    bbci    a3, 30, .L2    ; branch on bit set / unset
    bbsi    a2, 30, .L2
    j.l memcpy_P, a9
.L2:
    j.l memmove, a9
    .size   memmove_P2, .-memmove_P2
    .literal_position
    .align  4
    .global memmove_P1
    .type   memmove_P1, @function
memmove_P1:
    movi.n  a5, -1        ; btw this only happens on Os, O2 and O3 use l32r const of 0x40000000
    srli    a5, a5, 2
    bgeu    a5, a3, .L7
    bltu    a5, a2, .L7
    j.l memcpy_P, a9
.L7:
    j.l memmove, a9
    .size   memmove_P1, .-memmove_P1
    .ident  "GCC: (GNU) 10.3.0"

Port in PSTR macro changes from arduino core

esp8266/Arduino#6565

Symbol stripping

I note that memcpy, memset, etc. are still present in the generated libc.a. Is the intention to remove these or handle symbol stripping separately?

Q: both S and M in .section flags

Remembering not easily reproducible issue from back in April:
https://gitter.im/esp8266/Arduino?at=606c9a5c0147fb05c5de1d59

I noticed these flags after the section name string in the PSTR(...) macro:

newlib-xtensa/newlib/libc/sys/xtensa/sys/pgmspace.h

Line 44 in 85c33ba

 #define PSTRN(s,n) (__extension__({static const char __pstr__[] __attribute__((__aligned__(n))) __attribute__((section( "\".irom0.pstr." __FILE__ "." __STRINGIZE(__LINE__) "." __STRINGIZE(__COUNTER__) "\", \"aSM\", @progbits, 1 #"))) = (s); &__pstr__[0];})) 

Per https://sourceware.org/binutils/docs/as/Section.html#index-Section-Stack-3

For sections with both M and S, a string which is a suffix of a larger string is considered a duplicate. Thus "def" will be merged with "abcdef"; A reference to the first "def" will be changed to a reference to "abcdef"+3.

If such offset had happened on pstr... what does that look like for const char[] arrays used across the app? Is it an actually desired behaviour since the resulting pointer would (?) not be aligned?

Some extra info, sorry for blowing the text size :)

I still have the offending .bin & .elf generated back in april and still with gcc10.2 timestamped 201223, but not the rest of the build directory including the suggested .map file and neither the (possibly broken?) sources that generated it. What I meant it looked like:

4026fa70 <_mqttInitCommands()::__pstr__>:
4026fa70:   514d 5454 492e 464e 004f 0000               MQTT.INFO...

However, dumping the code with current toolchain & code:
(with only change being the namespace { ... } surrounding the function, and looking at objdump it inlined the function into the one it is called from)

40282678 <(anonymous namespace)::_mqttInitCommands()::__pstr__>:
40282678:   514d 5454 492e 464e 004f 0000               MQTT.INFO...

40282684 <(anonymous namespace)::_mqttInitCommands()::__pstr__>:
40282684:   514d 5454 522e 5345 5445 0000               MQTT.RESET..

And, there are some instances where the 'merging' seems to happen on strings like F("FACTORY.RESET") e.g. settingsInitCommands() here uses it, and terminal function contains F("RESET"):

4028187c <(anonymous namespace)::_settingsInitCommands()::__pstr__>:
4028187c:   4146 5443 524f 2e59                         FACTORY.

40281884 <(anonymous namespace)::_terminalInitCommands()::__pstr__>:
40281884:   4552 4553 0054 0000                         RESET...

asm code for the April build _mqttInitCommands() is slightly different as well, but I am not sure what to look at there for

Since then, I was not really been able to reproduce this issue (reliably, or even at all, and do not have some kind of small example that could showcase the issue), but I wonder if the 'merge' flag somehow messed with the pointers. Or it is something related to gcc / binutils / different order of .o files when linking / etc., and it was silently resolved due to updates.

PSTR within templated code

GCC silently disregards attributes in templated code (esp8266/Arduino#6294 (comment))

Test case:

template <typename T>
void templateTest(T x)
{
	Serial.print("x = ");
	Serial.println(x);
	Serial.println(PSTR("This text ends up in RAM (.rodata) because GCC silently disregards attributes in templated code."));
}

This can be fixed by adding to linker script .irom0.text output section:

	/* Inline flash strings PSTR() within templated code */
	*(.rodata._ZZ*__c)
	*(.rodata._ZZ*__c_*)

With this in mind, a safer/more descriptive name than __c would be preferable, such as __pstr__.

memmove_P does not support IRAM

Copied over from esp8266/Arduino#7060 (comment)

memmove_P does not support IRAM

#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <umm_malloc/umm_malloc.h>
#include <umm_malloc/umm_heap_select.h>

//#define memmove_P ets_memmove

char str[] = "Move over";

void setup() {
  WiFi.persistent(false);
  WiFi.mode(WIFI_OFF);
  Serial.begin(115200);
  delay(10);
  Serial.printf_P(PSTR("\r\nBegin ...\r\n"));

  {
    HeapSelectIram ephemeral;
    size_t sz = strlen(str) + 1;
    char *newstr = (char *)malloc(sz + 8);
    memcpy(newstr, str, sz);
    Serial.println(newstr);
    memmove_P(&newstr[1], newstr, sz);
    newstr[0] = ' ';
    Serial.println(newstr);
    free(newstr);
  }
}

void loop() {}

Prints:

Begin ...
Move over
 MMMMMMMMMM

Works when not using IRAM or using ets_memmove with a non32-bit access handler.

Your comment about the issue: esp8266/Arduino#7060 (comment)

earlephilhower / newlib-xtensa Goto Github PK

newlib-xtensa's Introduction

newlib-xtensa's People

Contributors

Stargazers

Watchers

Forkers

newlib-xtensa's Issues

TZ: Issue parsing glibc timezones

Q: slightly more compact memmove_P

Port in PSTR macro changes from arduino core

Symbol stripping

Q: both S and M in .section flags

PSTR within templated code

memmove_P does not support IRAM

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	void memmove_P(void dest, const void *src, size_t n)
	{
	if ( ((const char )src >= (const char )0x40000000) && ((const char )dest < (const char )0x40000000) )
	return memcpy_P(dest, src, n);
	else
	return memmove(dest, src, n);
	}