Here’s a hypothesis:
The bug is in MULTIPLE-VALUE-CALL, it isn’t popping the stack properly when it receives more than a single value.
If this is true, then it is a total accident that I tripped over this:
I used SEE of a UTF8 encoded file to test my conversion interface, SEE ends up calling PFCOPYBYTES to interpret font changes in case it is a Lisp source file (which my file is not). PFCOPYBYTES takes start and end byte positions, because it is typically called from PF not for the whole file (as for SEE) but to pick out the definition of functions at particular byte positions as provided by a filemap.
PFCOPYBYTES is set up to deal with the fact that in the NS world (and other external format worlds) there is not a one-to-one correspondence between byte positions and character positions. That’s not an issue for most applications, but it is crucial for the PF invocation.
The low-level character readers hang off the open macro \NSIN, which takes an argument that says whether the number of bytes read should also be returned with the code. In the NS case, the second value is set to a local variable in the inline compilation. But in the case where there is a call out to a different encoding, the encoding-function is given a flag that says it should return the number of bytes it read, as a second value. That’s what’s happening in this case, and since it is repeated many times in a loop and if the stack isn’t cleaned up properly on each invocation, we would eventually get the overflow.
Putting aside this possible bug, this is an unfortunate implementation of PFCOPYBYTES—and according to the edit dates, it is probably something that I did for NS about 25 years ago, screwing up a simpler implementation that Larry had originally done about 10 years before that.
The problem is that it exports the details of a very low-level macroed interface to a very high-level function. The more direct implementation, given that bytes and characters are not 1-to-1, would have been to test the actual file ptr at each character, instead of the over-eager optimization that tries to second guess the counting with simple arithmetic. O better still, to recognize this as a first-class conceptual issue by adding a flag argument to READC and READCCODE indicating whether the bytes-read should also be returned as a second value.
On Aug 2, 2020, at 7:11 PM, Ron Kaplan [email protected] wrote:
Part of the mystery is that the stack overflow doesn’t happen until it has read about 20K characters. So somehow this must be growing over time, even though it doesn’t look like anything should affect that frame from inside the loop in PFCOPYBYTES. The loop calls \NSIN and \OUTCHAR, and in this case the NSIN does an apply* to \JISIN. If you put a break on \JSIN, I wonder if the stack frame is big from the get go, or whether it grows over repeated calls.
On Aug 2, 2020, at 6:41 PM, Ron Kaplan [email protected] wrote:
The last one is from the RESETLST inside PFCOPYBYTES? The other ones from the WITH-OPEN-FILEs in the little test routine?
How do you read the length of that frame, is it the difference between 4c22 and f092?
On Aug 2, 2020, at 5:53 PM, Nick Briggs [email protected] wrote:
So this is what I see when I use the debug tools to dump the stack -- seems to reflect what URAID says, but, near the end of the stack,
4c22: SI:UNWIND-PROTECT alink: 0x4c10, clink: 0x0000, next: 0xf092
that SI:UNWIND-PROTECT appears to have allocated a huge amount of stack space, for reasons I don't understand.