Comments (33)
Thanks, this stack trace was very helpful.
I'm wondering if it would be possible for you to determine which input created this? I have some guesses as to what might cause this, but without being able to replicate it will be difficult to fix.
from fastnumbers.
I'm working on that now. It's a file that's given me trouble before, probably something to do with control characters or something like that. I'll try and narrow it down to a specific line of code.
I have a GUI that reads a delimited file into pandas, then runs various calculations on each column like min/max, frequency count, etc. I use natsort after I've determined that a column contains both numbers and characters to sort it naturally.
from fastnumbers.
I've been working on this for a while now and it's very frustrating. I've gotten the file that causes the crash down to 122Kb, but I can't get it any smaller. Here's a link:
I've never used pastebin before, so hopefully that works, I don't see a way to add an attachment here.
I also can't reproduce the crash on a smaller program than my full one, which is 500 lines of python, wx, etc. Hopefully looking at the file that causes the crash will help you. Otherwise, I'm stuck.
Thanks again.
P. S. The problem happens 100% of the time on a huge file (63MB), but happens intermittently on the pastebin file.
from fastnumbers.
Thanks, I'll take a look at this tonight. For reference, what system are you on?
from fastnumbers.
Linux lepore-desktop 3.19.0-16-generic #16-Ubuntu SMP Thu Apr 30 16:13:00 UTC 2015 i686 athlon i686 GNU/Linux
Running KDE.
I spun up a Windows 7 virtual machine and did not get the error.
from fastnumbers.
I realize that this isn't the question that you asked, but I am finding that the sorting is not working properly because there is an NaN in your data. This confuses Python's sort because 5 < NaN is False and 5 > NaN is False. This created a jump discontinuity in your sorted data (see below). I will update natsort
to better handle this case after I solve this issue, but I don't think this is related to the seg fault (which I haven't been able to replicate yet, but I'm on a Mac, so it may be machine dependent). I will dig more.
SERIAL_NUMBER NAME
1927 6 APLIN -OR-& -
3253 33053 06 BALDASANO BENJAMIN M
1412 2919302 ANDERSON ARVINE L
1323 6135134 AMORE ERNEST S
898 6145219 ALLARD LEO L
3873 6149528 BARNEY WILLIAM A
740 6149858 ALDRICH HENRY W
4813 6248805 BECK JOHN C
4889 6865158 BECKLUND EDWARD
4680 6909807 BEARDSLEY HAROLD F
4683 6953423 BEARLEY HARRY L
4686 11110897 BEARSE SELWYN F
4715 13046508 BEATTIE JOHN H JR
4689 15044122 BEASLEY CHARLES P
4708 16006589 BEASTER RICHARD H
4702 17068735 BEASLEY JOSEPH C
4681 20310601 BEARE GEORGE D
4703 20407637 BEASLEY MARVIN J
4682 31309985 BEARISTO WILLIAM E
4711 33393550 BEATTIE CHARLES D
4714 33404711 BEATTIE HERMAN H
4696 33646001 BEASLEY JAMES B
4695 34174220 BEASLEY HENRY L
4698 34426838 BEASLEY JAMES T
4705 34517074 BEASLEY PEARMAN
4699 34538587 BEASLEY JAMES W
4697 34801955 BEASLEY JAMES L
4701 35790825 BEASLEY JOSEPH B
4709 36531700 BEATON ROBBIE R
4693 36737603 BEASLEY FRANK JR
4687 37197229 BEARY MARTIN C
4691 37563286 BEASLEY DONALD L
4700 37611309 BEASLEY JESSE E
4688 37627746 BEASLEY CHARLES A
4690 38107155 BEASLEY CHESTER J
4685 38466544 BEARPAW TOM
4706 38564225 BEASLEY STEWART R SR
4718 39203811 BEATTIE ROBERT J
4717 39342618 BEATTIE KENNETH M
4710 42054165 BEATTIE C W
4712 NaN BEATTIE EDWARD # <=== COUNT RESETS STARTING HERE
4757 6262518 BEAULIEU LEO E
2105 6264303 ARMON THEODORE
4492 6269549 BAUMGARTEN OTIS K
674 6271743 ALBIN HENRY D
3766 6277281 BARNES CLARENCE B
4139 6285548 BARTLEY JESSIE B
250 6294035 ADAMS CLAUDE E
3087 6296739 BAKER CLARENCE F
3685 6379336 BARKER ERNEST P
from fastnumbers.
Can you try testing with the development version that I have just pushed? My suspicion is that there was some problem when converting one of your inputs to a char*
, and I have switched to the Python C function that does a bit more error checking when doing the string conversion.
from fastnumbers.
Off on vacation for a week, will test next Thursday. Thanks!
On 05/22/2015 12:22 AM, Seth Morton wrote:
Can you try testing with the development version that I have just
pushed? My suspicion is that there was some problem when converting
one of your inputs to a |char*|, and I have switched to the Python C
function that does a bit more error checking when doing the string
conversion.—
Reply to this email directly or view it on GitHub
#2 (comment).
from fastnumbers.
No luck with the development version, here's the error:
home/lepore/.local/lib/python2.7/site-packages/pkg_resources/init.py:1250: UserWarning: /home/lepore/.python-eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable).
warnings.warn(msg, UserWarning)
Skipping line 15027: expected 26 fields, saw 27
Skipping line 18505: expected 26 fields, saw 27
Skipping line 21991: expected 26 fields, saw 31
Skipping line 44022: expected 26 fields, saw 31
[New Thread 0xac0ffb40 (LWP 5978)]
[New Thread 0xb5351b40 (LWP 5958)]
[New Thread 0xb3b50b40 (LWP 5957)]
Program received signal SIGSEGV, Segmentation fault.
fast_atoi (p=0xac940034 <error: Cannot access memory at address 0xac940034>, error=0xbfffcc26, overflow=0xbfffcc27) at src/fast_atoi.c:24
24 while (white_space(*p)) { p += 1; }
from fastnumbers.
Can you try using the following function as a key to natsorted
? This will print out every input individually to natsorted
before fast_int
is run on it. The last one printed before the segfault should be the input causing the problem.
import sys
def printer(x):
print(x)
sys.stdout.flush()
return x
b = natsorted(your_data, key=printer)
from fastnumbers.
Since you have the source code, you can also add the following before line 24 in fast_atoi.c, preferably in conjuction with the printer
function suggested.
fprintf(stdout, "fast_atoi string: %d\n", p);
while (white_space(*p)) { p += 1; }
This should print out the string right before the problem occurs.
from fastnumbers.
I think you're making progress. The file that crashes fastnumbers that I posted above no longer crashes it. However, the larger file that the excerpt came from still crashes it. Here are the last values before the segfault:
O&795577
fast_atoi string: -1423753252
fast_atoi string: -1423641420
10305793
fast_atoi string: -1423641548
6132688
fast_atoi string: -1423642156
O&401818
fast_atoi string: -1423753228
fast_atoi string: -1423641292
10300351
fast_atoi string: -1423641420
O&366604
fast_atoi string: -1424162764
Segmentation fault (core dumped)
Thanks for working on this!
from fastnumbers.
Great, this helps narrow down the possible problem. I wish that I had given you the right code to add, though. In the C function, can you change it to the following?
fprintf(stdout, "fast_atoi string: ");
fprintf(stdout, "%s\n", p);
while (white_space(*p)) { p += 1; }
I had accidentally had you use the %d
format, which will print out an integer, but really I need %s
which prints the string in the character array. I also think it will be helpful to know if it is the printing that causes the crash now, or if it is still searching for a space, so I separated the first part of the string from the second.
In the python printer
function, can you change print(x)
, to print(x, repr(x))
? This should show any control characters in the string that we aren't thinking about.
Last, if you do this multiple times, does it always crash on the same input, or does it change from run to run?
Sorry to ask you to modify the tests again. I think we are making headway.
from fastnumbers.
Happy to help! Here's the latest output. It always crashes on this file, but on the smaller version it only crashed most of the time.
(u'16062279', "u'16062279'")
fast_atoi string: 16062279
(u'31129792', "u'31129792'")
fast_atoi string: 31129792
(u'39093001', "u'39093001'")
fast_atoi string: 39093001
(u'37693447', "u'37693447'")
fast_atoi string: 37693447
(u'O&699536', "u'O&699536'")
fast_atoi string: O&
Segmentation fault (core dumped)
Do you need the GDB output?
from fastnumbers.
I imagine the GDB output won't tell anything we haven't seen before.
One thing I notice right away from the two runs is that it is not failing on the same input, but they both begin with O&
. I wonder what would happen if you didn't let those strings go to fast_int
...
Could you let me know if you get a crash doing either of the following?
First, try modifying the printer
function to look like this:
def printer(x):
print(x)
sys.stdout.flush()
return '' if x.startswith('O&') else x
This will remove any string beginning with the "bad" characters from the pool. If you don't get any crashes with that, try the following:
def printer(x):
print(x)
sys.stdout.flush()
return x.replace('O&')
To see if we can stop the problem just by removing the leading bad characters.
from fastnumbers.
Unfortunately removing the bad characters isn't acceptable for my purposes (ditto for the nans). The data that I'm reading and sorting must remain exactly as it's written in the source file. Otherwise the output will not match the inputs. It's a government thing!
Trying either new printer function I get:
Traceback (most recent call last):
File "daeric2.py", line 375, in readCSV
result_list = natsorted(result_list, key=self.printer)# if the results are mixed text and numbers, use natural sort
File "/usr/local/lib/python2.7/dist-packages/natsort-4.0.0-py2.7.egg/natsort/natsort.py", line 234, in natsorted
return sorted(seq, reverse=reverse, key=natsort_keygen(key, alg=alg))
File "/usr/local/lib/python2.7/dist-packages/natsort-4.0.0-py2.7.egg/natsort/utils.py", line 294, in _natsort_key
val = key(val)
TypeError: printer() takes exactly 1 argument (2 given)
from fastnumbers.
If you made printer
part of a class, you will need to add self
as part of the function definition, as in def printer(self, x):
, or you should make it a @staticmethod
to not need self
. I think this is the origin of the new error you are seeing.
I wasn't suggesting removing the bad stuff for real, just in our debugging.
from fastnumbers.
Ahh! I see. Would that also apply to the replace code? I think so (was getting TypeError: replace() takes at least 2 arguments (1 given)). I added it there as well and got:
12138003
Traceback (most recent call last):
File "daeric2.py", line 375, in readCSV
result_list = natsorted(result_list, key=self.printer)# if the results are mixed text and numbers, use natural sort
File "/usr/local/lib/python2.7/dist-packages/natsort-4.0.0-py2.7.egg/natsort/natsort.py", line 234, in natsorted
return sorted(seq, reverse=reverse, key=natsort_keygen(key, alg=alg))
File "/usr/local/lib/python2.7/dist-packages/natsort-4.0.0-py2.7.egg/natsort/utils.py", line 294, in _natsort_key
val = key(val)
File "daeric2.py", line 539, in printer
return x.replace(self, 'O&')
TypeError: coercing to Unicode: need string or buffer, Example found
from fastnumbers.
Sorry, it should be x.replace('O&', '')
, since we need to replace the string with something.
from fastnumbers.
I should have seen that, sorry. I fixed that line and the file processed successfully! So it's something about the O& that's causing the problem?
from fastnumbers.
That's what it looks like.
As a temporary workaround, can you try the following?
a = natsorted(your_data, key=lambda x: x.replace("&", "$"))
This will replace all ampersands with dollar signs. These are next to each other on the ASCII table, so it shouldn't mess up the sort order, but it might prevent this seg fault. This might get you by while I figure out the seg fault.
from fastnumbers.
Hmm....
fast_atoi string: O$
fast_atoi string: 795367
fast_atoi string: O$
fast_atoi string: 718174
fast_atoi string: 37490261
fast_atoi string: 37529450
fast_atoi string: 35570246
fast_atoi string: O
fast_atoi string: 1062485
fast_atoi string: 35241067
fast_atoi string: O$
Segmentation fault (core dumped)
from fastnumbers.
Ran the code in gdb again and got a different segfault:
fast_atoi string: 11082136
fast_atoi string: 37593519
fast_atoi string: 12005032
fast_atoi string: T!
[New Thread 0xa97fab40 (LWP 26397)]
[New Thread 0xb4351b40 (LWP 26386)]
[New Thread 0xb3b50b40 (LWP 26385)]
Program received signal SIGSEGV, Segmentation fault.
0xb7e102f4 in _IO_vfprintf_internal (s=0xb7f85e80 <IO_2_1_stdout>, format=, ap=0xbfffcbbc "4\300۬\360\064") at vfprintf.c:2039
2039 vfprintf.c: No such file or directory.
from fastnumbers.
Ok, so it is related to having to split the string before sending to fast_int
. I will try to get a VM to replicate this. Thanks for your help.
In the meantime, you can uninstall fastnumbers to avoid the segfault.
from fastnumbers.
No worries, I still have several weeks before initial deployment. Thanks for working so hard on this.
from fastnumbers.
I installed Kubuntu in a virtualbox (and wasn't that fun) but was unable to reproduce the problem, using the same code and data file as on my machine. The versions of Kubuntu were both 15.04.
from fastnumbers.
Huh... that doesn't give me much hope that I will be able to reproduce.
It's not clear to me if the problem is originating from my C code, or if it originating from something else. Internally, natsort
is using re.findall
to split your input into numbers and non-numbers, and sending this split list to fast_int
from fastnumbers
to do the conversion. So, it's not clear to me if the reason for the failure is because I am not handling this input correctly, or if re.findall
is giving poorly formed strings to parse. It is also entirely possible that there is some third problem causing this. Without being able to reproduce I am not sure how I will solve the problem.
from fastnumbers.
Understood. I'll try to re-install everything and see if I can get my system like the virtualbox I set up. I'll let you know what happens. Thanks for working on this.
from fastnumbers.
You didn't happen to be using any special arguments to natsort
like LOCALE, did you?
from fastnumbers.
Nothing but:
result_list = natsorted(result_list)
I'll fiddle around some more with this when I get a chance.
from fastnumbers.
I re-created the crash on a Kubuntu 15.04 virtualbox image. I've saved the box as a .ova file, which you should be able to download and open in virtualbox. Please email me at [email protected] and I will give you the download address of the .ova file and some brief instructions on reproducing the error. Thanks!
from fastnumbers.
I would like to award @glepore70 the "Best Bug Reporter" imaginary internet award for taking the time to create a virtual machine image of the system on which the segfault occurs and sending it to me to debug. I don't imagine many users would go through the hassle to fix the problem... they would just uninstall and move on. Thanks so much!
from fastnumbers.
The segfault was related to making a bad assumption when dealing with character arrays.
The Python C-API to get a char*
from a string/bytes object is varied, but the simplest version looks a bit like the following:
if (PyBytes_Check(input)) {
str = PyBytes_AS_STRING(input);
}
Note this is just a straight pointer assignment, no strcpy
call is done. As long as the input
object is not deleted and str
is being used as read-only, this is a fairly safe strategy. The problem arises when the input is not string/bytes, but unicode:
if (PyUnicode_Check(input)) {
temp_bytes = PyUnicode_AsEncodedString(input, "ascii", "strict");
if (temp_bytes != NULL) {
str = PyBytes_AS_STRING(temp_bytes);
Py_DECREF(temp_bytes); // <-- Uh-Oh!
}
}
To extract the char*
the unicode object must be first converted to bytes. This bytes object is only temporary, which means that as soon as the object is garbage collected* (i.e. deallocated) the str
pointer will not point to anything meaningful. When one tries to access the dangling str
, a sefgault happens.
The interesting thing is that Python only periodically performs garbage collection, so most of the time the temporary bytes object remains in memory for the duration of the fastnumbers
function call even though its reference count is zero. In fact, a segfault would only occur if Python initiates garbage collection on a Py_DECREF
call inside the fastnumbers
code. Apparently, this is a rare event since I was unable to reproduce the segfault on my machine, and none of my Travis-CI runs had a segfault either.
The solution of this problem is to force fastnumbers
to take ownership of the contents of str
(i.e. make a strcpy
call), and not rely on Python keeping it alive for the duration of the function call:
if (PyBytes_Check(input)) {
PyBytes_AsStringAndSize(input, &s, &s_len);
str = malloc((size_t)s_len + 1);
strcpy(str, s);
} else if (PyUnicode_Check(input)) {
temp_bytes = PyUnicode_AsEncodedString(input, "ascii", "strict");
if (temp_bytes != NULL) {
PyBytes_AsStringAndSize(temp_bytes, &s, &s_len);
str = malloc((size_t)s_len + 1);
strcpy(str, s); // <-- Now I own the contents of str
Py_DECREF(temp_bytes); // <-- Now not a problem
}
}
The only caveat now is that str
must be free
d at some point, so I had to do a bit of rework of my other code to ensure a free(str)
call was made before returning to Python.
I will merge this with master
tonight and make an official release to PyPI.
*Calling Py_DECREF
reduces the reference count of the object, and when the garbage collector detects that an object has a 0 reference count it will be destroyed (i.e. deallocated).
from fastnumbers.
Related Issues (20)
- Request for an option to not ignore underscores in numeric literals HOT 10
- Update documentation and metadata
- Integrate with numpy and pandas HOT 9
- Unit test numeric issues on 32bit arm CPU HOT 3
- Speed not better than Python's int/float HOT 6
- Make most options keyword-only
- Rename "key" option to "on_fail"
- [BUG] FastNumbers can crash with a SystemError due to returning NULL without setting an exception HOT 5
- Fastest way to check is and object is int or float in one pass HOT 14
- Proposal: change behavior of isfloat with respect to treatment of float("nan") HOT 19
- Proposal: change behavior of isfloat function with respect to treatment of strings containing integers HOT 3
- Proposal: Do not raise an exception on None HOT 5
- python3.9 compatibility HOT 5
- Re-write using C++ and pybind11
- Add support to release Linux aarch64 wheels HOT 1
- Broken 3.2.0 installation
- Missing -lm breaks build on armv7hl
- Error: <built-in function isint/isfloat> returned NULL without setting an error HOT 2
- Use fast C++ methods like std::from_chars or fast_float HOT 1
- Improve performance with METH_FASTCALL
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastnumbers.