eyurtsev / fcsparser Goto Github PK

View Code? Open in Web Editor NEW

70.0 70.0 43.0 14.19 MB

A python parser for reading fcs files supporting FCS 2.0, 3.0, 3.1

License: MIT License

Python 100.00%

fcsparser's People

Contributors

Stargazers

Watchers

fcsparser's Issues

$PnE based rescaling is not heeded

According to the FCS 3.1 specification (and probably older ones, too), logarithmic data channels can be saved in linearized form with $PnE specifying the logarithmic range and minimal value.

Currently the library does not transform these values into logarithmic form for later transformations.

non-parsing FCS file (and a fix)

Hi, flow cytometry files from Apogee don't parse with fcs parser.

They (inexplicably) have a bunch of spaces before $P1N in the header, this can be remedied without too much trouble though.

An example of an FCS file with the issue:
https://bitbucket.org/mwfcomp/incubation_experiment/raw/a6ef3685428f6e373442ab0e04b553b36bbcdd48/sample_data/2015-07-08/THP-1%20-%20235%20nm%20Capsule%2016%20hr.fcs

And a (slightly hacky) script to fix this per file:
https://bitbucket.org/mwfcomp/incubation_experiment/raw/a6ef3685428f6e373442ab0e04b553b36bbcdd48/fcs_file_fixer.py

This could be dealt with more elegantly in the parser, I would be happy to put together a solution for it if this is acceptable and this project is still being maintained?

Handling for invalid UTF8 characters in headers

I'm working with FCS2 files from a CellQuest Pro 6 cytometer, and I've found that the machine puts the non-UTF8 character \xaa in the header file next to the machine name. I get around this by removing the non-unicode character before reading the file with FCSparser, but would you be open to changing line 166 in api.py from

 raw_text = raw_text.decode('utf-8')

 raw_text = raw_text.decode('utf-8', errors='ignore')

Let me know if you want me to submit a pull request for the change?
-Nathanael

Unusual multi-FCS files

Hi there,

I've come across FCS files (From the Luminex Muse), which implement multi-FCS by simple concatenating single FCS files together. This was my solution to split them:

files = glob.glob("ADM_*.VIA.FCS")
for f in files:
    handle = open(f,"rb")
    data = handle.read()
    ##Some FCS files are just literal concatenations of single FCS files, this splits them.
    split_data = data.split(b"FCS3.0")
    for s in range(1,len(split_data)):
        handle = open(f+"_"+str(s)+".FCS","wb")
        handle.write(b"FCS3.0"+split_data[s])
        handle.close()

Once these multi-FCS files are split, fcsparser works perfectly, as far as I can tell. But it might be nice for the library to be able to detect these files by default! See attached for an example FCS:
ADM_09SEP2020_181310.VIA.FCS.zip

Parameter $PnR not being used?

Hi, I'm new to FCS, so this might be some misunderstanding from my end. But the 3.1 standard indicates that, for list mode and integer data type, the $PnR parameter is the range (i.e., max value) a parameter can have. From page 24:

$PnR/n1/ $P2R/1024/ $P2R/262144/ [REQUIRED] For $DATATYPE/I/ this keyword specifies the maximum range, n1, of parameter n. For $MODE/L/ (list mode data), this corresponds to the ADC range, e.g., 1024. In that case, the data values can range from 0 to 1023. ... For $DATATYPE/I/, the value of $PnR also indirectly specifies the bit mask that should be used when reading values.

When running fcsparser on a 3.1 FCS file, I get the following:

                    $PnE     $PnN  $PnB       $PnS  $PnR $PnV
Channel Number                                               
1                 [0, 0]       FS    16         FS  1024  393
2                 [0, 0]       SS    16         SS  1024  314
3               [4, 0.1]  FL1 LOG    16  CD41 FITC  1024  683
4               [4, 0.1]  FL2 LOG    16    CD7 RD1  1024  617
5               [4, 0.1]  FL3 LOG    16   CD45 ECD  1024  636
6               [4, 0.1]  FL4 LOG    16   CD33 PC5  1024  826

            FS       SS  CD41 FITC  CD7 RD1  CD45 ECD  CD33 PC5
0      34015.0  36163.0    38184.0  40097.0   42218.0   44256.0
1       1647.0   3346.0     5390.0   7342.0    9484.0   11757.0
2      34416.0  36018.0    38337.0  40155.0   42129.0   44487.0
...

So, the question is, why are all these values greater than 1024? Aren't the masks being applied to the integers? Thanks!

ValueError: total size of new array must be unchanged

I get the following error parsing a .fcs. I have parsed a dozen of files before, never got the above error. It is due from the function read_data: data = data.reshape((num_events, num_pars)). Any idea where this might be coming from? I can send you the .fcs file if needed.

Save to fcs?

Hello,

I've been using your module and it is great. I was wondering if you plan to support saving back to fcs format, or if you have any recommendations for how to do this.

Loading data from CyFlow Cube 8

Hi,

I've been trying to use this module with some data generated using the CyFlow Cube 8. When I attempt to read the data I get a error as follows:
ValueError: item #0 of names is of type bytes and not string

this traces to line 375 in api.py:
data.dtype.names = tuple([name.encode('ascii', errors='replace') for name in names])

If I remove the encoding, everything seems to work as expected. Is this a real issue or am I loading my files incorrectly?

eyurtsev / fcsparser Goto Github PK

fcsparser's People

Contributors

Stargazers

Watchers

Forkers

fcsparser's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs