GithubHelp home page GithubHelp logo

Comments (17)

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024 2

Hello @lukfor / @LukeGoodsell

Using groovy, we came up with an alternative strategy for asserting maps / json / tuples etc that doesn't require sorting or traversal. Either:

  • Assert the contents of the channel in an order-agnostic manner: assertInAnyOrder(channel.out.outputCh, expected)
  • Assert specific examples exist in the channel: assert channel.out.outputCh.contains(expected)

To deal with files, we found you can either filter them out or parse them using list.collect { }. This only adds one additional line for the user, and is rather flexible.

See the PR for examples. Let us know what you think!

from nf-test.

LukeGoodsell avatar LukeGoodsell commented on June 12, 2024 1

Hi Lucas!

A recursive per-item comparison should do the trick, along with conversion to SortedMaps. I'll code this up when I get a chance some time in the next couple of weeks.

from nf-test.

lukfor avatar lukfor commented on June 12, 2024 1

Hi, yes, I agree with you. This is a core feature and should be intergrated directly in the nf-test codebase.

from nf-test.

lukfor avatar lukfor commented on June 12, 2024 1

Really cool idea! The examples are looking great and I think this is the perfect alternative to sorted channels. You found a good solution without breaking existing testcases that rely on sorted channels 👍

I will review the PR tomorrow 👍 Thanks guys!

from nf-test.

lukfor avatar lukfor commented on June 12, 2024

Hi Luke!

I see, it makes absolutely sense to support maps.

We sort the items of a channel to get a deterministic order that we can use when we write asserts. For paths we sort only by filename to get a logical order independent from the process subfolder in the work directory. What do you think would be the best comparison method/strategy for maps? Sorting nested maps could be challenging, therefore we started with simple types and tuples/lists.

Thanks for all you feedback and help!

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

This is great! Need to make sure the recursion depth doesn't get out of hand as stack overflow errors can be awkward to recover from. An iterative approach is also possible here, and might be more efficient if the savings on function calls are significant.

@LukeGoodsell - do you think this MapDifference tool could be utilised for this purpose?
https://www.baeldung.com/java-compare-hashmaps#map-difference-using-guava

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

Hey @lukfor really awesome to see 0.7.0 released!

Do you envision this feature being in this repo - or as a plugin that resides in a separate repo? My instinct is this is would be useful for the wider community and should live here in nf-test

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

Hey @lukfor we are working on this today. Have you got time for a quick chat? Trying to work out a solution

from nf-test.

lukfor avatar lukfor commented on June 12, 2024

Cool! Today I'm in meetings. if helpful and you can outline the idea, I'm probably able to give you feedback later today.👍

from nf-test.

byb121 avatar byb121 commented on June 12, 2024

Hi @lukfor,

We found that as soon as there's a Map or Array (tuple in Nextflow realm) in output channel of a workflow/process and there's more than one element in the channel, nf-test will raise the warning: Warning: Unsupported classes: class org.apache.groovy.json.internal.LazyMap vs. class org.apache.groovy.json.internal.LazyMap.

We can replicate this with this minimum example:

$ cat example/nested-input.nf
workflow CreatePerFastqChWf_simpleReturn {
    take:
        inputCh

    main:
        inputCh
            | map { metadata, fastqFwd ->
                return [
                    metadata, file(fastqFwd)
                ]
            }
            | set { outputCh }
    
    emit:
        outputCh
}

$ cat example/test_map_issue_3.nf.test

nextflow_workflow {

    name "Test workflow CreatePerFastqChWf_flatMap"
    script "example/nested-input.nf"
    workflow "CreatePerFastqChWf_simpleReturn"

    test("CreatePerFastqChWf test with all positive input") {

        when {
            workflow {
                """
                input[0] = Channel.from([
                  [
                    ["patientID": "patientA"],
                    'test_file_1.txt'
                  ],
                  [
                    ["patientID": "patientA"],
                    'test_file_2.txt'
                  ]
                ])
                """
            }
        }

        then {
              assert workflow.success
            }

    }

}

$ nf-test test example/test_map_issue_3.nf.test

🚀 nf-test 0.7.1
https://code.askimed.com/nf-test
(c) 2021 - 2022 Lukas Forer and Sebastian Schoenherr

Warning: This pipeline has no nf-test config file.

Test workflow CreatePerFastqChWf_flatMap

  Test [f387aa6c] 'CreatePerFastqChWf test with all positive input' Warning: Unsupported classes: class org.apache.groovy.json.internal.LazyMap vs. class org.apache.groovy.json.internal.LazyMap
PASSED (4.047s)


SUCCESS: Executed 1 tests in 4.055s

With this little example we can see a few issues:

  • nf-test test can not sort Map object in output channels, like @LukeGoodsell already mentioned; and
  • Without asking nf-test to check the output channel, nf-test still tries to sort elements in a channel, hence we see the warning even when only asserting workflow.success.

We (me, @ivopieniak and @aaron-fishman-achillestx ) thinks that we probably can make following changes:

  • Remove the sorting function of the output channels
  • Add a new assertion keyword, (assertUnsorted for example, open for discussion) for order agnostic comparison when asserting contents of output channels, syntax can be like this: assertUnsorted workflow.out.outputCh == [ ['a': 'b'], ['c': 'd'] ]. The comparison can be done by leveraging JSONassert package.
  • JSONassert package also has a strict mode which will not forgive order differences, and this can be used for normal assertions of output channels in current syntax, e.g.: assert workflow.out.outputCh == [ ['a': 'b'], ['c': 'd'] ].

We can start to implement some changes later if these look ok.

Best,

from nf-test.

byb121 avatar byb121 commented on June 12, 2024

Hi @lukfor, would you have time today for a quick chat?

from nf-test.

lukfor avatar lukfor commented on June 12, 2024

Sorry, my days are much busier than expected.

Just some quick thoughts: I am currently not sure if removing the sort function is the best solution.

(a) All existing testcases would break
(b) Implementing an assertUnsorted would no be trivial
(c) It would be hard to identify the files in the channel, when the items are not sorted and have no deterministic order. For example we have an output channel with 3 files (for each sample one file). How can we find the file for sample2 to check its content? By sorting the items we know that sample2 is always item 2.

Thus, I tend to implementing the sort function of lazy maps. Maybe they have a compareTo method that compares them order agnostic?

What's your take here?

Thanks for all your efforts!

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

Awesome, thanks for offering to give this ago, that is much appreciated. You make a good point! It doesn't look like Maps, including LazyMaps, have a compareTo methods, only a equals method for use in == operations.

So it seems some generalised compareTo method for Map<Object, Object> would be required.

Regarding (b), we found an existing solution for this using Guava's MultiSet . While it does work (see the assertListUnsorted here), we're not sure of the usability of this re (c). Our thinking now is that sorted channels would probably give a more streamlined UX.

from nf-test.

byb121 avatar byb121 commented on June 12, 2024

Thanks for the reply. I agree that backward compatibility can be an issue.

I can see it's super convenient to have a sorted output channel that provides users lists in a deterministic order, but we know that a real channel has no deterministic order, so perhaps nf-test does not have to guarantee the order either. In most of our processes/workflows, output files will be coupled with an ID value, sample name, for example. Therefore there will be Map object in the output channels. In these cases, I think for nf-test to support order-agnostic lazyMap comparison can be more welcome.

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

fair point. Can provide both options this way.

from nf-test.

lukfor avatar lukfor commented on June 12, 2024

Thanks for your input! So what you think about the following:

(1) We extend the sort method to work with maps

(2) We extend the possibility to disable sorting. Currently, you can disable sorting only for a specific test:


test("test xy"){

  autoSort: false
  
  when {
    ...
  }
  then {
  ...
  }
}

We could add this property also to testsuites and to the config file to change default behaviour.

I hope to find some free time on Friday to play/experiment wit lazy maps.

from nf-test.

aaron-fishman-achillestx avatar aaron-fishman-achillestx commented on June 12, 2024

I like that idea!

from nf-test.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.