Comments (4)
The issue appears to be with the change to batching fetching the cherrypicked samples to add to the positives report. The cherrypicked samples are fetched (through lots of JOINS) using the root sample id. However, the root sample id is not unique to a sample, and fetching on such introduces the possibility of fetching duplicates/fetching the wrong cherrypicked samples.
When there was no batching, there were no duplicate fetches on root sample id, but there was the possibility that the wrong cherrypicked samples were fetched for a positive, as they were not being uniquely identified. This is still the case now, but with batching we can also fetch for the same root sample id twice, meaning we get duplicates.
A fix for both issues would be to uniquely fetch cherrypicked samples using root sample id + other fields
from lighthouse.
Deployment and testing:
- Alter connection string variable names in psd-deployment project
- Run report in UAT / training - check for duplicates
- Deploy to UAT / training - inc. deploying new docker config
- Re-run report - check duplicates are gone
linked psd-deployment PR - #160
from lighthouse.
Prod report from today 10:30:
Num positives with locations: 574,059
Num positives total: 782,647
Prod report from today 14:11 (after deploy):
Num positives with locations: 574,664 (presume some must have been scanned into LabWhere)
Num positives total: 782,635 (12 fewer, as expected from removing the duplicates)
Searching for the samples that were known duplicates in the report before, I can see that they only appear twice now rather than 4 times.
However, all the samples now say 'No' for the LIMS submission column - will revert the release
from lighthouse.
Deployed fix today and results look good, as in above comment. Known 'duplicate' samples now have one row with 'No' and one with 'Yes', as expected if one of them has been cherrypicked and one not.
pre-deploy - 117,146 yes's
post-deploy - 117,128 yes's
implies there were a few that were wrong but not duplicated, same situation I guess but happened to be processed in the same 'chunk'
from lighthouse.
Related Issues (20)
- DPL-470 - LSPA date tested in future HOT 1
- DPL-472 - Brants Bridge sent root sample ids in incorrect format
- DPL-473 - Brants Bridge sent ~1500 samples with incorrect RNA ID HOT 1
- DPL-342 Remove report [C=S, V=3]
- DPL-362 As GSU (Alan K) I want to allocate a unique COG-UK ID to all historic Heron samples so that we can use this instead of Root Sample ID within our internal tracking. HOT 10
- DPL-426 Remove updates for MLWH once all samples arrive via RabbitMQ HOT 2
- DPL-429 As developer I want to remove the lighthouse reports generation as is not in use anymore and want to have less code to maintain HOT 1
- DPL-544 Make data consistent between MongoDB, MLWH and Sequencescape
- DPL-614 Missing plate map (Heron)
- DPL-631 [BUG] Incorrect evaluation of must_sequence affects both Box buster and Biosero HOT 1
- DPL-629 Check status of Heron CP plates HOT 1
- DPL-675-2: AROUND SEPTEMBER 2023 -- Remove reports and Sentinel functionality HOT 1
- DPL-717-1: Create a new endpoint for creating entities for Lighthouse deep-well plates in Sequencescape
- DPL-757 As PSD I want to create a first draft design for RVI new project (multipick) that will reflect lims architecture and required components to solve the current list of user requirements obtained.(S=M; C=L) HOT 1
- DPL-776 Investigate a way to identify Heron/RVI samples already in SequenceScape so that we can avoid problems during stamping from deep wells to shallow well plates for RVI.
- DPL-782 Filter samples being imported for deep-well to shallow-well stamping so as to exclude samples already in Sequencescape
- DPL-821 [RVI] Cherry-picking LSW-96 Stock plates using the existing Tecan systems HOT 4
- DPL-670-4 Pass a Lighthouse specific API key to Sequencescape calls HOT 3
- DPL-840 Populate missing sample fields in MLWH and consequently in CoTrack
- Y24-191 - Use the v2 API key for all Sequencescape requests
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lighthouse.