Comments (3)
Hi Alberto,
Seems like a straightforward improvement.
On Friday, July 22, 2016, Alberto Accomazzi [email protected]
wrote:
The search engine behind the ADS API is SOLR, which has performance
problem with queries that require "deep pagination," i.e. the retrieval of
records way down the list of results. The issues related to the degraded
performance are explained here:
https://cwiki.apache.org/confluence/display/solr/Pagination+of+ResultsTo improve the situation, SOLR implements the notion of cursors (also
explained in the page above) which mitigate the pagination problem for
queries generating a large set of results which need retrieving. I believe
an implementation of multi-page retrieval based on cursors would be more
robust for a few reasons:
- better performance: avoids recomputing long result list just to
fetch items way down in the list (which could generate a timeout)- consistent results: if the index is modified during the follow-up
queries, some results might be duplicated or skipped using the current
approach, whereas this should not happen with the user of cursorsI tested this approach against our search engine and it seems to work but
does require sort to include the unique key id as a tie-breaker when
creating a list of result (e.g. sort=date asc,id asc).—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#70, or mute the thread
https://github.com/notifications/unsubscribe-auth/ADV6ZAmDj3S3clLqw6a8F8fIXcvJXHVTks5qYSWvgaJpZM4JTGJR
.
from ads.
Currently sort
defaults to None
, which according to https://github.com/adsabs/adsabs-dev-api/blob/master/search.md returns results sorted by relevancy.
Using cursorMark requires defining sort
, which would change this default behavior. Is there a way to explicitly specify "relevancy" as a sort
strategy?
I'm going to continue implementing support for cursorMark, but it seems like it only makes sense to use it if sort
is not None
.
Thoughts?
from ads.
The default SOLR sort order is score desc
so you could explicitly set this as primary if sort
is None
from ads.
Related Issues (20)
- `ads.SearchQuery` returns a status 500 APIResponseError HOT 6
- SearchQuery not respecting start parameter HOT 27
- docs: link to available search field needs updating HOT 3
- author_count return None HOT 3
- How do you search on a range of years? HOT 1
- citation_count returns None HOT 2
- Missing paper.journal attribute HOT 1
- Missing .adsurl attribute HOT 1
- API Response Error HOT 2
- Search parameters wrapped in parentheses should not be auto-quoted
- readthedocs rendering is broken HOT 2
- support for Positional Field Searches HOT 1
- ExportQuery not working properly HOT 1
- Not able to use timeAllowed and cursorMark combination HOT 1
- Why is the total number of filtered articles and citation_count cumulative number inconsistent with manual searches? HOT 1
- API Response Error while running example HOT 1
- search using both cursorMark and timeAllowed HOT 8
- All queries fail with SSLError HOT 8
- Feature Request: search by arxiv-id HOT 7
- Non-lazy loading leads to missing properties in SeachQuery HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ads.