Comments (2)
Adding a specific use case we use ElasticSearch for the searching capability instead of trying that with our database layer. The indexing workload ends up being about 90% updates and 10% indexes. All operations are done via the /_bulk
endpoint.
A typical index vs update flow would look like:
- Index to create the initial document with 50-100 fields
- Update 3-4 fields or nested documents on a daily to weekly basis
from rally-tracks.
Thanks for the use case. To me this sounds as if this should be solved with a dedicated corpus that includes documents and the corresponding action-and-meta-data line though because the documents that should be updated need to be crafted specifically. The approach we use here instead is to simulate id conflicts randomly (but based on a configurable probability) and then emit either an index
or update
action in the corresponding meta-data line but we always use the full document. Hope that makes sense now with my additional explanation.
from rally-tracks.
Related Issues (20)
- Split the many-shards configurations to its own track
- Extend the TSDB challenge to include time_series aggregations
- Find latest branch automatically in Sync 8.x branch GitHub Action
- Rely on Elasticsearch's default for source mode in TSDB track
- Bulk request failures using the `elastic/security` track for nightly benchmarks
- `raw-request` task in `nyc_taxis#esql` fails with `request [/_esql] contains unrecognized parameter: [request_timeout]"` HOT 2
- Verification exception issues with `nyc_taxis`, `esql` challenge HOT 2
- Add new non-indexed vector search benchmarks
- Create/find a byte quantized vector dataset
- Dense vector recall-10-100 scores are weird HOT 13
- Branching for backward compatibilty: 8.7 and later HOT 5
- Missing --offline support for new tracks
- elser speed test rally track - add exclude source for the rank_features field
- float division by zero error HOT 1
- Elastic/logs redis.log-default datastream doesn't use start_date / end_date track parameters
- Unsupported auto date histogram with nested time series aggregations
- Deprecatied frequent-items, change to frequent-items_sets HOT 3
- Deprecated geo_polygon field in geopoint HOT 3
- Elastic/logs many-shards-quantitative challenge stopped working
- Which license do your queries have? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rally-tracks.