Comments (13)
I'm going to admit that I'm still wrapping my head around how the IAsyncEnumerable
and AsyncSeq
work, given we don't have the await foreach
in F# that C# would tend to unpack.
Because of that, I'm not completely sure that AsyncSeq
understands pagination the same way.
Do you have an example of what you're trying to do?
from fsharp.cosmosdb.
What I'm doing right now (with the v3 SDK) is this:
let public fetchAllItems(feedIterator: FeedIterator<'a>) =
asyncSeq {
while feedIterator.HasMoreResults do
let! response = feedIterator.ReadNextAsync() |> Async.AwaitTask
yield! response |> AsyncSeq.ofSeq
}
This returns an AsyncSeq<'a>
and seems to work, but I have my doubts if it is the "right" way. I believe someone helped me with the final yield! line.
from fsharp.cosmosdb.
For a bit more explanation, response
is a FeedResponse<'a>
which implements IEnumerable<'a>
. response |> AsyncSeq.ofSeq
then gives AsyncSeq<'a>
which is then merged into the parent sequence via yield!
.
See: https://theburningmonk.com/2011/09/fsharp-yield-vs-yield/
from fsharp.cosmosdb.
Looking into the v3 SDK source and comparing it to v4, it looks like it works a bit differently. In v4 there isn't the FeedIterator<'T>
that v3 used, instead it's AsyncPagable<'T>
.
Now, digging through this a bit further it turns out that this is really a wrapper over FeedIterator
and hides away the paging unless you call AsyncPagable.AsPages()
, in which you provide the size of the pages you need.
So, I probably would have to have execPagedAsync
where you can provide the right info and that could return it as an AsyncSeq
then, but I'll have to play (trying to work out how to handle the Insert
API presently).
from fsharp.cosmosdb.
Could you not just use OFFSET and LIMIT in your query "SQL" and remember what page index you're on? Or have I missed something here?
Edit: I believe that's the official CosmosDB approach for paging.
from fsharp.cosmosdb.
@seankearon I don't think these two types of paging are the same thing, though I may be wrong. The type I'm referring to is more akin to batching. The Cosmos SDK client won't return everything all at once. You have to continually call it to get the next batch of data. I'll take a look at what's in V4 when I get a chance.
from fsharp.cosmosdb.
Yeah, I'm wondering where the difference is between those two.
If an async stream is like using a drinking straw to drink from a lake. Then paging/batching is like using a bucket to drink from a lake.
If I'm passing you the bucket to drink from, do you care whether I fill it all at once or in little steps using my drinking straw? Probably not - you just want the next bucket of water.
I'm thinking that the way to fill up bucket n using the straw would be something like this:
let usersByPage(page: Int32) =
host
|> Cosmos.host
|> Cosmos.connect key
|> Cosmos.database "UserDb"
|> Cosmos.container |> "UserContainer"
|> Cosmos.query "SELECT u.FirstName, u.LastName FROM u WHERE u.LastName = @name OFFSET xyz LIMIT pqr"
|> etc. etc. etc.
|> Cosmos.execAsync<User>
But then, it's been a loooong day! :)
from fsharp.cosmosdb.
I've finally had some time to come back and revisit this issue and work out if it's possible/viable to do pagination support.
TL;DR: Use the OFFSET and LIMIT as @seankearon has suggested, I don't think I can put anything into the API to do it for you. Best I can do it have a way to get batched results per iteration of the AsyncSeq
.
We're going to dive through a rabbit hole now, so choose if you want to read on as I'm partially writing this down for my own sake. I'm going to trace through a bunch of the Azure.Cosmos code as it currently stands.
When you execute a GET
query you call the method GetItemQueryResultsAsync
(_Note: The Async
suffix is added after -preview4
, so my code doesn't use it, but it will eventually) and this creates a FeedIteratorCore
which is what handles the ReadNextAsync
operation to fetch records from CosmosDB.
This type is then wrapped in FuncAsyncPagable
to return the AsyncPagable<Page<T>>
response that is consumed by AsyncSeq
in F# to make our nice API.
AsyncPagable
, the base class of FuncAsyncPagable
has the AsPages
and MoveNext
methods defined on it, MoveNext
being what is ultimately called by the state machine to go over the iterator (it bubbles through a few other types, but it's ultimately where we land). What's interesting about the implementation is that it actually called AsPages
anyway, so the AsPages
method is our important one.
Our AsPages
calls the func passed in here which is a call to GetPagesAsync
on PageIteratorCore
.
Now, if we trace through here, AsPages
takes a continuationToken
and pageHitSize
, but GetPagesAsync
on our iterator only takes the continuationToken
, the pageHitSize
is dropped along the way. My guess is that the pageHitSize
doesn't map to anything that is available on the CosmosDB REST API, so it can't be used and is discarded in turn.
So, what's the difference between iterating over the AsyncPageable<T>
and IAsyncEnumerator<Page<T>>
(how it currently works vs calling .AsPages
)? Whether or not you get a single result or a batch of results. Page<T>
has a Values
property which will contain 100 T
items that you need to unpack. This means it comes down to "do you want to work with a single result each iteration, or with a batch of items?" (no where can I find exposed a "HasMore" property, that's just determined by whether you keep iterating).
I tested this against a large Cosmos store I have with the following code:
async Task Main()
{
var client = new CosmosClient("...");
var container = client.GetDatabase("...").GetContainer("...");
var qd = new QueryDefinition("SELECT c.id FROM c");
var nonCount = 0;
"Non-paged query".Dump();
await foreach (var response in container.GetItemQueryIterator<Dictionary<string, string>>(qd))
{
nonCount++;
}
"Paged query".Dump();
var pageCount = 0;
await foreach (var response in container.GetItemQueryIterator<Dictionary<string, string>>(qd).AsPages())
{
pageCount++;
}
$"Non-Paged ran {nonCount} times to Paged {pageCount}".Dump();
}
And here's the response:
Non-paged query
Paged query
Non-Paged ran 3811 times to Paged 39
The iteration count dropped and I ran a network trace on it, which saw the same number of network requests happening.
I might look at putting in a queryBatch
or something like that which returns AsyncSeq<Page<T>>
to give feature parity with the underlying API.
from fsharp.cosmosdb.
Thanks for the detailed write-up. I'm waiting to use this on my project until the v4 Cosmos API SDK is fully released, but this looks great.
from fsharp.cosmosdb.
I've added a "pagination" option in a new branch: https://github.com/aaronpowell/FSharp.CosmosDb/tree/pagination
Basically all it does is adds a new method Cosmos.execBatchAsync
which returns AsyncSeq<Page<T>>
so you can get the some paged results but it's not really paged due to what I mentioned above.
from fsharp.cosmosdb.
This will be coming in the next release.
from fsharp.cosmosdb.
If anyone wants to test, grab the 0.3.0
pre-release packages from https://github.com/aaronpowell?tab=packages&repo_name=FSharp.CosmosDb
from fsharp.cosmosdb.
Available on NuGet now.
from fsharp.cosmosdb.
Related Issues (20)
- Performance benefits in sharing the connection? HOT 2
- Release 0.2.0 ready for review HOT 3
- Release 0.2.0 ready for review HOT 3
- Release 0.2.0 ready for review HOT 2
- Release 0.2.0 ready for review HOT 2
- Release 0.2.0 ready for review HOT 2
- Release 0.2.0 ready for review HOT 2
- Analyzer not loading HOT 1
- Use a singleton Azure Cosmos DB client for the lifetime of your application HOT 3
- Any chance of UPSERT support? HOT 2
- 📣 Announcement - Road to v1 HOT 3
- Is Paket required for FSharp.CosmosDb.Analyzer? HOT 8
- Getting a warning about package downgrade HOT 4
- Issue deleting an item from a container HOT 2
- Does it support Stored Procedure? HOT 3
- [Request] Make compatable with newer Azure Functions versions HOT 1
- Linq Query Support HOT 5
- Cannot open in dev container when on Windows HOT 2
- Replace doesn't resolve correct id value HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fsharp.cosmosdb.