Comments (26)
Re SMTP MIME format, see https://github.com/davidmoten/odata-client#download-an-email-in-smtp-mime-format
from odata-client.
Here's the documentation https://github.com/davidmoten/odata-client#delta-collections.
CollectionPage<Message> delta = OdataFactory
.request()
.users('[email protected]')
.mailFolders('inbox')
.messages()
.delta()
.get();
delta.stream()...
// a while later
delta = delta.nextDelta();
...
from odata-client.
BTW the first call to delta()
lists all messages I think so you might want to use .deltaTokenLatest()
as per documentation.
from odata-client.
Is this question about serialization? If you serialize the CollectionPage<Message>
to json it should have the @odata.deltaLink
field in it.
from odata-client.
Thanks for the response,
We are expecting millions of messages in a Mail-folder and service will not be able to process all the messages locally. So we are streaming the messages received from exchange server. Initially service will pull k records(where k is the pagesize) & send to the agent, then it will again fetch next k records & send to agent, this will keep on going until it gets 'deltaLink' in the response. As soon as service receives 'deltaLink' in the response it will send the records(from last page) to agent, also since 'deltaLink' is received so it will stop pulling data.
Now we need to append 'deltaLink' with the last set of records(from the page containing deltaLink) to the agent. Can you please suggest how to fulfill this requirement?
from odata-client.
You can use the CollectionPage.deltaLink()
method. Note also that when you serialize a CollectionPage
to JSON it will include the deltaLink field if present.
from odata-client.
BTW in terms of page size you should also read https://github.com/davidmoten/odata-client#your-own-page-size.
from odata-client.
Thanks David for the consideration,
Here we are using stream(as given below) and it seems that the library automatically serializes based on mail message records like collectionPage.currentPage();, since the data on agent doesn't shows the nextlink or deltalink. Any suggestions here?
public Flux<Stream<Message>> GetMails()
{
Stream<Message> collectionPage = OdataFactory
.request()
.users('[email protected]')
.mailFolders('inbox')
.messages()
.delta()
.maxPageSize(20)
.stream();
return Flux.just(collectionPage);
}
from odata-client.
Ah well this boils down to how a Stream serializes doesn't it. A Stream is not a List but it is like one for serialization purposes (the library you are using presumably does something predictable for Streams). Just imagine you take all the Message objects out of a CollectionPage object and add them to an ArrayList. What makes you think that the metadata will go across with it? It won't!
What's the maximum number of messages you want to go across in one call to getMails? It certainly won't be 20 with your current code because the stream keeps getting more pages. You could just return Flux<CollectionPage<Message>>
if you were happy with returning one page per call. Then the serialization would include nextLink, and deltaLink (if at last page). It's ugly to couple an API to an internal libraries classes, you could always create your own Page object and return that Flux<Page<Message>>
. Of course if you only care about the JSON representation across the network then it doesn't matter.
BTW, you do realize that the Message object won't include email attachments, especially large ones, and there is weirdness with special attachments like Reference attachments? If you want to capture the whole SMTP message reliably you can get it in MIME format as a stream from Graph api using odata-client-msgraph. I can show you how if is of interest.
from odata-client.
Apologize, '.maxPageSize(20)' is not relevant in the function( 'GetMails') shared above, as the stream will automatically pull the the pages util deltalink is received.
Agree we can pull all the records using 'nextPage().get()' until deltalink is received and finally add these records into a CustomPage along with deltalink. Using this approach the response on agent will have records as well as the deltalink. In this approach we need to store all the records into a list locally and if we consider a scenario where Mail-folder has millions of messages then this approach will impact the performance and also service may not be able to store all these messages locally, if it doesn't have enough memory.
This is why we are going for stream but the only challenge using stream is how to send 'deltalink' in the response to the agent, so that next time agent can make request for only incremental changes.
from odata-client.
Thanks for the detail, I'll give you an example of what to do shortly.
from odata-client.
I'm adding a method with the signature
Stream<ObjectOrDeltaLink<T>> streamWithDeltaLink()
to CollectionPage<T>
.
The approach to solve your problem is to take a stream of n Message objects and convert that to a stream of n+1 wrapper objects where the first n objects contain a Message and the last object has an Optional deltaLink (after all, not every stream has a deltaLink at the end).
When Stream<ObjectOrDeltaLink<Message>>
is serialized you should see;
[
{ "object": MESSAGE_JSON, "deltaLink": null},
...
{ "object": MESSAGE_JSON, "deltaLink": null},
{ "object": null, "deltaLink": "https://blahblah" }
]
I'll finish tests and then let you try it out.
from odata-client.
I've merged the change into master. So now you can do this:
public Flux<ObjectOrDeltaLink<Message>> getMails()
{
Stream<ObjectOrDeltaLink<Message>> stream =
client
.users('[email protected]')
.mailFolders('inbox')
.messages()
.delta()
.deltaTokenLatest()
.streamWithDeltaLink();
return Flux.fromStream(stream);
}
Note the use of deltaTokenLatest
(are you familiar with the effects of that option?) and you should not need to rebuild a client from a factory every call. A client built once can be used for the lifetime of an application and is threadsafe. Note also that you were returning a Flux
using Flux.just
and you should be using Flux.fromStream
.
What library are you using to expose a Flux across the network (what library is doing the serialization of a Flux into JSON)? Is it WebFlux?
from odata-client.
Thank you David, yes we are using WebFlux.
I hope the above code changes will be part of release 0.1.35, when it will be released?
from odata-client.
I'll look at a release in the next day or so. In the meantime you can just do this to use the SNAPSHOT version locally:
git clone https://github.com/davidmoten/odata-client.git
cd odata-client
mvn clean install
from odata-client.
Hi David,
Thanks a lot, we have tested and the new code changes produces output as per our expectation.
Just small request can we have method 'streamWithDeltaLink' should be part of 'CollectionPageNonEntityRequest' and 'CollectionEntityRequestOptionsBuilder' also?
Otherwise we are ok with the current code changes as well and below is the invocation:
public Flux<ObjectOrDeltaLink<Message>> getMails()
{
Stream<ObjectOrDeltaLink<Message>> stream =
client
.users('[email protected]')
.mailFolders('inbox')
.messages()
.delta()
.get()
.streamWithDeltaLink();
return Flux.fromStream(stream);
}
from odata-client.
Glad to hear it works @madanbisht, thanks for testing the change. I've added the extra methods as requested and I'll build a release shortly.
from odata-client.
0.1.35 is on Maven Central now.
from odata-client.
Thank you, now we are using build 0.1.35 and its perfectly working.
Just one small request like deltalink, is it possible to add nextlink in the response as it will be useful to handle below failure scenario.
Considering a scenario where mailbox has lets say 2 Million mails, using stream we will be able to pull all mails including deltalink(required for incremental messages).
If somehow the connection gets broken in between, lets say after pulling 1 Million mails then agent doesn't have the information on what data it has received and I think in this situation agent has to make the request again to pull all the data, including mails which it has already received.
Adding nextlink in the response to the agent will enable the agent to initiate request only for remaining mails(which it doesn't received yet) using nextlink.
from odata-client.
Ha, getting complicated for you! To solve that problem I would call nextDelta
more often. You get a failure you have fewer messages to repeat processing for. The other reason that you should call nextDelta
more often is that the delta tokens themselves have limited lifetimes, they expire!
Are you trying to guarantee processing of every email? If so then I doubt using deltas
is an appropriate method especially as deltaTokens expire. At my work we guarantee processing of emails by pulling down unread emails from the mailbox and only marking them as read once they have been persisted to a queue for processing locally. There appears to be an index on the read/unread status because performance is still good for this request even though the mailbox has grown a lot (in our case 2000 emails a day). Good luck with getting O365 to scale to millions of messages per day in one mailbox, have you tested this? What's the plan for removing messages from the mailbox? If you are doing that anyway why don't you just stream all messages from the mailbox and delete them when they are processed successfully?
from odata-client.
Hi David,
Actually we are not assuming millions of messages per day in one mailbox, in fact its impossible and below is the use case I was talking about.
The intent of our application is to retrieve mails from an account mailbox and store it somewhere in a disk, so that the mails can be restored back into the mailbox whenever required. Considering a scenario where an account is 5 to 10 years old, the account can have millions of messages and our application need pull all these messages.
Once all mails for an account retrieved successfully, application will use deltalink to pull incremental mails in future. If the next pull(incremental) happens after 1 year from the last pull(very first pull) then again the account can have Millions of new mails.
from odata-client.
One more concern please correct, as you said that deltaToken has limited lifetimes then I think deltaLink will not be applicable after a week or month.
from odata-client.
There's not much out there but here's something that talks about delta links expiring:
https://stackoverflow.com/questions/51933002/syncstatenotfound-error-how-to-fix-or-avoid
and this says that they expire within 7 days:
http://www.msfttoday.com/duration-of-change-tracking-tokens-for-identity-and-education-resources/
from odata-client.
Thank you David for sharing it, it has an impact in our use case.
Also your suggestions are appreciated on the above use case, i.e. error handling(due to connection error) for those scenario where application uses stream to process Millions of messages for an account(which is 5 to 10 years old) .
from odata-client.
The intent of our application is to retrieve mails from an account mailbox and store it somewhere in a disk, so that the mails can be restored back into the mailbox whenever required. Considering a scenario where an account is 5 to 10 years old, the account can have millions of messages and our application need pull all these messages.
Use raw SMTP format as I've already commented, not Message json. That way you retain everything about the email including all attachments no matter how big and the SMTP headers. You'll need to confirm the practicality of this for restoring an email to an account.
Once all mails for an account retrieved successfully, application will use deltalink to pull incremental mails in future. If the next pull(incremental) happens after 1 year from the last pull(very first pull) then again the account can have Millions of new mails.
Solution for your error handling scenario is to pull more often to reduce the stream size and to account for expiry of deltaLink and handle duplicates sensibly. Handling of duplication is an inevitability, make sure you account for it. Microsoft suggests using a webhook for notification of changes so you don't have to poll for them. Worth looking at too.
from odata-client.
Thanks David for your kind support and understanding.
Also we are interested on streamming SMTP message from Graph api using odata-client-msgraph. If required, we will create a new thread for it.
from odata-client.
Related Issues (20)
- NPE when running odata-client-maven-plugin through Java 11 HOT 10
- Batch requests? HOT 6
- Create client for Graph v1.0 behind a proxy doesn't work HOT 6
- How to get partial range of content of a drive item(file) HOT 9
- DriveItem does not have the @microsoft.graph.downloadUrl property HOT 6
- Issue with Microsoft Dynamics EntityDefinitions/RelationshipDefinitions HOT 2
- Onedrive delta link, does not give top(2), but gives all changes though top is added HOT 3
- [Documentation] Authentication with OData Service HOT 7
- Cannot deserialize UnsignedByte using Serializer.deserialize HOT 5
- upload file to onedrive through createUploadSession - itemWithPath() not found HOT 4
- httpResponse of odata.client.HttpResponse put(String s, .List<RequestHeader> list, InputStream inputStream, int i,HttpRequestOptions httpRequestOptions) does not give reponse body in httpResponse.getText() as post() method gives. HOT 8
- odata\client\TestingService.class- not returning response body for patchOrPut() method. HOT 17
- post() now does a HTTP PUT HOT 4
- HttpPatch Delta<T> parameter is always Null HOT 1
- odata-client-runtime dependency on jaxb-core and jaxb-impl HOT 2
- Unable to generate a client with dynamics finance and operations metadata HOT 10
- Support "Evolvable Enumerations" with a default value HOT 5
- Hit java reserved identifier 'package' HOT 5
- Automatic mapping of "expanded" objects support? HOT 4
- Requesting an individual resource - request produced using path parameter instead of ID in brackets encoded in URL HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odata-client.