Comments (18)
If you're going to filter watched flags, I would love if it could be filtered by tag. I feel like prefixes is more of a workaround for what tags can do more effectively.
from ruby-eventsource.
This is my current idea for a minimal fix, but need to test it against a few prod systems over the next few days to see if that's correct. Gist of the idea is to introduce a cursor to prevent re-reading chunks. String#index
takes an offset to start scanning from, which is very useful for this.
from ruby-eventsource.
This was brought to our attention recently via a support request. I believe your description is accurate, and I understand it now better than I did the original report. I'm not sure I understand this part, though:
Current experimentation I've been working on has been to prevent full reads of a partial buffer, but I believe this reflects more on the entirety of flag data being sent in a single message. If it truly is all being sent in a single response this will be a severe issue for larger clients.
By "response" do you mean a single SSE event? The stream connection normally remains open, so all downloaded data from LaunchDarkly makes up a single HTTP response. But it is true that the first SSE event sent on the stream (regardless of whether it's the Ruby SDK or another SDK) contains the full flag data. You also mentioned "large clients" which was a bit confusing, but I'm guessing that what you meant there was "large LaunchDarkly data sets".
from ruby-eventsource.
Yeah, that was poorly worded, let me try that one again.
From observation I believe that the initial flag data provided to init
on data stores is given in a single SSE PUT
message in the form of a JSON blob, so: data: { entire_flag_payload_here }
. Because of our admittedly overkill usage of tokens our flags are approximately 150Mb of data going over the wire.
If that was a single message, non-streamed/chunked, it'd be a bit expensive to deserialize that much JSON. When the response is streamed and chunked there won't be a record break (/[\r\n]/
) until the end of the 150Mb of data. That means that the @buffer.index(/[\r\n]/)
check will be called for every chunk that's received until the end of that payload, starting from the beginning.
The experiment I'm working on right now introduces a cursor to not traverse already read chunks as a quicker fix.
In the initial message I'd used this example, where every -
represents a 1Mb theoretical chunk of 150Mb of data:
buffer = ''
buffer.index(/[\r\n]/)
buffer = '-'
buffer.index(/[\r\n]/)
buffer = '--'
buffer.index(/[\r\n]/)
# ...
buffer = '-' * 150
buffer.index(/[\r\n]/)
The fix tries to do this instead:
buffer = ''
buffer.index(/[\r\n]/)
previous_buffer_size = buffer.size
buffer = '-'
buffer.index(/[\r\n]/, previous_buffer_size)
previous_buffer_size = buffer.size
buffer = '--'
buffer.index(/[\r\n]/, previous_buffer_size)
# ...
previous_buffer_size = buffer.size
buffer = '-' * 150
buffer.index(/[\r\n]/, previous_buffer_size)
...so it'll start scanning where the last @buffer.index
left off instead of trying to scan the entire thing again.
To be fair this had me scratching my head for a good week or so trying to figure what was up, and it was hard to think of a good way to approach or fix it, so it took a bit to write this ticket. Sorry about the delay there.
from ruby-eventsource.
I'd run some isolated tests to see if I could see just how much impact that index
scan has versus using a positional cursor in conjunction with it:
chunk = '-' * 1000
chunks = [*(([chunk] * 999) * 150), chunk + "\n"]
chunks.join.bytesize.fdiv 10**6 # 1 mil
# => 149.851001 (Mb)
# Full scan
Benchmark.measure do
start_time = Time.now
previous_time = start_time
buffer = ''
chunks.each_with_index do |c, i|
if i % 1000 == 0
current_time = Time.now - start_time.to_f
puts "i: #{i}, delta_time: #{(current_time - previous_time).round(4)}, time: #{current_time.to_f.round(4)}"
previous_time = current_time
end
buffer << c
break true if buffer.index("\n")
end
end
# i: 1000, delta_time: 0.0144, time: 0.0144
# i: 2000, delta_time: 0.0388, time: 0.0533
# i: 3000, delta_time: 0.051, time: 0.1043
# ...
# i: 43000, delta_time: 2.4448, time: 46.5457
# i: 44000, delta_time: 2.4821, time: 49.0279
# i: 45000, delta_time: 2.624, time: 51.6519
# ...
# i: 147000, delta_time: 9.4535, time: 648.7023
# i: 148000, delta_time: 9.5025, time: 658.2048
# i: 149000, delta_time: 9.0586, time: 667.2634
# Positional scan
Benchmark.measure do
start_time = Time.now
previous_time = start_time
buffer = ''
chunks.each_with_index do |c, i|
if i % 1000 == 0
current_time = Time.now - start_time.to_f
puts "i: #{i}, delta_time: #{(current_time - previous_time).round(4)}, time: #{current_time.to_f.round(4)}"
previous_time = current_time
end
pos = buffer.size
buffer << c
break true if buffer.index("\n", pos)
end
end
# i: 142000, delta_time: 0.0007, time: 0.1223
# i: 143000, delta_time: 0.0007, time: 0.123
# i: 144000, delta_time: 0.0008, time: 0.1238
# i: 145000, delta_time: 0.0007, time: 0.1245
# i: 146000, delta_time: 0.0007, time: 0.1252
# i: 147000, delta_time: 0.0007, time: 0.1259
# i: 148000, delta_time: 0.0007, time: 0.1266
# i: 149000, delta_time: 0.0007, time: 0.1274
667.2634 / 0.1274
# => 5218.621638954628 (times faster)
This looks to be a viable fix to try, and I'll test it with our production servers on Monday. Any thoughts?
from ruby-eventsource.
It looks plausible at first glance, we'll just need to review it in more detail.
from ruby-eventsource.
Looking at the branch with your proposed fix, there's one thing I'm not sure is right. It's here, where you are moving @cursor_position
to the current end-of-buffer immediately before appending each new chunk of data to the buffer.
It seems to me that a more logical place to do this would be within read_line
, immediately after you have scanned to the end of the buffer and determined that there isn't a line break in it. Currently I think the result is the same, since read_chunk_into_buffer
is only ever called from that same place, but logically it is more related to what read_line
is doing.
Either way, I think there's an edge case where it could end up being wrong— but it's an edge case that I think the existing code is already vulnerable to. SSE allows the line break to be \r
, \n
, or \r\n
. So, if we read some data and it ends in \r
, we could consider that to be the end of a line but we do not want a subsequent \n
at the beginning of the next chunk to be treated as another line break; it should be treated as just an optional suffix to the previous \r
. Handling that properly requires keeping track of a little extra state. So I'm going to try to address that in the process of integrating this fix.
(Edge cases like that are a common pitfall in writing an SSE implementation; I'm pretty sure we ran into this one before on another platform, but missed it in Ruby. For that reason, we're looking into implementing a standardized set of acceptance/contract tests to be run against all of our SSE implementations, separate from the higher-level SDK testing.)
from ruby-eventsource.
I'm also not sure that our use of @lock
was actually necessary; if there is a way that read_line
or read_chunk_into_buffer
would ever get called from more than one thread, I'm not seeing it.
from ruby-eventsource.
Yes, and to be clear these are mostly theoretical changes that look decent on paper. I would have to vet out edge cases as well and you know far more than me on how SSEs work. Still intending on giving this a test, but other projects have been demanding attention lately.
I believe the larger change, which has a much larger footprint, would be around sending the entire payload as one SSE versus potential sharding patterns for it, or filtering on the server side before they're even sent down. My fear here is that we won't be the only clients with that large of a payload in the future, so it may be a discussion worth having now.
from ruby-eventsource.
Sending the payload as multiple events would be a much, much larger change, incompatible with all our existing SDKs and affecting multiple layers of the service infrastructure. It's unlikely that that will happen in the foreseeable future. Adding some kind of filtering mechanism is more feasible and it's been requested as a feature before.
from ruby-eventsource.
Agreed that sharding payloads is a major breaking change. Server-side filtering would be far more likely and immediately useful, and if a change like this works it'll buy some more time to address it as an option.
I know we've written filtering at a client-side level for data stores, but ideally the filtering is done on the server side before payloads are sent.
from ruby-eventsource.
We've actually added filtering (at the flag data store interface level) in our Go and Java implementations, so that apps suffering from memory ballooning due to LD can explicitly specify the list of flags (or flag name prefixes) they care about. Everything is still downloaded initially, and streamed, but only the specified flags are kept.
Our ideal client SDK would probably look something like this:
- allow clients to optionally specify a list of flags/prefixes on initial connection
- send only those flags/prefixes in the initial dump
- send only those flags/prefixes in subsequent updates
- if the client requests a flag that is not currently present:
- log it (once)
- fetch it explicitly
- add it to the list of flags so that it is streamed for the rest of the connection's lifetime
- (optionally) add it to the list of flags for all connections to apps with the same name/identity
Segments would presumable be handled the same way as flags, with the possible optimization of including any segments mentioned in a wanted flag.
The big question would be whether keeping the flag list per connection in Relay would be expensive. If it's a problem, some sort of copy-on-write/coalesce-on-collision system could allow all 50 instances of the same app to (eventually) share the same list of flags.
from ruby-eventsource.
We might spin those discussions off into a separate ticket, don't want to overload this too much. Also I added a Ruby implementation for local filtering, so that exists too @zellyn.
from ruby-eventsource.
Filtering would be a general feature request in the entire product, not specific to the Ruby SDK. And it's a request we've received before, so there's no need to file it again; I'll just make a note that there's more interest in it. Let's table that here, so that this GitHub issue on ruby-eventsource
can remain about the ruby-eventsource
problem.
(Although - @zellyn, I do want to address one thing you mentioned above, as something that would probably not be in any future filtering implementation: the idea that if the client needed a flag it had not previously received, it would fetch the flag from LD or from Relay at that time. We've deliberately never implemented anything in the SDKs that would involve making HTTP requests to the service endpoints on demand during a flag evaluation— that is just too much latency and unpredictability. The SDKs do all of their HTTP dialogue in the background, updating their internal state accordingly, so that flag evaluations can simply reference that internal state. The only exception is that in server-side SDKs which have database integrations, some of that state might need to be requeried from a database if it's not currently cached.)
from ruby-eventsource.
@baweaver We're reviewing a PR now that has a number of improvements to stream reading, including the one you suggested. We should be able to release the fix this week.
from ruby-eventsource.
We've deliberately never implemented anything in the SDKs that would involve making HTTP requests to the service endpoints on demand during a flag evaluation— that is just too much latency and unpredictability. The SDKs do all of their HTTP dialogue in the background, updating their internal state accordingly, so that flag evaluations can simply reference that internal state.
Absolutely. Our internal home-built feature flag system works exactly the same way: all evaluations are local, based on asynchronously updated rules that can be evaluated. I don't think one synchronous evaluation for a flag would hurt, although it would also be acceptable to return the default, log a warning (so that clients can add the flag to the list of initially-requested flags), and fetch the value asynchronously.
Filtering would be a general feature request in the entire product, not specific to the Ruby SDK. And it's a request we've received before, so there's no need to file it again; I'll just make a note that there's more interest in it.
Indeed, it is seeming more and more likely that it is going to become a requirement for Square to be able to successfully use LaunchDarkly.
from ruby-eventsource.
@baweaver The changes in that PR have been released and the improved SSE client is now in the 6.2.5 release of the Ruby SDK. Please let us know if you see improvements in performance after updating to this version.
from ruby-eventsource.
Closing this because implementation changes in the last several releases have made it obsolete.
from ruby-eventsource.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ruby-eventsource.