GithubHelp home page GithubHelp logo

dataflowex's People

Contributors

aevitas avatar hekaiduo avatar karldodd avatar svick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataflowex's Issues

CancellationToken with DataFlow

Hi. With raw TPL you can set a CancellationToken in the ExecutionDataflowBlockOptions of each dataflow block for easy cancellation of an entire network. However, I am not seeing an equivalent feature in your DataFlowEx library. How would you suggest handling/processing cancellation requests using your API? Thanks!

Where can I ask questions?

Hi Dodd!

Thanks for the great work with DataflowEx, I started using it and so far it's great for the problem I'm solving. I have a couple of questions but I don't kwnow where to ask.

Can you please give me some directions on where can I ask my questions?

Thanks once more
Erlis

DataBroadcaster sends data only to the first linked target if it is used as output property

Hi. Recently, I was trying to move one of my pipeline from low-level TPL Dataflow to DataflowEx and I met quite an unexpected behavior (as I think).

Let's say, we have class A which is inherited from Dataflow<TIn, TOut> and OutputBlock property is equal to the output block of DataBroadcaster field:

internal sealed class A : Dataflow<string, IReadOnlyList<string>>
{
    private readonly DataBroadcaster<string> _inputBroadcaster;

    private readonly DataBroadcaster<IReadOnlyList<string>> _resultBroadcaster;

    public override ITargetBlock<string> InputBlock =>
        _inputBroadcaster.InputBlock;

    public override ISourceBlock<IReadOnlyList<string>> OutputBlock =>
        _resultBroadcaster.OutputBlock;

    public A()
        : base(DataflowOptions.Default)
    {
        // Init code goes here.
    }
}

However, if I link this dataflow with several external targets, it sends data only to the first linked target:

var a = new A();
var b1 = new TransformBlock<IReadOnlyList<string>, IReadOnlyList<string>>(d => d);
var b2 = new TransformBlock<IReadOnlyList<string>, IReadOnlyList<string>>(d => d);

a.LinkTo(b1); // b1 gets data from a.
a.LinkTo(b2); // b2 do not get data from a.

Is this behavior by design or not?

By the way, using LinkToMultiple solves the problem (but it would cost additional DataBroadcaster implicitly):

a.LinkToMultiple(b1, b2); // b1 and b2 get data from a.

Changing DataBroadcaster to BroadcastBlock with .ToDataflow() call also fixes the problem.

If this example is poor, I can attach the full code to reproduce the issue.

Filtering OutpuBlock

Hi

Let's say I handle an exception inside a TransformBlock which is assign as an Output and return a default(T). I do not want the consumer of the OutputBlock to do the filtering, so is there a way to do it inside my Dataflow?

Unregistering a child from a Dataflow

Is there currently any way to un-register a dataflow block from a dataflow? RegisterChild() is able to be called any time to dynamically add a child to a flow, is the reverse possible?

Elegant Way to Stop Processing a Dataflow

hello, based on a condition in one of our subflows or "blocks", we would like to stop processing the current dataflow. however, we do want to re-use the dataflow for many posts so I don't think want to propogate completion for all children after. Can you think of an elegant way to do this?

thanks

.Net Core 2.0 and logging?

I have cloned the master repository currently at release v2.0.0, changed just the program.cs file under Gridsum.DataflowEx.Demo so that slowflow demo is enabled (SlowFlowAsync on line 59) and the database demos are not enabled (commented lines 57, 65, and 71). When I build and run the demo, I expect to find the FlowMonitor and PerformanceMonitor log files. They don't seem to be getting created.
Is this a known issue, or a problem with my configuration? I'm running Visual Studio 2017 V15.2.5, Common.Logging V3.4.1, and App.Config in Gridsum.DataflowEx.Demo is specifying the NLog.NLogLoggerFactoryAdapter as Common.Logging.NLog20.
NLog has come out with some much newer versions, including a V4.5-rc1 just a few days ago.
Does the Demo src need some tweaks to get the logging working again?

No way to gracefully cancel a task when it fails.

Sorry, I know this isnt really an issue it's just that there isn't even a tag for this library on SO.

Is there a way to cancel a task (a work unit) if it fails (throws an exception)?

I have a few flows that do a lot of file IO over a network and exceptions can happen, so far I have seen no mechanism that would allow me to just log the failure and continue with the other tasks as they come in.

For example, I might have a flow that copies a file, then one that renames it, then one that zips it.

If copying the file fails, I want the copy flow to log the failure, and wait for more work.

Can this be done with DataFlowEx?

Thanks.

Build on CI

I don't see anything in your repository which suggests that the nuget packages are build on a CI-Server. Not doing so, is a security risk for any project which depends on this package, as you might change source files locally before building or your PC might be infected.

Are there any plans on building the NuGet package on an CI server, like for instance AppVeyor?
I can support you to do so, with creating a cake build script and an AppVeyor file, but in the end it's the maintainer who is responsible.

Keep up the good work!!

Best wishes,
Robin

Exposing DataDispatcher child dataflows and their dispatch function?

I would like to use the DataDispatcher to create child TransientBuffers, which, when created, would accept messages having a specific signature, and buffer these messages. Buffering to occur because when created, they would not be linked to a downstream flow. Later, when an async httpclient fetch completes, messages having these specific signatures can be processed, so at that time, I'd like to link the child TransientBuffers to a _terminator flow. Is it possible (and what syntax would I use) to get an IEnumerable of all child dataflows whose ISourceBlock is NOT linked at all ( or not linked to a specific dataflow)? and having produced that list of child dataflows, determine what the dispatch function is for each?

Port to Net Standard 2

Hi,

I think these extensions are still pretty useful and i would like to use them in net core apps.

The Following references are troubling:
Gridsum.DataflowEx.Utils ->
Generic Types from mscorlib
System.Data.DataColumnCollection (used by Databases)
System.Diagnostics.Stackframe with Common.Logging.LogManager

Gridsum.DataflowEx.Databases ->
System.Data.SqlClient

Gridsum.DataflowEx.AutoCompletion -> System.Timers (should be easy)

(C5 -> recently updated for net standard but nuget out of date)

@karldodd any updates from you #3 ?

Dependencies to .NET Framework required

Hello,

Given that the project is supposed to be .NET Standard 2.0 compatible, it shouldn't require dependencies to .NET Framework. These are required by the dependency to System.Data.SqlClient, that is a direct dependency of DataflowEx.

This is not suitable for three reasons:

  • DataflowEx gets inherently coupled with Sql Server, making it database gnostic
  • What if I don't even need to use databases?
  • It forces to download and be dependent on a whole bunch of .NET Framework assemblies:
    image

I suggest putting the Sql Server related code in another Nugget package, making it optional.

Dataflow waiting for todo items on Exception.

Hi,

I'm trying to make a Dataflow return a Exception state when a exception occurs on the middle of processing, but the Dataflow remains to waiting for todo items to complete from what i can see.
The message that log returns in loop is [Gridsum.DataflowEx.PerfMon].[Debug] [MyFlow1] has 19 todo items (in:0, out:19) at this moment.
How can i stop all the pipeline when i receive a unhandled exception at a point of my flow?

Edit: the this.Complete() call inside SignalAndWaitForCompletionAsync make this waiting behavior.
If i substitute SignalAndWaitForCompletionAsync with flow.CompletionTask.Wait(), the exception is returned to my application.
In my model, i need that the application handle the exception.

Thanks.

Using JoinBlock or equivalent

Hello, trying to use Broadcast and Join Blocks to fork/join, doesn't seem really possible with DataflowEx...? Can't LinkTo joinBlock.Target1 or Target2 ... ?

Error with newer version of Common.Logging

I got an error when using a newer version of Common.Logging and Common.Logging.Core System.TypeInitializationException : The type initializer for 'Gridsum.DataflowEx.LogHelper' threw an exception.
----> System.TypeLoadException : Could not load type 'Common.Logging.LogManager' from assembly 'Common.Logging.Core, Version=3.1.0.0, Culture=neutral, PublicKeyToken=af08829b84f0328e'.
at Gridsum.DataflowEx.LogHelper.get_Logger()
at Gridsum.DataflowEx.Dataflow.d__18.MoveNext()

i got the same issue using 3.1 and 3.2 of the frameworks.

i was able to get past that issue by revering Common.Logging.Core to 2.2 and updating my app.config.
It looks like they had similar problems with an earlier version: net-commons/common-logging#23

but that should be fixed in 2.2 according to that thread. Not sure what the root problem is here or if maybe a new version of DataflowEx should be published with updated references to the logging frameworks.

Error occurred in my performance monitor loop

hi

Any ideas on what might cause such an exception?

18/01/05 00:05:44 [HttpClientFactory1] Error occurred in my performance monitor loop. Monitoring stopped. System.FormatException: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.
at System.Text.StringBuilder.AppendFormatHelper(IFormatProvider provider, String format, ParamsArray args)
at System.String.FormatHelper(IFormatProvider provider, String format, ParamsArray args)
at System.String.Format(IFormatProvider provider, String format, Object[] args)
at Common.Logging.Factory.AbstractLogger.StringFormatFormattedMessage.ToString() in C:_oss\common-logging\src\Common.Logging.Portable\Logging\Factory\AbstractLogger.cs:line 164
at System.Text.StringBuilder.AppendFormatHelper(IFormatProvider provider, String format, ParamsArray args)
at System.String.FormatHelper(IFormatProvider provider, String format, ParamsArray args)
at System.String.Format(IFormatProvider provider, String format, Object[] args)
at NLog.LogEventInfo.CalcFormattedMessage()
at NLog.LogEventInfo..ctor(LogLevel level, String loggerName, IFormatProvider formatProvider, String message, Object[] parameters, Exception exception)
at Common.Logging.NLog.NLogLogger.WriteInternal(LogLevel logLevel, Object message, Exception exception) in C:_oss\common-logging\src\Common.Logging.NLog10\Logging\NLog\NLogLogger.cs:line 949
at Common.Logging.NLog.NLogLogger.DebugFormat(String format, Object[] args) in C:_oss\common-logging\src\Common.Logging.NLog10\Logging\NLog\NLogLogger.cs:line 321
at Gridsum.DataflowEx.Dataflow.d__34.MoveNext() in D:\eXpandFramework\Trading\DataflowEx-master\DataflowEx-master\Gridsum.DataflowEx\Dataflow.cs:line 328

Help request for simple project

Hi,
Very interested in DataFlowEx and wondered if you could just give a little advice on how to solve the following simple problem as a starter project (I hope this is the right place to ask as could not find anywhere else).

I have 10's of thousands of XML files sitting in a folder. In parallel, for each file. I want to be able to (assuming reading them off disk),

  1. Run each file through a component that reshapes the data and does validation on the data
  2. If any validation rule fails, the component throws an exception with broken validation details. I want to be able to handle that exception and write it to a failures XML file (Failures.xml)
  3. If the component works or fails, I want to add some details about the file to an XML file (Index.xml).
  4. If the component works, I want to write the result to disk AND upload to a server using a REST API

4 is the bit that takes much longer than 1 so rather than process each file one by one, I'd like to build a pipeline that processes files in parallel, logs, and uploads.

My sticking points are the logging - how to do this without locking (if at all that's possible) and do I make the logging part of the flow or a separate buffer type block that handles the logging but then not sure how to wait on read file / process file / upload file and all the logging has completed.

Also not sure how to feed the 1 - 4 without swamping the flow i.e. want it to be at max capacity processing all the time without submitting too many that it kills the performance gains of doing it in parallel.

Also, how do I deal with shared variables e.g. If I want to keep a track of how many worked and how many didn't, how do I do that?

Any outline help gratefully received

Thanks

Mike

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.