GithubHelp home page GithubHelp logo

Comments (2)

qxzzxq avatar qxzzxq commented on June 17, 2024

Hey sorry, I didn't see this issue. Did you solve it finally?

from setl.

JorisTruong avatar JorisTruong commented on June 17, 2024

Hi @LiuxyEric, I could not reproduce your issue. Did you add your second Stage before adding your first Stage ? Here's what I tried.

Declaring three Factories for the first Stage:

class Factory1 extends Factory[DataFrame] with HasSparkSession {
  import spark.implicits._

  override def read(): Factory1.this.type = this

  override def process(): Factory1.this.type = this

  override def write(): Factory1.this.type = this

  override def get(): DataFrame = List("1").toDF("id")
}
class Factory2 extends Factory[DataFrame] with HasSparkSession {
  import spark.implicits._

  override def read(): Factory2.this.type = this

  override def process(): Factory2.this.type = this

  override def write(): Factory2.this.type = this

  override def get(): DataFrame = List("2").toDF("id")
}
class Factory3 extends Factory[DataFrame] with HasSparkSession {
  import spark.implicits._

  override def read(): Factory3.this.type = this

  override def process(): Factory3.this.type = this

  override def write(): Factory3.this.type = this

  override def get(): DataFrame = List("3").toDF("id")
}

Declaring a Factory that ingest the results of the three previous Factories for the second Stage:

class FinalFactory extends Factory[Unit] with HasSparkSession {

  @Delivery(producer = classOf[Factory1])
  val resultOne: DataFrame = spark.emptyDataFrame
  @Delivery(producer = classOf[Factory2])
  val resultTwo: DataFrame = spark.emptyDataFrame
  @Delivery(producer = classOf[Factory3])
  val resultThree: DataFrame = spark.emptyDataFrame

  override def read(): FinalFactory.this.type = {
    resultOne.show(false)
    resultTwo.show(false)
    resultThree.show(false)

    this
  }

  override def process(): FinalFactory.this.type = this

  override def write(): FinalFactory.this.type = this

  override def get(): Unit = {}
}

My main function:

val setl: Setl = Setl.builder()
    .withDefaultConfigLoader()
    .getOrCreate()

val stageOne = new Stage()
    stageOne.addFactory[Factory1]()
    stageOne.addFactory[Factory2]()
    stageOne.addFactory[Factory3]()

setl
  .newPipeline()
  .addStage(stageOne) // notice the stageOne is before FinalFactory
  .addStage[FinalFactory]()
  .run()

Output:

+---+
|id |
+---+
|1  |
+---+

+---+
|id |
+---+
|2  |
+---+

+---+
|id |
+---+
|3  |
+---+

You can see that the second Stage correctly ingested the results of the first Stage's Factories.

For other people that might encounter the same issue and that are looking for an answer, the requirement failed is due to the pipeline expecting some deliverables but cannot find them. This feature has been added on v0.4.3. In the current v1.0.0-SNAPSHOT, we added more explicit exception messages, detailing the missing delivery. This might probably help in fixing the error.
So if you are using SETL v0.4.3 onwards, make sure to check all the available Deliveries in your Pipeline, or an exception will be thrown.

from setl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.