GithubHelp home page GithubHelp logo

papeeria-compiler-fleet's People

Contributors

dbarashev avatar shavkunov avatar

Watchers

 avatar  avatar  avatar

papeeria-compiler-fleet's Issues

Process publisher failures

Should publishing the result fail, we want to take appropriate measures. It may be just logging or monitoring counter increment or re-send or sending failure message. Anyway, one who calls publish (e.g. TaskReceiver should be able to process failure.

Send out the result of task processing

When compiler instance completes processing a task, it needs to send out the result.

The result is also a protocol buffer which is sent to PubSub topic. It needs to include:

  • task id -- the same which was received in the task message)
  • status code -- integer number, 0 in case of success and other value in case of errors. The semantics of non-zero codes will be established later
  • result itself. It should be a byte array which currently (when we calculate md5) may be just a string. Later when we start producing PDF it will be a serialized PDF document and finally in production it most likely will be a serialized protocol buffer with the information about the produced artifacts.

Really process project by rmarkdown compiler

Googling "docker rmarkdown" produces a handful of results and it seems there are existing options of compiling RMarkdown documents in Docker. So we need to take some of the existing images and feed it with the project extracted from ZIP. For that purpose we'll need to bind-mount the directory where project was extracted to into the container and, well, run the compiler.

Command line utility for publishing ZIP tasks

Let's convert our publisher into a command line utility which can be invoked with the directory name in the arguments and which will create a zIP archive and will send it to the topic?

Simple compile task with zipped project

Since we build a compile fleet, we need to pass the source files to fleet workers. In the real system they will contact a special service and will ask to download user project contents and write the files to disk, but for development and testing purposes we need a way to pass a zip file with the project contents and unpack it to some directory.

Zip file should be passed in a protocol buffer. Besides, we need additional information in the same message:

  • root file name -- "absolutive" (stands for relative absolute) file name of a file which is supposed to be compiled. In general compilation process may involve many files, but normally it starts from some particular file which is called "root file" or "compilation target". This file may be identified by file id or by path relative to the project root. Despite that paths are relative, we use absolute notation. For instance, if root file is called "main.rmd" and sits directly in the project root, its absolutive path is /main.rmd
  • task id -- it is a unique string which is built from project, root file and user id, so that compile requests submitted by different users in the same project has different task ids. We use task id as a name of a directory where we download/extract user project.

Subscriber should be configured with the absolute path to "tasks dir" which is a root of a directory tree with the contents of files comprising the input to compiler. Tasks dir is referred hereafter as /mnt/tasks but it should be configurable via command line

Once subscriber receives such message, it should create a subdirectory /mnt/tasks/$TASK_ID/files and unzip the contents of ZIP archive into that directory.

Compile Markdown in Docker

Currently it's compiled as a standalone process, we want to do it in docker.

The result must be in tex, for further processing texbe will be called.

Adding unit tests

There is a need to add some unit tests to Publisher/Subscriber and DockerProcessor.

Run docker in the subscriber message processor

Let's do something more complex in the subscriber: start docker container which would e.g. just print the message contents or calculate its md5 sum. Somethigg equivalent to
docker run --rm busybox echo ${MESSAGE_BODY} | md5sum

You will certainly need Docker and probably this java library to access Docker from the subscriber code.

Converting Markdown to Latex

We want to process converting md files on the separate machine and get result via Pubsub

The goal is to send CompileRequest to texbe and send back it's result to frontend.

Print something meaningful in the help message

Currently it prints

build/install/papeeria-compiler-fleet/bin/papeeria-compiler-fleet -h
Exception in thread "main" com.xenomachina.argparser.ShowHelpException: Help was requested
	at com.xenomachina.argparser.ArgParser$1.invoke(ArgParser.kt:608)
	at com.xenomachina.argparser.ArgParser$1.invoke(ArgParser.kt:38)
	at com.xenomachina.argparser.OptionDelegate.parseOption(OptionDelegate.kt:61)
	at com.xenomachina.argparser.ArgParser.parseShortOpts(ArgParser.kt:594)
	at com.xenomachina.argparser.ArgParser.access$parseShortOpts(ArgParser.kt:38)
	at com.xenomachina.argparser.ArgParser$parseOptions$2.invoke(ArgParser.kt:495)
	at com.xenomachina.argparser.ArgParser$parseOptions$2.invoke(ArgParser.kt:38)
	at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:74)
	at com.xenomachina.argparser.ArgParser.getParseOptions(ArgParser.kt)
	at com.xenomachina.argparser.ArgParser.force(ArgParser.kt:448)
	at com.xenomachina.argparser.ArgParser.parseInto(ArgParser.kt:470)
	at com.bardsoftware.backend.fleet.rmarkdown.SubscriberKt.main(Subscriber.kt:68)

See also
bardsoftware/papeeria-edit-history@827648e

Add protocol buffer support

Protocol buffer is a convenient way to transfer data between servers. We want to submit protocol buffers as message payloads to our subscribers. Please add basic support of protocol buffers to the build system. We need a gradle plugin, primitive protocol buffer definition (at this moment it can be just any message).

You can borrow some ideas from this pull request in the neighbor project.

We do not have a service in the compiler fleet, we will just have a CompilerFleetTask message

Fix local build

Despite successful build on travis, locally it can't be done.
Tests need a google credentials, which are used by Publisher, but there is no reason to create one, so idea is parameterize TaskReceiver with Publisher

Texbe markdown compiling

Previous step was about converting .md file into .tex file.

Now, we gonna send request to texbe to compile .tex file and publish it's response.

Simple PubSub publisher and subscriber

Our fleet nodes will run PubSub subscribers waiting for new tasks coming from PubSub topics. Let's take Quick Start example and create a simple server which waits for simple messages from some hello-world topic and just prints them to console.

Google client libraries and docs use Java language, but please write code in Kotlin. It is pretty compatible with java libs.

Place code into com.bardsoftware.backend.fleet.rmarkdown package.

We'll also need a build.gradle file with java, kotlin and application plugins which can build, run and create distribution packages.

You can send messages using gcloud command line tool

You will need a project id and application credentials file. Project ID is papeeria-interns. I will send you JSON file separately. Please keep it securely and do not put into the repository. Not that it was very dangerous because it gives very limited permissions in a very limited project (it is not production Papeeria) but anyway, we don't want it to be public.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.