GithubHelp home page GithubHelp logo

fenzo's Introduction

Fenzo - ARCHIVED

Overview

Fenzo is a scheduler Java library for Apache Mesos frameworks that supports plugins for scheduling optimizations and facilitates cluster autoscaling.

Apache Mesos frameworks match and assign resources to pending tasks. Fenzo presents a plugin-based, Java library that facilitates scheduling resources to tasks by using a variety of possible scheduling objectives, such as bin packing, balancing across resource abstractions (such as AWS availability zones or data center racks), resource affinity, and task locality.

Fenzo Features:

  • it is a generic task scheduler that works with any Apache Mesos frameworks
  • it is well suited both for long running service-style tasks and for batch or interactive use cases
  • it can autoscale the execution hosts cluster based on demand for resources
  • it supports plugins with which you can customize the constraints that regulate resource selection and task placement optimizations
    • you can compose multiple individual plugins together to achieve higher level complex objectives; for example, you can in this way achieve bin packing as well as task affinity
    • the constraints you choose can be soft (best effort) or hard (must satisfy)
  • you can assign CPU, memory, network bandwidth, disk, and port resources from resource offers
  • you can group a heterogeneous mix of execution hosts based on their attributes
    • you can have each group autoscaled independently; for example, you may choose to keep a minimum of five big-memory hosts idle, but a minimum of eight small-memory hosts
  • you can set resource allocation limits on a per job group basis
    • you can define limits on the use of resource amounts on a per job group basis
    • the scale up of the cluster stops when these limits are reached, even if tasks are pending
  • you can monitor resource allocation failures in order to assist in debugging why some tasks can't be launched
  • you have the ability to trade off scheduling optimizations with speed of assignments

Beyond a first-fit assignment of resources, Fenzo frameworks can optimize task placement by using the built-in plugins, or they can design and deploy their own custom task placement optimization plugins. Frameworks can balance scheduling speed with optimal task assignment quality based on their needs. The fitness calculator and constraint plugins that are built in to Fenzo include:

  • Bin packing fitness calculator
    • CPU, memory, network bandwidth, or a combination of them
    • packs tasks into as few hosts as possible
  • Host attribute value constraint
    • selects a host for a task only if it has a specific attribute value
  • Unique host attribute constraint
    • ensures co-tasks are placed on hosts that have unique values for a given attribute or host attrName
    • for example: one co-task per AWS EC2 availability zone, or one co-task per unique host
  • Balanced host attribute constraint
    • ensure co-tasks are placed on hosts such that there are an equal number of tasks on hosts with unique value of the attribute
    • for example: balance all co-tasks across AWS EC2 availability zones
  • Exclusive host constraint
    • ensure the host is used solely for the task being assigned, even if additional resources are available on the host

You can specify whether Fenzo applies a constraint in a hard (must satisfy) or soft (satisfy as much as possible) manner.

Packages

  • fenzo-core The core scheduler library for Apache Mesos frameworks.
  • fenzo-triggers Utility library for setting up triggers based on cron style specification.

Binaries

Binaries and dependency information for Maven, Ivy, Gradle and others can be found at http://search.maven.org.

Javadocs

Programmer's Guide

The Fenzo Programmer's Guide is available as a wiki on the Fenzo GitHub site.

LICENSE

Copyright 2015 Netflix, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

fenzo's People

Contributors

amit-git avatar apophizzz avatar corindwyer avatar davidmgross avatar fabiokung avatar jkschneider avatar karnauskas avatar nadavc avatar rpalcolea avatar rspieldenner avatar solarkennedy avatar spodila avatar sthadeshwar avatar wyegelwel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fenzo's Issues

Support dynamic fitness threshold

I'd like the "good enough" calculation to vary with the urgency of the task. For example, imagine that a task should be scheduled within 30 seconds. At first, the fitness bar should be held high, then be gradually lowered as the task request gets older; at the 30 second mark, "anything that meets the hard constraints will do".

I suggest that the FitnessGoodEnoughFunction accept the TaskAssignmentResult to convey the TaskRequest along with the fitness measurement.

TaskRequest should support wait queue expiration

When a TaskRequest is submitted to a Fenzo scheduler (scheduler.scheduleOnce()), it stays in the scheduler's internal queue until a resource offer comes in that fits its requirements. However, it may be the case that no such offered resource will ever satisfy it, and so we would like for the task to be auto-removed from the internal queue after sitting there for a specified amount of time, instead of bad TaskRequests filling up the queue forever. To support this, the following things would be needed:

  1. specifying the queue wait timeout in the SchedulerBuilder SchedulerBuilder.withQueueWaitTimeout(1000), where 1000 is seconds
  2. ability to add a callback in either the Scheduler or SchedulerBuilder for TaskRequest removal-from-queue events, i.e. SchedulerBuilder.withRemovedFromQueueCallback(Action1<TaskRequest>) or scheduler.setRemovedFromQueueCallback(Action1<TaskRequest>),
  3. the TaskRequest interface should specify a "int waitTimeout()" method that the scheduler can access to figure out of the task request has been waiting too long to be assigned.

Lease expiration v.s. scale up trigger

Thanks Fenzo for the separation of concerns to abstract the scheduling logic. I was able to create a simple framework (using Fenzo) to scale up/down tasks(Docker) and also scale up/down platform (Softlayer) as needed.

One issue I faced is to balance between a shorter offer expiration (to increase platform sharing) and less aggressive scale up trigger (due to offer just expired ).

Finally the solution is, besides setting offer expiration and scale up delay configuration, I also add another configuration of "wait seconds since last lease expiration before scale up" to ignore the scale up if the last offer expiration is within a duration.

Please let me know if there are other alternatives than introducing this new configuration. You can find out details of my framework here: https://github.com/yanglei99/Mesos_Auto_Scale

Thanks.

Unsafe concurrent access to unknown lease collection in AssignableVMs class

unknownLeaseIdsToExpire array is modified directly when invoking TaskScheduler#expireLease method. This array is not concurrent, and most likely the intent was to handle it like all the other mutations by the internal thread.

Here is an example stack trace:
fenzo.TaskScheduler:? - Error with scheduling run: null java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) at java.util.ArrayList$Itr.next(ArrayList.java:851) at com.netflix.fenzo.AssignableVMs.expireAnyUnknownLeaseIds(AssignableVMs.java:257) at com.netflix.fenzo.AssignableVMs.prepareAndGetOrderedVMs(AssignableVMs.java:270) at com.netflix.fenzo.TaskScheduler.doSchedule(TaskScheduler.java:754) at com.netflix.fenzo.TaskScheduler.doScheduling(TaskScheduler.java:736) at com.netflix.fenzo.TaskScheduler.scheduleOnce(TaskScheduler.java:711) at com.netflix.fenzo.TaskSchedulingService.scheduleOnce(TaskSchedulingService.java:275) at com.netflix.fenzo.TaskSchedulingService.access$700(TaskSchedulingService.java:73) at com.netflix.fenzo.TaskSchedulingService$1.run(TaskSchedulingService.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

fitness calculators | oring properties

hi @spodila just testing out fenzo

Seeing the binpackingfitnescalculators.* it's effectively


sum ( fitness_i( request ).... + fitness_n( request) ) /  total_fitness_fns 

# or | fitness |   

Wondering if it'd be worth it to make it OR objects... so that one can do


builder.
           .....
           .withFitnessCalculator ( Fitness1 | Fitness2 | Fitness3 )
           .build()

Interested in a patch?

I guess two things:

  1. You can use constraints to do what I want - effectively further filtering of machines to offers
  2. Since there are fairly small set of resources on mesos (cpu, mem, gpu, bandwidth)... it's not really too bad to have 4 or even 10 (in the future) that are hard coded since for most workloads you'll need CPU + mem so the permutations are not that many ... w/ or w/out network, w/ or w/out gpu.. etc.

Pluggable ENI fitness evaluator

Feature request

Fenzo supports a concept of preferential named consumable resource, which models a collection
of two-level resources. The top level resource is tagged with a name during task placement process,
which defines some sort of its runtime profile. Multiple tasks matching the same profile can be
associated with the same consumable resource, and be allocated portion of its subresources.

For example, in AWS an ENI and its security group can be modeled as two level resource. The ENI
interface models the resource, the subresource is a number of IPs that can be associated with an ENI
interface, and the runtime profile is defined by security group(s) associated with an ENI.
Tasks with identical security groups placed on the same agent, may thus share single ENI interface
until pool of available IPs (sub-resources) is exhausted. When the last task associated with an ENI
interface is terminated, its runtime profile becomes undefined again.

As calling AWS API is expensive, it makes sense to reduce the amount of network stack configuration related calls by reusing already provisioned resources. This means Fenzo should promote task placement on an agent/ENI slot which already holds required resources. As Fenzo has limited insight into it (unless a task is already associated with an ENI), we need a pluggable API to externalize this evaluation process.

Implementation proposal

To achieve this goal, two new callback interface are proposed. PreferentialNamedConsumableResourceEvaluator computes fitness score for each valid task/ENI assignments. SchedulingEventListener provides notifications from within the scheduling loop, so newly placed tasks can be accounted for during fitness calculation process.

/**
 * Evaluator for {@link PreferentialNamedConsumableResource} selection process. Given an agent with matching
 * ENI slot (either empty or with a matching name), this evaluator computes the fitness score.
 * A custom implementation can provide fitness calculators augmented with additional information not available to
 * Fenzo for making best placement decision.
 *
 * <h1>Example</h1>
 * {@link PreferentialNamedConsumableResource} can be used to model AWS ENI interfaces together with IP and security
 * group assignments. To minimize number of AWS API calls and to improve efficiency, it is beneficial to place a task
 * on an agent which has ENI profile with matching security group profile so the ENI can be reused. Or if a task
 * is terminated, but agent releases its resources lazily, they can be reused by another task with a matching profile.
 */
public interface PreferentialNamedConsumableResourceEvaluator {

    /**
     * Provide fitness score for an idle consumable resource.
     *
     * @param hostname hostname of an agent
     * @param resourceName name to be associated with a resource with the given index
     * @param index a consumable resource index
     * @param subResourcesNeeded an amount of sub-resources required by a scheduled task
     * @param subResourcesLimit a total amount of sub-resources available
     * @return fitness score
     */
    double evaluateIdle(String hostname, String resourceName, int index, double subResourcesNeeded, double subResourcesLimit);

    /**
     * Provide fitness score for a consumable resource that is already associated with some tasks. These tasks and
     * the current one having profiles so can share the resource.
     *
     * @param hostname hostname of an agent
     * @param resourceName name associated with a resource with the given index
     * @param index a consumable resource index
     * @param subResourcesNeeded an amount of sub-resources required by a scheduled task
     * @param subResourcesUsed an amount of sub-resources already used by other tasks
     * @param subResourcesLimit a total amount of sub-resources available
     * @return fitness score
     */
    double evaluate(String hostname, String resourceName, int index, double subResourcesNeeded, double subResourcesUsed, double subResourcesLimit);
}
/**
 * A callback API providing notification about Fenzo task placement decisions during the scheduling process.
 */
public interface SchedulingEventListener {

    /**
     * Called before a new scheduling iteration is started.
     */
    void onScheduleStart();

    /**
     * Called when a new task placement decision is made (a task gets resources allocated on a server).
     *
     * @param taskAssignmentResult task assignment result
     */
    void onAssignment(TaskAssignmentResult taskAssignmentResult);

    /**
     * Called when the scheduling iteration completes.
     */
    void onScheduleFinish();
}

Hard time limit on holding offers for Fenzo

Currently, Fenzo supports releasing offers at a fixed rate. In order to have a Fenzo-scheduled framework function in a multiframework environment, it's important for every offer to be declined in a timely manner if it isn't used. For example, another framework may have specific constraints or properties it's looking for in an agent--currently, it's possible for Fenzo to hold onto an offer for many minutes (if you're unlucky on a very large cluster). This can negatively impact other frameworks that are looking for very specific hosts.

The solution I'd like to see is the ability to configure a maximum time to hold an offer before declining it; this way, a Fenzo-based framework could choose to hold no offers for longer than 30 seconds; this would greatly benefit multi-framework Mesos clusters.

Complete wiki documentation

The wiki docs should:

  1. Describe how Fenzo fits in to a Framework=>Mesos=>Cloud Cluster environment, from the point of view of the Mesos community.
  2. Describe how to use the various features of the Fenzo API, from the point of view of Java developers.

Some questions to answer:

  • How do I use Fenzo to control autoscaling, and how can I automate this with framework resource allocations?
  • How do I establish custom slave attributes that I can then use as constraints?
  • How do I get jobs that use similar resources allocated to the same instance so as to reduce resource use?
  • How do I adjust task urgency? How can I create & deploy fitness calculator plugins that do this in an automated way?
  • What are hard and soft constraints, and how do I distinguish them?

A possible outline:

  1. Intro
  2. Getting started
  3. Fenzo design and concepts
  4. How to use Fenzo
  5. Constraint evaluator plugins
  6. Fitness calculator plugins
  7. Autoscaling the cluster
  8. Appendix: JavaDocs

Support v1 Protos (or provide utility to convert)

I started porting our Fenzo-based scheduler over to the Mesos HTTP API using mesos-rxjava.
One issue I ran into is documented in the mesos-rxjava project (mesosphere/mesos-rxjava#74). To summarize, the HTTP API now hands me an org.apache.mesos.v1.Protos.Offer, but Fenzo only accepts org.apache.mesos.Protos.Offer.
I have a few ideas for how to convert to/from these Offers, but they feel risky. I was wondering if (a) Fenzo plans to support v1.Protos in the future? ... and (b) if you had any idea how to cleanly convert from v1.Protos to legacy Protos in the meantime?

Update Mesos dependency

The library is currently linking against Mesos 0.24. Consider updating the Mesos system requirements to a more recent version, e.g. 1.0+. This would help with #83 (which requires a more recent protocol), and with reducing the test scope.

Make AssignableVMs key on slaveId not hostname

Hostname appears to be informational to Mesos: hostname does not appear in TaskStatus messages or TaskInfo messages. SlaveID does, and for Mesos tasks are assigned by SlaveID, not by hostname. This means that a framework that uses Fenzo must map SlaveIDs to hostnames just so that it can call Fenzo when a task state changes.

Better yet, make the taskUnassigner take a TaskStatus and then just pull out whatever is required.

Add missing javadocs

Completely describe the Fenzo API in javadoc comments, including classes, interfaces, methods, attributes, and enums.

Use the active voice in order to make the documentation unambiguous -- see http://go/pv

Interaction between frameworks using Fenzo

We have multiple Mesos frameworks in a Mesos Cluster with three hosts(agents). Some of the frameworks developed by ourselves are using Fenzo and some of the frameworks are not using Fenzo (e.g. Marathon). We have configured leaseOfferExpirySecs to 2 and have found that frameworks that use Fenzo have been starving frameworks that do not use Fenzo.
We would like to ask the following questions.

  1. Can we have more than one Fenzo TaskSchduler in a Mesos Cluster?
  2. Can we have Mesos Framework that uses Fenzo and Mesos Framework that does not use Fenzo in a Mesos Cluster?
  3. Can we use Fenzo in a small cluster with three hosts(agents)?

Update Javadocs

The Javadocs don't match the latest code. It'd be awesome to update them!

Fenzo does not account for the resources of a custom executor

I am running tasks with a custom dockerized executor that needs 0.5 cpus.
If (let's say) all my tasks need 0.1 cpus , and Fenzo gets a lease for 1.0 cpu, what normally happens is that "scheduleOnce" tries to pair up ten tasks against that lease... so I schedule those ten.

  • Task 1 launches fine, but because it's the first time the executor runs on that agent, there are actually 0.6 CPUs used in the process (0.5 + 0.1)
  • Task 2 launches fine (now we're at 0.7 cpus)
  • Task 3 launches fine (now we're at 0.8 cpus)
  • Task 4 launches fine (now we're at 0.9 cpus)
  • Task 5 launches fine (now we're at 1.0 cpus)
  • Tasks 6-10 fail with TASK_ERROR - effectively saying the task resources are more that what is left in the offer.

I can work around it by checking if my custom executor is not part of the offer and then manually summing resources and not scheduling tasks that would cause a TASK_ERROR, but of course this is somewhat duplicating what I hope Fenzo would do for me, and I am bound to screw it up.

[question] set resources evaluation in Fenzo

Assume two task requests and a VM lease come in:

VMLease1: { fooset: {foo-a, foo-b} }

TaskRequest1: requires foo-a
TaskRequest2: requires foo-a

Both task requests are entered simultaneously:
taskScheduler.scheduleOnce( [TaskRequest1, TaskRequest2], [VMLease1])

For both tasks, I have overridden getHardConstraints() to include a ConstraintEvaluator that checks whether a VMLease has the set resource "foo-a"

If TaskRequest1 is evaluated first and succeeds, how does Fenzo tell TaskRequest2 that "foo-a" is not available anymore when it runs TaskRequest2's constraints evaluator? Or is it the case that each task request's foo-a ConstraintEvaluator sees a different VirtualMachineCurrentState when ConstraintEvaluator.evaluate() is called (so that only one of the task requests' foo-a evaluators will see foo-a)?

What I'm trying to ask is, if VMLease1 satisfies both TaskRequest1 and TaskRequest2, how does Fenzo know not to return from taskScheduler.scheduleOnce() with a success to both TaskRequests since they both ask for the same resource?

Increase debug-level logging in Fenzo's scheduler

Currently, I am trying to debug an issue where I provide with one task and one lease to schedule, and fenzo says that it has zero successful or failed assignments. I'm trying to debug this, but since there's no debug logging available, it's tricky to trace what's going on.

Allow for custom shortfall evaluators

Hey there,

Great to see the new OptimizingShortfallEvaluator. Any chance this class/interface hierarchy could be made public so that we can extend it to implement our own shortfall evaluation strategies?

The use case I have is where we are scheduling only short-lived tasks on a dedicated auto-scaling group (or possibly groups in the future). If there are tasks that have a lifetime in the order of seconds, then the current shortfall evaluation ends up grossly overestimating resource needs. Ideally we want to pseudo-schedule some pseudo tasks that represent what we think our resource requirements will be for the next n minutes (where n is probably derived from the auto-scaling cooldown period) based on currently running tasks and pending tasks (and maybe some task history that we record as well).

We might even just start with something fairly naive that doesn't even use pseudo-scheduling, so it would be cool just to be able to implement ShortfallEvaluator ourselves.

Fenzo holds all resource offers of slaves that have tasks assigned

Sorry that I couldn't find a place to ask questions so had to open an issue here.

I found that after calling TaskScheduler.getTaskAssigner().call(...), Fenzo will hold all subsequent resource offers of the corresponding slave forever without even looking at leaseOfferExpirySecs. I'd like to know why Fenzo needs to do that. It makes other frameworks unable to use the remaining resources of that slave. Or is it expected that there should NOT be other frameworks in the cluster? In other words, is it expected that the framework that uses Fenzo should be the only framework in the cluster?

Thanks a lot!

Reject Leases on single hosts to avoid fragmentation

In com.netflix.fenzo.AssignableVMs#removeLimitedLeases there is this comment:
// randomize the list so we don't always reject leases of the same VM before hitting the reject limit
Don't we want to free up large chunks of contiguous space on a single VM, rather than fragmenting space across lots of machines?
See @spodila comment on #50

Incorrect handling of multi-disk offers

Fenzo misunderstands offers that contain numerous disk resources, as can occur when an agent is configured with multiple disks (as described here). Fenzo consequently miscalculates the available disk resources.

The VMLeaseObject constructor iterates over the offered resources to identify the cpus, mem, and disk. The last encountered disk resource wins, becoming the basis for the diskMB quantity. The logic should only consider the 'root' disk resource, i.e. the disk without a source component. This approach makes sense for frameworks that don't explicitly support non-root disks.

How to fully support multi-disk scheduling with Fenzo is considered a separate issue.

Execution failed for task ':fenzo-core:compileJava'.> 无效的源发行版: 1.8

11:30:45.916 [ERROR] [org.gradle.BuildExceptionReporter]
11:30:45.917 [ERROR] [org.gradle.BuildExceptionReporter] FAILURE: Build failed with an exception.
11:30:45.918 [ERROR] [org.gradle.BuildExceptionReporter]
11:30:45.918 [ERROR] [org.gradle.BuildExceptionReporter] * What went wrong:
11:30:45.918 [ERROR] [org.gradle.BuildExceptionReporter] Execution failed for task ':fenzo-core:compileJava'.
11:30:45.918 [ERROR] [org.gradle.BuildExceptionReporter] > 无效的源发行版: 1.8
11:30:45.919 [ERROR] [org.gradle.BuildExceptionReporter]
11:30:45.919 [ERROR] [org.gradle.BuildExceptionReporter] * Exception is:
11:30:45.920 [ERROR] [org.gradle.BuildExceptionReporter] org.gradle.api.tasks.TaskExecutionException: Execution failed f
or task ':fenzo-core:compileJava'.
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.ExecuteAction
sTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:69)
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.ExecuteAction
sTaskExecuter.execute(ExecuteActionsTaskExecuter.java:46)
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.PostExecution
AnalysisTaskExecuter.execute(PostExecutionAnalysisTaskExecuter.java:35)
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.SkipUpToDateT
askExecuter.execute(SkipUpToDateTaskExecuter.java:68)
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.ValidatingTas
kExecuter.execute(ValidatingTaskExecuter.java:58)
11:30:45.921 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.tasks.execution.SkipEmptySour
ceFilesTaskExecuter.execute(SkipEmptySourceFilesTaskExecuter.java:52)

High availabilty support

If one builds a framework on top of Fenzo, what are the guidelines to enable high availability of the framework? Specifically assuming zookeeper is being used to provide leadership election between framework instances, how local cache of the Fenzo (for example Tasks queues, running state etc.) can be synchronized to other instances of framework built on Fenzo?

Thanks for your help.

[question] Recover the task assignment after system restart

Hi,
I am using the UniqueHostAttrConstraint to ensure tasks assigned to different host.
However, after the system restart. The constraint is not working. I believe this is due to the assignment history loss after restart.

So what will be the correct way to presisent the assignment history and recover it after system restart?

[Question] Recommendation for long running service-style task

For a framework based on Fenzo, what are the guidelines for scheduling service style tasks?
I am looking for a use case to schedule mix of service and batch jobs. The queueable task input to Fenzo has no distinction for service or batch jobs. This implies that framework should restart the service job when the job finishes/fails. One way is to push the failed/finished service job back in the pending queue, and wait for Fenzo to schedule the job. However, this may lead to interruption of service till the time the prior pending jobs in the queue gets scheduled.

Is there any recommendation to handle restart for the service style tasks and also minimize the interruption of the service?

Give use cases to understand the features better

After reading the features I'm struggling to understand what actual use cases are where Fenzo comes in. For example in my case, I have a Spark framework (on Mesos) which runs interactive queries with a measured average response time. Now if I increase the cluster resources by a factor 2 or 4 for example, and increase the amount of Spark frameworks by that same factor, the variance increases greatly. So far, I can see that this stems from the fact that a Spark framework is very greedy; if a resource offer is made, the framework accepts all, and in the meantime other frameworks stall. The average response time is pretty much the same only because the added resources make up for the longer waiting periods for the frameworks. But the higher variance makes this unfavorable to interactive Spark sessions.

  • Are there mechanisms/features in Fenzo which can help with this above problem? (Basically QoS problems, which Mesos doesn't seem to care about)
  • Is anybody working on Spark/Fenzo integration? If not, does anyone have an idea about caveats/hurdles to implement it?

Resource ranges and "greedy" scheduling

UPFRONT DISCLAIMER:
Nothing is broken - I am only asking if the Fenzo developer community would have any interest in pursuing or accepting a new feature.

Problem

I am working on a development effort where our team is adding new capabilities to a Mesos Framework that uses Fenzo. One of our requirements is "greedy" scheduling. Effectively, we work with a lot of scientific algorithms that can run sequentially or can run in parallel modes. The parallel modes run faster but need more resources, making them harder to schedule sometimes. Currently, these algorithms must be configured with high resource requirements, but sometimes they take forever to schedule. Alternatively, we can configure them with low resource requirements, and they run, but much more slowly.

Design space:

We are looking at introducing a data model on top of TaskRequest that takes "resource ranges" (min, max). We are then looking at two different design approaches:

Approach 1:

  1. Initialize task requests to their maximum requested resources (e.g. 16 cpus)
  2. After some configurable number of scheduling iterations have passed, if a task is still unassigned, back off or reduce the requested resources (e.g. 8 cpus).
  3. Continue until task is schedule or you reach minimum resource levels (e.g. 1.0 cpu)

Approach 2:

  1. Generate permutations on the TaskRequest ranges
  2. Let Fenzo generate SchedulingResults for the various permutations of TaskRequest resource ranges
  3. Run the SchedulingResults through some fitness evaluation that takes into account % of resources utilized, number of tasks scheduled, etc.
  4. Use the best SchedulingResult

We recognize that approach 2, although more optimal and faster to schedule, is not readily possible using TaskQueues and TaskSchedulingService. There would be a lot of changes to Fenzo needed to allow concurrent calls to scheduleOnce() on the same sets of resourceOffers and taskRequests to generate the permutations... (honestly not even sure if you'd use 1 taskQueue per permutation or try to handle it all in one).

Our proposed solution is to choose approach 2 ONLY IF the Fenzo team is interested in helping to support the feature, or if the Fenzo team would be willing to accept the additional complexity of a pull request with resource ranges and permutations.

If the Fenzo team is not interested in this "greedy" scheduling feature or doesn't believe it adds general-purpose value to the community, then our team is going to select approach 1.

Any insight you have on this matter is greatly appreciated! Whether it be a "yes" or "no" on contributing support...... a "yes" or "no" on accepting a pull request.... or even general guidance on something we've missed, whereby Fenzo is already capable of solving this problem elegantly.

Thank you!

How does Fenzo scale?

Apologies if this isn't the place to ask questions - I couldn't find a mailing list.

My understanding is that a TaskScheduler is quite stateful and should be a singleton within a cluster of JVM instances. Is this view incorrect? If so then what are the considerations around scaling?

Thanks!

advise on how to loop/tick fenzo

Generally there are a couple of options:

  1. new Thread().run()
  2. with every mesos offer, enqueue and deuque
  3. w/ timers - i.e.: actors - like the flink scheduler, or hierarchical timer wheel - for frequent timers.

When developing a large framework - i.e.: relatively involved, where mesos is a small part - based on what you guys built at netflix, is there anything that works particularly well ?

The downside of (1) is using thread.sleep() which is blocking.

The downside of (2) is that given the async nature of fullfilling offers, it will probably not work for short expiration tasks.

The downside of (3) is that you commit to their threading model - i.e.: actors, or callbacks for the time wheel approach which seem ok.

Also this is more of a mailing list discussion, but couldn't find any, so I thought of issues as a way to record some fenzo wisdom. :) - maybe it can turn into docs PR later !

NamedResourceSetRequest

Didn't see any docs around this:

TaskRequest.java {
     public NamedResourceSetRequest(String resName, String resValue, int numSets,
                                   int numSubResources) {
      this.resName = resName;
      this.resValue = resValue;
      this.numSets = numSets;
      this.numSubResources = numSubResources;
    }

....
}

Is it for the mesos agents --attributes ?

BinPacking with weights

From a quick glance in this line

return (cpuFitness + memFitness + networkFitness)/3.0;

every type of resource is considered equal. For example (0.9 + 0.1 + 0.1 )/3
~= 0.36 and fitness being 0.64 . For me such a machine with high cpu utilization may be useless even if I only need one more cpu for subsequent tasks and leads to wasted resources. This can be true especially for machines with many GBs of RAM but few cpus. Instead of having that formula I think we should try normalize the different type of items in the bins or use some type of weights. Also another question is that bin packing optimizes the number of bins but how about load balancing. I might missing something here.

Incorrect handling of reserved resources

Problem Description
Fenzo misinterprets offers containing a mix of reserved and unreserved resources, causing it to fail to consider all offered resources. For example, given an offer of 2 reserved CPUs and 3 unreserved CPUs, Fenzo behaves as though the offer contains 2 (or 3) CPUs, not 5 CPUs as it should.

This situation arises when the operator (or another framework in the same role) reserves a subset of a host for the framework's role. This is an increasingly common phenomenon due to:

  1. the dynamic reservation feature, which makes it easy for an operator to make fine-grained reservations.
  2. the growing popularity of the dcos-commons library, which makes extensive use of dynamic reservations. A framework based on that library may use the same role as a Fenzo-based framework, leading to unintended side-effects.

Here's an example depicting the resources within such an offer (2 cpus for myrole, 3 unreserved):

cpus(myrole):2.0; mem(myrole):4096.0; ports(myrole):[1025-2180];
disk(*):28829.0; cpus(*):3.0; mem(*):10766.0; ports(*):[2182-3887,8082-8180,8182-32000]

Problem Location
The root cause is within com.netflix.fenzo.plugins.VMLeaseObject. The VMLeaseObject assumes that a given resource name (e.g. cpus) will appear at most once in the offer.

Suggested fix
VMLeaseObject should aggregate all resources with the same name (subject to a set of roles to filter on).

A suggested workaround is for the framework to use an alternate implementation of com.netflix.fenzo.VirtualMachineLease. See example here.

Auto scaling

Read about Fenzo in netflix blog. Auto scaling concept sounds interesting. I'm yet to try my hands on Fenzo. Can you please let me know context of auto scaling here? Is it like Fenzo will shutdown/bring up VMs based on the demand ?

Thanks in advance,
Mani

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.