pineapplemachine / higher Goto Github PK
View Code? Open in Web Editor NEWLazy and async higher-order functions in JavaScript.
License: Other
Lazy and async higher-order functions in JavaScript.
License: Other
Documenting this here so it isn't forgotten. According to https://github.com/pineapplemachine/higher/pull/66/files#r127577462, the Stable Sorting term needs to be added to the glossary.
Companions to the existing dropHead, dropTail, and dropSlice.
Refer to flatten.js for a working pattern
The function should return the opposite of homogeneous
for all inputs
i.e. use the same error pattern as introduced in #25
Repro steps:
> a = {b:3}
Object {b: 3}
> a.a = a
Object {b: 3, a: Object}
> hi.isEqual(a, a)
isEqual.js:43 Uncaught RangeError: Maximum call stack size exceeded
at isEqual (isEqual.js:43)
at objectsEqual (isEqual.js:218)
at isEqual (isEqual.js:60)
at objectsEqual (isEqual.js:218)
at isEqual (isEqual.js:60)
at objectsEqual (isEqual.js:218)
at isEqual (isEqual.js:60)
at objectsEqual (isEqual.js:218)
at isEqual (isEqual.js:60)
at objectsEqual (isEqual.js:218)
The reject function in reject.js
is an opposite to filter
. Where filter
enumerates elements satisfying a predicate reject
enumerates those elements that do not. reject
should be documented and tested; docs and tests should be simple negations of the descriptions and tests given already in filter.js
.
Related to and depends on #96
A consumer shall be a finite state machine. Various functions may call a push
method which accepts a single arbitrary value as input - typically an element from a sequence. The internal state of any consumer is implementation-dependent; a consumer is defined only by its interface. Consumers likely will have at least these methods:
push(value)
to consume a new valuedone()
when the consumer has reached any termination state, if evervalue()
represent termination state: truthy values indicate success, falsey values failure (or pending).copy()
to create a copy of this consumerreset()
to reset the state of this consumerConsumers are comparable to regular expressions in purpose and concept. In fact, most regular expressions would also be possible to implement as consumers.
Once, implemented, as separate tasks, consumers should be used:
findFirst
, findLast
, and findAll
to find substringsreplace
to find substringssplit
to find delimitersmatch
(name? there is currently another function called "match") to determine if a sequence matches a consumer, i.e. pushing every element to a consumer produces a success stateAs part of this task: There shall be default consumer types for comparing to a substring sequence with an optionally specified comparison function, and for matching a series of elements satisfying a predicate function. Eventually, as a separate task, there should also be a default consumer type for acquiring a consumer object from a regular expression.
The mapIndex function currently returns a CounterSequence.SingularMapSequence
, but it could return a more performant sequence type specialized for this use case rather than requiring a two-part sequence chain.
The priority of this task is trivial until such time as users are actually using this functionality for performance-sensitive tasks.
I'm doing some testing over here and noticed that webpack is throwing errors in asSequence.js
on this line:
export const ArraySequence = Sequence.extend({
I dug around and this is because Sequence and asSequence both depend on each other. Commenting out import {ArraySequence} from "./asSequence";
in sequence.js
clears the error, but surely causes other problems.
Higher should have a sequence for enumerating the ordered pairs produced by the cartesian product of any number of input sequences.
Currently lines such as
tests: process.env.NODE_ENV !== "development" ? undefined : { ... }
are present all over the code base. If at all possible, a more concise and approachable pattern should be used. tests: isProd ? undefined : { ... }
would be an example of such a better pattern.
As part of this task: Once a better pattern is found, all places in the codebase that currently compare against process.env.NODE_ENV
should be replaced with that newer pattern.
It is a requirement of the pattern that minified production builds (currently created using nwb build
) must totally exclude development-only branches and must totally exclude docs
and tests
objects from things like wrapped functions where the old pattern is currently used.
Related to #53
Higher should have a sequence for enumerating all permutations of a bounded input sequence.
Currently, some sequences support a slice
property and some do not. Users should always be able to slice
a sequence even if it implies traversing the sequence to acquire the slice. It must still be possible, however, for internal higher functions to determine whether slicing without traversal is supported for a sequence.
One solution to this may be to rename the slice
method attached to sequences and create a new slice
function that invokes a sequence's slicing method if it has one, and otherwise generates a sequence functionally equivalent to source.dropHead(low).head(high - low)
.
There should be stripLeft, stripRight, and strip(Both) functions that are equivalent to e.g. sequence.from(!pred)
, sequence.until(!pred)
; the default predicate should strip falsey values like null
, undefined
, 0
, and also whitespace.
edit 8/11/17: Name them trimLeft, trimRight, and trim instead.
Validators should not themselves throw errors, they should return an object containing success state and the validated object. They should not be raw functions but should contain metadata informing the arguments validator's error messages.
Currently there is no clean or established way to write tests specifically for a sequence type's overloaded methods.
One possible way to do this would be to add a tests
attribute to sequences for miscellaneous tests, such as those applicable to specific methods, overload and not. In this case tests currently placed in docs.methods[methodName].tests
should be moved into this new tests
object.
The functions in this module do already have some tests, but they are not adequately exhaustive for such a complicated and commonly needed function. Additional test cases, especially for isEqual
, and especially for odd inputs that I wouldn't have anticipated when I wrote the implementation, would be very valuable.
Ideally this task won't uncover any undesirable behavior but, if it does, please let me know so we can discuss the best way to fix it.
(Desirable behavior is defined as the function not failing with errors, and always returning the least astonishing answer.)
Rather than a expressing priority as a number, each converter should be able to express which other converters it must precede and which it must follow
Object sequences currently have the property where keys are always enumerated in a consistent and well-defined order. This could be a more interesting problem to solve with Map keys which are not required to be strings.
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Map
Higher should provide an interface for stable/unstable, in-place/out-of-place, and lazy/eager sorting.
For example implementations of most of the algorithms to be used here refer to https://github.com/pineapplemachine/mach.d/tree/master/mach/sort
The link does not include implementations of timsort or quicksort, which higher should include
Ideas for what this should look like:
The sort
method of sequences returns a new sequence object. If it is consumed normally as a sequence, then it is a lazy out-of-place selection sort. Its array
and collapse
/collapseBreak
methods allow for distinction between in-place and out-of-place eager sorting. Lazy sorting is always out-of-place.
sort
accepts an optional relational function. The default is (a, b) => (a < b)
.
It has ascending
and descending
convenience functions for immediately setting the relational function to a < b
ora > b
, respectively. It has an order
function
It overrides reverse
to return a new sorting sequence with a reversed relational function
It has stable
and unstable
methods to choose between quicksort and timsort. (TODO: evaluate whether the default stable sort should in fact be timsort, or merge sort instead. Note that merge sort is often considered best from a security standpoint because its complexity is the same regardless of the input, i.e. it isn't vulnerable to attacks.)
A radixSort
method should accept a transformation function and produce a radix sort instead. In-place, out-of-place, stable, and unstable radix sorts should all be feasible. It probably is not possible to implement a lazy radix sort.
A heapSort
method should allow for a partially lazy implementation (eagerly heapify and lazily pop elements) including for in-place sorting.
It would be nice to provide insertionSort
and selectionSort
and explicit timSort
, quickSort
, mergeSort
methods for those who know what's best for their use case
In some places in the code base, knowing for certain that an input sequence would be unbounded is important
For example
Line 6 in c5ba0d6
Currently sequence.collapse
will behave badly with some sequences where just reassigning the source
attribute won't alter its behavior. (This was always going to be an unsafe behavior, but it at least worked in most cases and was a reasonable stopgap solution... now it needs to be fixed)
One solution to this issue might be adding a method to sequences e.g. retarget
which resets and reassigns the source sequence when applicable; this would be called rather than reassigning source
.
For an example see Python's itertools.tee
A homogeneous
overload could simply return this.compare(this.element, this.element)
and heterogeneous
the inverse. This is of particular importance for infinitely repeated element sequences, since otherwise the operation would fail with a NotBoundedError
.
Related to #77 - no heterogeneous
function actually exists yet!
For an example of properly documented function overloads ctrl+f the codebase for overloads:
and refer to the sequence type methods having the names listed.
Note that there is not currently a well established pattern for testing overloads. This task is not required to involve automated tests for the overloaded methods; all such methods added so far will be given tests in a future task.
https://facebook.github.io/immutable-js/
It ought not be difficult to write a separate extension for higher which makes it possible to produce sequences from ImmutableJS objects and vice-versa. This should become a priority once the core library is stable, and before a first release.
Note that doing this task is likely to involve making the asSequence
implementation more flexible.
Here is what the function looks like now:
export const asSequence = (source) => {
if(isSequence(source)){
return source;
}else if(isArray(source)){
return new ArraySequence(source);
}else if(isString(source)){
return new StringSequence(source);
}else if(isIterable(source)){
return new IterableSequence(source);
}else if(isObject(source)){
return new ObjectSequence(source);
}else{
throw (
"Value is not valid as a sequence. Only arrays, strings, " +
"iterables, and objects can be made into sequences."
);
}
};
Here is what the function is likely to look like in the future:
export const asSequence = (source) => {
if(isSequence(source)){
return source;
}else if(isArray(source)){
return new ArraySequence(source);
}else if(...){
...
}else{
for(const converter in asSequence.converters){
if(converter.canConvert(source)) return converter.convert(source);
}
throw (
"Value is not valid as a sequence. Only arrays, strings, " +
"iterables, and objects can be made into sequences."
);
}
};
asSequence.converters = [];
asSequence.addConverter = (converter) => {...};
Once this is the case, it should be possible to specify with function wrapping that the function should act as a converter for objects satisfying a predicate.
Currently there is not any complete documentation (or really almost any documentation) regarding how to define a sequence type using Sequence.extend
, and this must be fixed.
A separate package should be provided that adds functions to higher for converting to and from lazy sequences as defined by Lazy.js
N-ary cartesian product of a sequence with itself, but with an option to produce only the unordered seqeuences as opposed to the ordered sequences. e.g. this should be possible:
hi.cartPower([1,2,3], 2).ordered() => [[1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3], [3, 1], [3, 2], [3, 3]]
hi.cartPower([1,2,3], 2).unordered() => [[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 3]]
Like https://github.com/pineapplemachine/mach.d/blob/master/mach/range/cartpower.d
Throughout higher, throw
statements throw strings. They should throw Error objects or objects inheriting from Error, not strings.
This task would also involve fixing the slightly broken methods currently in ./core/error.js
Because pretending lists are really sets is such good fun
Should follow #58
I have some ideas about how to maybe generalize this kind of find behavior with FSMs
You could pass a finder a consumer
A consumer is an object with consume
and state
and copy
methods. consume
accepts a sequence element as an argument and advances the consumer by one element. state
returns one of active
, completed
, terminated
. Active is the in-progress state, completed is the successfully-finished state, and terminated is the failed-finished state. copy
forks and makes a copy of the consumer. A consumer could be paired with a finder to find substrings matching some other sequence or to find substrings of consecutive elements matching a predicate, or could be more complicated and do things like detect floating point numbers. A long term goal might be to build these lazy consumer objects from regular expressions
The split
function would change to accept a consumer, or a sequence or a predicate to construct a simple consumer from. Same with replace
when added.
Currently most fall back to a default implementation attached to the Sequence prototype. All sequences should define their own unbounded method like they currently do a bounded method.
Like the existing groupBy
except that it accumulates counts rather than arrays of values.
The split function currently only supports splitting on a delimiter. It should also be possible to, for example, split on any continuous whitespace substring. (In fact, hi.split(x)
should behave this way by default.)
The best way to achieve this is probably to provide different findDelimiters
sequences to the existing split implementation.
It should be a priority to have at least basic documentation for every user-exposed function and every sequence and error type. A script should be written and made part of a release process that generates a github pages hosted site from that documentation.
Should act as a complement to zip
, satisfying zip(unzip(a, b)) === [a, b]
and unzip(zip(a, b)) === [a, b]
.
Instead, the supportsAlways
and supportsWith
attributes of sequences should be inspected in order to accomplish the same thing (that is, omission of methods that are unsupported due to a source not supporting them)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.