This was originally filed here
When pants moves to its own repository, I'd like to take the opportunity to do a code reorg in order to make it easier for other projects / repositories / organizations to use pants.
current layout
Backends (java, scala, python, etc) consist of tasks and targets, and basic wiring into phases. Right now the code org is something like this:
- pants.bin
- pants.base
- pants.commands
- pants.doc
- pants.goal
- pants.java
- pants.python
- pants.pants_doc
- pants.python
- pants.targets
- pants.tasks
Bolded is what I consider to be "pants core" – as in pants simply can't operate without code in there. Specifically this is the execution engine and ancillary stuff like base classes for everything else (Task, Goal, Target.) Currently the Task/TaskError base classes are in pants.tasks.init but should be moved into pants.base.
While pants/init.py contains some code necessary for pants to run (get_buildroot), most of it should be moved out to the leaves (is_* e.g. is_java, is_scala) and 'from twitter.pants.targets import *' should go away entirely.
new backend layout
I propose that everything in pants.java, pants.python, pants.targets and pants.tasks be reorganized (temporarily) into a new toplevel directory:
With the following layout:
- pants.backend.{language}.init
- pants.backend.{language}.base
- pants.backend.{language}.targets
- pants.backend.{language}.tasks
Each backend is treated as a namespace package so that they can more easily be developed by third parties. This model works naturally with the python.new backend, and for the most part with the Java and Scala backends.
One case where this is more challenging is IDL backends. My proposal for handling this is by making the CodeGen(Task) base class in pants.base the canonical base class for code generation, and specific pants.backend.thrift and pants.backend.protobuf IDL backends that other backends can then depend upon. For example, pants.backend.python would depend upon pants.backend.thrift, and subclass the ThriftGen base from pants.backend.thrift.base to create a Python-specific codegen target. (I've already done the refactor in the python.new branch to make this more straightforward through language-specific createtarget abstract base classes.) So rather than the 'gen' phase with 'gen:thrift' and 'gen:protobuf' goals, you'd have the 'gen' phase with 'gen:thrift-python', 'gen:thrift-java', 'gen:protobuf-go' etc.
bootstrapping
To bootstrap a repository to use pants, you would do the following:
- copy the bootstrap_pants.py script into your repository and name it 'pants', chmod +x
- add a root BUILD file which sets up your source root and defines which plugins you'd like
The pants bootstrap script currently bootstraps itself from PyPI, though some of the dependent source dists have not yet been published (specifically twitter.pants, twitter.common-core, twitter.common.python) but we can do that when we're ready – especially since twitter.pants could be 'pb' or 'pantsbuild' or something else. If you don't have external network access, you'd add a field in pants.ini that specifies where the bootstrapper should be looking for artifacts and you'd cache them locally. Similarly, by default the bootstrap script bootstraps the latest version of pants, but you could address specific versions in the pants.ini using the existing Requirement format (e.g. 'pb==0.3.2' or 'pb>0.3,<1').
The bootstrapper script would write pants.pex and on subsequent runs 1) check the pants.pex version to make sure it's compatible with the one defined in pants.ini and if so 2) os.execv.
repository BUILD
The root BUILD would look like:
require('pants.backend.java==0.2.3')
require('pants.backend.python')
require_path('plugins/foursquare')
# [any extra project specific methods or variables defined here]
Each backend would be a PythonLibrary with a corresponding SetupPy artifact, which can have install_requires in order to dictate dependencies (e.g. pystache, boto, pants.backend.thrift, whatever.) The upside being that 1) this is all already implemented and 2) they could write a backend as a standard setup.py project outside of pants.
require_path would simply look for an init.py in that directory and install everything there into the root namespace.
Currently for BUILD files to be usable, we autoinject 'from pants.targets import *' and 'from twitter.common.quantity import Amount, Time' in every single one. Instead, we'd deposit the union of the namespaces created by the buildroot BUILD, and the init.py of each registered backend into all down-tree BUILDs. The backend _init_s would also be responsible for wiring tasks into phases.