GithubHelp home page GithubHelp logo

bareflank / pal Goto Github PK

View Code? Open in Web Editor NEW
37.0 11.0 20.0 2.15 MB

The Bareflank Processor Abstraction Layer

License: MIT License

Python 42.34% CMake 6.15% C++ 2.33% C 45.41% Assembly 1.96% Rust 1.79% Makefile 0.01%

pal's Introduction

PAL

The Bareflank PAL (Processor/Peripheral Abstraction Layer) project provides developers and researchers of systems (operating systems, hypervisors, platform firmware) with a software API for manipulating low-level system state (instructions and registers) within a CPU. The project also provides APIs for register-level interfaces (memory-mapped, I/O mapped) for peripheral devices and system data structures.

The PAL project consists of:

  • A database (yaml files) that describes facts (normally documented by .pdf manuals) about how a CPU (e.g. Intel, AMD, ARMv8), peripheral (e.g. PCIe device) or data structure (e.g. ACPI table) is structured and accessed.
  • Code generators (C, C++, Rust) that create a library of accessor functions (a software API) for manipulating the information described by the database.
  • Support libraries and shims (e.g. drivers) that enable generated accessor functions to integrate and run in numerous execution contexts (e.g. inline assembly code, or forwarded from an application to a driver via an IOCTL)
  • Build system interfaces that provide integration with a variety of compilers, languages, and toolchains
  • Examples that demonstrate how to use PAL's generated APIs in each supported programming language

The PAL project enables a variety of different use cases:

  • Control (read/write/execute) and view (print) the contents of hardware devices in a bare-metal software project
  • Research, audit, test, or control privileged system state from an unprivileged application (much like Chipsec or RWEverything)
  • Build test harnesses, mock devices, or hardware emulators

A Brief Example

To demonstrate what PAL is all about, let's implement a small program that uses PALs generated APIs for the Intel platform in 3 different languages (C, C++, Rust). The program defines a single function that performs the following tasks:

  • Read the value of the IA32_FEATURE_CONTROL MSR by name
  • Read the value of the IA32_TSC MSR using its address, and the x86 RDMSR instruction
  • Enable paging by setting the PG bit (bit 31) in control register CR0
  • Print the value of CPUID leaf 0x1, output register EAX

C

#include "pal/msr/ia32_feature_control.h"
#include "pal/msr/ia32_tsc.h"
#include "pal/control_register/cr0.h"
#include "pal/cpuid/leaf_01_eax.h"

void pal_example(void)
{
    uint64_t msr1 = pal_get_ia32_feature_control();
    uint64_t msr2 = pal_execute_rdmsr(PAL_IA32_TSC_ADDRESS);
    pal_enable_cr0_pg();
    pal_print_leaf_01_eax();
}

C++

#include "pal/msr/ia32_feature_control.h"
#include "pal/msr/ia32_tsc.h"
#include "pal/control_register/cr0.h"
#include "pal/cpuid/leaf_01_eax.h"

void pal_example(void)
{
    auto msr1 = pal::ia32_feature_control::get();
    auto msr2 = pal::execute_rdmsr(pal::ia32_tsc::ADDRESS);
    pal::cr0::pg::enable();
    pal::leaf_01_eax::print();
}

Rust

use pal;

pub fn pal_example() {
    let msr1 = pal::msr::ia32_feature_control::get();
    let msr2 = pal::instruction::execute_rdmsr(pal::msr::ia32_tsc::ADDRESS);
    pal::control_register::cr0::pg::enable();
    pal::cpuid::leaf_01_eax::print();
}

For more examples that show how to use PAL with different CPU architectures and peripheral devices, check out the project's example and test directories.

Dependencies

Build-time dependencies will vary depending on your host system, target language, build system, and compiler toolchain. The following provides a good starting point for using most of PAL's features:

Ubuntu

For running the code generator:

sudo apt-get install python3 python3-pip
pip3 install lxml dataclasses colorama pyyaml

For building support libraries and shims:

sudo apt-get install cmake cmake-curses-gui build-essential linux-headers-$(uname -r)

Windows

TODO

Building and Integrating

There are numerous build interfaces to configure, generate, and build PAL APIs for use with your project. Depending on the needs of your target language, compiler toolchain and execution environment, you may need to use one or more of the following build interfaces.

CMake (C/C++)

CMake provides the easiest way to integrate all of PAL's features with C and C++ projects. The CMake build interface works by specifying a configuration input that describes what PAL should generate code for (fomatted as -DPAL_<ARCHITECTURE>_<EXCECUTION_STATE>_<SOURCE_ENVIRONMENT>_<TARGET_ENVIRONMENT>=ON). Cmake then runs the code generator and builds any necessary support libraries and shims for that configuration. Example (run from the project's top directory):

mkdir build && cd build
cmake .. -DPAL_INTEL_64BIT_SYSTEMV_GNUINLINE=ON
make

To explore all available configuration opions for the CMake build interface, run the following from your CMake build directory:

ccmake .

Cargo (Rust, experimental)

Cargo provides the easiest way to integrate all of PAL's features with Rust projects. The Cargo build interface works by specifying a crate feature that describes what PAL should generate code for (fomatted as <architecture>_<execution_state>_<source_environment>_<target_environment). Cargo then runs the PAL code generator and builds any necessary support libraries and shims for that configuration. Example (place this into your project's Cargo.toml):

[dependencies.pal]
git = "ssh://[email protected]/bareflank/pal.git"
features = ["intel_64bit_linux_ioctl"]

To explore all available configuration opions for the Cargo build interface, see the [features] section of PAL's Cargo.toml file

Code Generator

If you want to generate code from the project's database that does not require any support libraries (e.g. C/C++ code with inline assembly), you can run the code generator directly. Example:

./pal/pal.py --language=c --execution_state=intel_64bit --access_mechanism=gnu_inline

To explore all available configuration opions for the code generator:

./pal/pal/py --help

Project status and scope

TODO: Need to make a grid/table that shows support for many different arch/language/runtime scenarios:

Is the project missing something you need?

Contributions, enhancements and feature requests to the project are welcome. In addition to the backlog in project's issue tracker, the project will always be seeking support for:

  • Additional database entries. Is the project missing a defintion for something you need? Help us add it!
  • Database auditing. Did you find a mistake in a database entry? Help us fix it!
  • Database maintenance. Do you work for an orginization that designs or publishes the information described in our database? Get in touch to help us maintain the data over time!

pal's People

Contributors

arkivm avatar connojd avatar davgab7 avatar jaredwright avatar jschultz-dk avatar rianquinn avatar zhaofengli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pal's Issues

[RFC] Shoulder Gadgets

After creating both a C and C++ header generator, it seems like there are quite a few parts shared between the two generators. Additionally, some of the generator functions (like _generate_bitfield_accessors() are large and complex, and could benefit from being broken down into smaller, bite-size pieces that are more easily understandable and testable individually.

This RFC proposes the addition of a new sub-package within Shoulder called gadget. The gadget package will contain single-purpose generator "gadgets" that can be reused across all built-in shoulder generators, and also be used by external Python modules that want to use Shoulder.

The gadget package will follow a similar pattern to the generator package. An interface for each gadget will be defined using an AbstractGadget abstract-base class:

import abc

class AbstractGadget(abc.ABC):
    @property
    @abc.abstractmethod
    def description(self):
        """ Description of what this gadget does """

    @abc.abstractmethod
    def generate(self, objects, outfile):
        """ Generate a specfic output using the given Shoulder objects """
        """ to the given output file """
        return

    def __str__(self):
        return self.description

An example of a concrete gadget would be the following, a simple gadget to insert a license into a generated output:

from shoulder.gadget.abstract_gadget import AbstractGadget
from shoulder.logger import logger
from shoulder.config import config
from shoulder.exception import *

class LicenseGadget(AbstractGadget):
    @property
    def description(self):
        return "Generate the Shoulder license for a C/C++ file"

    def generate(self, objects, outfile):
        try:
            with open(config.license_template_path, "r") as license:
                for line in license:
                    outfile.write("// " + line)
                outfile.write("\n")

            msg = "{gadget}: license generated".format(
                gadget = str(type(self).__name__)
            )
            logger.debug(msg)

        except Exception as e:
            msg = "{gadget} failed to generate license: {exception}".format(
                gadget = str(type(self).__name__),
                exception = e
            )
            raise ShoulderGeneratorException(msg)

def generate(objects, outfile):
    g = LicenseGadget()
    g.generate(objects, outfile)

YAML Data Generator

Add support for generating the YAML models contained in the data directory. Ideally, the PAL project should be completely capable of generating a perfect copy of the data directory using that directory as input.

Vector/SIMD Control Register Access Mechanism Code Generation

Add support for generating C/C++ register access functions that can read and write vector/simd control registers from AArch32 state.

  • Create a SHOULDER_VMRS_BANKED_IMPL accessor macro that implements a vector/simd control register read using the VMRS instruction
  • Create a SHOULDER_MSR_BANKED_IMPL accessor macro that implements a vector/simd control register write using the VMSR instruction
  • Update the C header generator to use the above new macros
  • Update the C++ header generator to use the above new macros

Component header error

This code seems no to work properly. It generates not a header for each component but it generates the same header (last component from previous "for loop") and overwrites it.

for name, regs in peripherals.items():
include_guard = "PAL_" + reg.component.upper() + "_H"
self.gadgets["pal.include_guard"].name = include_guard
self.gadgets["pal.cxx.namespace"].name = namespace_name
outfile_path = os.path.join(outpath, reg.component.lower() + ".h")
outfile_path = os.path.abspath(outfile_path)

I have made a quick fix according to my understanding, is it correct or are i'm wrong?.

            for name, regs in peripherals.items():
                include_guard = "PAL_" + regs[0].component.upper() + "_H"
                self.gadgets["pal.include_guard"].name = include_guard
                self.gadgets["pal.cxx.namespace"].name = "pal::" + regs[0].component.lower() # namespace_name
                outfile_path = os.path.join(outpath, regs[0].component.lower() + ".h")
                outfile_path = os.path.abspath(outfile_path)

Update C Header Generator

Update the existing C header generator (pal/generator/c_header_generator.py) for compatibility with the new writer package, using the existing C++ header generator (pal/generator/cxx_header_generator.py) as an example.

  • Refactor all outfile.write() calls into a C-specific LanguageWriter
  • Update the generator logic so that all contents to the output file are written indirectly by either a gadget or a writer object.

Support reverse lookup of bit offset

It would be nice to have a per-register function that returns the corresponding field name given a bit offset. This will be useful for generating debugging/error messages.

For example, for each VMCS control field there is a [1] corresponding MSR that contains the constraint (i.e, features supported by the processor) on the value that can be set in said control. In our hypervisor we perform the checks on the values ourselves, and Bit 28 (Load CET state) of "VM-exit controls" must be 0 would look much nicer than Bit 28 of "VM-exit controls" must be 0.

This requires a bit more changes to the code generation infrastructure, so I'm opening an issue first for some feedback on how this should be done.

[1] Sometimes two

RFC: PAL Instruction APIs

Overview

Currently, PAL can generate this:

uint64_t efer = pal_ia32_efer_get();    // <-- Read the EFER MSR by name

But PAL cannot generate this:

uint64_t efer = pal_execute_rdmsr(0xc0000080);    // <-- Read the EFER MSR using the RDMSR instruction

This RFC aims to introduce the concept of CPU instructions to the PAL project.

Why?

The Bareflank hypervisor and MicroV projects are aiming provide support for lifting the implementation of a VMM up into a de-privileged execution state (e.g. Ring-3 of VMX-root). In this scenario, a VMM implementation will need a way to request execution of privileged instructions from a more privileged monitor (e.g. the Bareflank microkernel). PAL is well suited for making this happen!

Proposed Changes

To address the above topic, this RFC proposes the addition of three new components to the PAL project:

Component 1: Instruction Data Definitions

Add .yml files to the pal/data directory that represent a logical model of a system instruction. This logical model will contain information that describes how to use the instruction from a software perspective (how many logical arguments does it take? Does it return any data?), as well as necessary execution context for the instruction's operation (usable in 64-bit mode? compatible with 32-bit protected mode?).

Instruction data definitions will be split into two different categories: architectural instruction vs. logical instructions.

Architectural Instructions

These will represent instructions that are explicitly defined by a reference manual. These must also have a well defined set of inputs and outputs. For example, the RDMSR instruction is an architectural instruction because it is defined in the Intel/AMD 64-bit software developer manuals, has well defined inputs (register ecx), and has well defined outputs (registers eax and edx).

Logical Instructions

Logical instructions represent collections of instructions that must be either executed as a series to perform a desired outcome, or to produce a logical use-case of an instruction that isn't explicitly defined in a reference manual. For example, "read_cr0" could be a logical instruction because there is no explicit instruction defined for reading control register 0 (its a variant of the MOV instruction).

Component 2: Add more implementations to libpal

Currently, libpal hosts assembly stubs used to implement register access instructions when targeting Visual Studio and the MS64 ABI. Libpal will receive additional assembly routines that can provide a low-level implementation for many other execution contexts such as the SystemV ABI, and placeholders for pending future Bareflank microkernel ABI.

Component 3: Add instruction API code generation

Finally, the python package of the PAL project will be updated to provide support for generating C/C++ code that can link the models from Component 1 to the backends provided by Component 2. This will effective allow PAL to take care of language-dependent code generation tasks independently of the backend used in an libpal implementation.

RFC: PAL Project Ownership/Maintenance

Summary

This RFC seeks to gather feedback from the Bareflank community and maintainers about plans for maintenance and ownership of the PAL project, including an idea to transfer ownership of the PAL project to a new GitHub organization.

What are your thoughts @rianquinn @connojd @brendank310?

Background Context

In 2018, The PAL project was started as a side project to the Bareflank Hypervisor, intended to provide a support library for manipulating system registers of Intel and ARMv8 CPUs. The goal was to reduce maintenance burden coming from Bareflank's "bfintrinsics" layer by developing a code generator, as opposed to hand-written accessor functions. Since then, the project has been used for numerous research efforts, but (to my knowledge) was never fully integrated into the Bareflank Hypervisor ecosystem (Hypervisor, MicroV, etc).

As of December 2020, The PAL project's primary contributor and maintainer (myself, Jared Wright) moved to a new employer (Amazon Web Services), where I would like to continue efforts to develop and maintain the project. In the short term, there are a couple hoops I need to jump though before I can do this, so I am looking to identify the best path for myself to contribute to PAL in open source. In the long term, I would also like to ensure the scope of the project is able to expand over time to include new features such as:

  • Support for more programing languages (e.g. Rust)
  • APIs for system instructions
  • APIs for peripheral devices and data structures (e.g. ACPI tables, page tables, etc)
  • Execution shims to enable forwarding of APIs across system boundaries (e.g. userspace->kernel via an IOCTL, much like how Chipsec works)
  • Adoption and integration with a variety of open-source projects (e.g. AWS's Kernel Test Framework)
  • Engagement with silicon vendors (e.g. Intel, AMD, etc) to seek contributions and maintenance of the project's register+instruction database.

Proposed Changes

I'd like the current Bareflank maintainers to weigh in on which of the following strategies is most appropriate to support future development and maintenance of the PAL project. My personal recommendation is to pursue "Option 1".

I would also like to get a better understanding of the Bareflank community's current dependencies on (or future plans for) the PAL project to ensure everyone's needs are supported. At the moment, I am not aware of any integrations between PAL and another project in the Bareflank organization, but I'm wondering if there are other dependencies internally at AIS?

Option 1: Move PAL to the AWS Labs GitHub Organization

Pros:

  • This is the easiest path for me to continue contributing to and maintaining the project.
  • This is likely the easiest path for me to pursue more support for the project.
  • The project could likely maintain its current license (MIT) or a similar one (Apache 2.0, or a more permissive MIT-0)

Cons:

  • The project will likely need to go through a name change (there's a another PAL project here)
  • The copyright for the project would likely need to be transferred from Assured Information Security, Inc to Amazon.
  • Existing Bareflank maintainers would need to jump through hoops to become maintainers (I don't think this is a real problem, but want to make it explicit)

Option 2: Keep PAL in the Bareflank GitHub organization (where it is now)

Pros:

  • The project's maintainers and the governance model stays the same.
  • Assured Information Security, Inc maintains the copyright to the project.

Cons:

  • I'm not 100% certain if it is possible for me to contribute this way. The project's current CLA agreement may cause friction for me trying to contribute and maintain on behalf of my employer.
  • This option will take some more research on my part.

Option 3: Find another open source community that wants to adopt the project

I'm open to suggestions if anyone has another idea for a good "home" for PAL.

Option 4: Maintain two project forks

I really hate this option, but I would be open to discussing if for some reason there is no other choice.

README

After rebranding the project to Pal, the README was completely inaccurate and out of date. Start a new project README that communicates the following:

  • What is this project? Why would you use it?

  • How do you build/use/generate something from this project?

  • What are some examples of how to use the generated outputs from this project?

  • Describe the project's scope and current status

  • Describe (briefly) how you would go about adding new definitions to the project's data directory.

Banked System Register Access Mechanism Code Generation

Add support for generating C/C++ register access functions that can read and write banked system registers from AArch32 state.

  • Create a SHOULDER_MRS_BANKED_IMPL accessor macro that implements a banked system register read using the MRS instruction
  • Create a SHOULDER_MSR_BANKED_IMPL accessor macro that implements a banked system register write using the MSR instruction
  • Update the C header generator to use the above new macros
  • Update the C++ header generator to use the above new macros

Documentation writers for C and C++

PAL's Rust code generator supports writing file-level and function-level documentation strings that can be used to create friendly documentation pages (see pull-requrest #97). An equivalent functionality should be developed for both C and C++, generating something along the lines of doxygen-style comments.

This task consists of:

Decorator Based Gadgets

RFC #12 introduced the concept of "gadgets" to shoulder with the intention of providing a way to break down complex Generator objects into smaller, more reusable bits and pieces. This gadget concept followed the same design pattern behind generators: An AbstractGadget class defined an interface that specific concrete gadgets would implement to provide their own functionality. This choice to use an abstract base class to define the gadget interface seems to be imposing some limitations:

  • There isn't easy way to parametrize each gadget differently (which seems to be necessary)
  • It is difficult to write a gadget that "wraps" content (like an include guard, or C++ namespace)
  • It is a little awkward to call the gadgets from within a generator (the whole point of the gadget in the first place)

To overcome the above limitations, this RFC proposes to move the gadget concept into an implementation that utilizes python decorators, instead of an abstract base class. This way, a shoulder generator could use a gadget by "decorating" its private, internal functions. Each decorator would optionally provide parameters for tweaking generated content. For example:

import os
import shoulder

class MyGenerator(shoulder.generator.abstract_generator.AbstractGenerator):
    def generate(self, objects, outpath):
        outfile_path = os.path.abspath(os.path.join(outpath, "my_generated_output.h"))

        with open(outfile_path, "w") as outfile:
            self._generate(objects, outfile)

    @shoulder.gadget.license
    @shoulder.gadget.include_guard(name="MY_CUSTOM_INCLUDE_GUARD_H")
    def _generate(self, objects, outfile):
        outfile.write("This decorated content will contain a license and be wrapped in include guards")

The example above applies two decorator-based gadgets named shoulder.gadget.license, and shoulder.gadget.include_guard. The implementation of the include_guard gadget would be the following:

def include_guard(_decorated=None, *, name="SHOULDER_AARCH64_H"):
    def _include_guard(decorated):
        def include_guard_decorator(generator, objects, outfile):
            outfile.write("#ifndef " + str(name) + "\n")
            outfile.write("#define " + str(name) + "\n\n")
            decorated(generator, objects, outfile)
            outfile.write("#endif\n")
        return include_guard_decorator

    if _decorated is None:
        return _include_guard
    else:
        return _include_guard(_decorated)

[BUG] VMCS access mechanisms fail for fields that aren't 64-bits

PAL's VMCS access mechanism writers create getters and setters in the following format for 64-bit VMCS fields:

uint64_t value = 0;
__asm__ __volatile__(
    "mov $0x201a, %%rdi;"
    "vmread %%rdi, %[v];"
    : [v] "=r"(value)
    :
    : "rdi"
);
return value;

When a VMCS field that is 32-bits gets generated, the following occurs:

uint32_t value = 0;              <-- This 32 bit variable declaration...
__asm__ __volatile__(
    "mov $0x4402, %%rdi;"
    "vmread %%rdi, %[v];"
    : [v] "=r"(value)            <-- ... causes this to be a 32-bit register operand
    :
    : "rdi"
);
return value;

This cannot be assembled, because vmread/vmwrite instructions require a 64-bit operand

Encoding-based access mechanism writer

The existing CHeaderGenerator supports the ability to access a register by directly emitting an instruction opcode into a generated accessor function.

See:

def _generate_aarch64_encoded_set(self, outfile, reg, am):

and:

def _generate_aarch64_encoded_get(self, outfile, reg, am):

Refactor this functionality into a new BinaryEncodedAccessMechanismWriter class that will let any generator leverage this functionality.

Generator Modules

This RFC proposes a design for generating outputs based on the results of parsed ASL/XML inputs (RFC #2).

Overview

The overall goal of the Shoulder project is to generate output files that contain functions to manipulate system registers and execute system instructions on the aarch64 architecture. Generated files might include language specific header files, libraries, or tests that you could include into a larger software project. This RFC proposes a mechanism to add file generation support to the Shoulder project.

Proposed Design

Generators for the Shoulder project will follow the same basic design pattern used for Shoulder parsers: an abstract base class will be created that represents the required interface that a generator must provide:

class AbstractGenerator(abc.ABC):
    @abc.abstractmethod
    def generate(self, objects, outpath):
        """ Generate target output using the given register and/or """
        """ instruction objects to a file at the given output path"""
        return

Concrete generators will then implement this interface. For this RFC, one generator will be implemented to create a C++ header file with intrinsic-like register access functions:

from shoulder.generator.abstract_generator import AbstractGenerator
from shoulder.logger import logger
from shoulder.config import config
from shoulder.exception import GeneratorException

class CxxHeaderGenerator(AbstractGenerator):
    def __init__(self):
        self._current_indent_level = 0

    def generate(self, objects, outpath):
        logger.info("Generating C++ header: " + str(outpath))
        with open(outpath, "w") as outfile:
            self._generate_license(outfile)
            self._generate_cxx_includes(outfile)
            self._generate_include_guard_open(outfile)
            self._generate_namespace_open(outfile)

            self._generate_objects(objects, outfile)

            self._generate_namespace_close(outfile)
            self._generate_include_guard_close(outfile)

All generators will be grouped into a submodule (subdirectory) of the Shoulder package at shoulder/generator/.

Out-of-scope:

Generator modules will not be responsible for validating the contents of the objects (registers and instructions) that they will generate. Generators will attempt to generate output files without any modifications to the given inputs. As a result, the generated output will be prone to any mistakes made by the parsers that created the list of objects to be generated. To address invalid registers and allow for the ability to exclude registers, a separate filter module RFC will be created as a next step.

Coprocessor Register Access Mechanism Code Generation

Add support for generating C/C++ register access functions that can read and write coprocessor registers from AArch32 state.

  • Create a SHOULDER_MCR_IMPL accessor macro that implements a coprocessor register write using the MCR instruction
  • Create a SHOULDER_MCRR_IMPL accessor macro that implements a coprocessor register write using the MCRR instruction
  • Create a SHOULDER_MRC_IMPL accessor macro that implements a coprocessor register read using the MRC instruction
  • Create a SHOULDER_MRRC_IMPL accessor macro that implements a coprocessor register read using the MRRC instruction
  • Update the C header generator to use the above new macros
  • Update the C++ header generator to use the above new macros

Initial VMCS Support

Prove out the concept of generating accessor functions for VMCS fields in the Intel x86_64 architecture.

  • Add vmread and vmwrite access mechanism models
  • Add vmread and vmwrite access mechanism writers
  • Begin adding vmcs field definitions to the data directory. Doesn't have to be complete, just enough to prove the concept.

Missing info target

The banner printed by cmake states that make info is a target, but there is no info target

System Register Immediate Access Mechanism Code Generation

Add support for generating C/C++ register access functions that can write system registers using immediate values from AArch64 state.

  • Create a SHOULDER_MSR_IMMEDIATE_IMPL accessor macro that implements a system register write using the MSR instruction and an immediate value
  • Update the C header generator to use the above new macros
  • Update the C++ header generator to use the above new macros

AbstractParser properties and test coverage

  • Remove AbstractParser unit tests
  • Exclude AbstractParser (or better, all AbstractBaseClasses) from test coverage
  • Try to move the current AbstractParser properties to the class-level

External Access Mechanism Code Generation

Add support for generating C/C++ register access functions that can read and write external (memory mapped) registers.

  • Create a SHOULDER_LDR_IMPL accessor macro that implements an external register read using the LDR instruction
  • Create a SHOULDER_STR_IMPL accessor macro that implements an external register write using the STR instruction
  • Update the C header generator to use the above new macros
  • Update the C++ header generator to use the above new macros

[RFC] Shoulder Parser Modules

This RFC proposes a design for parsing ARM specification documents into a format that Shoulder can use for further processing.

Overview

ARM provides specification documents that Shoulder will ingest as its primary input. There seems to be many different formats that Shoulder could potentially use to parse information about registers and instructions (XML files, ASL files, the files that Alastair's MRA tools generate, etc). We will initially focus on providing a parser for one such format (XML files), but will design Shoulder's parsing module keeping in mind the addition of new formats in the future.

Proposed Design

This RFC proposes the addition of a few new components to Shoulder: basic data model classes, and parser modules

Data model classes

The two basic data types that Shoulder will primarily deal with are registers and instructions. These will each be modeled as Python classes at the top level of the Shoulder package (shoulder.register and shoulder.instruction). Parser modules will be responsible for constructing these objects from some sort of input file (xml, etc), and generator modules will use these to generate functions (there will be a separate RFC for generator modules later)

Parser Modules

Since there are potentially many formats Shoulder could use as input, a common interface will be defined for Shoulder parsers to implement, allowing us to have parsers for many different formats. This will be implement using a Python abstract base class, something like the following:

import abc

class AbstractParser(abc.ABC):
    @property
    @abc.abstractmethod
    def aarch_version_major(self):
        """ Major version of the ARM architecture specification """
        """ supported by this parser """
        pass

    @property
    @abc.abstractmethod
    def aarch_version_minor(self):
        """ Minor version of the ARM architecture specification """
        """ supported by this parser """
        pass
 
    @abc.abstractmethod
    def parse_registers(self, path):
        """ Parse a file at the given path to a register object(s) """
        return

    @abc.abstractmethod
    def parse_instructions(self, path):
        """ Parse a file at the given path to an instruction object(s) """
        return

Concrete parsers for different formats (XML, ASL, other) will then implement this interface:

class ArmV8XmlParser(AbstractParser):
    def __init__(self):
        pass

    @property
    def aarch_version_major(self):
        return 8

    @property
    def aarch_version_minor(self):
        return 0

    def parse_registers(self, path):
        pass

    def parse_instructions(self, path):
        pass

All parsers will be grouped into a submodule (subdirectory) of the Shoulder package named parser. The final directory structure will look like this:

shoulder/
    include/
    scripts/
    shoulder/
        parser/
            abstract_parser.py
            armv8_xml_parser.py
        register.py
        instruction.py
   test/

For this RFC, one parser implementation will be provided to parse the latest version of XML files provided on ARM's website (ARMv8.3 00bet6).

Testing

In addition to unit test coverage for each developed component, we will need to create an XML file that mocks the format of ARM's xml files. I don't believe that we can use ARM's files directly due to licensing differences, so one will have to be developed and checked into shoulder/test/support/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.