GithubHelp home page GithubHelp logo

moov-io / watchman Goto Github PK

View Code? Open in Web Editor NEW
309.0 19.0 83.0 36.85 MB

AML/CTF/KYC/OFAC Search of global watchlist and sanctions

Home Page: https://moov-io.github.io/watchman/

License: Apache License 2.0

Go 90.82% Dockerfile 0.21% Makefile 0.85% Shell 0.73% HTML 0.34% JavaScript 6.99% CSS 0.06%
ofac kyc cip bis-denied-persons sanctions aml ctf bis csl sectoral-sanctions-identifications

watchman's Introduction

Moov Banner Logo

Project Documentation · API Endpoints (Admin Endpoints) · API Guide · Community · Blog

GoDoc Build Status Coverage Status Go Report Card Repo Size Apache 2 License Slack Channel Docker Pulls GitHub Stars Twitter

moov-io/watchman

Moov's mission is to give developers an easy way to create and integrate bank processing into their own software products. Our open source projects are each focused on solving a single responsibility in financial services and designed around performance, scalability, and ease of use.

Moov Watchman offers download, parse, and search functions over numerous trade sanction lists from the United States, agencies, and nonprofits for complying with regional laws. Also included is a web UI and an async webhook notification service to initiate processes on remote systems.

Lists included in search are:

All United States, UK and European Union companies are required to comply with various regulations and sanction lists (such as the US Patriot Act requiring compliance with the BIS Denied Persons List).

Table of contents

Project status

Moov Watchman is actively used in multiple production environments. Please star the project if you are interested in its progress. If you have layers above Watchman to simplify tasks, perform business operations, or found bugs we would appreciate an issue or pull request. Thanks!

Usage

The Watchman project implements an HTTP server and Go library for searching, parsing, and downloading lists. Below, you can find a detailed list of features offered by Watchman:

  • Download OFAC, US/UK/EU CSL, BIS Denied Persons List (DPL), and various other data sources on startup
  • Index data for searches
  • Library for OFAC and BIS DPL data to download and parse their custom files

Docker

We publish a public Docker image moov/watchman from Docker Hub or use this repository. No configuration is required to serve on :8084 and metrics at :9094/metrics in Prometheus format. We also have Docker images for OpenShift published as quay.io/moov/watchman. Lastly, we offer a moov/watchman:static Docker image with files from 2019. This image can be useful for faster local testing or consistent results.

Pull & start the Docker image:

docker pull moov/watchman:latest
docker run -p 8084:8084 -p 9094:9094 moov/watchman:latest

Get information about a company using their entity ID:

curl "localhost:8084/ofac/companies/13374"
{
   "id":"13374",
   "sdn":{
      "entityID":"13374",
      "sdnName":"SYRONICS",
      "sdnType":"",
      "program":[
         "NPWMD"
      ],
      "title":"",
      "callSign":"",
      "vesselType":"",
      "tonnage":"",
      "grossRegisteredTonnage":"",
      "vesselFlag":"",
      "vesselOwner":"",
      "remarks":""
   },
   "addresses":[
      {
         "entityID":"13374",
         "addressID":"21360",
         "address":"Kaboon Street, PO Box 5966",
         "cityStateProvincePostalCode":"Damascus",
         "country":"Syria",
         "addressRemarks":""
      }
   ],
   "alts":[
      {
         "entityID":"13374",
         "alternateID":"15056",
         "alternateType":"aka",
         "alternateName":"SYRIAN ARAB CO. FOR ELECTRONIC INDUSTRIES",
         "alternateRemarks":""
      }
   ],
   "comments":null,
   "status":null
}

Google Cloud Run

To get started in a hosted environment you can deploy this project to the Google Cloud Platform.

From your Google Cloud dashboard create a new project and call it:

moov-watchman-demo

Enable the Container Registry API for your project and associate a billing account if needed. Then, open the Cloud Shell terminal and run the following Docker commands, substituting your unique project ID:

docker pull moov/watchman
docker tag moov/watchman gcr.io/<PROJECT-ID>/watchman
docker push gcr.io/<PROJECT-ID>/watchman

Deploy the container to Cloud Run:

gcloud run deploy --image gcr.io/<PROJECT-ID>/watchman --port 8084

Select your target platform to 1, service name to watchman, and region to the one closest to you (enable Google API service if a prompt appears). Upon a successful build you will be given a URL where the API has been deployed:

https://YOUR-WATCHMAN-APP-URL.a.run.app

Now you can ping the server:

curl https://YOUR-WATCHMAN-APP-URL.a.run.app/ping

You should get this response:

PONG

Configuration settings

Environmental Variable Description Default
DATA_REFRESH_INTERVAL Interval for data redownload and reparse. off disables this refreshing. 12h
INITIAL_DATA_DIRECTORY Directory filepath with initial files to use instead of downloading. Periodic downloads will replace the initial files. Empty
SEARCH_MAX_WORKERS Maximum number of goroutines used for search. 1024
ADJACENT_SIMILARITY_POSITIONS How many nearby words to search for highest max similarly score. 3
EXACT_MATCH_FAVORITISM Extra weighting assigned to exact matches. 0.0
LENGTH_DIFFERENCE_CUTOFF_FACTOR Minimum ratio for the length of two matching tokens, before they score is penalised. 0.9
LENGTH_DIFFERENCE_PENALTY_WEIGHT Weight of penalty applied to scores when two matching tokens have different lengths. 0.3
DIFFERENT_LETTER_PENALTY_WEIGHT Weight of penalty applied to scores when two matching tokens begin with different letters. 0.9
UNMATCHED_INDEX_TOKEN_WEIGHT Weight of penalty applied to scores when part of the indexed name isn't matched. 0.15
JARO_WINKLER_BOOST_THRESHOLD Jaro-Winkler boost threshold. 0.7
JARO_WINKLER_PREFIX_SIZE Jaro-Winkler prefix size. 4
LOG_FORMAT Format for logging lines to be written as. Options: json, plain - Default: plain
LOG_LEVEL Level of logging to emit. Options: trace, info - Default: info
BASE_PATH HTTP path to serve API and web UI from. /
HTTP_BIND_ADDRESS Address to bind HTTP server on. This overrides the command-line flag -http.addr. Default: :8084
HTTP_ADMIN_BIND_ADDRESS Address to bind admin HTTP server on. This overrides the command-line flag -admin.addr. Default: :9094
HTTPS_CERT_FILE Filepath containing a certificate (or intermediate chain) to be served by the HTTP server. Requires all traffic be over secure HTTP. Empty
HTTPS_KEY_FILE Filepath of a private key matching the leaf certificate from HTTPS_CERT_FILE. Empty
WEB_ROOT Directory to serve web UI from. Default: webui/
WEBHOOK_MAX_WORKERS Maximum number of workers processing webhooks. Default: 10
DOWNLOAD_WEBHOOK_URL Optional webhook URL called when data downloads / refreshes occur. Empty
DOWNLOAD_WEBHOOK_AUTH_TOKEN Optional Authorization header included on download webhooks. Empty

List configurations

Environmental Variable Description Default
OFAC_DOWNLOAD_TEMPLATE HTTP address for downloading raw OFAC files. https://www.treasury.gov/ofac/downloads/%s
DPL_DOWNLOAD_TEMPLATE HTTP address for downloading the DPL. https://www.bis.doc.gov/dpl/%s
EU_CSL_DOWNLOAD_URL Use an alternate URL for downloading EU Consolidated Screening List Subresource of webgate.ec.europa.eu
UK_CSL_DOWNLOAD_URL Use an alternate URL for downloading UK Consolidated Screening List Subresource of www.gov.uk
UK_SANCTIONS_LIST_URL Use an alternate URL for downloading UK Sanctions List Subresource of www.gov.uk
WITH_UK_SANCTIONS_LIST Download and parse the UK Sanctions List on startup. Default: false
US_CSL_DOWNLOAD_URL Use an alternate URL for downloading US Consolidated Screening List Subresource of api.trade.gov
CSL_DOWNLOAD_TEMPLATE Same as US_CSL_DOWNLOAD_URL
KEEP_STOPWORDS Boolean to keep stopwords in names. false
DEBUG_NAME_PIPELINE Boolean to print debug messages for each name (SDN, SSI) processing step. false
Downloads

Moov Watchman supports sending a webhook (POST HTTP Request) when the underlying data is refreshed. The body will be the count of entities indexed for each list. The body will be in JSON format and the same schema as the manual data refresh endpoint.

Watching a specific customer or company by ID

Moov Watchman supports sending a webhook periodically when a specific Company or Customer is to be watched. This is designed to update another system about an OFAC entry's sanction status.

Watching a customer or company name

Moov Watchman supports sending a webhook periodically with a free-form name of a Company or Customer. This allows external applications to be notified when an entity matching that name is added to the OFAC list. The match percentage will be included in the JSON payload.

Prometheus metrics

  • http_response_duration_seconds: A histogram of HTTP response timings.
  • last_data_refresh_success: Unix timestamp of when data was last refreshed successfully.
  • last_data_refresh_count: Count of records for a given sanction or entity list.
  • match_percentages A histogram which holds the match percentages with a label (type) of searches.
    • type: Can be address, q, remarksID, name, altName

Data persistence

By design, Watchman does not persist (save) any data about the search queries or actions created. The only storage occurs in memory of the process and upon restart Watchman will have no files or data saved. Also, no in-memory encryption of the data is performed.

Go library

This project uses Go Modules and Go v1.18 or newer. See Golang's install instructions for help setting up Go. You can download the source code and we offer tagged and released versions as well. We highly recommend you use a tagged release for production.

$ [email protected]:moov-io/watchman.git

# Pull down into the Go Module cache
$ go get -u github.com/moov-io/watchman

$ go doc github.com/moov-io/watchman/client Search

In-browser Watchman search

Using our in-browser utility, you can instantly perform advanced Watchman searches. Simply fill search fields and generate a detailed report that includes match percentage, alternative names, effective/expiration dates, IDs, addresses, and other useful information. This tool is particularly useful for completing quick searches with the aid of a intuitive interface.

Reporting blocks to OFAC

OFAC requires annual reports of blocked entities and offers guidance for this report. Section 31 C.F.R. § 501.603(b)(2) requires this annual report.

Useful resources

Getting help

We maintain a runbook for common issues and configuration options. Also, if you've encountered a security issue please contact us at [email protected].

channel info
Project Documentation Our project documentation available online.
Twitter @moov You can follow Moov.io's Twitter feed to get updates on our project(s). You can also tweet us questions or just share blogs or stories.
GitHub Issue If you are able to reproduce a problem please open a GitHub Issue under the specific project that caused the error.
moov-io slack Join our slack channel to have an interactive discussion about the development of the project.

Supported and tested platforms

  • 64-bit Linux (Ubuntu, Debian), macOS, and Windows

Note: 32-bit platforms have known issues and are not supported.

Contributing

Yes please! Please review our Contributing guide and Code of Conduct to get started! Checkout our issues for first time contributors for something to help out with.

This project uses Go Modules and Go v1.18 or newer. See Golang's install instructions for help setting up Go. You can download the source code and we offer tagged and released versions as well. We highly recommend you use a tagged release for production.

Releasing

To make a release of ach simply open a pull request with CHANGELOG.md and version.go updated with the next version number and details. You'll also need to push the tag (i.e. git push origin v1.0.0) to origin in order for CI to make the release.

Testing

We maintain a comprehensive suite of unit tests and recommend table-driven testing when a particular function warrants several very similar test cases. To run all test files in the current directory, use go test. Current overall coverage can be found on Codecov.

Related projects

As part of Moov's initiative to offer open source fintech infrastructure, we have a large collection of active projects you may find useful:

  • Moov Fed implements utility services for searching the United States Federal Reserve System such as ABA routing numbers, financial institution name lookup, and FedACH and Fedwire routing information.

  • Moov Image Cash Letter implements Image Cash Letter (ICL) files used for Check21, X.9 or check truncation files for exchange and remote deposit in the U.S.

  • Moov Wire implements an interface to write files for the Fedwire Funds Service, a real-time gross settlement funds transfer system operated by the United States Federal Reserve Banks.

  • Moov ACH provides ACH file generation and parsing, supporting all Standard Entry Codes for the primary method of money movement throughout the United States.

  • Moov Metro 2 provides a way to easily read, create, and validate Metro 2 format, which is used for consumer credit history reporting by the United States credit bureaus.

License

Apache License 2.0 - See LICENSE for details.

watchman's People

Contributors

adamdecaf avatar atonks2 avatar bkmoovio avatar ckelly-digicert avatar dependabot[bot] avatar diginicho avatar docadam avatar fhwrdh avatar garrypolley avatar hampgoodwin avatar loghen41 avatar m29h avatar nlakritz avatar oschwald avatar rayjlinden avatar renovate-bot avatar renovate[bot] avatar snyk-bot avatar titmuscody avatar tombemogga avatar tomdaffurn avatar vxio avatar wadearnold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

watchman's Issues

OFAC service can fail on start-up

If the OFAC service has a failure getting the sanctions list it will exit with a failure. I'm not sure this is the most ideal behavior.

Maybe it should keep trying? Maybe it can try from some back up locations? Not sure the best behavior but crashing does not seem ideal.

(Note: we are not currently persisting the DB. So maybe this is a problem only on initial run when db is created?)

docs: deploy new web interface on moov.io

We should replace the existing demo at moov.io/ofac/ with our new web UI. This involves shifting how that site is deployed and removes the auth requirement, but we should be able to do so in a straightforward way.

Support other lists

There are some other lists we need to monitor. Would be great if this software handled them as well:

  1. The UN Consolidated Sanctions List:
    https://www.un.org/securitycouncil/content/un-sc-consolidated-list

  2. The UK sanctions list:
    https://www.gov.uk/government/publications/financial-sanctions-consolidated-list-of-targets/consolidated-list-of-targets

  3. The EU sanctions list:
    https://webgate.ec.europa.eu/europeaid/fsd/fsf#!/files

The third one is slightly annoying because you to create credentials to access the list.

OFAC dies when it can not load the OFAC info

The treasury site where this service downloads info is not the greatest about being always available.

Recently, a retry mechanism was added. This may have helped in some cases. However, we still see a lot of failures.

The big problem we have is that when it can not load the data the container dies. In our docker-compose environment we have nginx running in front of it and the container then kills our nginx container which means we have to restart the whole damn thing.

(This will be a little less of a problem in production because we could just leave old containers up until new containers successfully launch and load the data.)

At least in our sandbox environment it would be nice if we could load these files initially from disk. Then worse case the data is not fresh - but at least the container is up and running.

search: handle all address query parameters

We need to handle all search params (right now it's just req.Address). This needs to be done so we search across the metadata provided in each OFAC entry.

This includes the following fields/params: .City, .State, .Providence, .Zip, .Country

I assume searching the SDN entry for whatever is in City, State, Providence and Zip (as separate searches?) would be a good start.

I'm not exactly sure how to correlate Country though. Often records are part of the "IRAN" or "CUBA" sanction program, or shows up in the remarks column.

I realized this when reading over an OFAC settlement for shipping to Iran.

Possible match Exception or UnSafe

When a possible match occurs (above 85%) an endpoint should show the watch-list data that matched and the resulting score.
The endpoint should allow the possible match result to be marked as unsafe or add an exception to avoid repeat matches of the same entry.

unsafe the customer/company can no longer make payments

exception the customer/company had a false positive. An exception

and exception should include the time, optional note, and ideally the user that made that decision.

spike: persistence/indexing into elasticsearch (and maybe sqlite)

A standalone instance of this app needs to download the OFAC files on its own and should refresh that copy after N hours. (N is configurable and likely defaults to 24h) This allows someone to start our app without the need for external dependencies and keeps the information up to date.

We can keep the files in temp storage close to the app. When the app restarts it can check the modification time of the files and if the files are too old download them again. This would help to prevent repeated downloads if the app is in a crash loop.

After reading the flat files we might want to persist the structured data in a database to allow for better queries, full text, etc. I think a SQL solution would be the best and we can start with sqlite since our other apps use that.

The spec for CSV files isn't too bad and can probably be directly mapped to a few tables. ent_num is used to join the tables together.

FORMAT SDN CSV

Main table, text file name SDN.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        ent_num     number          unique record
                                     identifier/unique
                                     listing identifier
2        SDN_Name     text     350   name of SDN
3        SDN_Type     text     12    type of SDN
4        Program      text     50    sanctions program name
5        Title        text     200   title of an individual
6        Call_Sign    text     8     vessel call sign
7        Vess_type    text     25    vessel type
8        Tonnage      text     14    vessel tonnage
9        GRT          text     8     gross registered tonnage
10       Vess_flag    text     40    vessel flag
11       Vess_owner   text     150   vessel owner
12       Remarks      text     1000  remarks on SDN*

Address table, text file name ADD.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        Ent_num      number         link to unique listing
2        Add_num      number         unique record identifier
3        Address      text     750   street address of SDN
4        City/				text     116   city, state/province, zip/postal code
         State/Province/
         Postal Code
5        Country      text     250   country of address
6        Add_remarks  text     200   remarks on address

Alternate identity table, text file name ALT.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        ent_num      number         link to unique listing
2        alt_num      number         unique record identifier
3        alt_type     text     8     type of alternate identity
                                     (aka, fka, nka)
4        alt_name     text     350   alternate identity name
5        alt_remarks  text     200   remarks on alternate identity

admin: generalize admin data refresh endpoint path

#95 adds BIS Denied Person's List (DPL) data into OFAC search endpoints. The admin endpoint is currently /ofac/refresh, but we should rename it as /data/refresh since the underlying code updates DPL data as well.

For the time being we should support both endpoints so we don't break backword compatibility.

search: ignore stop words in matching

Stop words are typically useless words in sentences and are also the most frequent. They're excluded because the verb, noun, adjectives, etc contain information which the speaker weights more heavily over these grammatical nuances.

There's a Go library which looks to be setup for removing and ranking strings according to a language's common stop words. We'd need to add support for Jaro Winkler.

https://github.com/bbalet/stopwords / LevenshteinDistance

search: filter/handle weak vs strong results

From their docs:

Am I required to screen for weak aliases (AKAs)?

OFAC's regulations do not explicitly require any specific screening regime. Financial institutions and others must make screening choices based on their circumstances and compliance approach. As a general matter, though, OFAC does not expect that persons will screen for weak AKAs, but expects that such AKAs may be used to help determine whether a hit arising from other information is accurate.

Will I be penalized for processing an unauthorized transaction
involving a weak alias (AKA)?

A person who processes an unauthorized transaction involving an SDN has violated U.S. law and may be subject to an enforcement action. Generally speaking, however, if (i) the only sanctions reference in the transaction is a weak AKA, (ii) the person involved in the processing had no other reason to know that the transaction involved an SDN or was otherwise in violation of U.S. law, and (iii) the person maintains a rigorous risk-based compliance program, OFAC will not issue a civil penalty against an individual or entity for processing such a transaction.

Add BIS - Denied Persons List to search

The Bureau of Industry and Security (BIS) of the U.S. Department of Commerce is responsible for the regulation of exports for national security, foreign policy, and nonproliferation reasons, as well as the enforcement of those regulations. The BIS Denied Persons List (formerly known as the Denial Orders List or the DOL List) lists individuals who have been denied export privileges in whole or in part. BIS Denied Person list is a key list to be scanned against in order to comply with the USA PATRIOT Act.

In addition to enforcing the Export Administration Regulations (EAR), BIS export compliance administers several restricted party lists that apply to export and reexport transactions. BIS maintains three restricted party lists: the Denied Persons List, the Entity List and the Unverified List.

BIS lists, which change frequently, include:

  • BIS Denied Persons List download
    The BIS Denied Persons List includes the names of individuals and companies that have been denied export privileges by BIS, usually due to a violation of U.S. export control laws. U.S. persons and companies are generally prohibited from engaging in export transactions with parties named on the Denied Persons List.

  • BIS Entity List
    The BIS Entity List identifies the names of companies, individuals, government agencies and research institutions that trigger export and reexport license requirements. U.S. companies need to ensure that the appropriate export licenses are in place before proceeding with transactions with parties on the BIS Entity List.

  • BIS Unverified List
    This list includes the names of foreign parties that BIS have been unable to conduct a pre-license check or post-shipment verification. Potential transactions with parties on the Unverified List are a “red flag” that must be addressed and resolved before proceeding with the export.


Adding BIS Denied Persons list expands the scope of this project from OFAC to identity verification. Competitive products in the market that are sold as OFAC checks usually include additional public lists that augment the OFAC check by starting with the OFAC’s SDN lsit and then searching

  • BIS Denied Persons
  • FBI’s Most Wanted List
  • Debarred Parties list
  • Death Master File

All lists are used to help verify the identity of the sending or receiving company

periodic refresh of OFAC data

We need to periodically refresh the OFAC data downloaded. I'm unsure if there's a specific maximum set of OFAC standards, but offering an override via environmental variable is also needed. (Example: OFAC_REFRESH_INTERVAL=240m)

This would use the same underlying download/index logic as #9

Add DB persistence to data models

Currently, all data is stored in SQLLite in memory. Customers, Companies, and the event log should be configured to be stored in a restart persistent data stores such as MySQL or Postgres.

cmd/server: Prometheus metric for last refresh timestamp

OFAC's server should expose a Prometheus gauge for the last unix timestamp that data was successfully refreshed. This can be used with an alert pretty easily. (example below)

groups:
 - name: ./ofac.rules
   rules:
    - alert: OFACStaleData
      expr: (now() - ofac_data_last_refresh) < 60*60*12 # 12 hours ago
      for: 1h
      labels:
        severity: warning
      annotations:
        description: "OFAC stale data, last refresh {{ humanizeTimestamp $value }}"

Results inconsistent with Treasury.gov

Using this service (or the example available at https://moov.io/ofac/) does not match results from the treasury's official SDN search at https://sanctionssearch.ofac.treas.gov/.

This service returns way too many unrelated results with a high match value. For example, searching "George Bush" returns an 89% match to "George Habbash", 88% match to "George Haswani", 84% match to "George Chiweshe":

image

However, searching the same name with an 80% fuzzy match threshold on the Treasury's own OFAC search website yields no results:

image

I love the open source fintech toolbox. Unfortunately these tools are not high enough quality to use in production at a financial institution today. I look forward to helping to improve the library.

Thanks!

cmd/server: support TLS in HTTP server

We need to support TLS in our HTTP server. This protects requests from inspection along the various network paths and routing. It's also required as part of several guidelines and audit requirements.

spike: API and service interactions

This issue is to discuss the API workings and how other Moov services talk with ofac.

I'm going to propose the endpoints are written like auth and paygate, which is along these lines with their routing logic defined nearby (file-wise). The go-kit endpoints are okay, but casting to and from interface{} has drawbacks.

The endpoint specifics are over in #1, so I'm not going to discuss that here.

Authentication works the following way in paygate:

An HTTP call like GET /v1/depositories/:id with a cookie or OAuth token will hit our LB (traefik) and a "forward auth" call gets made from traefik to our auth service. The cookie or OAuth token is checked, and if valid '200 OK' is returned to traefik. Only on that '200 OK' is the actual request proxied to paygate (or in this case ofac).

I'd like to setup the ofac API in a similar way. It removes the need for the app to care about auth and can be enforced pretty easily. (We can move that logic into a shared library.)

Finding an HTTP address for paygate -> ach interactions is easy enough for local dev and kubernetes so let's re-use that logic.

The custom client in paygate's pkg/achclient was needed because our generated go client assumes too much about hitting api.moov.io and LB routing. We needed specific service-to-service interaction ignoring that. We might need the same for ofac, but I'd like to try generating and importing an OpenAPI spec from ofac's source tree in our main api.moov.io OpenAPI spec.

There's some common Prometheus metrics exported by their library and I've added a few to each service. I assume ofac will have metrics to export. i.e. ofac_last_refreshed{type="sdn"} 1213212515

endpoint for comparing strings with a specific algorithm

I thought of having an endpoint like GET /compare?s1=..&s2=...[&algorithm=soundex] which returns the similarity percentage between s1 and s2. This would allow dynamic debugging of two strings.

Would this be useful? Please comment below or vote 👍 / 👎 thanks!

Retry mechanism could be better

I think when the ofac site is having trouble it has bigger troubles than a retry will solve! :(

Also, I see this error when it fails:
{"caller":"main.go:135","main":"ERROR: failed to download/parse initial OFAC data: ERROR: downloading OFAC and DPL data: OFAC: problem downloading (matched=add.csv, alt.csv, sdn.csv, sdn_comments.csv missing=dpl.txt): err=\u003cnil\u003e","ts":"2019-08-07T04:40:48.491283209Z"}

Something weird is happening with the error field.

An option to load from disk or something would be nice for dev...

latest ofac version requires an invalid go4.org pseudo-version

In commit 9844e58, the go4.org module requirement was updated to an invalid pseudo-version. In commit 6e70f24, a replace directive was added with the correct pseudo-version.

A problem is that the replace directive only applies when this is the main module, so it doesn't help other projects that try to require and use this module.

You can reproduce by doing:

$ go version
go version go1.13.4 darwin/amd64
$ cd $(mktemp -d)
$ go mod init example.com/m
go: creating new go.mod: module example.com/m
$ go get github.com/moov-io/ofac@latest
go: finding github.com/moov-io/ofac v0.11.1
go: downloading github.com/moov-io/ofac v0.11.1
go: extracting github.com/moov-io/ofac v0.11.1
go get: github.com/moov-io/[email protected] requires
	[email protected]: invalid pseudo-version: does not match version-control timestamp (2019-03-13T08:23:47Z)

The right fix would be to remove the replace directive and change the required version to be valid.

Add Type filter for the search

One of the important filters we will need is the Type filter that lets us distinguish individuals from other types of entities. I'd be happy to add it to the UI once the search API supported this.

image

Audit Log of list updates and delta of changes

Keep a log of when the different sanction lists where updated and the affecting change of the new list. We should be able to retrieve a last updated date from a service along with a history of changes.

Example:

Source Description Updated On # of records
OFAC (SDN) Specially Designated Nationals List 01/27/2019 32,173
OFAC (OFCL) Consolidated List 11/20/17 2,546

Log Entries

Date Action
2019-01-31 07:25:30 Updated OFAC database, including 7,414 entries from the SDN list and 366 entires from the Consolidated Non-SDN list. Total of 7,780.
2019-01-28 07:25:25 Updated OFAC database, including 7,413 entries from the SDN list and 366 entires from the Consolidated Non-SDN list. Total of 7,779.
2019-01-23 07:25:43 Updated OFAC database, including 7,410 entries from the SDN list and 366 entires from the Consolidated Non-SDN list. Total of 7,776.

Add: EU Consolidated list of sanctions

The consolidated list of persons, groups and entities subject to EU financial sanctions can be downloaded from Financial Sanctions Database - FSF platform accessible via the following address: https://webgate.ec.europa.eu/europeaid/fsd/fsf#!/files

In order to access the FSF platform, you need to have an "EU Login" account.
Please follow the instructions provided on the EU Login page displayed when you click on the above link.

ofactest: failure in Kubernetes environment

ofactest works fine locally, so I'm not sure the problem in Kubernetes.

$ kubectl  logs -n apps ofactest-1550773800-c4jhf  
2019/02/21 18:32:59.625924 main.go:57: Starting moov/ofactest v0.5.2
2019/02/21 18:32:59.625949 main.go:83: [INFO] using http://ofac.apps.svc.cluster.local:8080/ for address
2019/02/21 18:32:59.731940 main.go:101: [SUCCESS] ping
2019/02/21 18:32:59.733422 main.go:108: [SUCCESS] last download was: 6m41s ago
2019/02/21 18:32:59.751252 main.go:121: [SUCCESS] name search passed, query="alh"
2019/02/21 18:32:59.752313 main.go:133: [FAILURE] problem adding company watch: addCompanyWatch: 405 Method Not Allowed

Support watches on OFAC data

A Watch represents an automated notification when OFAC data is matched, either as part of an online or async re-search. Watches let consumers of OFAC get a callback on an HTTPS endpoint.

spike: string matching algorithms

OFAC searches are inherently messy and complicated because they interact with people's real names and/or aliases. This means there isn't a "one-size" algorithm we could apply and instead need to offer our customers with lots of options.

In short, our search endpoint(s) should be able to reflect multiple string comparison algorithms:

POST /v1/ofac/search/name?algos=levenstein,exact
[
 {
   // ...
 }
]

We should also support no algos parameter (or a value of all) to run all string comparison algorithms. All searches run across aliases and real names from the OFAC list.

The initial list of algorithms could include:

With many other possible algorithms,

I think the result body for an algorithm should be:

{
  "algorithm": "hamming",
  "score": 0.95
}
  • algorithm: lowercase enumeration of all algorithms.
  • score: Is a normalized 0-1 percent of string match. (i.e. 0.95 -> 95% match)

Endpoint Proposal

Here are my initial thoughts for OFAC endpoints.

We'd serve all endpoints at https://api.moov.io/v1/ofac/

POST /companies/:id/watch
- monitor company by id

POST /companies/watch?name=...
- monitor company by name, reparsed on each search

GET /companies/:id
- get company information and matches

PUT /companies/:id
- mark company as blocked or unblocked

DELETE /companies/:id/watch
- stop watching company

DELETE /companies/watch/:watchId
- stop watching company by watchId

POST /customers/:id/watch
- monitor customer

POST /customers/watch?name=...
- monitor customer by name, re-parse on each search

GET /customers/:id
- get customer information and matches

PUT /customers/:id
- mark customer as blocked or unblocked

DELETE /customers/:id/watch
- stop watching customer

DELETE /customers/watch/:watchId
- stop watching customer name


POST /search/address
- Search for address records matching the given search criteria.

POST /search/name?k=v
- fuzzy name search
- See: https://github.com/moov-io/ofac/issues/6

POST /search/alt?k=v
- fuzzy alternate name search

POST /search/company?k=v
- fuzzy company name search


GET /sdn/:id/addresses
- get addresses for a given SDN

GET /sdn/:id/alternateNames
- get alternate names for a given SDN

GET /sdn/:id
- get SDN information

OFAC not returning matches

I've been trying to get ofac to work but it doesn't seem to be returning the rigth results:

  1. Go to https://sanctionssearch.ofac.treas.gov/ and search name="Maduro" (the venezuelan dictator) and you will get 4 results with 100% score.
  2. Go to https://moov.io/ofac/ and search "Maduro" and you won't get those 4 results but a bunch of unrelated results

Also if you search by the National ID Number you won't get any results:

  1. Go to https://sanctionssearch.ofac.treas.gov/ and search ID#="5892464" and you will get the right result of "Maduro"
  2. Go to https://moov.io/ofac/ and search "5892464" and just a bunch of nonsense is returned. (Also tried using a recent ofac docker image with search/q=589246)

bug: replace their null character

The OFAC spec (for CSV) specifies the following as their null character. We should replace this with an empty string after/during reading.

null:                          -0-

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.