OWASP Machine Learning Security Top 10 Project

Home Page: http://owasp.org/www-project-machine-learning-security-top-10/

License: Other

Ruby 0.11% HTML 69.48% SCSS 0.58% Nix 29.83%

ml mlops mlsec owasp owasp-top-10 ai appsec mlai mlsecops mlsecurity

www-project-machine-learning-security-top-10's Introduction

OWASP Machine Learning Security Top 10

Welcome to the repository for the OWASP Machine Learning Security Top 10 project!

Overview

The primary aim of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. More information on the project scope and target audience is available in our project working group charter.

Contribution

The initial version of the Machine Learning Security Top 10 list was contributed by Sagar Bhure and Shain Singh. The project encourages community contribution and aims to produce a high quality deliverable reviewed by industry peers.

All contributors will need to adhere to the project's code of conduct. Please use the following form for any feedback, suggestions, issues or questions.

Getting Started

The project has a wiki which provides information to get help you started on how to contribute.

Contributors ✨

This project follows the all-contributors specification. Contributions of any kind welcome!

Thanks goes to these wonderful people (emoji key):

_{Sagar Bhure} 💻 📖 👀 💬 🖋 🔬 📣	_{Shain Singh} 💻 📖 👀 💬 🖋 📣 📆	_{Rob van der Veer} 👀 💻 📖 💬 📣	_{M S Nishanth} 💻 💬	_{Rick M} 💻	_{Harold Blankenship} 💻	_{RiccardoBiosas} 💻	_{Aryan Kenchappagol} 📖 💻 💬 📣	_{Mikołaj Kowalczyk} 💻 📖 💬 📣
_{Adit Nugroho} 💻 📖

License

This project is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License

www-project-machine-learning-security-top-10's People

Contributors

Stargazers

Watchers

Forkers

isabella232 shsingh msnishanth9001 kingthorin riccardobiosas sagarbhure adminbeyondacloud rajanagori ethicalsecurity-agency mik0w adityoari alainlompo aryanxk02 manuelslemos bilalbrar linkedatai nextgensec-github aishieldsorg

www-project-machine-learning-security-top-10's Issues

feat(docs): create guide for how to use Top 10 list as an MLOps Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#mlops

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

[FEEDBACK]: Consider excessive agency

Type

General Feedback

What would you like to report?

The LLM top 10 mentions excessive agency, because it is important to limit privileges /autonomy / have oversight over LLM's. This is a general AI problem.
One could argue whether this is a security risk, and I would argue that it is, because just as AI models are unpredictable, they may also have been manipulated.
I believe the ML top 10 also needs Excessive agency.

Code of Conduct

I agree to follow this project's Code of Conduct

feat(docs): create a recorded demo of ML02 Data Poisoning Attack

Create a recorded video demo (no audio)

Video will be uploaded to OWASP Youtube Channel

feat(docs): create a recorded demo of ML03 Model Inversion Attack

Create a recorded video demo (no audio)

Video will be uploaded to OWASP Youtube Channel

bug(website): remove pages included from project templates

as an example _includes has information from the standard top 10 directory which shows error pages such as: https://owasp.org/www-project-machine-learning-security-top-10/2023/Acknowledgements.html

mentioned in: #44

feat(docs): create guide for how to use Top 10 list as a ML Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#ml-engineeranalyst

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

feat(docs): create guide for how to use Top 10 list as a AppSec Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#pentestersecurity-engineer

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

fix: merge review from @robvanderveer

the following is an initial review taken from Slack logs: https://owasp.slack.com/archives/C04PESBUWRZ/p1677192099712519

by @robvanderveer

Dear all,
I did a first scan through the list to mainly look at taxonomy. Here are my remarks.
1.
ML01
In 'literature' the term ‘adversarial’ is often used for input manipulation attacks, but also for data poisoning, model extraction etc. Therefore in order to avoid confusion it is probably better to rename the ML01 adversarial attack entry to input manipulation?
2.
It is worth considering to add ‘model evasion’ aka black box input manipulation to your top 10? Or do you prefer to have one entry for input manipulation all together?
3.
ML03
It is not clear to me how scenarios 1 and 2 work. I must be missing something. Usually model inversion is explained by manipulating synthesized faces until the algorithm behaves like it recognizes the face.
4
ML04
It is not clear to me how scenario 1 works.
Standard methods against overtraining are missing form the ‘how to prevent’ part. Instead the advice is to reduce the training set size - which typically increases the overfitting problem.
5
ML05
Model stealing describes a scenario where an attacker steals model parameters, but generally this attack takes place by ways of black box: gathering input-output pairs and training a new model on it.
6
ML07
I don’t understand exactly how the presented scenario should work. I do know about the scenario where a pre-trained model was obtained that has been altered by an attacker. This matches the description.
7
ML08
Isn’t model skewing the same as data poisoning? If there’s a difference, to me they are not apparent from the scenario and description.
8
ML10 is called Neural net reprogramming but I guess the attack of changing parameters will work on any type of algorithm - not just neural networks. The description also mentions changing the training data, but perhaps that is better left out to avoid confusion with data poisoning?

refactor: rename "neural net reprogramming" to "model poisoning"

as per feedback from #87

create project charter

define scope and objectives of the project

[Async] Meeting -> Jul 25 - Jul 30 2023

Join us for this async meeting running from 25th July - 30th July 2023, by participating in Slack Thread or by commenting to this issue in Github

chore: make draft status more prominent

feat(book): enable ability to export as PDF book

The top 10 list should have an ability to export the Markdown files into a PDF. This could be done for example via a Github Action.

feat(contributions): investigate use of all-contributors specification

All Contributors has a specification for recognising contributions that are not just code.

e.g.

blog posts
examples
ideas and planning
mentoring
answering questions

The full list is shown here: https://allcontributors.org/docs/en/emoji-key

--
Implementation would involve either using a bot: https://allcontributors.org/docs/en/bot/overview or manually via cli: https://allcontributors.org/docs/en/cli/overview

refactor: rename 'Corrupted packages' to 'AI supply chain attacks'

as per feedback in #85

feat(docs): create guide for how to use Top 10 list as a Developer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#developers

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

feat(rendering): make PDF output from Markdown files more presentable

The Top 10 list is being rendered using Markdown at https://mltop10.info

The site is being rendered using Quarto and the files from https://github.com/OWASP/www-project-machine-learning-security-top-10/tree/master/docs are mirrored to https://github.com/mltop10-info/mltop10.info

Currently a manual process is run for the https://github.com/mltop10-info/mltop10.info locally to render the HTML and PDF outputs which are stored in https://github.com/mltop10-info/mltop10.info/tree/main/docs and used by Github Pages.

The rendering for PDF is currently using the default method of LaTeX - example at: https://github.com/mltop10-info/mltop10.info/blob/main/docs/OWASP-Machine-Learning-Security-Top-10.pdf

Quarto has a lot of formatting options for generating PDF and this needs to be explored to make the PDF and ePUB formats look more presentable.

[FEEDBACK]: sync master and dev branch.

Type

General Feedback

What would you like to report?

I would like to report the following issue/feedback

Code of Conduct

I agree to follow this project's Code of Conduct

[Fortnightly] Working Group Meeting - 2023-Aug-17

Date: Thursday, August 17 at 0500 UTC
Previous agenda: #42

Current agenda

General project status - v0.2 milestone complete
Contributions and current help wanted
Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

Discussed in #122

^{Originally posted by ankitloud October 6, 2023}

Discussed in #115

^{Originally posted by giscus[bot] September 20, 2023}

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

https://mltop10.info/ML01_2023-Input_Manipulation_Attack.html

feat(docs): create a recorded demo of ML04 Membership Inference Attack

Create a recorded video demo (no audio)

Video will be uploaded to OWASP Youtube Channel

add: Create a RELATED.md to list similar projects and SIGs

Type

Website Issue Report

What would you like to report?

This applies to both the website and documentation content.

create a page which lists similar:
OWASP projects
other committees and SIGs (LF, OpenSSF, CSA)

Code of Conduct

I agree to follow this project's Code of Conduct

feat(docs): create guide for how to use Top 10 list as a CISO

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#ciso

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

[Fortnightly] Working Group Meeting - 2023-Aug-31

Current agenda

General project status v0.3 - in progress
Contributions and current help wanted
Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

feat(docs): create page on calculating severity

Each of the Top 10 items are scored according to OWASP's Risk Rating Methodology. There should be a page defining how to use the ratings to provide a severity score. This will assist practitioners in knowing 'what to fix' and 'when'.

[FEEDBACK]: Leaking pipeline is missing

Type

General Feedback

What would you like to report?

The risk of leaking training data or other confidentiality issues of the AI pipeline (code, model parameters) are missing.

Code of Conduct

I agree to follow this project's Code of Conduct

[Fortnightly] Working Group Meeting - 2023-Aug-03

Date: Thursday, August 3 at 0500 UTC
Previous agenda: #7

Current agenda

General project status
Discuss Charter and in particular the scope of project
Contributions and current help wanted
Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

create CODEOWNERS

create a CODEOWNERS file to enable core-team reviewers

fix: merge existing body of work from EthicalML https://ethical.institute

Type

Documentation Issue Report

What would you like to report?

There is a comprehensive existing body of work at: https://ethical.institute

The intent would be to review the current Top 10 list in this project and:

merge content where appropriate
create new content missing
suggest improvements for areas/scope not curently covered

Code of Conduct

I agree to follow this project's Code of Conduct

feat(docs): create guide for how to use Top 10 list as a Data Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#data-engineer

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

fix: Website has incorrect pdf document

Type

Website Issue Report

What would you like to report?

I would like to report the following issue/feedback

Incorrect pdf downloaded from : https://owasp.org/www-project-machine-learning-security-top-10/2023/Acknowledgements.html

Code of Conduct

I agree to follow this project's Code of Conduct

[FEEDBACK]: Rename 'Corrupted packages' to 'AI supply chain attacks'

Type

Suggestions for Improvement

What would you like to report?

I believe 'Packages' to be a too specific term for the problem of supply chain attacks. Calling it 'supply chain attacks' will make the reader aware of the risk that any external component in the AI pipeline can be manipulated.
Also, add 'data' as a potential supply chain risk, and refer to 'data poisoning' for that, and also add 'model', referring to the transfer learning attack.

Code of Conduct

I agree to follow this project's Code of Conduct

fix: formatting top 10 pages

clean up formatting on top 10 pages

refactor: rename "adversarial attack" to "input manipulation"

as per feedback in #84

feat(docs): create a set of guidelines for how to use the Top 10 list

Create a set of guidelines for how to consume the information presented in the Top 10 based on roles

Example roles:

MLOps/MLEng practitioner
Appsec practitioner

.. etc

feat(docs): create a GLOSSARY.md

Glossary page to standardise on terminologies and definitions.

Workflow to clone Top 10 attacks to mirror site https://mltop10.info/

Construct workflow to clone the Top 10 attack's MD File to the Repo https://github.com/mltop10-info/mltop10.info.

So that all changes to attack scenarios are pushed by WF rather than human interaction.

[FEEDBACK]: Rename Neural Net Reprogramming to Model poisoning

Type

General Feedback

What would you like to report?

Corrupting/manipulating model parameters is a general threat, referred to as model poisoning, and is not restricted to neural networks.

Code of Conduct

I agree to follow this project's Code of Conduct

create issue templates

create template for new issues
create template for pull requests

Model stealing through interaction is not mentioned

The current model stealing only describes the model being stolen through parameters, but the model can also be stolen by presenting inputs, capturing the output and using those combinations to train your own model. See AI guide

[FEEDBACK]: Risk Ranking Reference

Type

Documentation Issue Report

What would you like to report?

Hi team,
I would like to focus on the missing information related to the Risk Ranking number of Top 10 at the starting of the page, the table mentioned in every vulnerability.

It would be a great resource if added so that people can related to the risk associated with the vulnerability.

@sagarbhure

Code of Conduct

I agree to follow this project's Code of Conduct

[Fortnightly] Working Group Meeting - 2023-Jul-20

Date:

2023-Jul-20 06:00 UTC (11:30 Hyderabad, 16:00 Melbourne)

Attendees:

Alejandro Saucedo
John Sotiropoulos
Sagar Bhure
Shain Singh

Notes:

Discuss project reboot, historical information on the project
Agreement on setting up cadence and getting project information in order
- project charter
- defined goals
- create sprints and roadmap to keep momentum and attract regular and new contributors
- information on how to contribute

Action Items:

setup regular meeting
create documentation/wiki to get project charter and other documentation created

create wiki

home page as landing page for where to find information
charter (either page in wiki or link to markdown in repository)
contributing
meeting information

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

Discussed in #115

^{Originally posted by giscus[bot] September 20, 2023}

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

https://mltop10.info/ML01_2023-Input_Manipulation_Attack.html

[Fortnightly] Working Group Meeting - 2023-Sep-14

Current agenda

General project status v0.3 - in progress
Notable PRs completed since last meeting:

#104
#110

Notable discussions:

Meetings:

WG meeting will change forward a few hours to accomodate EU morning time zones

Contributions and current help wanted
Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

[FEEDBACK]: Integrate model skewing into data poisoning

Type

Suggestions for Improvement

What would you like to report?

[FEEDBACK]: Model skewing requires altering training data, making it a form of data poisoning. Therefore it is probably better to integrate the two threats.

Code of Conduct

I agree to follow this project's Code of Conduct

[FEEDBACK]: Make ML06 more precise and with more Attack Scenarios

Type

Suggestions for Improvement

What would you like to report?

Re-thinking and re-writing ML06 - corrupted packages

The description of ML05 is quite limited given how complicated the software supply chains are, especially those related to ML-using software.

In the summary of the vulnerability it is written: This type of attack can be particularly dangerous as it can go unnoticed for a long time, since the victim may not realize that the package they are using has been compromised. The attacker's malicious code could be used to steal sensitive information, modify results, or even cause the machine learning model to fail.. Meanwhile, in the Detectability section in Risk Factors it says, that it's easy to detect this kind of vulns.

What is more, there's nothing said about countermeasures such as SBOM/MLBOM etc. in the description of this vulnerability. In my opinion that should be included.

There's plenty of resources that should be analyzed and used for the description of this specific vulnerability:

Code of Conduct

I agree to follow this project's Code of Conduct

Implementing demo for ML10:2023 Model Poisoning

Thinking of ML10:2023 Model Poisoning, we can create two scripts that, although carrying out the same operation (perhaps classification), which provide different outcomes.

By this was, we can showcase model poisoning in action along with the theory corresponding to it.

Please share your ideas with me on this!

cc: @sagarbhure @shsingh @robvanderveer

[FEEDBACK]: Rename adversarial attack to something less ambiguous

Type

Suggestions for Improvement

What would you like to report?

The term adversarial attack usually has a broader definition than the intention of ML01. For example it usually includes data poisoning.
The intention seems to refer to what is more often called 'evasion attack'. The problem with that term is that it usually means small changes to the input. This is why in the AI guide we used the term 'input manipulation', which is more clear.

Code of Conduct

I agree to follow this project's Code of Conduct

feat(docs): create a recorded demo of ML01 Input Manipulation Attack

Create a recorded video demo (no audio)

Video will be uploaded to OWASP Youtube Channel

owasp / www-project-machine-learning-security-top-10 Goto Github PK

www-project-machine-learning-security-top-10's Introduction

OWASP Machine Learning Security Top 10

Overview

Contribution

Getting Started

Contributors ✨

License

www-project-machine-learning-security-top-10's People

Contributors

Stargazers

Watchers

Forkers

www-project-machine-learning-security-top-10's Issues

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Current agenda

Discussions

Calendar Event

Discussed in #122

Discussed in #115

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

Type

What would you like to report?

Code of Conduct

Current agenda

Discussions

Calendar Event

Type

What would you like to report?

Code of Conduct

Current agenda

Discussions

Calendar Event

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Date:

Notes:

Action Items:

Discussed in #115

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

Current agenda

Discussions

Calendar Event

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Type

What would you like to report?

Code of Conduct

Recommend Projects

Recommend Topics

Recommend Org

Jobs