GithubHelp home page GithubHelp logo

owasp / www-project-machine-learning-security-top-10 Goto Github PK

View Code? Open in Web Editor NEW
57.0 18.0 18.0 47.47 MB

OWASP Machine Learning Security Top 10 Project

Home Page: http://owasp.org/www-project-machine-learning-security-top-10/

License: Other

Ruby 0.11% HTML 69.48% SCSS 0.58% Nix 29.83%
ml mlops mlsec owasp owasp-top-10 ai appsec mlai mlsecops mlsecurity

www-project-machine-learning-security-top-10's Introduction

OWASP Machine Learning Security Top 10

OWASP Incubator License: CC BY-SA 4.0

Welcome to the repository for the OWASP Machine Learning Security Top 10 project!

Overview

The primary aim of the OWASP Machine Learning Security Top 10 project is to deliver an overview of the top 10 security issues of machine learning systems. More information on the project scope and target audience is available in our project working group charter.

Contribution

The initial version of the Machine Learning Security Top 10 list was contributed by Sagar Bhure and Shain Singh. The project encourages community contribution and aims to produce a high quality deliverable reviewed by industry peers.

All contributors will need to adhere to the project's code of conduct. Please use the following form for any feedback, suggestions, issues or questions.

Getting Started

The project has a wiki which provides information to get help you started on how to contribute.

Contributors ✨

This project follows the all-contributors specification. Contributions of any kind welcome!

Thanks goes to these wonderful people (emoji key):

Sagar Bhure
Sagar Bhure

💻 📖 👀 💬 🖋 🔬 📣
Shain Singh
Shain Singh

💻 📖 👀 💬 🖋 📣 📆
Rob van der Veer
Rob van der Veer

👀 💻 📖 💬 📣
M S Nishanth
M S Nishanth

💻 💬
Rick M
Rick M

💻
Harold Blankenship
Harold Blankenship

💻
RiccardoBiosas
RiccardoBiosas

💻
Aryan Kenchappagol
Aryan Kenchappagol

📖 💻 💬 📣
Mikołaj Kowalczyk
Mikołaj Kowalczyk

💻 📖 💬 📣
Adit Nugroho
Adit Nugroho

💻 📖

License

This project is licensed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License

www-project-machine-learning-security-top-10's People

Contributors

adityoari avatar aryanxk02 avatar hblankenship avatar kingthorin avatar mik0w avatar msnishanth9001 avatar nextgensec-github avatar owaspfoundation avatar riccardobiosas avatar sagarbhure avatar shsingh avatar yusufmunircloud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

www-project-machine-learning-security-top-10's Issues

[FEEDBACK]: Consider excessive agency

Type

General Feedback

What would you like to report?

The LLM top 10 mentions excessive agency, because it is important to limit privileges /autonomy / have oversight over LLM's. This is a general AI problem.
One could argue whether this is a security risk, and I would argue that it is, because just as AI models are unpredictable, they may also have been manipulated.
I believe the ML top 10 also needs Excessive agency.

Code of Conduct

  • I agree to follow this project's Code of Conduct

fix: merge review from @robvanderveer

the following is an initial review taken from Slack logs: https://owasp.slack.com/archives/C04PESBUWRZ/p1677192099712519

by @robvanderveer


Dear all,
I did a first scan through the list to mainly look at taxonomy. Here are my remarks.
1.
ML01
In 'literature' the term ‘adversarial’ is often used for input manipulation attacks, but also for data poisoning, model extraction etc. Therefore in order to avoid confusion it is probably better to rename the ML01 adversarial attack entry to input manipulation?
2.
It is worth considering to add ‘model evasion’ aka black box input manipulation to your top 10? Or do you prefer to have one entry for input manipulation all together?
3.
ML03
It is not clear to me how scenarios 1 and 2 work. I must be missing something. Usually model inversion is explained by manipulating synthesized faces until the algorithm behaves like it recognizes the face.
4
ML04
It is not clear to me how scenario 1 works.
Standard methods against overtraining are missing form the ‘how to prevent’ part. Instead the advice is to reduce the training set size - which typically increases the overfitting problem.
5
ML05
Model stealing describes a scenario where an attacker steals model parameters, but generally this attack takes place by ways of black box: gathering input-output pairs and training a new model on it.
6
ML07
I don’t understand exactly how the presented scenario should work. I do know about the scenario where a pre-trained model was obtained that has been altered by an attacker. This matches the description.
7
ML08
Isn’t model skewing the same as data poisoning? If there’s a difference, to me they are not apparent from the scenario and description.
8
ML10 is called Neural net reprogramming but I guess the attack of changing parameters will work on any type of algorithm - not just neural networks. The description also mentions changing the training data, but perhaps that is better left out to avoid confusion with data poisoning?

feat(rendering): make PDF output from Markdown files more presentable

The Top 10 list is being rendered using Markdown at https://mltop10.info

The site is being rendered using Quarto and the files from https://github.com/OWASP/www-project-machine-learning-security-top-10/tree/master/docs are mirrored to https://github.com/mltop10-info/mltop10.info

Currently a manual process is run for the https://github.com/mltop10-info/mltop10.info locally to render the HTML and PDF outputs which are stored in https://github.com/mltop10-info/mltop10.info/tree/main/docs and used by Github Pages.

The rendering for PDF is currently using the default method of LaTeX - example at: https://github.com/mltop10-info/mltop10.info/blob/main/docs/OWASP-Machine-Learning-Security-Top-10.pdf

Quarto has a lot of formatting options for generating PDF and this needs to be explored to make the PDF and ePUB formats look more presentable.

[FEEDBACK]: sync master and dev branch.

Type

General Feedback

What would you like to report?

I would like to report the following issue/feedback

Code of Conduct

  • I agree to follow this project's Code of Conduct

add: Create a RELATED.md to list similar projects and SIGs

Type

Website Issue Report

What would you like to report?

This applies to both the website and documentation content.

  • create a page which lists similar:
  • OWASP projects
  • other committees and SIGs (LF, OpenSSF, CSA)

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEEDBACK]: Leaking pipeline is missing

Type

General Feedback

What would you like to report?

The risk of leaking training data or other confidentiality issues of the AI pipeline (code, model parameters) are missing.

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEEDBACK]: Rename 'Corrupted packages' to 'AI supply chain attacks'

Type

Suggestions for Improvement

What would you like to report?

I believe 'Packages' to be a too specific term for the problem of supply chain attacks. Calling it 'supply chain attacks' will make the reader aware of the risk that any external component in the AI pipeline can be manipulated.
Also, add 'data' as a potential supply chain risk, and refer to 'data poisoning' for that, and also add 'model', referring to the transfer learning attack.

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEEDBACK]: Rename Neural Net Reprogramming to Model poisoning

Type

General Feedback

What would you like to report?

Corrupting/manipulating model parameters is a general threat, referred to as model poisoning, and is not restricted to neural networks.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Model stealing through interaction is not mentioned

The current model stealing only describes the model being stolen through parameters, but the model can also be stolen by presenting inputs, capturing the output and using those combinations to train your own model. See AI guide

[FEEDBACK]: Risk Ranking Reference

Type

Documentation Issue Report

What would you like to report?

Hi team,
I would like to focus on the missing information related to the Risk Ranking number of Top 10 at the starting of the page, the table mentioned in every vulnerability.

It would be a great resource if added so that people can related to the risk associated with the vulnerability.

@sagarbhure

Code of Conduct

  • I agree to follow this project's Code of Conduct

[Fortnightly] Working Group Meeting - 2023-Jul-20

Date:

2023-Jul-20 06:00 UTC (11:30 Hyderabad, 16:00 Melbourne)

Attendees:

  • Alejandro Saucedo
  • John Sotiropoulos
  • Sagar Bhure
  • Shain Singh

Notes:

  • Discuss project reboot, historical information on the project
  • Agreement on setting up cadence and getting project information in order
    • project charter
    • defined goals
    • create sprints and roadmap to keep momentum and attract regular and new contributors
    • information on how to contribute

Action Items:

  • setup regular meeting
  • create documentation/wiki to get project charter and other documentation created

create wiki

  • home page as landing page for where to find information
  • charter (either page in wiki or link to markdown in repository)
  • contributing
  • meeting information

[Fortnightly] Working Group Meeting - 2023-Sep-14

Current agenda

  1. General project status v0.3 - in progress
  2. Notable PRs completed since last meeting:
  1. Notable discussions:
  1. Meetings:
  • WG meeting will change forward a few hours to accomodate EU morning time zones
  1. Contributions and current help wanted
  2. Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

[FEEDBACK]: Integrate model skewing into data poisoning

Type

Suggestions for Improvement

What would you like to report?

[FEEDBACK]: Model skewing requires altering training data, making it a form of data poisoning. Therefore it is probably better to integrate the two threats.

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEEDBACK]: Make ML06 more precise and with more Attack Scenarios

Type

Suggestions for Improvement

What would you like to report?

Re-thinking and re-writing ML06 - corrupted packages

The description of ML05 is quite limited given how complicated the software supply chains are, especially those related to ML-using software.

In the summary of the vulnerability it is written: This type of attack can be particularly dangerous as it can go unnoticed for a long time, since the victim may not realize that the package they are using has been compromised. The attacker's malicious code could be used to steal sensitive information, modify results, or even cause the machine learning model to fail.. Meanwhile, in the Detectability section in Risk Factors it says, that it's easy to detect this kind of vulns.

What is more, there's nothing said about countermeasures such as SBOM/MLBOM etc. in the description of this vulnerability. In my opinion that should be included.

There's plenty of resources that should be analyzed and used for the description of this specific vulnerability:

Code of Conduct

  • I agree to follow this project's Code of Conduct

[FEEDBACK]: Rename adversarial attack to something less ambiguous

Type

Suggestions for Improvement

What would you like to report?

The term adversarial attack usually has a broader definition than the intention of ML01. For example it usually includes data poisoning.
The intention seems to refer to what is more often called 'evasion attack'. The problem with that term is that it usually means small changes to the input. This is why in the AI guide we used the term 'input manipulation', which is more clear.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.