In the summary of the vulnerability it is written: This type of attack can be particularly dangerous as it can go unnoticed for a long time, since the victim may not realize that the package they are using has been compromised. The attacker's malicious code could be used to steal sensitive information, modify results, or even cause the machine learning model to fail.. Meanwhile, in the Detectability section in Risk Factors it says, that it's easy to detect this kind of vulns.

What is more, there's nothing said about countermeasures such as SBOM/MLBOM etc. in the description of this vulnerability. In my opinion that should be included.

There's plenty of resources that should be analyzed and used for the description of this specific vulnerability:

Code of Conduct

I agree to follow this project's Code of Conduct

feat(docs): create a set of guidelines for how to use the Top 10 list

Create a set of guidelines for how to consume the information presented in the Top 10 based on roles

Example roles:

MLOps/MLEng practitioner
Appsec practitioner

.. etc

refactor: rename "neural net reprogramming" to "model poisoning"

as per feedback from #87

feat(docs): create guide for how to use Top 10 list as a Data Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#data-engineer

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

feat(docs): create a GLOSSARY.md

Glossary page to standardise on terminologies and definitions.

[FEEDBACK]: Rename adversarial attack to something less ambiguous

Type

Suggestions for Improvement

What would you like to report?

The term adversarial attack usually has a broader definition than the intention of ML01. For example it usually includes data poisoning.
The intention seems to refer to what is more often called 'evasion attack'. The problem with that term is that it usually means small changes to the input. This is why in the AI guide we used the term 'input manipulation', which is more clear.

Code of Conduct

I agree to follow this project's Code of Conduct

chore: make draft status more prominent

[FEEDBACK]: Consider excessive agency

Type

General Feedback

What would you like to report?

The LLM top 10 mentions excessive agency, because it is important to limit privileges /autonomy / have oversight over LLM's. This is a general AI problem.
One could argue whether this is a security risk, and I would argue that it is, because just as AI models are unpredictable, they may also have been manipulated.
I believe the ML top 10 also needs Excessive agency.

Code of Conduct

I agree to follow this project's Code of Conduct

[FEEDBACK]: Rename 'Corrupted packages' to 'AI supply chain attacks'

Type

Suggestions for Improvement

What would you like to report?

I believe 'Packages' to be a too specific term for the problem of supply chain attacks. Calling it 'supply chain attacks' will make the reader aware of the risk that any external component in the AI pipeline can be manipulated.
Also, add 'data' as a potential supply chain risk, and refer to 'data poisoning' for that, and also add 'model', referring to the transfer learning attack.

Code of Conduct

I agree to follow this project's Code of Conduct

[Async] Meeting -> Jul 25 - Jul 30 2023

Join us for this async meeting running from 25th July - 30th July 2023, by participating in Slack Thread or by commenting to this issue in Github

feat(docs): create guide for how to use Top 10 list as an MLOps Engineer

Reference https://github.com/OWASP/www-project-machine-learning-security-top-10/blob/master/GUIDELINES.md#mlops

Create a detailed guidelines document for how to use the information in the Top 10 list for use day to day

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

Discussed in #115

^{Originally posted by giscus[bot] September 20, 2023}

OWASP Machine Learning Security Top 10 - Draft release v0.2 - ML01:2023 Input Manipulation Attack

https://mltop10.info/ML01_2023-Input_Manipulation_Attack.html

feat(book): enable ability to export as PDF book

The top 10 list should have an ability to export the Markdown files into a PDF. This could be done for example via a Github Action.

bug(website): remove pages included from project templates

as an example _includes has information from the standard top 10 directory which shows error pages such as: https://owasp.org/www-project-machine-learning-security-top-10/2023/Acknowledgements.html

mentioned in: #44

[FEEDBACK]: Risk Ranking Reference

Type

Documentation Issue Report

What would you like to report?

Hi team,
I would like to focus on the missing information related to the Risk Ranking number of Top 10 at the starting of the page, the table mentioned in every vulnerability.

It would be a great resource if added so that people can related to the risk associated with the vulnerability.

@sagarbhure

Code of Conduct

I agree to follow this project's Code of Conduct

feat(contributions): investigate use of all-contributors specification

All Contributors has a specification for recognising contributions that are not just code.

e.g.

blog posts
examples
ideas and planning
mentoring
answering questions

The full list is shown here: https://allcontributors.org/docs/en/emoji-key

--
Implementation would involve either using a bot: https://allcontributors.org/docs/en/bot/overview or manually via cli: https://allcontributors.org/docs/en/cli/overview

[Fortnightly] Working Group Meeting - 2023-Jul-20

Date:

2023-Jul-20 06:00 UTC (11:30 Hyderabad, 16:00 Melbourne)

Attendees:

Alejandro Saucedo
John Sotiropoulos
Sagar Bhure
Shain Singh

Notes:

Discuss project reboot, historical information on the project
Agreement on setting up cadence and getting project information in order
- project charter
- defined goals
- create sprints and roadmap to keep momentum and attract regular and new contributors
- information on how to contribute

Action Items:

setup regular meeting
create documentation/wiki to get project charter and other documentation created

feat(rendering): make PDF output from Markdown files more presentable

The Top 10 list is being rendered using Markdown at https://mltop10.info

The site is being rendered using Quarto and the files from https://github.com/OWASP/www-project-machine-learning-security-top-10/tree/master/docs are mirrored to https://github.com/mltop10-info/mltop10.info

Currently a manual process is run for the https://github.com/mltop10-info/mltop10.info locally to render the HTML and PDF outputs which are stored in https://github.com/mltop10-info/mltop10.info/tree/main/docs and used by Github Pages.

The rendering for PDF is currently using the default method of LaTeX - example at: https://github.com/mltop10-info/mltop10.info/blob/main/docs/OWASP-Machine-Learning-Security-Top-10.pdf

Quarto has a lot of formatting options for generating PDF and this needs to be explored to make the PDF and ePUB formats look more presentable.

feat(docs): create a recorded demo of ML02 Data Poisoning Attack

Create a recorded video demo (no audio)

Video will be uploaded to OWASP Youtube Channel

Implementing demo for ML10:2023 Model Poisoning

Thinking of ML10:2023 Model Poisoning, we can create two scripts that, although carrying out the same operation (perhaps classification), which provide different outcomes.

By this was, we can showcase model poisoning in action along with the theory corresponding to it.

Please share your ideas with me on this!

cc: @sagarbhure @shsingh @robvanderveer

Workflow to clone Top 10 attacks to mirror site https://mltop10.info/

Construct workflow to clone the Top 10 attack's MD File to the Repo https://github.com/mltop10-info/mltop10.info.

So that all changes to attack scenarios are pushed by WF rather than human interaction.

[FEEDBACK]: Integrate model skewing into data poisoning

Type

Suggestions for Improvement

What would you like to report?

[FEEDBACK]: Model skewing requires altering training data, making it a form of data poisoning. Therefore it is probably better to integrate the two threats.

Code of Conduct

I agree to follow this project's Code of Conduct

create CODEOWNERS

create a CODEOWNERS file to enable core-team reviewers

fix: merge review from @robvanderveer

the following is an initial review taken from Slack logs: https://owasp.slack.com/archives/C04PESBUWRZ/p1677192099712519

by @robvanderveer

Dear all,
I did a first scan through the list to mainly look at taxonomy. Here are my remarks.
1.
ML01
In 'literature' the term ‘adversarial’ is often used for input manipulation attacks, but also for data poisoning, model extraction etc. Therefore in order to avoid confusion it is probably better to rename the ML01 adversarial attack entry to input manipulation?
2.
It is worth considering to add ‘model evasion’ aka black box input manipulation to your top 10? Or do you prefer to have one entry for input manipulation all together?
3.
ML03
It is not clear to me how scenarios 1 and 2 work. I must be missing something. Usually model inversion is explained by manipulating synthesized faces until the algorithm behaves like it recognizes the face.
4
ML04
It is not clear to me how scenario 1 works.
Standard methods against overtraining are missing form the ‘how to prevent’ part. Instead the advice is to reduce the training set size - which typically increases the overfitting problem.
5
ML05
Model stealing describes a scenario where an attacker steals model parameters, but generally this attack takes place by ways of black box: gathering input-output pairs and training a new model on it.
6
ML07
I don’t understand exactly how the presented scenario should work. I do know about the scenario where a pre-trained model was obtained that has been altered by an attacker. This matches the description.
7
ML08
Isn’t model skewing the same as data poisoning? If there’s a difference, to me they are not apparent from the scenario and description.
8
ML10 is called Neural net reprogramming but I guess the attack of changing parameters will work on any type of algorithm - not just neural networks. The description also mentions changing the training data, but perhaps that is better left out to avoid confusion with data poisoning?

[Fortnightly] Working Group Meeting - 2023-Aug-17

Date: Thursday, August 17 at 0500 UTC
Previous agenda: #42

Current agenda

General project status - v0.2 milestone complete
Contributions and current help wanted
Introductions (for new contributors)

Discussions

Calendar Event

Download calendar event (ICS)

feat(docs): create page on calculating severity

Each of the Top 10 items are scored according to OWASP's Risk Rating Methodology. There should be a page defining how to use the ratings to provide a severity score. This will assist practitioners in knowing 'what to fix' and 'when'.