Comments (2)
Hi @limhasic,
From the references that are listed for the CategoricalCAP
metric: My understanding is that the metric was created using the original definition of CAP, from reference [1]. This looks for exact matches of the key fields. Meanwhile, in [2] it is generalized so that if no matches are found, approximate matches are deemed ok.
Whatever matches are found, the metric iterate through each row and reports the overall average of all rows. I have not read the full paper for TCAP -- but if it is doing the same thing, it would appear to be similar to CAP.
Also, are there any criteria for which column to select for cap score?
Selecting columns as key or sensitive fields would be dependent on your threat model. Any kind of CAP metric requires you to assume that an attacker may have access to certain types of columns. Perhaps these values are available in publicly available datasets, or perhaps there was a data leak in the past, etc. It seems project dependent to me.
Hope that helps.
from sdmetrics.
Thank you for your kind reply
from sdmetrics.
Related Issues (20)
- Figures are not shown on VSCode HOT 1
- Support Python 3.12
- Retrieve (also) the CAP measure for each instance rather than just the overall score HOT 2
- SDMetrics in R HOT 3
- Transition from using setup.py to pyroject.toml to specify project metadata
- Remove bumpversion and use bump-my-version
- Switch to using ruff for Python linting and code formatting
- Improve readability of the report scores when verbosity is on
- Add support for Copulas 0.10
- Use parallelization in single and multi-table reports
- ImportError: cannot import name 'evaluate' from 'sdv.evaluation' HOT 8
- Too slow "Column Pair Trends" HOT 4
- Add dependency checker
- Better way to ignore columns when running a report
- Add bandit workflow
- Fix minimum version workflow when pointing to github branch
- Cleanup automated PR workflows
- Update the verbosity in the docs HOT 1
- Only run unit and integration tests on oldest and latest python versions for macos
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sdmetrics.