GithubHelp home page GithubHelp logo

programming-dp's People

Contributors

bamrainboo avatar chikeabuah avatar coding-famer avatar curt-mitch avatar jnear avatar justinstigall avatar liuweiran900217 avatar nrnrk avatar psilospore avatar sisaman avatar sriar avatar vmoeykens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

programming-dp's Issues

Issue on page /ch1.html

There is a display of three records just prior to the text:

This time, there are three rows returned - and we don’t know which one is the real Karrie.

The three records have the same date of birth ("9/7/1967") but different ages. This is impossible.

Issue on page /ch10.html

At the bottom of the page in the code snippets that call sparse() and range_query_svt(), it shows it as throwing an error.

Gaussian mechanism formula is not valid for every epsilon / delta

The formula $\sigma^2 = 2 s^2 ln(1.25/\delta) / \epsilon^2$ is not valid for every epsilon / delta. Theorem 3.22 in Dwork, Roth is only for $\epsilon \in (0, 1)$. See theorem 4 in Balle et al (2018) showing this bound does not hold for $\epsilon > 1$.

It would probably be good to add a caveat that the bounds in the book are not valid for every $\epsilon$, in a footnote or something.

Add IPv6 support to programming-dp.com

Hey, thanks for this great page. I noticed that you host the static page on GitHub Pages, but don't have the required AAAA DNS records to make it accessible over IPv6. The good news is – You just have to add those to the domain and you have full IPv6 support.
Just follow the official guide from GitHub: https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site/managing-a-custom-domain-for-your-github-pages-site#configuring-an-apex-domain

k-Anonymity PDF discrepancy

I noticed a typo in the PDF downloaded from programming-dp.com/book.pdf that seems to be fixed already in the repo. Cell 75 is ch2.ipynb outputs the result False, but the pdf says True on page 18.

I appreciate the book so I wanted to point out this minor discrepancy.

Clipped Gradient Equation Incorrect

Hi. In equation 36 of the Machine Learning Section of the book and the subsequent paragraph, the clipped gradient is written as

\begin{align} || \text{L2_clip}(\nabla(\theta; X, y), b) - \text{clip}(\nabla(\theta; X \text{'}, y))|| \end{align}

when it really should be written as

\begin{align} || \text{L2_clip}(\nabla(\theta; X, y), b) - \text{clip}(\nabla(\theta; X \text{'}, y), 0)|| \end{align}

The second term is missing a zero.

(I wrote this latex in this github issue using this trick here.)

Typo in De-identification Chapter

I have found a typo in the De-identification chapter. In the section describing re-identifying Karrie Trusslove's data, the word "information" is spelled "informatino".

Full line is:

  • We can look at the differences between the rows to determine what additional auxiliary informatino would help us to distinguish them (e.g. sex, occupation, marital status)

Variants of Differential Privacy

In "Variants of Differential Privacy" section, when calculating noises_seq for advanced composition why
noises_seq = [16knp.log(1.25/delta)np.log(1/delta)/(epsilon**2) for k in ks]
is used instead of
noises_seq = [16
knp.log(2.5k/delta)*np.log(2/delta)/(epsilon**2) for k in ks] as derived earlier?

Improvements for various equations with large braces

Equation (2) in chapter 3 - Laplace Mechanism should use \left and \right for braces on the right side:

\begin{equation}
F(x) = f(x) + \textsf{Lap}\left(\frac{s}{\epsilon}\right)
\end{equation}

Before:
image
After:
image

There are also other equations which would look a lot better if the \left and \right were used for braces:

  • Chapter 5 - equation 3
  • Chapter 7 - equation 18
  • Chapter 8 - eq 22, 23, 25
  • Chapter 9 - eq 30, 31, 32
  • Chapter 13 - eq 38

The \forall \epsilon' in advanced composition

The section on advanced composition has a line that says:

" for \epsilon', \delta \ge 0, the total privacy cost of the entire k-fold adaptive composition is equal to \epsilon', \delta, where <formula for \epsilon'>. "

The \epsilon' should be outside the \forall quantifier right? My understanding was that the order of quantifier goes "\forall \epsilon, \delta, \exists \epsilon' given by formula", but I might be misunderstanding something. Specifically, I think the dash should be removed in the first \epsilon'.

Not as important, but I think there should be brackets around the \epsilon', \delta at the end of the sentence?

Typos in Chapters 2 and 6

Chapter 2 --
Informally, we say that a dataset is “k-Anonyized” -> Informally, we say that a dataset is “k-Anonymized”

Chapter 6 --
this is the “Euclidian distance,” -> this is the “Euclidean distance”,

Chapter 6 --
which we will call the catastrophe mechansim -> which we will call the catastrophe mechanism
fails gracefully, rather than catistrophically -> fails gracefully, rather than catastrophically

Duplicated Word in Chapter 2

There is a duplicated word on the line:

Informally, we say that that a dataset is “k-Anonyized” for a particular k if each...

A typo in cp1

In subsection "Is Karrie Special?", the "A good way to guage the effectiveness of this type of attack is to look at how “selective” certain pieces of data are.", the "guage" should be "gauge".

Typos in Chapters 6,7,8

https://uvm-plaid.github.io/programming-dp/notebooks/ch6.html#advanced-composition
the total privacy cost under both sequential compsosition -> the total privacy cost under both sequential composition

https://uvm-plaid.github.io/programming-dp/notebooks/ch7.html#sample-and-aggregate
In this simple instantation -> In this simple instantiation

https://uvm-plaid.github.io/programming-dp/notebooks/ch8.html#variants-of-differential-privacy
eliminating the catastropic failure mode -> eliminating the catastrophic failure mode

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.