GithubHelp home page GithubHelp logo

okasi / swe-pii Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 505 KB

Swedish — Personal Identifiable Information (PII)

Home Page: https://swe-pii.vercel.app/

JavaScript 1.28% CSS 0.33% TypeScript 98.39%
personal-identifiable-information pii regex sweden sweden-data swedish

swe-pii's Introduction

Swedish — Personal Identifiable Information (PII)

CURRENT version:

  • RegEx
  • Set Lookups from JSON files

FUTURE version:


Dataset sources:

WIP extraction from OpenStreetMap data:

osmium tags-filter sweden-latest.osm.pbf -o addresses.osm.pbf n/addr:street n/addr:postcode n/addr:city n/admin_level=7 n/admin_level=4 w/addr:street w/addr:postcode w/addr:city w/admin_level=7 w/admin_level=4

osmium export addresses.osm.pbf -f geojson -o addresses.geojson

jq -r '.features[] | select(.properties["addr:street"] != null and .properties["addr:postcode"] != null and .properties["addr:city"] != null) | "\(.properties["addr:street"]),\(.properties["addr:postcode"]),\(.properties["addr:city"])"' addresses.geojson | sort | uniq | sed -E 's/([0-9]{3})([0-9]{2})/\1 \2/' > unique_addresses.txt

jq -r '.features[] | select(.properties.admin_level == "7" and .properties.name != null) | "\(.properties.name)"' addresses.geojson | sort | uniq > municipalities.txt

jq -r '.features[] | select(.properties.admin_level == "4" and .properties.name != null) | "\(.properties.name)"' addresses.geojson | sort | uniq > counties.txt

Identifiers & labels

  • Person First Name (PER-FIRST)
  • Person Last Name (PER-LAST)
  • Personnummer (ID-PNR)
  • Samordningsnummer (ID-SNR)
  • Marital Status (MARITAL)
  • Biological Sex (SEX)
  • Nationality (NATION)
  • Education Program (EDU-PROGRAM)
  • Profession (PROF)
  • Disabilities (DISAB)
  • Ethnicity (ETHNIC)
  • Sexual Orientation (SEXOR)
  • Political Opinions (POL)
  • Religious Beliefs (REL)
  • Phone Number (PHONE)
  • Email (EMAIL)
  • Social Media Profiles (SOCM)
  • Street Address (ADDR-STREET)
  • Postal Code (ADDR-POSTAL)
  • Municipality (ADDR-MUNICIPALITY)
  • City (ADDR-CITY)
  • County (ADDR-COUNTY)
  • Bank Account Number (FIN-BANKNUM)
  • IBAN (FIN-IBAN)
  • BIC/SWIFT Code (FIN-BIC)
  • Credit Card Number (FIN-CC)
  • Organization Number (ORG-NUM)
  • Company Name (ORG-WORK)
  • Education Institute (ORG-EDU)
  • IP Address (IP)
  • MAC Address (MAC)
  • Date (DATE)
  • Time (TIME)
  • Vehicle Registration Number (VEH)

swe-pii's People

Contributors

osim-sbab avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.