GithubHelp home page GithubHelp logo

signal_sqlite_md's Introduction

signal_sqlite_md

Convert messages from a Signal SQLite database export to Markdown.

Unlike my signal_md which requires output from signald, this one requires nothing beyond this Python script, configuration, and a tool to export the DB.

Disclaimer

I probably should've called it signal_sqlite_csv_md because the script doesn't read directly from the SQLite DB, instead it parses a CSV export from it. I tried directly accessing the DB and gave up ๐Ÿคฃ

Context

A big shoutout to Florian Engel whose post [1] saved me hours ๐Ÿค—.

The SQLite DB is encrypted but it's easy to decrypt because you have the key!

The attachments are not in the DB, they're stored in the file system in a series of folders with 2 digit Hex labels. The files have names like "000ec9a54abe93416284f83da2f9f8d124778f22191d9422ed9829de2b22c1b7" with no suffix but don't worry, that info is in the DB and the script takes care of adding the suffix e.g. ".jpg".

Another approach would be to query the SQLite DB directly on device but that's a future thing for me. A good reference is https://github.com/idanlevin/signal-query

Dependencies

The code in this repo relies heavily on my message_md classes which contain generic Message, Person, Group and other classes and the methods to convert messages to Markdown files. Be sure to read the README and the configuration guide for that repo first.

References

  1. Extracting Messages from Signal Desktop by Florian Engel guided my way
  2. DB Browser for SQLite to get your data
  3. message_md upon which this tool depends

High level process

  1. Get your Signal data
  2. Configure this tool
  3. Run this tool
  4. Be happy

It's your data, go get it!

The tool needs two files of date exported from Signal's SQLite database:

  1. messages.csv - the actual messages
  2. conversations.csv - the mapping of conversation-id to people and groups

This steps below desribe how to get these two sets of data out of Signal.

Steps

  1. Install DB Browser for SQLite - [2]
  2. Find the key to your SQLite DB, see [1]
    • For me, on Windows, with user micro it was here: C:\Users\micro\AppData\Roaming\Signal\config.json
  3. Find the path to your Signal db.sqlite database file
    • For me, it was here: C:\Users\micro\AppData\Roaming\Signal\sql\
  4. Launch "DB Browser for SQLite (SQLCipher)" -- not the one without (SQLCipher)
  5. Click "Open Database"
  6. Choose Raw key from the menu to the right of the "Password" field
  7. In the "Password" field, type 0x and then paste the key you found in step 2
  8. Right click on "messages" and click "Export as CSV file"

  1. Right click on "conversations" and click "Export as CSV file"
  2. Find the attachments
    • Mine were under: C:\Users\micro\AppData\Roaming\Signal\attachments.noindex
  3. Copy the attachments to the same folder (no subfolders) as the CSV file
    • the cp_signal_attachments.sh shell script made it easier for me
    • I had to use dos2unix on that shell script file before it worked
    • NOTE: I can improve this later to get the files directly, being lazy!

Setting up the config files

The next step is to configure this tool.

You'll need to define each person that you communicate with in people.json and the groups in groups.json. This way the tool can associate each message with the person that sent it and who it was sent to.

Samples of these configuration files are in the message_md repo here upon which this tool depends.

This part is tedious the first time and needs to be updated when you add new contacts or Groups in Signal, i.e. a pain.

Someday I can automate this but for now, no pain, no gain ๐Ÿ™‚.

The next sections describes how people are identified and where to find the identifiers for groups.

People

People are found in the conversations.csv file via their phone number in the e164 column. The conversations.csv file is assumed to be under the source folder

Groups

  1. Open the conversations.csv file in your favorite editor
  2. Look at the first row
id,json,active_at,type,members,name,profileName,profileFamilyName,profileFullName,e164,serviceId,groupId,profileLastFetchedAt
  1. If there's a groupId field value on a given row, then that's a group
    • the name field will tell you the name of the group
""id"":""a1760c87-d3d0-40f6-9992-ac0426efcc14""
""groupId"":""FdibKUgQIZPilWQu3jbgEB+tajc3RUKuoyYNZp4bRhQ=""
""name"":""Family"
  1. Add the corresponding row to groups.json:

    • set group id to the id from conversations.csv
    • set the conversation-id to the groupID from Step 3
    • set slug to a one-word or hyphenated keyword (slug) for this group
    • set description to the name either the name from conversations.csv or something else e.g. "Family"
  2. Repeat Steps 3 and 4 for every row

Using signal_sqlite_md

Once you have the two CSV export files and you your people.json and groups.json configured, you're finally ready to run this tool.

The command line options are described in the message_md repo.

Example:

# python3 signal_sqlite_md.py -c ../../dev-output/config -s ../../signal_sqlite/ -f messages.csv -d -o ../../dev-output -m spongebob -b 2023-12-20

where:

  • config settings are in ../../dev-output/config
  • source folder is ../../signal_sqlite
  • file of CSV messages is messages.csv in the source folder
  • output the Markdown files to ../../dev-output
  • my slug is spongebob
  • begin the export from 2023-12-20

Other info

In the messages.csv file, the attachments are referenced in this part of the message

""path"":""0b\\0b82ab19cb4cab30f5041f7705aa890833cab2c32d662c2792814e0268c90e6c""

signal_sqlite_md's People

Contributors

thephm avatar

Stargazers

pandalanax avatar Lukasz Hanusik avatar Jim Grisham avatar

Watchers

 avatar

signal_sqlite_md's Issues

Use the conversations.csv to get the conversation IDs

The current implementation requires the definition of each conversation mapped to a person so the messages file can be parsed.

There are two issues:

  1. It's a pain to have to go into the conversations.csv file, copy and paste those into the config file
  2. The conversation IDs can change over time (I think)

Since the person's phone number is also in the converstions.csv file inside the e164 field, just parse this CSV file and find the corresponding conversation IDs, keep those with each person.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.