GithubHelp home page GithubHelp logo

message_md's Introduction

message_md

Code to hold Messages and convert them to Markdown.

Also includes all the supporting classes for Person, Group, Setting, String, Config, Attachments so the client code only needs to deal with the app-specific parsing of message files.

Configuration

Read the guide to learn how to configure the library.

Command line options

Any app that uses this library will inherit these command line options.

IMPORTANT: by default the begining date for the parsing is today so that it's easier (and faster) to get results and make sure everything is workling because you only have to look at a day's worth of messages. Once you're ready to parse everything, use something like -b 1970-01-01 to get all the messages.

Argument Alternate Description
-c --config Folder where the configuration files are
-s --sourceFolder Folder where the message file is
-f --file The filename of the file containing all of the messages to be converted
-o --outputFolder Where the resulting Markdown files will go
-l --language UI language, defaults to English
-m --mySlug Which person in the config file is me e.g. bob
-d --debug Print debug messages
-b --begin The date from which to start converting
-i --imap IMAP server address
-r --folders IMAP folders to retrieve from
-e --email email address to retrieve from
-p --password email password
-x --max The maximum number of messages to process
-a --add Add people to the output even if not in people.json config file

License

Apache License 2.0

message_md's People

Contributors

thephm avatar

Stargazers

 avatar

Watchers

 avatar

message_md's Issues

Add a config folder option

Need a way to point the script to where all of the config files are located (settings.json, strings.json, people.json, groups.json, MIMETypes.json)

Add a -c <folderName> (or --config) option

Create an audit trail of messages processed

Could be as simple as

date, time, source_slug, destination_slug, group_slug, message_id, message_date, message_time, result_code

Where:

  • date and time are when the message was processed
  • source_slug is the person who sent the message or blank if not found
  • destination_slug is who it was sent to, or blank if to a group or not found
  • group_slug is the group it was sent to, or blank if to a single person or not found
  • message_date and message_time are the actual date and time the message was sent
  • result_code set to 0 if no error or some non-zero value if an error e.g. failed to find the person

Could also include the source_filename which might be helpful to remember which file was processed.

This way I could have an interactive processing which I think will be needed for email. Over the years, I had manually copy/pasted many emails into my DB and so I don't want to have duplicates.

Having a log would allow me to process in batches when I have time and not have to remember where I left off.

Similarly, people wouldn't have to remember when the last time they ran one of the tools that uses this library. Right now, they have to use -b YYYY-MM-DD to set the date from when to begin processing.

Add the ability to convert messages from a specific date forward

Add an option -b YYYY-MM-DD to convert messages from, and including, that date forward

With Signal and SMS I would delete the messages/conversations with people once they've been exported to Markdown but with LinkedIn, there's no easy way to delete messages. Plus, it's handy to keep messages in the native service anyway.

Refactor to move parameter parsing into config.py

Command line parameter parsing should've been in the config

Eventually the clients to the library should be adding their own parameters because right now there are client specific parameters in this generic library (e.g. "imap-server")

Add the group slug to `tags` in frontmatter for group messages

It would be helpful to include the slug for a group in a chat message so I can find all of the chats with that group more easily. Right now I'd have to parse the people field.

For example, I have a Signal group with my 4 sisters we call the "Bottom of the 9" since we are the bottom 5 of 9 kids :)

---
tags: [chat, bot9]
---

A converted message had the wrong people label

A converted message had the wrong people label in it. In my personal export had bob-loblaw but should have been someone else who was in the config.json file and had a valid linkedin-id

---
tags: [chat]
people: [bob-loblaw, me]
date: 2023-11-03
time: 20:09
service: linkedin
---

The bob-loblaw had a blank linkedin-id in the config.json file so maybe that's why

{"person-slug": "bob-loblaw", "first-name": "Bob", "last-name": "Loblaw", "number": "6135551212", "linkedin-id": ""},

Display people who are not found

It's important to know who is not in config\people.json but in the LinkedIn export file so I can add them. In the future this could be automated and/or just use the LinkedIn contacts file.

Allow for multiple LinkedIn IDs

Maybe do this for all services since people could have multiple Twitter accounts.

Why? I notice that as people change their LinkedIn profiles the ID can change in the URL, e.g. theidhere

https://www.linkedin.com/in/theidhere/

Creating a bunch of "empty" message files

It may be specific to the LinkedIn export but I'm seeing a large number of files

May be because the group conversations aren't implemented in the LinkedIn script

image

Create an additional file if there's one existing for the date

Currently, when exporting messages and there's an existing YYYY-DD-MM.md file, it moves onto the next date.

This was the case because there was no option to specific from which date and I didn't want messages to be overwritten nor append to existing files and duplicate messages in a dated output file.

Instead, create an additional dated file with a number, incremented for each additional file

For example, if 2023-12-10.md exists, create 2023-12-10 - 1.md. If 2023-12-10 - 1.md exists, create 2023-12-10 - 2.md and so on.

This way, the user will have the messages and be able to manually de-dup the files. If I was smarter, I'd check the contents of the existing file and modify it by injecting the messages but that's overkill for me ๐Ÿ˜‚

Redact passwords

If there is password: blah or password blah or password is blah or pwd: blah or pwd blah or pwd is blah then change blah to *****. Also, Password or Pwd.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.