GithubHelp home page GithubHelp logo

n0th1ng-else / voice-to-text-bot Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 3.0 7.31 MB

Telegram bot that converts Voice messages into text

Home Page: https://t.me/AudioMessBot

License: MIT License

JavaScript 5.77% Dockerfile 0.21% TypeScript 93.20% HTML 0.80% Shell 0.01% Procfile 0.01%
bot nodejs speech-recognition speech-to-text telegram telegram-api telegram-bot typescript

voice-to-text-bot's People

Contributors

n0th1ng-else avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

voice-to-text-bot's Issues

Handle duplicates in the db

For some reason, there are duplicates in development DB, so it seems that we need to handle such cases and avoid duplications as much as possible

Fix storing chat name

Right now stat contains last user name who applied a voice message, should be group name for groups

iphone files

2020-07-22T10:13:06.966690+00:00 app[web.1]: /usr/src/app/node_modules/prism-media/src/opus/OggDemuxer.js:56

2020-07-22T10:13:06.966690+00:00 app[web.1]:       throw Error(`capture_pattern is not ${OGGS_HEADER}`);

2020-07-22T10:13:06.966691+00:00 app[web.1]:       ^

2020-07-22T10:13:06.966692+00:00 app[web.1]: 

2020-07-22T10:13:06.966693+00:00 app[web.1]: Error: capture_pattern is not OggS

2020-07-22T10:13:06.966693+00:00 app[web.1]:     at OggDemuxer._readPage (/usr/src/app/node_modules/prism-media/src/opus/OggDemuxer.js:56:13)

2020-07-22T10:13:06.966694+00:00 app[web.1]:     at OggDemuxer._transform (/usr/src/app/node_modules/prism-media/src/opus/OggDemuxer.js:36:27)

2020-07-22T10:13:06.966694+00:00 app[web.1]:     at OggDemuxer.Transform._read (_stream_transform.js:191:10)

2020-07-22T10:13:06.966695+00:00 app[web.1]:     at OggDemuxer.Transform._write (_stream_transform.js:179:12)

2020-07-22T10:13:06.966695+00:00 app[web.1]:     at doWrite (_stream_writable.js:403:12)

2020-07-22T10:13:06.966695+00:00 app[web.1]:     at writeOrBuffer (_stream_writable.js:387:5)

2020-07-22T10:13:06.966696+00:00 app[web.1]:     at OggDemuxer.Writable.write (_stream_writable.js:318:11)

2020-07-22T10:13:06.966696+00:00 app[web.1]:     at IncomingMessage.ondata (_stream_readable.js:717:22)

2020-07-22T10:13:06.966696+00:00 app[web.1]:     at IncomingMessage.emit (events.js:315:20)

2020-07-22T10:13:06.966697+00:00 app[web.1]:     at IncomingMessage.EventEmitter.emit (domain.js:482:12)

2020-07-22T10:13:06.978649+00:00 app[web.1]: npm ERR! code ELIFECYCLE

Log termination triggers

Right now logs do now show what happened with the replica when external system tries to shut it down. We want to try to track such events

analytics

  • add exceptions
  • add thdead id for events

API Errors collectiong to be addressed

tag:app.8cf913a16282f43bfdad831afe02f5adf638a0d0
logtype:json
http:
clientHost:54.74.88.42
contentType:application/json
json:
id:telegram-bot
level:error
message:Unable to recognize the file
metadata-0-code:13
metadata-0-details:Received RST_STREAM with code 2 (Internal server error)
metadata-0-note:Exception occurred in retry method that was not classified as transient
prefix:no
timestamp:2020-08-12T17:24:01.039Z

Add more mime types

  • audio/mpeg
  • audio/mp3
  • audio/m4a
  • audio/x-m4a
  • audio/mpeg3
  • audio/x-vorbis+ogg
  • audio/x-aac

API entrypoint from previous replica

server
level:error
message:Unknown route /bot/message/c4b247a3e37e43eb7e140a2576138926
timestamp:2020-07-01T19:49:03.595Z

Sounds like we want to handle everything /bot/message/:id and then show some error/warn if it is not our entrypoint

Set up whatsapp bot

since the process is all set, it is interesting if we can implement the same feature in whatsapp messager

Render more charts

  • Groups vs Direct messages
  • Groups usage vs Direct messages (in total files recognized)
  • Installs vs Usages redesign - show percentage
  • Installs per day cumulative
  • Installs per day average (add on the existing graph)

Track exact user actions (in logs)

Right now it is unclear what kind of action user called to analyze particular issue in logs. We need to enrich logs with more data. WHat kind of action do they call, was it successful or not etc

[DB] is unavailable: Unable to get the lang

d:telegram-bot
level:error
message:Unable to get the lang
timestamp:2020-06-16T09:28:33.200Z
0:
code:100
message:XMLHttpRequest failed: {"UNSENT":0,"OPENED":1,"HEADERS_RECEIVED":2,"LOADING":3,"DONE":4,"readyState":4,"responseText":"","responseXML":"","status":503,"statusText":null,"withCredentials":false}

Log collector throws an error

TypeError: Cannot convert undefined or null to object 
at Function.getOwnPropertyNames (<anonymous>) 
at convertDataItem (/usr/src/app/dist/src/logger/integration.js:39:19) 
at /usr/src/app/dist/src/logger/integration.js:78:30 
at Array.reduce (<anonymous>) 
at sendLogs (/usr/src/app/dist/src/logger/integration.js:77:16) 
at Logger.warn (/usr/src/app/dist/src/logger/index.js:52:36) 
at /usr/src/app/dist/src/recognition/wit.ai.js:75:24 
at runMicrotasks (<anonymous>) 
at processTicksAndRejections (node:internal/process/task_queues:96:5) at async Promise.all (index 0)

Bad Request: message is too long

{
  "@timestamp": "2023-01-02T15:27:39.766Z",
  "severity": "error",
  "message": "ETELEGRAM Request failed with status code 400",
  "level": "error",
  "title": "Unable to recognize the file",
  "id": "telegram-bot",
  "prefix": "thread-1",
  "appVersion": "app.0f1f995f3fb5622ec474a956c1261c5b111369ce",
  "code": 400,
  "response": {
    "ok": false,
    "error_code": 400,
    "description": "Bad Request: message is too long"
  },
  "url": "/bot/sendMessage",
  "stack": "Error: ETELEGRAM Request failed with status code 400\n    at /usr/src/app/dist/src/telegram/api/index.js:127:19\n    at runMicrotasks (<anonymous>)\n    at processTicksAndRejections (node:internal/process/task_queues:96:5)\nAxiosError: Request failed with status code 400\n    at settle (/usr/src/app/node_modules/axios/dist/node/axios.cjs:1855:12)\n    at IncomingMessage.handleStreamEnd (/usr/src/app/node_modules/axios/dist/node/axios.cjs:2704:11)\n    at IncomingMessage.emit (node:events:539:35)\n    at IncomingMessage.emit (node:domain:475:12)\n    at endReadableNT (node:internal/streams/readable:1345:12)\n    at processTicksAndRejections (node:internal/process/task_queues:83:21)",
  "@timestamp_received": "2023-01-02T15:27:41.101Z",
  "logsene_orig_type": "application-logs"
}

Log more details for duplicated records

I can see duplications happen to be in production DB and service resolves them. But I dont know the difference between this or that duplicated record. We need to put more details into logs

Add support for groups

Currently, the bot is disabled for groups. So it might be a worthy case to have it in there. need some research

Enrich readme

add badges (sonar, version, usages)

make table responsive

Handle 403

level:error
message:[Id] [ChatId] Unable to recognize the file AwACAgIAAxkBAAIxXF8FxNuSKWTDCsW3Hv5CxnDXq70fAAIoBwACR30wSBrbPQABj2tXZxoE
metadata-0-code:ETELEGRAM
timestamp:2020-07-08T13:07:06.488Z
metadata-0-response:{ headers: { date: "Wed, 08 Jul 2020 13:07:05 GMT", access-control-allow-origin: "*", server: "nginx/1.16.1", content-length: "84", content-type: "application/json", connection: "keep-alive", strict-transport-security: "max-age=31536000; includeSubDomains; preload", access-control-expose-headers: "Content-Length,Content-Type,Date,Server,Connection" }, request: { headers: { content-length: 3034, content-type: "application/x-www-form-urlencoded" }, method: "POST", uri: { path: "/:/sendMessage", protocol: "https:", hostname: "api.telegram.org", port: 443, host: "api.telegram.org", href: "https://api.telegram.org/sendMessage", slashes: true, pathname: "/:/sendMessage" } }, body: { description: "Forbidden: bot was blocked by the user", error_code: 403, ok: false }, statusCode: 403 }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.