swiftraccoon / cpp-sdrtrunk-transcriber Goto Github PK
View Code? Open in Web Editor NEWmonitor directory for SDRTrunk Project25 mp3 files. categorize files. create transcription file.
License: GNU General Public License v3.0
monitor directory for SDRTrunk Project25 mp3 files. categorize files. create transcription file.
License: GNU General Public License v3.0
Operating System: Ubuntu 22.04.4 LTS
Kernel: Linux 6.5.0-28-generic
$ make
[ 2%] Built target CLI11
[ 2%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/binary.cpp.o
[ 4%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/contrib/graphbuilderadapter.cpp.o
[ 6%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/contrib/graphbuilder.cpp.o
[ 8%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/convert.cpp.o
[ 10%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/depthguard.cpp.o
[ 12%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/directives.cpp.o
[ 14%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/emit.cpp.o
[ 17%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/emitfromevents.cpp.o
[ 19%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/emitter.cpp.o
[ 21%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/emitterstate.cpp.o
[ 23%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/emitterutils.cpp.o
[ 25%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/exceptions.cpp.o
[ 27%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/exp.cpp.o
[ 29%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/memory.cpp.o
[ 31%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/node.cpp.o
[ 34%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/node_data.cpp.o
[ 36%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/nodeevents.cpp.o
[ 38%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/nodebuilder.cpp.o
[ 40%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/null.cpp.o
[ 42%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/ostream_wrapper.cpp.o
[ 44%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/parse.cpp.o
[ 46%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/parser.cpp.o
[ 48%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/regex_yaml.cpp.o
[ 51%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/scanner.cpp.o
[ 53%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/scanscalar.cpp.o
[ 55%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/scantag.cpp.o
[ 57%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/scantoken.cpp.o
[ 59%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/simplekey.cpp.o
[ 61%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/singledocparser.cpp.o
[ 63%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/stream.cpp.o
[ 65%] Building CXX object external/yaml-cpp/CMakeFiles/yaml-cpp.dir/src/tag.cpp.o
[ 68%] Linking CXX static library libyaml-cpp.a
[ 68%] Built target yaml-cpp
[ 70%] Building CXX object external/yaml-cpp/util/CMakeFiles/yaml-cpp-read.dir/read.cpp.o
[ 72%] Building CXX object external/yaml-cpp/util/CMakeFiles/yaml-cpp-parse.dir/parse.cpp.o
[ 74%] Building CXX object external/yaml-cpp/util/CMakeFiles/yaml-cpp-sandbox.dir/sandbox.cpp.o
[ 76%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/main.cpp.o
[ 78%] Linking CXX executable parse
[ 80%] Linking CXX executable read
[ 82%] Linking CXX executable sandbox
[ 82%] Built target yaml-cpp-sandbox
[ 82%] Built target yaml-cpp-read
[ 87%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/curlHelper.cpp.o
[ 87%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/DatabaseManager.cpp.o
[ 87%] Built target yaml-cpp-parse
[ 89%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/fileProcessor.cpp.o
[ 91%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/ConfigSingleton.cpp.o
[ 93%] Building CXX object CMakeFiles/sdrTrunkTranscriber.dir/src/transcriptionProcessor.cpp.o
/home/pi/cpp-sdrtrunk-transcriber/src/transcriptionProcessor.cpp:15:10: fatal error: nlohmann/json.hpp: No such file or directory
15 | #include <nlohmann/json.hpp>
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/sdrTrunkTranscriber.dir/build.make:146: CMakeFiles/sdrTrunkTranscriber.dir/src/transcriptionProcessor.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:219: CMakeFiles/sdrTrunkTranscriber.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
CodeFactor found an issue: Possible OS Command Injection (CWE-78)
It's currently on:
src\fileProcessor.cpp:114
Commit 9770ab0
presently we set minimum mp3 duration in fileProcessor.cpp
should move it to config.yaml
[13:15:57] curlHelper.cpp curl_transcribe_audio Received response: {"text":"Thank you for watching."}
[13:15:57] curlHelper.cpp curl_transcribe_audio Valid response received.
Describe the bug
transcriptionProcessor.cpp JSON Error: [json.exception.parse_error.101] parse error at line 3, column 1: syntax error while parsing object key - unexpected '}'; expected string literal
implement optional flag --email to accept an email.
if optional flag --email detected we require:
--match regex
CodeFactor found an issue: Possible OS Command Injection (CWE-78)
It's currently on:
fileProcessor.cpp:72
Commit 7a92829
let's implement official debugging instead of commenting out stuff. preferably with debug levels
rename curlHelper
to something making it obvious it's OpenAI
create files for other transcription services
create optional flag --provider {business1|business2|business3}
or
create config.yaml
option
should probably determine a way to move
std::unordered_set<int> specialTalkgroupIDs = {52198, 52199, 52201};
into config.yaml
would need to redesign how we do tgID<->JSON file
CodeFactor found an issue: Complex Method
It's currently on:
fileProcessor.cpp:123-245
Commit a4f1ef8
add --local
arg
? https://github.com/ggerganov/whisper.cpp
fuzzy match on keys?
completely new glossary method?
redesign glossary JSON formatting expectation ?
{
"GLOSSARY": [
{
"keys": ["10-1", "101"],
"value": "SIGNAL WEAK"
},
{
"keys": ["10-2", "102"],
"value": "SIGNAL GOOD"
},
{
"keys": ["10-3", "103"],
"value": "STOP TRANSMITTING"
},
{
"keys": ["10-4", "104"],
"value": "AFFIRMATIVE"
},
// ... continue for each group of signal codes
{
"keys": ["C2"],
"value": "DRUGS"
}
]
}
[mp3 @ 0x55b170ed8500] Failed to read frame size: Could not seek to 1026.
/home/foxtrot/SDRTrunk/recordings/20231023_003834North_Carolina_VIPER_Rutherford_T-Control__TO_41020_FROM_1610092.mp3: Invalid argument
fileProcessor.cpp Invalid argument: stof
i think this is probably due to the file not being completely written yet
The error message you're seeing in the build log indicates that the build process failed due to a missing header file:
fatal error C1083: Cannot open include file: 'unistd.h': No such file or directory
The unistd.h
header is a POSIX header file and is not available on Windows by default, which is why MSBuild cannot find it. This file typically provides access to the POSIX operating system API, and its absence suggests that the code is using POSIX-specific functions that are not available or are named differently on Windows.
To resolve this issue, you have a few options:
Conditional Compilation: If the code that includes unistd.h
is not needed on Windows, you can exclude it using preprocessor directives. For example:
#ifndef _WIN32
#include <unistd.h>
#endif
Find Windows Alternatives: If the code in fileProcessor.cpp
uses functions from unistd.h
(like read
, write
, close
, etc.), you will need to find the equivalent Windows functions and use them when compiling on Windows.
Use a Compatibility Layer: There are libraries like unistd.h
for Windows, which provide a compatibility layer for Unix-like functions on Windows. You can include such a library in your project to bridge the gap.
Refactor the Code: If the POSIX-specific code is not essential, you could refactor the code to be cross-platform, avoiding the use of unistd.h
altogether.
https://github.com/swiftraccoon/cpp-sdrtrunk-transcriber/actions/runs/6973035628/job/18976332083
Run .\vcpkg\vcpkg install libsndfile libmpg123 curl:x64-windows sqlite3:x64-windows yaml-cpp:x64-windows nlohmann-json:x64-windows
warning: In the September 2023 release, the default triplet for vcpkg libraries changed from x86-windows to the detected host triplet (x64-windows). For the old behavior, add --triplet x86-windows . To suppress this message, add --triplet x64-windows .
Computing installation plan...
error: while looking for libmpg123:x64-windows:
D:\a\cpp-sdrtrunk-transcriber\cpp-sdrtrunk-transcriber\vcpkg\ports\libmpg123: error: libmpg123 does not exist
Error: Process completed with exit code 1.
would it be more efficient to be tagging mp3s with the text transcription?
enable a new config field for max threads
enable new flag --parallel
then design around multiple threads of main.cpp
/processDirectory
[21:58:10] L66 transcriptionProcessor.cpp Could not extract actual transcription from JSON-like string.
[17:23:44] L66 transcriptionProcessor.cpp Could not extract actual transcription from JSON-like string.
should investigate the cause of this and determine if there's better way of handling whatever error
some mp3s will be over the duration but still not contain any (detectable) speech
add flag --speech_check <delete|move>
to enable mp3 deletion on (multiple (3 attempts?)) speech check failures for an mp3
if --speech_check move
we move it to a folder called "investigate" or something to user can keep mp3 and try to determine why it fails speech check
previously statically set talkgroupIDs<->talkgroupNames and stored in sqlite db
decide to leave as TODO
or implement translation
we already do a translation on the website
presently we accept a millisecond configuration value to loop main.cpp
/processDirectory
-- is it significantly better to do file-based events to trigger processDirectory
?
even at 100ms loop there is basically no CPU load tho (~0%).
if we do remove the milliseconds and just do while (true)
we get to ~6% CPU usage
redesign recordings db for efficiency
DatabaseManager.cpp Execution failed: database is locked
fileProcessor.cpp processFile Error: curlHelper.cpp CURL request failed: Couldn't resolve host name
[20:34:26] fileProcessor.cpp processFile Error: [20:34:26]curlHelper.cpp CURL request failed: Failed sending data to the peer
CodeFactor found an issue: Possible OS Command Injection (CWE-78)
It's currently on:
src\fileProcessor.cpp:114
Commit 9770ab0
modify how we're doing this to be less bothersome. let's just call 'em all a glossary and then simply required the sample JSON is followed. users attach glossary (or multiple) by talkgroup IDs
transcriptionProcessor.cpp JSON Error: [json.exception.type_error.302] type must be number, but is object
no longer seeing [json.exception.parse_error.101] anymore. instead am seeing 302 .. I suspect this may actually be errors regarding my callsign/tencode/signal files instead of transcription
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.