shindoumihou / rosedb Goto Github PK

A super simple, fast NoSQL database made in Java.

License: Apache License 2.0

Java 100.00%

nosql java database java-database java-database-application rosedb nosql-database nosql-databases nosql-server nosql-data-storage

rosedb's Introduction

❤️ What is RoseDB?

RoseDB is a simple, NoSQL database that is written completely in Java containing the most basic functions that is needed for a database. This project was initially created as a random project for me (a shower thought) but has evolved into a learning experience for me.

⚙️ How does it work

RoseDB works with both in-memory and file data storage, for every request it receives, it stores it on a queue and also on its cache which will be saved immediately at an interval of 5 seconds to the specified directory. It utilizes websockets to receive and send data to clients and should be more than capable to process plenty of requests per second.

✨ Goal

My primary goal/aim of this project is not to create a simple but decent database that can get you up and running in literal mere seconds with little to no configuration at all.

Are you not convinced? Have a look at our no-configuration setup.

Download the jar from Releases.
Run the jar from a console: java -jar RoseDB.jar.
Type in your Authorization code (be sure to remember it) then restart the application.
Install one of our drivers (for example, the official Java driver) and follow the instructions to use the driver.

🛡️ Security

RoseDB has support for websocket SSL which counts as security and also an Authorization header enforcement with the header being compared with hash (it is written to the disk as a hash value for security), there will be more security features and if you have more to suggest then feel free to send a Pull Request or an Issue explaining everything. We are always focusing our attention to bringing more security features onto the application but since we are still in our very early stages, we are trying to get everything up and running first before focusing on security.

Though, in my opinion, RoseDB is more suited to be used in simple applications like tiny Discord bots that is shared among friends and not large applications that require super complicated features, after all, the main aim of RoseDB is to be as simple as possible and that involves replication and load balancing (future).

🖥️ Requirements

JDK 11 (Preferably, OpenJDK 11).
An computer with storage, memory and a terminal.
A keyboard that you can type on.
Internet Connection (to download the JAR file, naturally).

🖱️ Installation

Installation of RoseDB is simple, all you need is JDK 11 (Preferably, OpenJDK 11) and tier-one extreme basic knowledge of JSON. Here are the steps of installing RoseDB.

Download RoseDB.jar from the Releases on GitHub.
Place RoseDB.jar on its dedicated, empty folder.
Open Terminal or Powershell then execute the following line: java -jar RoseDB.jar
OPTIONAL CTRL + C (or exit) the application then head to the folder where you will find a config.json.
OPTIONAL Configure the JSON config as you like.
OPTIONAL Run the jar file again with the same line: java -jar RoseDB.jar

📝 Configuration

Configuration of RoseDB is straightforward, here is an example of a configuration file (it is on JSON format).

FIELD	TYPE	DESCRIPTION	DEFAULT VALUE
Cores	integer	The number of cores the application should use.	1
maxTextMessageBufferSizeMB	integer	The maximum message buffer size for each message (request) received (MB)	5
maxTextMessageSizeMB	integer	The maximum text (message/request) size to receive (MB)	5
port	integer	The port that RoseDB should use.	5995
versioning	boolean	Whether RoseDB should save a backup version for all items that are modified. (Recommended)	true
preload	boolean	Whether to preload all items that are saved on the database. (Recommended)	true
updateChecker	boolean	Whether to check for RoseDB updates from the maintainer's server. (Recommended)	true
directory	string	The exact directory folder where RoseDB will save all the data.	running location of the jar
heartbeatIntervalSeconds	integer	The interval seconds of when the server should send a heartbeat packet to all connections.	30
loggingLevel	string	The minimum level of which RoseDB should log (recommended at INFO for performance), options: INFO, WARNING, DEBUG, ERROR.	INFO

💌 Wrappers

If you want to quickly get up and running with your application then feel free to use our wrappers.

🌠 How simple is RoseDB?

RoseDB is very simple and easy to use, after following Installation, you can quickly get up and running by sending requests to the server via Queria format or JSON format.

An example of a Queria GET request is:

database.collection.get(item)

An example of a Queria ADD request is:

database.collection.add(item, {"someKey":"someValue"})

Are you interested, learn more at our GitHub Wiki.

❤️‍🔥 Reporting a Vulnerability

To report a vulnerability, simply file an issue at Issue Template.

🚀 Add a suggestion

To suggest a new feature or some sort, feel free to send a suggestion issue at Suggestion Template

🌟 Maintainers

Shindou Mihou, creator and developer.

💫 Credits

Bucket4j for Rate-limiter.
TooTallNate for Websocket (replacing io.javalin).
Resilience4j for Retry and Timeouts.
Apache Commons for FilenameUtils, Hashing, FileUtils, Hash Validation.
org.json for JSON Decoding and Encoding.
GSON for JSON Serialization and Deserialization (working in tandem with org.json)

rosedb's People

Contributors

Stargazers

Watchers

Forkers

vijaykumar2013 maryphani omkarpatel00 olivialiu123

rosedb's Issues

[Beautify config.json]

org.json offers a method to print a beautified JSON string which I did realize exist a week or three back but forgot to implement it with v1.1.0. If anyone wants to handle it, all you need to do is make the current config read method to read all lines

String.join(Files.readAllLines(...), "\n")

And change the toString() when creating the JSONObject for the config to toString(4).

Replication/Load-balancing

WARNING

This is a concept of art from someone who has took programming seriously starting a year or two ago, I have been writing code since early in my days but that was mainly for fun. I want to learn which is why I am leaving this concept here for others to improve on and also for myself to learn. Thank you.

Idea

Here is my general idea for the future which is to support load-balancing with the database, as I am still very immature at Java, feel free to give out your suggestions and opinions that could improve this (much better with code demonstration but that is purely optional).

Specification

The idea I have came up is a simple concept on paper which involves nodes that would act as load balancers for the main process.
The nodes (or childrens) would act as another gateway/endpoint for requests to enter in and to receive, sending all the requests towards the main server who would then decide which node will save this data on, in the meantime, broadcast the data to all nodes who would then update their cache immediately.
This way, the data is balanced between several node servers which, from my shower thoughts, thinks will make storage space a minor problem as all the data is stored on different nodes with all of the processes actually having the data already cached.
Now, the concept may sound a bit weird. There are always several issues with this, like for example, how would all nodes and the main process fill their cache if all the data is scattered everywhere. This is where another option comes in: main storage which is a concept of mine where the main storage during shutdown will save every single data it has on its storage temporarily.
Though, in general, all the nodes should report the data they have on their storage immediately to the main process on boot-up which the main process would immediately distribute to all the other nodes to store on their cache. As for collisions, the hash of the (identifier + collection + database name) will be assigned to a node and this node assignment will be stored all nodes for another thing.

What if the main process disconnects?

This is where the hash storage comes in, when the main process disconnects without notice, all the nodes will immediately decide on a temporary master node who will take place of the original master node until it comes back online. This temporary master node will ask for all the hash storages of all the nodes and see if there are any conflicts, if there is then it will pick the version that has the highest amount of nodes with the same version and distribute it.
After which, the temporary master will then go around and proceed with the tasks of the original master since all nodes have the same cache inside of them (this will be checked as well by combining all the data values and identifiers length).
Once the master process comes back online, the temporary master will immediately do a handover which is a process that keeps the original master up to date by sending all its data to the main master.

Protocol

There is also an issue with the protocol to use since HTTP will not work as we need to know when a node or the main server suddenly disconnects which leaves us with one choice which is the same protocol as before: websocket but on its own endpoint: ws://127.0.0.1:5563/node which has its own separate functions, all the nodes will also have to identify themselves when connecting with the Authorization code (Authorization: Node [TOKEN]) to which the server will check its own configuration to see if this node is actually registered on the list (to prevent hijacking).

Data Assignment to Node

As written earlier, each item will have their unique hash (item + collection + database)'s name which will be assigned to a hash storage with the node which is assigned in handling and saving the data's id saved together with it.

More details to be added and this concept will slowly be improved over time, please note that THIS IS STILL A CONCEPT IDEA AND HAS PLENTY OF FLAWS.

[Security Improvements #1]

Proposal

Currently, the Authorization key is stored on a config.json as plain-text which is pretty bad on a security point of view which is why I am thinking of changing this to a more secure version.

Upon first startup (new) of the application, it will first ask for the Authorization token to use for the user which will be changeable from the console.
After the master enters the token, the application will immediately hash it before saving the hash onto a secret file location where it will be used to validate the tokens of all requests.
Token validation will basically be hash comparison with the request token from their Authorization header.

Second layer of protection:

After the client has successfully authorized with the server and connected, the server will plant a secret token onto the cookies of the client which will be checked for every requests whether the server knows the value on the cookie or not.
If the cookie is invalid, the server will immediately disconnect from the client with an 4007 error which means: "Client did not pass Security Requirements."

[Patch v1.0.5]

TODO

Feel free to place your suggestions on the comments.

Data versionings (for every change, the database will rename the previous file and recreate a new one with the newer data), this can be rolled back and disabled. DELAYED for LATER VERSIONS
Automatic update check (it won't download though, simply checks whether there is an update and notifies) every specific time set on config.
Aggregation methods, for example, aggregating an entire collection or database.
Insertion of many data with one request. DELAYED for LATER VERSIONS
Focus on thread-safe reads and writes since current implementation does nothing about it.
Pre-loading of all databases and collections.
Configurable maximum message buffer & size.
Create a thread before shutdown to finish writing everything.
Customizable heartbeat interval.

[Next Patch]

TODO

Feel free to place your suggestions on the comments.

Data versionings (for every change, the database will rename the previous file and recreate a new one with the newer data), this can be rolled back and disabled. DELAYED for LATER VERSIONS
Insertion of many data with one request. DELAYED for LATER VERSIONS
Ability to perform requests through the terminal.
Compress all files with GZIP from now on.

[New Query Format]

Queria

RoseDB starting from the next patch will add support to its new query format which will shorten the length of each requests while making it more readable to the human eye.

This change will become mandatory in v1.3.0 though all versions < v1.3.0 will have backwards compatibility with the original JSON format.

The new query format looks similar to MongoDB with an example method of a GET request being: database.collection.get(identifier)

An example for ADD request:
database.collection.add(identifier, {"someKey":"someValue"})

Example for aggregation request:
database.collection.aggregate()

database.aggregate()

Please note that all of these do not have a semicolon at the end.

Potentially Chaotic Retries.

While developing a wrapper for a library, I realized that the way I was using Resilience4j's Retry library is pretty broken and shouldn't actually be done in such a way, therefore a patch will come when I have free time which will revert the retry changes back to the original thread-blocking system (since RoseDB is running multiple threads for each requests, it wouldn't hard to block some when doing tasks that usually is done at startup... unless you have preload turned off....).

Fix 'All overloaded methods should be placed next to each other. Placing non-overloaded methods in between overloaded methods with the same type is a violation. Previous overloaded method located at line '42'.' issue in src\main\java\pw\mihou\rosedb\io\FileHandler.java

CodeFactor found an issue: All overloaded methods should be placed next to each other. Placing non-overloaded methods in between overloaded methods with the same type is a violation. Previous overloaded method located at line '42'.

It's currently on:
src\main\java\pw\mihou\rosedb\io\FileHandler.java:71
Commit 873cdf4

It's a reasonable issue for readability, so i am placing this

shindoumihou / rosedb Goto Github PK

rosedb's Introduction

❤️ What is RoseDB?

⚙️ How does it work

✨ Goal

🛡️ Security

🖥️ Requirements

🖱️ Installation

📝 Configuration

💌 Wrappers

🌠 How simple is RoseDB?

❤️‍🔥 Reporting a Vulnerability

🚀 Add a suggestion

🌟 Maintainers

💫 Credits

rosedb's People

Contributors

Stargazers

Watchers

Forkers

rosedb's Issues

WARNING

Idea

Specification

What if the main process disconnects?

Protocol

Data Assignment to Node

Proposal

TODO

TODO

Queria

Recommend Projects

Recommend Topics

Recommend Org

Jobs