GithubHelp home page GithubHelp logo

divroll / datafactory Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 0.0 457 KB

DataFactory is a library for Jetbrains Xodus database that provides compute abstraction layer through Actions and Conditions with a fluent API. It allows a managed access to the underlying database through consistent Java API that works both in an embedded or in a remote context through Java RMI.

License: Apache License 2.0

Java 100.00%

datafactory's People

Contributors

kerbymart avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

datafactory's Issues

Ensure EntityStore is not being close prematurely

The error message "Attempt to read closed log" suggests that there is an attempt to read from a log file that has already been closed. This can happen if the database connection was closed prematurely or if there is a concurrency issue where multiple threads are trying to access the database simultaneously.

In the context of JetBrains Xodus, this error can occur when performing operations on the EntityStore after it has been closed. Ensure that the EntityStore is not being closed prematurely. Also, make sure that all database operations are properly synchronized if they are being accessed from multiple threads. Identify where and why the EntityStore is being closed.

[main] INFO com.divroll.datafactory.conditions.PropertyNearbyConditionClientPerfTest - Time to complete query (ms): 141
[main] INFO jetbrains.exodus.env.EnvironmentImpl - Environment[/tmp/test/b38100ad-62d7-4e24-8a84-c0d47bf0ddf9] is active: 1 transaction(s) not finished
[EntityStoreSharedAsyncProcessor0] ERROR jetbrains.exodus.entitystore.EntityStoreSharedAsyncProcessor - Attempt to read closed log
java.lang.IllegalStateException: Attempt to read closed log
	at jetbrains.exodus.log.DataCorruptionException.checkLogIsClosing(DataCorruptionException.java:38)
	at jetbrains.exodus.log.BlockNotFoundException.raise(BlockNotFoundException.java:27)
	at jetbrains.exodus.log.BlockNotFoundException.raise(BlockNotFoundException.java:32)
	at jetbrains.exodus.log.Log.readBytes(Log.kt:782)

Cleanup existing unit test

Clean up the existing JUnit test to make it more readable and be a reference for future refactorings.

Add Unit Tests for BlobRemoveAction Class

The actions package classes, specifically the BlobRemoveAction class, currently lack unit tests.

Here's a brief outline of what the tests for BlobRemoveAction could cover:

  1. Instantiation: Test that the BlobRemoveAction class can be instantiated correctly, and that all initial values are set as expected.

  2. Method functionality: For each method in the BlobRemoveAction class, write tests that confirm the method works as expected under normal conditions.

  3. Edge cases: Consider edge cases and write tests for them. For example, what happens if we pass null or unexpected values to the methods?

  4. Error handling: Test that the class handles errors gracefully and as expected.

Simply naming of core classes

Simply naming of core database classes:

  • DataFactoryBlob to be renamed to Blob or EntityBlob
  • DataFactoryEntities to be renamed to Entities
  • DataFactoryEntity to be renamed to Entity
  • DataFactoryEntityType to be renamed to EntityType
  • DataFactoryEntityTypes to be renamed to EntityTypes
  • DataFactoryProperty to be renamed to Property or EntityProperty

Serialization Issue with RMI Tests Due to `DatabaseManagerImpl` Reference

Issue Description:

When running RMI-based tests in isolation, everything functions as expected. However, when these tests are executed alongside other RMI-utilizing tests, a serialization issue arises with DatabaseManagerImpl. Specifically, we encounter a java.rmi.UnmarshalException related to an attempt to serialize DatabaseManagerImpl, which is not intended to be serializable. This problem manifests in tests that should not require DatabaseManagerImpl to be serialized.

Error Message:

java.rmi.UnmarshalException: error unmarshalling return; nested exception is: 
	java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: ....datafactory.database.impl.DatabaseManagerImpl

Potential Cause:

The issue likely stems from EntityStoreImpl or similar classes holding a direct reference to DatabaseManagerImpl. When RMI attempts to return objects from remote calls, it inadvertently tries to serialize the DatabaseManagerImpl due to its presence in the object graph.

Suggested Solution:

Refactor the implementation to avoid direct references to DatabaseManagerImpl within serializable classes. Instead, consider using a provider pattern or a factory method to dynamically access DatabaseManagerImpl without directly including it in the serialization context.

Example Refactoring Approach:

  • Introduce a DatabaseManagerProvider interface with a method to retrieve the DatabaseManagerImpl instance.
  • Modify EntityStoreImpl (and any other relevant classes) to use DatabaseManagerProvider for accessing DatabaseManagerImpl, instead of holding a direct reference.

Code Snippet Example:

public interface DatabaseManagerProvider {
  DatabaseManager getDatabaseManager();
}

public class EntityStoreImpl ... {
  private DatabaseManagerProvider databaseManagerProvider;
  
  // Use databaseManagerProvider to access DatabaseManagerImpl
}

This approach aims to decouple the database management concerns from serializable objects involved in RMI communication, thus avoiding serialization issues.

Exception when instantiating an empty entity.

When instantiating an entity with no parameters like;

DataFactoryEntity emptyEntity = new DataFactoryEntityBuilder().build();

It should be handled at compile-time to prevent NPEs.

java.lang.NullPointerException: environment

	at java.util.Objects.requireNonNull(Objects.java:228)
	at com.divroll.datafactory.builders.DataFactoryEntityBuilder$ImmutableDataFactoryEntity$InitShim.environment(DataFactoryEntityBuilder.java:728)
	at com.divroll.datafactory.builders.DataFactoryEntityBuilder$ImmutableDataFactoryEntity.<init>(DataFactoryEntityBuilder.java:701)
	at com.divroll.datafactory.builders.DataFactoryEntityBuilder$ImmutableDataFactoryEntity.<init>(DataFactoryEntityBuilder.java:656)
	at com.divroll.datafactory.builders.DataFactoryEntityBuilder.build(DataFactoryEntityBuilder.java:612)

Remove or improve the call to forcibly delete the database lock

In the implementation of the DatabaseManager, when retrieving the environment the deleteLockingProcessAndGetEnvironment is called to forcibly "get" the environment, which is not the proper way to acquire the database.

A possible solution is to just remove it, or test for edge cases where remnant locks need to be removed when the process that holds the lock was not able to do the proper cleanup.

Update implementation of Entity property methods

It is recommended to rename the saveProperty method in the EntityStore to updateProperty. This will avoid any ambiguity as the function of this method is to manipulate the properties of existing entities. Furthermore, the removeProperty method can be removed since other property actions, such as rename and delete, can handle the same functionality.

Update the method `removeEntityType`

Replace the parameter EntityQuery with EntityTypeQuery for the removeEntityType or create an overloaded method for removeEntities with the following signature:

Boolean removeEntities(@NotNull EntityTypeQuery query) throws DataFactoryException;

This will make the method less ambiguous.

BindingUtils and other database classes should not print the stacktrace

In the BindingUtils class, the e.printStackTrace(); is used in the deserialize method.

This is not a good practice because it prints the stack trace to the standard error, which is usually the console. This can lead to the leaking of sensitive information such as file paths, server IPs, and other system information.

It's also not flexible in terms of output format and destination. Instead, use a logger to log the exception. This way, the control of the level of logging and it's also more flexible in terms of output format and destination.

public static <T> T deserialize(@NotNull final byte[] data, final Class<T> clazz) {
    try {
        ByteArrayInputStream in = new ByteArrayInputStream(data);
        ObjectInputStream is = new ObjectInputStream(in);
        Object readObject = is.readObject();
        return clazz.isInstance(readObject)
                ? (T) readObject : null;
    } catch (IOException e) {
        throw new UncheckedIOException(e);
    } catch (ClassNotFoundException e) {
        LOGGER.log(Level.SEVERE, "Deserialization failed", e);
    }
    return null;
}

EmbeddedEntityBinding exception handling and refactoring

  • Renaming: The unclear variable names are renamed to increase code readability. For instance, the variable is gets renamed to objectInputStream.

  • Extract Function: These two serialization and deserialization operations, which are currently static methods could be made non-static.

  • Exception Handling: Instead of printing stack traces, we can throw a custom unchecked exception.

  • Using Optional: The serialize and deserialize methods return null when there's an exception which can lead to null pointer exceptions. Instead, we can return an Optional from these methods.

public class SerializationException extends RuntimeException {
    public SerializationException(String cause) {
        super(cause);
    }
}

//class remained the same
public class EmbeddedEntityBinding extends ComparableBinding implements Serializable {

    public static final EmbeddedEntityBinding BINDING = new EmbeddedEntityBinding();

    public Optional<byte[]> serialize(Object obj) {
        try (ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
             ObjectOutputStream objectOutputStream = new ObjectOutputStream(byteArrayOutputStream)) {
            objectOutputStream.writeObject(obj);
            return Optional.of(byteArrayOutputStream.toByteArray());
        } catch (IOException e) {
            throw new SerializationException("Failed to serialize object");
        }
    }

    public <T> Optional<T> deserialize(byte[] data, Class<T> clazz) {
        try {
            ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(data);
            ObjectInputStream objectInputStream = new ObjectInputStream(byteArrayInputStream);
            Object readObject = objectInputStream.readObject();
            return Optional.ofNullable(clazz.isInstance(readObject) ? clazz.cast(readObject) : null);
        } catch (IOException e) {
            throw new SerializationException("Failed to deserialize object");
        } catch (ClassNotFoundException e) {
            throw new SerializationException("Failed to find class during deserialization");
        }
    }

    @Override
    public Comparable readObject(@NotNull ByteArrayInputStream stream) {
        return Try.of(() -> {
            byte[] serialized = ByteStreams.toByteArray(stream);
            return deserialize(serialized, Comparable.class).orElseThrow();
        }).getOrNull();
    }
  
    @Override
    public void writeObject(@NotNull LightOutputStream output, @NotNull Comparable object) {
        byte[] serialized = serialize(object).orElseThrow();
        output.write(serialized);
    }
}

Simplify access to the EntityStore

Accessing the EntityStore should be made simpler by eliminating the redundant getInstance method call:

EntityStore entityStore 
    = isClientMode ? DataFactoryClient.getInstance().getEntityStore() 
          : DataFactory.getInstance().getEntityStore();

It should be straightforward and easy to understand, like something that requires little effort to comprehend:

EntityStore entityStore 
    = isClientMode ? DataFactoryClient.getEntityStore() 
          : DataFactory.getEntityStore();

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.