doanduyhai / achilles Goto Github PK
View Code? Open in Web Editor NEWAn advanced Java Object Mapper/Query DSL generator for Cassandra
Home Page: http://achilles.archinnov.info
License: Apache License 2.0
An advanced Java Object Mapper/Query DSL generator for Cassandra
Home Page: http://achilles.archinnov.info
License: Apache License 2.0
It should be possible to create entities with just one @Id/@EmbeddedId column and any @Column/@joincolumn
Add request tracing sampling & per column family request tracing client side
For the purpose of normalization, the collections & maps persistence in Thrift should be similar to the one implemented in CQL3.
It's not mandatory to use the special Serialized introduced by CQL3 though
Cassandra doesn't distinguish an empty collection (set, list, map) as a real empty collection or null and stores it always as null.
From other side it makes sense in term of domain model consistency - such collections could be @NotNull to have less code ( less null checks in code ), javax validation could take place etc.
So - seems it is worth to provide way how to set empty collection to objects - nulls or empty collections.
How can we create partition key using the annotations?
You can see the example here:
http://www.slideshare.net/slideshow/embed_code/23166139?startSlide=13
I'm writing a simple validator which should be called before persist for ALL entities (so it should be Interceptor), but I get an exception:
info.archinnov.achilles.exception.AchillesBeanMappingException: The entity class 'java.lang.Object' is not found
at info.archinnov.achilles.internal.validation.Validator.validateBeanMappingTrue(Validator.java:105)
at info.archinnov.achilles.internal.metadata.discovery.AchillesBootstrapper.addInterceptorsToEntityMetas(AchillesBootstrapper.java:116)
at info.archinnov.achilles.persistence.PersistenceManagerFactory.bootstrap(PersistenceManagerFactory.java:95)
at info.archinnov.achilles.persistence.PersistenceManagerFactory$PersistenceManagerFactoryBuilder.build(PersistenceManagerFactory.java:158)
Is there another way?
Right now, due to the 'upsert' nature of Cassandra, there are race conditions where we end up with inconsistent data in Cassandra:
em.persist(new User(userId,"John","Doe",52); // firstname,lastname,age
em.persist(new User(userId,"Helen","Sue"); // Another firstname & lastname, no age
In Cassandra we have : userId,"Helen","Sue",52
Logically, we expect to have only userId,"Helen","Sue" without the age.
The solution would be to clean the row first before inserting data
Since all serialization is done through jackson there is no reason to require implements serializable on classes.
In the case of "normal" entities, duplicate the rowkey as a value to ensure the row is created
In the case of widerow, there should be no problem as the secondary elements of the composite key will be stored as data anyway ensuring the row is created.
please !
Let's consider a clustered entity Clustered
The following query will fail: em.typedQuery(Clustered.class, "SELECT * FROM Clustered").get();
The reason is that Achilles does not map back CQL column to compound primary keys
Expected :'EntityMeta{className=className, tableName/columnFamilyName=cfName, propertyMetas=age,name, idMeta=PropertyMeta{type=SIMPLE, entityClassName=null, propertyName=id, keyClass=class java.lang.Void, valueClass=class java.lang.Long, counterProperties=null, embeddedIdProperties=null, consistencyLevels=(ALL,ALL)}, clusteredEntity=true, consistencyLevels=(ONE,ONE)}'
Actual :'EntityMeta{className=className, tableName/columnFamilyName=cfName, propertyMetas=name,age, idMeta=PropertyMeta{type=SIMPLE, entityClassName=null, propertyName=id, keyClass=class java.lang.Void, valueClass=class java.lang.Long, counterProperties=null, embeddedIdProperties=null, consistencyLevels=(ALL,ALL)}, clusteredEntity=true, consistencyLevels=(ONE,ONE)}'
For JUnit tests we are using an embedded Cassandra but it still dumps data on disk. Between each test, there may be truncate operations to clean up data.
Truncate is very expensive and the test suite can takes over 10 secs to run if we are truncating several column families.
The idea would be to use an in memory virtual file system and automatically truncate all managed tables/column families (this point should be optional thpugh).
Tutorial https://github.com/doanduyhai/Achilles/wiki/5-minutes-Tutorial says:
PersistenceManagerFactory persistenceManagerFactory = new PersistenceManagerFactory(configMap);
But PersistenceManagerFactory() constructor is package-local.
Currently, the KeyValue<K,V> type exposes the following methods:
public K getKey();
public V getValue()
public int getTtl()
To be comprehensive, the KeyValue<K,V> should also expose the timestamp information which is associated to the underlying Cassandra column:
public long getTimestamp()
Some time Achilles entities are also used as model object to be returned to clients/views
The proper design is to put all models in a separated Maven project with REST resources.
Right now doing so force you to pull all Achilles dependencies if your entity/model uses Achilles custom annotations
Putting Achilles annotations and custom types in a separated Maven project will solve the issue
Expose the Guava LRU cache size as config param
Add warning message in the code when cache is 80% full
Piece of cake
given two entity classes User and Car having a one-to-one bidirectional relationship.
when I persist or merge a User that relates to a Car
then the User and the Car must retain knowledge of each other's id
It would be nice to have different naming strategy for columns:
@Column(namingStrategy=NamingStrategy.SNAKE_CASE)
private String thisIsAComplicatedProperty
// would be this_is_a_complicated_property
@Column(namingStrategy=NamingStrategy.CASE_SENSITIVE)
private String thisIsAnotherComplicatedProperty
// would be "thisIsAnotherComplicatedProperty"
There should be a global setting: achilles.naming.strategy
CGLib takes a significant time to create proxies on entities on the first entity usage. It would be convenient to provide a way to warm up Achilles by creating proxies at configuration time.
AchillesErrorHandler
like:public interface AchillesErrorHandler {
// Process exception here.
Object onError(Exception exception);
}
Example of usage:
entityManager
.insert(entity)
.withErrorHandlers(myHandler1,myHandler2)
.execute();
Currently the WideMap API offers lots of useful method but there are too many of them. Especially findFirst()
, findFirstKey()
and findFirstValue()
are the same internally, only the return type differs.
The idea is to remove the KeyValue<K,V>
type and replace List<KeyValue<K,V>>
return type by SortedMap<K,V>
The initial motivation to define the KeyValue<K,V>
type was to provide meta data (timestamp & ttl) in addition of the plain couple (K,V)
However, most of the time, the end user just don't care about those meta data.
We could introduce a ValueWithMeta<V>
type to include meta data when necessary.
The API would become:
...
...
SortedMap<K,V> find(K start, K end, BoundingMode bound,OrderingMode ordering, int count);
SortedMap<K,V> findFirst(n);
Map.Entry<K,V> findFirst();
...
SortedMap<K,ValueWithMeta<V>> findWithMeta(K start, K end, BoundingMode bound,OrderingMode ordering, int count);
The ValueWithMeta<V>
type will expose:
V getValue();
long getTimestamp();
int getTtl();
Currently, the@Key annotation is used with the order attribute leading to some repeated boilerplate :
@Key(order = 1)
private final String messageType;
@Key(order = 2)
private final UUID messageId;
enabling passing the order by the value attribute would enable the following notation :
@Key(1)
private final String messageType;
@Key(2)
private final UUID messageId;
another possible solution would be to create an @order annotation and thus rewrite the previous as :
@Order(1)
private final String messageType;
@Order(2)
private final UUID messageId;
Caused by: java.lang.NullPointerException
at info.archinnov.achilles.table.ThriftTableCreator.discoverColumnFamily(ThriftTableCreator.java:51)
at info.archinnov.achilles.table.ThriftTableCreator.validateOrCreateCFForWideMap(ThriftTableCreator.java:107)
at info.archinnov.achilles.table.TableCreator.validateOrCreateColumnFamilies(TableCreator.java:27)
When settings in the test keyspaceName there is an error "Cannot find PersistenceManagerFactory for keyspace 'achilles_test'".
To help deploy Achilles within an OSGi container it would be helpful if the JARs contained the necessary metadata in their manifest files. This can easily be done by using the maven-bundle-plugin: http://felix.apache.org/site/apache-felix-maven-bundle-plugin-bnd.html
Right now, the current AchillesxxxResource method signature is
public AchillesCQLResource(String entityPackages, String... tables)
public AchillesCQLResource(String entityPackages, Steps cleanUpSteps, String... tables)
It is quite cumbersome to specify all the tables we want to clean up between tests. There should be an option to clean all tables managed by Achilles.
Maybe introducing a new constructor to solve the issue:
// Clean all table BEFORE and AFTER tests
public AchillesCQLResource(String entityPackages);
public AchillesCQLResource(String entityPackages,Steps cleanUpSteps);
Suppose 2 entities
public class User
{
@OneToMany(cascade = CascadeType.ALL)
@JoinColumn
private Identifier identifier
}
public class Identifier
{
@ManyToOne
@JoinColumn
private User user;
}
When
User user = new User(....);
Identifier identifier = new Identifier(....);
user.setIdentifier(identifier);
identifier.setUser(user);
em.persist(user);
If flag achilles.consistency.join.check
is turned on, Achilles will complain not finding the User with id xxx in the database for consistency check. It is because at the time of consistency check, Achilles has not flushed yet all mutations to Cassandra.
Because of the circular dependency, we should check first in the current "unit of work" for the presence of entity User before checking from Cassandra.
It would be nice to be able to delete rows by key. The current API forces you to do a find before doing a delete...
Looking at this trace, it would be preferable to see that Achilles cannot call the setter because the column data is not present (reported as null in CQL), rather than a stack with ReflectionInvoker
, IAE
, etc.
info.archinnov.achilles.exception.AchillesException: Cannot invoke 'setIntValue' of type 'domain.Object' on instance 'domain.Object@7759f64d'
at info.archinnov.achilles.proxy.ReflectionInvoker.setValueToField(ReflectionInvoker.java:99) ~[achilles-core-2.0.6.jar:na]
at info.archinnov.achilles.entity.CQLEntityMapper.mapRowToEntity(CQLEntityMapper.java:70) ~[achilles-cql-2.0.6.jar:na]
at info.archinnov.achilles.query.typed.CQLTypedQueryBuilder.get(CQLTypedQueryBuilder.java:80) ~[achilles-cql-2.0.6.jar:na]
at persistence.ObjectDao.findAll(ObjectDao.java:40) ~[ObjectDao.class:na]
at ...
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_21]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_21]
at java.lang.Thread.run(Thread.java:722) [na:1.7.0_21]
Caused by: java.lang.IllegalArgumentException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_21]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_21]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_21]
at java.lang.reflect.Method.invoke(Method.java:601) ~[na:1.7.0_21]
at info.archinnov.achilles.proxy.ReflectionInvoker.setValueToField(ReflectionInvoker.java:96) ~[achilles-core-2.0.6.jar:na] ... 40 common frames omitted
Currently, when accessing a field, we make it accessible, read the value and make it un-accessible again: https://github.com/doanduyhai/Achilles/blob/master/achilles-core/src/main/java/info/archinnov/achilles/internal/reflection/FieldAccessor.java#L28
In multi-threaded environments, there are race conditions around the setAccessible block.
One solution is to make field accessible once for all, using a synchronized block, and never set back its previous acccessible state
At the moment, you cannot bootstrap Achilles without providing a valid "entityPackages" parameter and without having at least 1 entity in those packages.
It should be possible to start Achilles without any entity mapping. In this case Achillse will act as a simple object mapper for Native Queries
Usage examples
Options options = OptionsBuilder.withConsistency(ALL)
.ttl(10)
.timestamp(100L);
em.persist(myEntity,options);
The validatePartitionComponent check in CQLTableValidator.java doesn't work correctly with Cassandra 2.0.
This check: tableMetaData.getPartitionKey().contains(columnMetadata)
assume that columnMetaData is going to be the identical object as an element of the partitionKey array.
I'm not sure what has changed in the protocol, and I have no idea how this was working in the first place, but using the same CQL driver (1.0.4) and just upgrading Cassandra means that check always fails. An inspection of the objects shows that they are identical in nearly every way. The "table" and "type" fields both reference the identical objects. But the "name" field, while identical in text, references different objects, hence making the entire "contains" check fail.
I haven't checked the others, but I'm guessing that the other methods in that class that check for equality (or contains) run the risk of having similar problems.
Enable support of @key Annotation on constructor parameters to avoid having to create constructors AND setters
public static class MessagesKey implements MultiKey {
@Key(order = 1)
private String messageType;
@Key(order = 2)
private UUID messageId;
public MessagesKey() {}
public MessagesKey(String messageType) {
this.messageType = messageType;
this.messageId = null;
}
public MessagesKey(String messageType,UUID messageId) {
this.messageId = messageId;
this.messageType = messageType;
}
public UUID getMessageId() {
return messageId;
}
public String getMessageType() {
return messageType;
}
public void setMessageId(UUID messageId) {
this.messageId = messageId;
}
public void setMessageType(String messageType) {
this.messageType = messageType;
}
}
Since a key represents Immutable data I would like to be able to write
public static class MessagesKey implements MultiKey {
private final String messageType;
private final UUID messageId;
public MessagesKey() {}
public MessagesKey(String messageType) {
this.messageType = messageType;
this.messageId = null;
}
public MessagesKey( @Key(order = 1) String messageType, @Key(order = 2) UUID messageId) {
this.messageId = messageId;
this.messageType = messageType;
}
public UUID getMessageId() {
return messageId;
}
public String getMessageType() {
return messageType;
}
}
When using Native Query, is sometimes needed to serialize Entity into Json. Could Achilles exposed a helper method to serialize Entity into Json with its mapper ?
This will enable to have consistency between Native Query and Achilles.
There is a race condition that is likely to create data corruption:
Client1 persist User1(friendsList = {'John','Helen','Richard'})
Client2 persist User1(friendsList = {'Peter','Richard'})
If Client1 request hits Cassandra before Client2 request, the state in Cassandra would be:
-> User1(friendsList = {'John','Richard','Richard'})
If Client2 request arrives first, the state in Cassandra would be:
-> User1(friendsList = {'Peter','Helen','Richard'})
In both case, we have data corruption.
Consequently, since we want to allow persist()
to be successful even thought the entity already exists in Cassandra, we need to fix the above issue.
Another strategy could be to check for entity existence in Cassandra when persist()
is called and raise Exception but it means that concurrent insert by different Clients in Cassandra is not possible, at least with the same range of primary key and worst, it would imply read-before-write pattern
Right now, all statically defined types UUID will be translated by Achilles into uuid Cassandra type.
Because of that, some CQL3 functions like dateOf()
or unixTimestampOf()
do not work on those columns even though their value is indeed TimeUUID.
To work-around this issue, since the core UUID
class is final and cannot be extended, we need to define a custom @TimeUUID
annotation.
This will help Achilles mapping this type into Cassandra timeuuid
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.