GithubHelp home page GithubHelp logo

Object model discussion about auraxium HOT 22 CLOSED

leonhard-s avatar leonhard-s commented on July 23, 2024 1
Object model discussion

from auraxium.

Comments (22)

qcoumes avatar qcoumes commented on July 23, 2024 1

I completely agree. The objects must be intuitive and not be as convoluted as the data returned by the census API. In the draft I wrote weeks ago, I already implemented some of these changes (like world -> server).

It should be able to provide quick access to most of the data used for common usages (discord bot mostly), the Query interface only there as a fallback for very specific requirements.

from auraxium.

qcoumes avatar qcoumes commented on July 23, 2024 1

I also think asyncio is the way to go, coupled with some caching, it should reduce the impact of the slowness of the census API.

from auraxium.

LordFlashmeow avatar LordFlashmeow commented on July 23, 2024 1

I think the intermediate object method is the right way to go, creating an object for each unique intermediate table query.
For something like the characters_item table, we could do an object like:

class CharactersItem:
    def __init__(self, character_id):
        self.character_id = character_id
        self.response = []
        self.has_fetched = False

    @property
    def getresponse(self):
        if not self.has_fetched:
            self.response = Query(...)
            self.has_fetched = True
        else:
            return self.response

This will only work for simple cases, without joins or resolves. In those cases, I think it's up to the user to craft that query themselves. That's what the object model is supposed to be, right? A simple way to expose simple queries to the user.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

I am not perfectly sure how to go about implementing some of the more obscure object types (e.g. Abilities, Effects), but aside from weapon damage calculations I am not certain they are all that useful. There is no link between implants and abilities, none of it is documented... at least I'd be willing to ignore them until we find a reason/way to make them intuitive.

I'll finish off some basic test cases for the URL generation, then I'll start toying around with the object model in a new branch.

@qcoumes Do you see any reason to not use asyncio for this? I think it is worth using, especially for Discord bot usage.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

Something I would like some input on: How should we handle large intermediate tables like characters_item in the object model?

Here are the things that come to my mind, feel free to add your own below:

  1. Just resolving it into a list of items would be the naïve way of doing it, but this also means the instantiation of possibly thousands of items. Lots of wasted traffic and unnecessary API load.

    There would be kwargs that allow filtering, but this still makes the most obvious option also a bad one, which feels unintuitive. Is a "you are returning a list of items longer than 100 items, consider filtering your query" error sufficient here?

  2. Use an intermediary object that acts as a placeholder until data is accessed. The object would store the information about what character is targeted, and each time the user interacts with it (like "is this item in the list" or "for each weapon in that list"), the appropriate request is carried out in the background.

  3. Just don't include these tables unless necessary. The main use case for this table are "Does have unlocked?" and "Who in <player_list> has unlocked?", which could easily be integrated into standalone methods. Are there any use-cases that would not be feasible this way?

from auraxium.

qcoumes avatar qcoumes commented on July 23, 2024
1. Just resolving it into a list of items would be the naïve way of doing it, but this also means the instantiation of possibly thousands of items. Lots of wasted traffic and unnecessary API load.

A way to resolve this, albeit convoluted, would be to download the whole items table into the module. Since it doesn't change very often (on PS2 updates). Something like a cron job (Github workflow on.schedule could do) could check once in a while (like once every week) if items changed, and if so automatically upload a new patch version of auraxium. We could do the same for every large and mostly static table (like hex).

The two main caveats are that we have to rely on a server to do the job, and the end-user of the module would have to update it somewhat frequently (at most every time PS2 is updated).

2. Use an intermediary object that acts as a placeholder until data is accessed. The object would store the information about what character is targeted, and each time the user interacts with it (like "is this item in the list" or "for each weapon in that list"), the appropriate request is carried out in the background.

I think it's the easiest way and would work for every intermediate table.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

I think the intermediate object method is the right way to go, creating an object for each unique intermediate table query.

My main issue with the intermediate object type is that it forgoes a intuitive OOP syntax I was looking for in favour of emulating the API model. At that point, we could also just hard-code common joins and leave any more complex queries up to the user.

@LordFlashmeow Regarding your example, we can get a "light" version of this through asynchronous properties/methods:

class Character:
  def __init__(self, id_: int) -> None:
    ...
  @async_property
  @cached(60)  # Cache response for up to 60 seconds
  async def items(self) -> List[Item]:
    # Perform the items query itself as needed
    return await Query('character', character_id=self.id).join('characters_item', list=True).get()

What I had in mind originally was something like a container that could implicitly create the queries as needed based on usage:

char = Character.get_by_name('higby')
if await Weapon.get_by_name('Solstice VE3') in char.items:
  # This now dynamically generated a query that is tailed to only check for the Solstice's ID
  pass

However, that might be a lot of complexity just to be clever. Retrieving all items for my main character yields about 2100 items, "smaller" characters tend to sit at 500-1000 items. The full dump of the intermediate table itself is only between 50 and 150 KB total, if no filtering is performed.

The scale of the problem is not as bad if we decide to lazy-load the item objects (i.e. only populate the IDs and add the remaining data later). Here is what that could look like:

class LazyLoadedItem:
  def __init__(self, id_: int) -> None:
    self._id = id_
    # Alternatively we could use a last_polled timestamp to allow refreshing
    # of objects without re-instantiating them
    self._is_populated = False

  @async_property
  async def is_weapon(self) -> bool:
    # If the item is still just a placeholder, populate it before returning
    if not self._is_populated:
      await self._populate()
    return self._is_weapon

The issue with that latter is that it leads to nearly every property having to be asynchronous.

These are options that remain in my eyes:

  1. Retrieve the entire list (because 100 KB are not worth worrying about)

    1a. Return them as integer IDs to discourage instantiating 1000+ objects for no reason
    1b. Return them as lazy-loaded objects, with all the async creep that follows

  2. Add an intermediate view object that will generate and return the appropriate query:

    • Containment checks: if await <item> in <ItemViewObject>
    • Iteration: async for <item> in <ItemViewObeject> # All items retrieved at once
    • Conversion to lists of items and/or IDs

from auraxium.

LordFlashmeow avatar LordFlashmeow commented on July 23, 2024

My main issue with the intermediate object type is that it forgoes a intuitive OOP syntax I was looking for in favour of emulating the API model.

Can you give us an example of the syntax you were thinking of? Having a better sense of what you're hoping for might help us figure out some solutions.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

If I had anything specific I'd try to make it work, but I am just not sure which way might be best.

But the best syntax would be the most intuitive one - if I pretend to not know how the Census API is structured, this is how I would try to work with the kinds of data we would need to access characters_item for:

  • Checking if a weapon or item is unlocked

    >>> await Character.get_by_name('...').has_item('...')
    True
  • Iterating over unlocked weapons

    >>> async for weapon in Character.get_by_id('...').weapons:
    ..    print(Character.weapon_stats(weapon))

    Or with filter terms:

    >>> async for weapon in Character.get_by_id('...').weapons.filter(faction='NS'):
    ..    ...

Especially for the last example, an internal WeaponView object would be handy. We could instantiate it with a list of IDs to keep it lightweight (this would also allow "is this weapon in that list" checks without querying the API), and if the user starts iterating we retrieve the list of items before iterating over the returned list.

from auraxium.

SMC242 avatar SMC242 commented on July 23, 2024

I think option 2 is the best for 2 reasons:

  • It only does the queries it needs to do
  • Option 3 could be built off of option 2 (exchanging overhead for usability)
    It seems like the best compromise between performance and usability

from auraxium.

LordFlashmeow avatar LordFlashmeow commented on July 23, 2024

To me, I think it makes sense to have an underlying weapons class (and items, etc.) that is an attribute of Character that handles its own data through @async property

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

To me, I think it makes sense to have an underlying weapons class (and items, etc.) that is an attribute of Character that handles its own data through @async property

There will definitely be a Weapon class, probably inheriting from Item. The View object ist mostly there to make it easy to select some of the weapons, rather than the API instantiating every Weapon object the user has unlocked every time the weapons attribute is accessed.

I think I'll build out the Character/Item/Weapon set of objects to test, then see if and how View objects would work with those. It seems simple enough, I am just paranoid that I am overlooking a use case or locking down options down the line.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

I finally added first parts of the object model. Not a lot of collection types implemented yet, but the groundwork (requests, caching, bases, etc.) is there.

I still have to do some fiddling to see how to best translate class properties back into API field names for queries, but it's getting there.

from auraxium.

spascou avatar spascou commented on July 23, 2024

Hey !

As I'm working on a similar project perhaps you can get some ideas from there too; I kinda went towards the way @qcoumes proposed, that is to collect all data locally and then work with the copy, mostly because I don't want to depend on the Census API being online for data that changes less often than the service unexpectedly dies.

I mostly worked on infantry weapons so far, for the purpose of this.

Basically I'm using ps2-census as a low-level client that reliably builds and executes queries, but does not parse the output. Short-term unreliability of the Census API is mitigated by using tenacity in a retry strategy here.

Within ps2-analysis gets built the query getting all info about infantry weapons, and executed like this, using start and limit query parameters to fetch the weapons 10 at a time until all have been collected. Raw results are stored in a local infantry-weapons.ndjson file.

Then, at runtime I parse the infantry-weapons.ndjson file using this mess and generate dataclass objects such as InfantryWeapon, DamageProfile, Ammo, Projectile etc.

Albeit the objective being different (making a user-friendly API client versus parsing data for further technical analysis), I hope some of my ideas can help here as well c:

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

Hey @spascou, thank you for chiming in!

What file size does your local copy infantry-weapons.ndjson reach? I did not want to force users to store a lot of data on disk, which is why I opted for the caching system instead.

Currently, my cache only works via IDs or names (lazily implemented via OrderedDict to get started), but it could be extended to use a local database instead, which would allow querying the local data by other fields too.
This would give users the option of storing some or all collections locally to be able to access them during an API outage

Does your system provide any way of querying the underlying data directly?
I am currently in the process of building out the first data classes and I am unsure of how to best translate from the user-friendly properties ("server" instead of "world", "asp_rank" instead of "prestige_level", "playtime" in hours rather than "times.minutes_played", etc.) to the API field names required to build queries, which is something you would have to go through as well if I'm not mistaking, since you are keeping the original copy of the Census data, not your use-case-specific flavour.

Anyway, thanks again for the input, I'll look into giving the user the option of keeping a local copy of (some) tables.

from auraxium.

spascou avatar spascou commented on July 23, 2024

My infantry-weapons.ndjson (uncompressed) reaches 22mb of disk space. But to avoid having to store an intermediate state (a more human-readable one than the raw data in the ndjson file) I'm simply parsing all of it at runtime to generate in-memory proper objects, like from the InfantryWeapon class.
So the process is "run the data downloader once, or everytime you want to update the local copy" and then every execution parses it. Quite ineffective, but simpler; there's no direct access to the underlying data as the generated class objects try to contain everything, but better organized.

For your API case, I think caching requests is a good method. You might also be interested in this enums file btw, as these make sense to be even hardcoded considering they will realistically not change.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

Ah yes that makes sense. Unfortunately that will not be an option for most of what I want to do, a lot of it will be outfit-centric stuff like members, stats or outfit captures, that stuff should be as up-to-date as the API allows.

I am not sure how to go about these kinds of enum tables yet.
I have tentatively settled on "have their caches be infinite with a 24 hour lifetime" for now, although that is of course more overhead than a proper Enum. Still, those objects should be fairly light-weight, especially once the strings are cast to integers and all that.

from auraxium.

qcoumes avatar qcoumes commented on July 23, 2024

I'd also like to use auraxium for outfit stuff, like keeping track of member's unlocks. Now that django is starting embracing async/await, I think that rewriting everything asynchronously was a very good idea :)

from auraxium.

LordFlashmeow avatar LordFlashmeow commented on July 23, 2024

I use auraxium for getting character stats, so the caching doesn’t help me. In fact, we may want a way to only enable it for some of the tables which don’t get updated often.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

The cache is currently implemented in a base class, the individual tables' behaviour is then specified via __init_subclass__. So we can pick and choose which tables use what kind of cache, and there is also a class method that allows overriding this setting at runtime.

The main issue I am currently dealing with are large intermediate collections, like characters_item or outfit_member. I am still planning on looking into dummy objects that will not retrieve all data until they are used, but I haven't gotten around to it with work and all.

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

I just pushed a first draft of a proxy object system that makes dealing with intermediate collections and related types a little more convenient.

Here are a few lines using this system, I'm pretty happy with it so far - gonna have to see how well it holds up in the long term.

@classmethod
async def get_by_name(cls, name: str, *, locale: str = 'en',
client: Client) -> Optional['Weapon']:
"""Retrieve a weapon by name.
This is a helper method provided as weapons themselves do not
have a name. This looks up an item by name, then returns the
weapon associated with this item.
Returns:
The weapon associated with the given item, or None
"""
item = await Item.get_by_name(name, locale=locale, client=client)
if item is None:
return None
return await item.weapon().resolve()
def item(self) -> InstanceProxy[Item]:
query = Query('item_to_weapon', service_id=self._client.service_id)
query.add_term(field=self._id_field, value=self.id)
join = query.create_join('item')
join.parent_field = 'item_id'
proxy: InstanceProxy[Item] = InstanceProxy(
Item, query, client=self._client)
return proxy

from auraxium.

leonhard-s avatar leonhard-s commented on July 23, 2024

With the rewrite effectively complete and #8 closing down, I am moving the remaining object model questions into separate issues to make the individual tasks look less intimidating "easier".

from auraxium.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.