GithubHelp home page GithubHelp logo

Instagram Node about node-red-web-nodes HOT 17 OPEN

node-red avatar node-red commented on July 21, 2024
Instagram Node

from node-red-web-nodes.

Comments (17)

knolleary avatar knolleary commented on July 21, 2024

We need to think carefully about passing streams around in messages. There are a number of problems with doing that today.

V1 might only be able to download the image and pass it on as a buffer.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

Yes, I'm going to be focusing on implementing the "pass image URL as string" functionality first as we need to decided if we're ever going to be working with streams.

Secondly, I think I need to simplify one more thing. I think it's better if the output is never an array but we return multiple messages of a single string/image/stream! Keeps things simple!

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

I think the count option for the input node isn't needed - it should fire whenever a new photo is added/liked from the time the node starts watching for them.

For the additional message options, ie what is emitted by the node, it depends what useful meta data is available from the api.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

OK, I removed the concept of "count". We can add that functionality later, if needed. I also modified my description so the notion of multiple returns/arrays is not mentioned any more.

For each image, Instagram returns location information (lat, long), comments (count, actual comment data), image title, likes/who liked, tags.

I'll implement a simple node first where these parameters are not transferred, then I'll add the additional features as step 2.

The basics of the image such as latitude, longitude, title transferred as msg.lat, msg.long, msg.title will be straightforward to handle/understand, the structure of comments/likes is going to be harder to define/transfer consistently.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

@knolleary , Mark and I had a conversation today about the issue with polling REST APIs for updates. We can either have an:

  1. input node: here the user would specify the query/refresh interval
  2. query node: here the user could trigger a refresh/API check with an input

In order to keep things simple, implementing 2) might be a better option. I updated the description to reflect this. We can discuss this tomorrow but any comments are welcome. thanks

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

What about 1) with a sensible default value that (in version 1), the user isn't expected to override? That is what I'd assumed we'd do.

from node-red-web-nodes.

hindessm avatar hindessm commented on July 21, 2024

I think they are two different things. A query node that returns all (recent) results and a polling node that only fires for new/updated results. I think we probably want both. I think writing the query node first makes sense since it is easier and the polling node could re-use code from the query code.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

One of the next big decisions will be to decide what additional metadata shall the social nodes return and in what format. With additional metadata, we could enable users to do cool things, like do something based on location.

APIs usually return tons of data and we'll need to decide how/what we're returning to the user.

See some example data from Instagram:

https://gist.github.com/zobalogh/dc1a719a3df3465c7ad7

We could go down on two routes (or more?):

  1. Return everything an API returns in a format it returns it in
  2. Somehow we standardise a Node-RED style format.

This decision should not be taken lightly as it will likely impact us in the long run.

I'm tagging all of us IBMers working on Node right now, so you can think about what to do before Tuesday:

@knolleary, @hindessm, @hbeeken, @Raminios, @dceejay, @anna2130

Deciding whether we return Node-RED convention-formatted data or the whole message has implications on whether we can use 3rd party APIs. For example some social APIs might make it easy/difficult/hide the original message. Is this a problem? We need to decide this.

I don't have a strong opinion either way. Returning the whole data is more work for us in many cases (can't use APIs, have to make use of plain HTTP/REST) but is useful for the user, familiar with the given service. Also gives more user options. However a Node-RED specific metadata could simplify our work and would enable users who are less technical (familiar with Node-RED only) to have interchangeable knowledge between social nodes.

I'm slightly more biased towards having our own global spec/standard and limit it to as little feature as possible for the time being. Maybe add (in the Instagram case): ID, location (lat/long) and title.

Thoughts? We need to discuss this on Tuesday.

Here's a summary of the Instagram node (for reference):

From the Instagram node, I personally believe, the following fields are/could be important (in order of importance, as ordered by me):

id, location (lat, long, name, id), caption (created time, text, from (as in who uploaded it)), tags, user_in_photo, user_has_liked (list of users), comments, type, images (low res, thumbnail, standard res).

Type (image/video) is something we could decide to ignore and focus on images only. This is to be decided. Images (the different resolutions) could be ignored completely and rely on the normal res only.

Location is very important as it enables us to do social stuff based on that. It has lang, lot, name and an internal ID (Instagram has locations defined). Caption is also quite important (naming) and the ID can also help us in naming/backup as it's technically an Instagram GUID. Tags, comments, are less important I believe and could be dealt with later.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

@knolleary, I updated the description to reflect what has been implemented, provided that pull request #44 is pulled. Thanks

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

I've been think about how these nodes would get used. I think the fact the query node could return 0-n individual messages makes it hard to use in a flow that expects a single outcome. I think the query node version should return the most recent image/like - the same way the swarm query node returns the most recent check-in.

I can't think of a scenario where you'd use the inject/query node combination where just the input node would do. I can think of scenarios where some event occurs (an http request) and you need to provide a single response, which you cannot do if the query node emits multiple messages.

What do you think? /cc @hindessm @hbeeken

from node-red-web-nodes.

hindessm avatar hindessm commented on July 21, 2024

@knolleary:

I've not made up my mind. However, it might be useful to consider a use case such as how would you write a flow that begins with an inject at 6am every day, query all my instagram photos for the last 24 hours, rate them (using an opencv-based function node or a web service) then picks the one with the best score and uploads it as a facebook cover photo.

Regarding the first paragraph, there are other similar use case for things like venues from swarm that might be returned as a list ordered by proximity but that you might like to rate based on some criteria to try to pick the closed with the highest food standards rating for instance - if you don't trust your friends judgment alone ;-)

Do you have query nodes send as a msg array or multiple msg objects. I tend to prefer the latter because it makes it easier for other nodes to just process one msg at a time but then you need some way to identify related batches - so you can have a node that for instances takes the first or takes the best based on some score and reduces the flow back to one msg object again.

from node-red-web-nodes.

hbeeken avatar hbeeken commented on July 21, 2024

I think it does make sense for a node to return an array of msgs if it's appropriate but providing the user the option in the UI to specify how many msgs they want (there would need to be a description as to what this number or shortened list represents).

For example, in the case of "recommended venues nearby" (issue #42) the query node should return an array of recommended venues as returned by a call to the Foursquare API. This is perfectly valid because the user may want to then filter this list on some other criteria (as mentioned in the previous comment by @hindessm) which they can then do using the function node, or another node they've written themselves. After various decisions have been made a "venue to go to" result will be made and then we're back to one msg. I don't believe in this case the foursquare node should do any analysis on the output as that is restricting its use. You could foresee that in the future more options could be added to the node to cover popular usecases but the node should always be able to return the list of venues.

Returning one massive msg vs an array of messages...I'd vote for an array of msgs. It keeps the manipulation in subsequent nodes simpler. It's also difficult to assign msg.lat, msg.lon etc. if the msg.payload is the list of venues.

The problem that some of the nodes expect a single msg rather than an array....I'd say that there are some fundamental ones (debug for example) which should be able to handle both a single msg and an array of msgs. It's not necessary for all to handle an array because sometimes that doesn't make sense. If the user really does just want one result then they can choose that in the UI for the node. If the user wants more than one result you can assume they want to do some further analysis in which case they should be happy to write a node which does the analysis relevant to their flow which then ultimately results in a single msg on which they can use some of the existing nodes.

Finally, the node-RED documentation specifically talks about sending multiple msgs if appropriate - what was the original use case for this and surely if there was one it still stands?

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

The node-RED documentation specifically talks about sending multiple msgs if appropriate - what was the original use case for this and surely if there was one it still stands?

There are occasions when a node wants to send on multiple messages in response to a single input. For example a node that splits a body of text into individual lines, where the user wants to do some discrete processing on each line. Or maybe it wants to send a sequence of commands to a serial port with a delay node adding a pause between each.

Returning one massive msg vs an array of messages...

For all of the scenarios that Mark describes, where a user wants to pick from a list based on some criteria, it is much harder to do that if each item of the list arrives as its own message. A node would never know when it has received all of the items in the 'set'. We would need to invent yet another convention for tagging the messages; you start with tagging them with an id to identify the set, but then you also have to tag them with an index and a count so the receiver knows how many are coming in the set. I've got some vague sketches for a scatter/gather pair of nodes that would do this in a generic fashion, so that each node wouldn't have to implement it themselves. Maybe I need to convert the vague sketches to a more concrete implementation to properly enable some of these scenarios.

Regardless, it is much easier to convert a single message containing multiple results to multiple individual messages than it is to go the other way.

But maybe that needs to be an option on the query nodes; emit as a single message (default) or emit each as its own message. We do that with other nodes (tcp for eg), although they are Input nodes rather than Query nodes.

On reusing message objects

This is what started me down this thread. We have made a change to the core where we no longer clone every single message a node sends. For efficiency, on each call to node.send, we don't clone the first message passed to the function , we only clone the subsequent messages in the array. This makes a huge saving when the majority of nodes are only dealing with single messages at a time. But it does mean nodes must not reuse msg objects within their implementation - something that, afaik, none of the core nodes do.

The instagram node, and maybe some of the other web nodes, do reuse the object passed to node.send - which needs to be fixed. There is no reason for the Input node version to reusing msg objects - there's nothing preceding it in a flow, so it can create new msg objects for each message it emits. The query node is trickier, because there may be existing properties on the received message that must be preserved on the emitted message. If the output node only emits a single message, this is easy. If it outputs multiple messages, how does it decide which message to attach existing properties to?

In the case of multiple messages, there is also the question of whether it should call node.send multiple times, or call node.send once with an array of messages. The cloning change means it must do the latter if it reuses message objects so that they will get cloned properly.

from node-red-web-nodes.

zobalogh avatar zobalogh commented on July 21, 2024

I'm numbering my thoughts into separate points as I'm talking about many things:

  1. I recently thought about a possible feature where nodes could signal state to each other. Knowing about state could be useful in many scenarios, such as one where a flow splits up to two separate chains of action where each chain has some relation to the other (possibly later on joining up as a single chain again). Furthermore state then could be used to control knowledge about sending multiple payloads/messages. What do you think about this alternative approach?
  2. In terms of the Instagram node, I think an alternative solution for the query node's behaviour is to return a blank message if there's no change and the most recent image is still the same as what we've already returned. Flows could still make sense of that.
  3. On the other hand, we could hugely simplify the Instagram node (should we wish to) where the node could maybe only return the single most recently liked/uploaded photo irrespective of anything. Then the cloning of the message would not be an issue.

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

On your first point, the whole flow-based model is based on the principle that nodes only interact via the messages that pass between them. We have thought through ideas of sharing state (ie a started/stopped state) and it gets sticky and overly complicated quite quickly.

I think there are (at least) two modes of query for instagram (and swarm, and ... and ... and ...)..

  1. give me the most recent image/like/checkin. This avoids the multiple message issue.
  2. give me everything I've not seen since last time I asked. This hits the multiple message issue.

In the interests of being able to do a release of the web nodes, and not rushing a solution to multiple messages in the next day, it may be more pragmatic to change the instagram node to the 'most recent' query instead. That keeps it consistent with the swarm node. The input node can be used to do the 'ensure I get everything' mode of operation. It doesn't allow the "at 6am each day given me everything since 6am yesterday" scenario, but we can solve that one next week.

A simple function node can be used to detect if the most recent is the same as the one processed last time, and drop it accordingly. We might add that as an optional behaviour to the instagram node if it feels right to do, but I don't think it should be in this iteration.

from node-red-web-nodes.

hindessm avatar hindessm commented on July 21, 2024

In the case of multiple messages, there is also the question of whether it should call node.send multiple times, or call node.send once with an array of messages. The cloning change means it must do the latter if it reuses message objects so that they will get cloned properly.

@knolleary Huh? How could an instragram query node send multiple messages with different payloads without calling clone itself with the special cases as Node.send()? I don't see how calling node.send once helps at all unless all the messages are the same?

from node-red-web-nodes.

knolleary avatar knolleary commented on July 21, 2024

@hindessm I know what I meant, but re-reading it, it didn't come across well.

The inherent issue with a query node returning multiple messages is what to do with all of the properties that already existed on the triggering message. To attach them to every message means cloning them all - and that is expensive when the system will clone them again... unless you call msg.send() for each individually... at which point my argument eats its own tail and goes home for the evening....

from node-red-web-nodes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.