This is currently in the portal repo. Should we consider something like gravatar? Unsu

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Move PI profile images to official metadata schema about nmdc-schema HOT 24 CLOSED

microbiomedata commented on July 19, 2024

Move PI profile images to official metadata schema

from nmdc-schema.

Comments (24)

dehays commented on July 19, 2024

I don't see anywhere to include an image to a profile in ORCiD so I'd guess we can't get researcher images from there. Gravitar - maybe in cases where someone uses Wordpress or something else that caused them to add an avatar to Gravatar.

The PI profiles for the initial FICUS studies were very manually provided. I think the best the schema can do here is provide an optional attribute for an image URL. Thinking of the case where 10K+ studies are imported from NCBI - probably nothing there to set a profile image on import.

from nmdc-schema.

jeffbaumes commented on July 19, 2024

Agree, a URL would work fine and may be what we want to stick with at least near-term. Submitters could use the URL of their gravatar if they are savvy enough to do so I suppose. The important thing is to get it into the schema and not hard-coded in the client.

from nmdc-schema.

jeffbaumes commented on July 19, 2024

@jbeezley could you get a PR together to add an image URL added to the PI schema? We could just point current PI urls to our server for now, which is maybe not ideal but at least the PI info would be upstream. I expect the current mapping of PI to image URL could also be placed in the PR and used in the scripts that pull together that JSON from our current dataset.

@wdduncan could you point us to the code that creates PI JSON?

from nmdc-schema.

wdduncan commented on July 19, 2024

@jbeezley I am pulling the principle investigators name from the contact table in GOLD. To add this info to my ETL, I think I might have to add another table to the ETL ingest. Might be easier to have a call about how to best do it ...

from nmdc-schema.

jbeezley commented on July 19, 2024

We can't use a URL from our server because I store the data as a binary blob in the database. It also depends on the uuid of the principal investigator row. Perhaps we could base64 encode the data and put it in the provided study json? For reference, the images used are stored in https://github.com/microbiomedata/nmdc-server/tree/master/nmdc_server/ingest/pis.

from nmdc-schema.

jbeezley commented on July 19, 2024

I don't see anywhere to include an image to a profile in ORCiD so I'd guess we can't get researcher images from there. Gravitar - maybe in cases where someone uses Wordpress or something else that caused them to add an avatar to Gravatar.

Gravatar would be a good alternative, but we don't necessarily get email addresses from ORCiD's. I checked some of our existing PI's and they don't make email public.

from nmdc-schema.

jeffbaumes commented on July 19, 2024

Would it be reasonable to support both external URLs and image content with allowing either a URL or data URI? As long as it is validated as one or the other, we could safely pass this through to the img src attribute and it would work in either case.

from nmdc-schema.

dwinston commented on July 19, 2024

@jeffbaumes I'd rather there be only one (external URL) versus two modes. It should be no more difficult to supply a URL for a profile image than for a data object.

On the topic of offering use of a Gravar image, I think that can be convenient, but should be opt-in somehow. Some PIs may have gravatar images from a decade ago that they have forgotten about and would prefer not to use.

from nmdc-schema.

jeffbaumes commented on July 19, 2024

You could think of my suggestion as actually only one thing: provide a URI to an image. It just happens there are multiple types of URI we could support fairly easily. They could be handled identically start to finish, with no extra logic anywhere to support them other than a more complex regex for validation, so I'm not seeing much downside.

I'd be ok with URL, and savvy PIs could find and use their gravatar URL so in that sense it would be opt-in. The main extra-work-for-us for URL-only is that we would need to decide where to host the current PI images. They need to be at static, stable URLs.

from nmdc-schema.

wdduncan commented on July 19, 2024

All these ideas are fine with me, but where in the ETL do we insert the URL/URI? I can do it on my end by simply having a file that gives the URI for each investigator in the contact table. However, this won't work for investigators that not registered in the GOLD database.

from nmdc-schema.

jbeezley commented on July 19, 2024

We can make it optional and on the portal show a placeholder image. A lot of these questions on what to make required (#310) depend on features needed for the portal and what we are willing to leave blank.

from nmdc-schema.

dehays commented on July 19, 2024

Moving to nmdc-schema to add optional pi_image_url slot

For the studies we currently have (12 FICUS) - will need to put image files on NERSC and manually set them in metadata - there is no standard source for these images so I don't see a way the GOLD ETL can set these

@wdduncan Question on implementation - an image_url on the person entity seems correct to me. Then study would refer to the PI (a person) image_url attribute. I think this is what you are doing with the principal_investigator_name on study. Does this make sense?

from nmdc-schema.

wdduncan commented on July 19, 2024

@dehays yes, that is what I was thinking.

from nmdc-schema.

dwinston commented on July 19, 2024

So the plan is to

eliminate core/person_value class
give core/person class an orcid slot
give core/person class an image_url slot
remove nmdc/study class' principal_investigator_name slot
add nmdc/study class principal_investigator slot with range core/person

Is that right?

from nmdc-schema.

jeffbaumes commented on July 19, 2024

@wdduncan I added you as an assignee since I don't think @jbeezley can make the actual query changes himself. Please correct this or delegate if I'm off here.

from nmdc-schema.

ssarrafan commented on July 19, 2024

Adding comments from email exchange for reference:

Agree with your comments here David (and yours Kjiersten). I commented in parallel on GitHub, but the gist is that we need to resolve which fields we can and can't expect to require. Many "required" things for the portal could be made optional if needed.

#41

On Wed, Apr 28, 2021 at 1:31 PM Kjiersten Fagnan [email protected] wrote:
I support the approach David laid out for what fields are required vs optional. I can add the following comments to the ticket, but we seem to be getting into this via email.

Could we create some default values for the portal to populate the page - avatar, URL, DOI and scientific objectives would be harder if not impossible.

In the future, when contributors are providing data to NMDC, could we also collect - photo, website URL, etc as part of the submission process - or perhaps give the PI the ability to add this themselves? This depends on having some level of access controls (different roles in the data portal than exist right now). Maybe this is part of working directly with the PIs to get their help on the study/data landing pages?

Kjiersten

On Wed, Apr 28, 2021 at 10:20 AM David Hays [email protected] wrote:
Bill and Emiley said:

#41 I do not have access to the data for some of the fields that Jeff is requesting. I can give an estimate until we track the data.

@emiley Eloe-Fadrosh any idea of who to follow up on with to get access to the data needed?

#19 Again, I do not have access to images of PIs.

@emiley Eloe-Fadrosh any idea of who to follow up on with to get access to the PI images needed?

David should be able to address #41, seems like a GOLD database dump issue?

For #19, as was indicated in the github ticket, these were all manually collected by me. Not sure the best solution here, but this could tie into the more general discussion of the study pages (and some items from #41 like scientific objectives that are not part of the GOLD db dump). Not everything can be fully automated.... just my two cents.

On #41, I believe the fields that Bill is referring to are the ones that are NOT available from GOLD; i.e. those listed in https://github.com/microbiomedata/nmdc-metadata/issues/301 and #19 such as PI web site, PI image, scientific objective, publication DOIs. Basically, the ones that Emiley collected manually and provided to Kitware.

For these, Bill could add these to the schema as non required attributes. He has no way of making the GOLD ETL populate these because they do not exist in GOLD.

Jon states that the portal UI depends on these fields - but I believe the portal UI will need to treat them as optional fields as well because normally they will not be available. If we add 10K studies tomorrow from GOLD or NCBI - we will not be waiting for Emiley to collect values for these fields before they can be displayed in the UI. The portal UI needs to be flexible enough to handle cases where these values are not available.

I also believe it should not be the responsibility of search portal development to merge additional metadata for studies to extend what was made available for ingest. So that implies the need for an curate/annotate procedure that is available between GOLD or NCBI ETL and search portal ingest. And in the case of images - in addition to a way to edit the study json docs to add PI image URLs, there is also the need to add and manage the image files to a location associated with the metadata URL.

So for #41, there are a number of fields that are always available (We will always have a PI for a study.) that can be made required in the schema. But for those for which there is no available source except manual curation, at best these could be optional fields in the schema. Make sense?

-David

On Tue, Apr 27, 2021 at 1:51 PM Emiley Eloe-Fadrosh [email protected] wrote:
For these two:

#41 I do not have access to the data for some of the fields that Jeff is requesting. I can give an estimate until we track the data.

@emiley Eloe-Fadrosh any idea of who to follow up on with to get access to the data needed?

#19 Again, I do not have access to images of PIs.

@emiley Eloe-Fadrosh any idea of who to follow up on with to get access to the PI images needed?

David should be able to address #41, seems like a GOLD database dump issue?

from nmdc-schema.

wdduncan commented on July 19, 2024

Please move to May sprint.

from nmdc-schema.

dwinston commented on July 19, 2024

@jeffbaumes is this subsumed by / a component of #41?

from nmdc-schema.

emiley commented on July 19, 2024

FYI - I think I’m possibly on this thread in error. Perhaps someone inadvertently at mentioned me rather than the correct recipient.

…

On Wed, May 5, 2021 at 6:00 PM Donny Winston ***@***.***> wrote: @jeffbaumes <https://github.com/jeffbaumes> is this subsumed by / a component of #41 <#41> ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2XIQFE27N3UE6BP3BKMY3TMG5ZDANCNFSM42VRBG3A> .

from nmdc-schema.

dwinston commented on July 19, 2024

sorry @emiley, thank you for alerting us! We meant @emileyfadrosh. You can unsubscribe yourself.

ScreenFlow.mp4

from nmdc-schema.

jeffbaumes commented on July 19, 2024

@jeffbaumes is this subsumed by / a component of #41?

@dwinston This issue has a slight additional complication attached (we need to host the profile images elsewhere and link to them by URL in the schema) so I feel it could be deserving of its own issue. But I'm also ok rolling it into #41.

from nmdc-schema.

ssarrafan commented on July 19, 2024

Based on the meeting today with @dehays, @emileyfadrosh, @dwinston, @wdduncan, and @jbeezley, @wdduncan will add image URL on the person object to the schema. The images can be stored on an object store at NERSC.

from nmdc-schema.

ssarrafan commented on July 19, 2024

Removing Jon from assignee.

from nmdc-schema.

wdduncan commented on July 19, 2024

I've added a profile image url slot (see PR #68).

This is to be use with study objects like so:

 {
    "id": "gold:Gs0112340",
    "name": "Thawing permafrost microbial communities from the Arctic, studying carbon transformations",
    "description": "....",
    "principal_investigator": {
        "has_raw_value": "Virginia Rich",
         "profile image url": "http://....." <--- new slot
     }
}

NB: the property principal_investigator_name is now named principal_investigator.

Closing this ticket. But it can be reopened if needed.

from nmdc-schema.

Move PI profile images to official metadata schema about nmdc-schema HOT 24 CLOSED

Comments (24)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs