GithubHelp home page GithubHelp logo

neo4j-partners / hands-on-lab-neo4j-and-vertex-ai Goto Github PK

View Code? Open in Web Editor NEW
85.0 5.0 38.0 302.89 MB

Hands on Lab for Neo4j and Vertex AI

License: Apache License 2.0

Jupyter Notebook 100.00%
data-science datascience google-cloud machine-learning neo4j vertexai genai

hands-on-lab-neo4j-and-vertex-ai's People

Contributors

benofben avatar ezhilvendhan avatar gogitguhan avatar jeffneo avatar jexp avatar runfourestrun avatar sidagarwal04 avatar zach-blumenfeld avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hands-on-lab-neo4j-and-vertex-ai's Issues

Cleanup Gitignore

The gitignore is full of cruft that must have been copied from somewhere. Clean out anything unusused.

Vertex AI Quota

Free trials now allow only one ML pipeline. That means we can't run both the embedding and raw notebooks simultaneously. Not sure what to do about this.

Investigate Forecasting

Vertex AI has a new forecasting feature. Figure out if it makes sense for the lab as our data set is a time series.

Lab 3 - fix syntax errors due to neo4j v 5

old syntax.
CREATE CONSTRAINT IF NOT EXISTS ON (p:Company) ASSERT (p.cusip) IS NODE KEY;
CREATE CONSTRAINT IF NOT EXISTS ON (p:Manager) ASSERT (p.filingManager) IS NODE KEY;
CREATE CONSTRAINT IF NOT EXISTS ON (p:Holding) ASSERT (p.filingManager, p.cusip, p.reportCalendarOrQuarter) IS NODE KEY;

replace it with the new syntax
CREATE CONSTRAINT IF NOT EXISTS FOR (p:Company) REQUIRE (p.cusip) IS NODE KEY;
CREATE CONSTRAINT IF NOT EXISTS FOR (p:Manager) REQUIRE (p.filingManager) IS NODE KEY;
CREATE CONSTRAINT IF NOT EXISTS FOR (p:Holding) REQUIRE (p.filingManager, p.cusip, p.reportCalendarOrQuarter) IS NODE KEY;

Simplify Moving Data lab

Right now we load a day of data and then a year. Walking through that Cypher improvement should probably stay. However given the change in GDS approach, we don't need the two data models any more.

Lab 5 - upload to google cloud storage

This step gives me an error message that mentions "bucket.location" is deprecated". Did google code change?
I'm not sure if I'm doing the step right or not.

Full error message:

:2: DeprecationWarning: Assignment to 'Bucket.location' is deprecated, as it is only valid before the bucket is created. Instead, pass the location to Bucket.create.
bucket.location=REGION

TransportError Traceback (most recent call last)
/usr/local/lib/python3.8/dist-packages/google/auth/compute_engine/credentials.py in refresh(self, request)
110 try:
--> 111 self._retrieve_info(request)
112 self.token, self.expiry = _metadata.get_service_account_token(

15 frames
TransportError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f544792bca0>)

The above exception was the direct cause of the following exception:

RefreshError Traceback (most recent call last)
/usr/local/lib/python3.8/dist-packages/six.py in raise_from(value, from_value)

RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f544792bca0>)

update to 5

There are various syntax issues caused by the version bump from 4 to 5

Use gcloud Environmental Variables

I really feel like Project ID and Region are probably already set in the managed Vertex AI Workbench environment. But, I'm not seeing what they are set as. Probably should revisit this.

I was able to delete the auth code.

Remove Workspace

Right now we're telling people to switch the config to workspace. Once that becomes the product default, that's great. But let's not use a non default configuration in the lab. It's just going to confuse people.

Key on CUSIP

Lab 7 is using a text name rather than the CUSIP as a key.

Rename and Delete Data Files

We've wound up with a bunch of files in the neo4j-datasets bucket. Some say things like -v2. We need to delete everything that's not being used and name the files in a way that represents their contents.

Lab 4 - bloom is empty (no

Perspective was created during an earlier step, before data is loaded. Therefore the Perspective is empty. Students will be confused.
Easiest solution is probably to remove the steps during earlier lab where students connect to Bloom.
2nd solution is to tell students to create/import a new perspective

Bloom Issue

Something obscure is going on when Bloom is first loaded. In some cases, the node types aren't populated. This seems to occur intermittently. Need to figure out what exactly is going on.

Streamline Neo4j Auth

We now have a bunch of notebooks. They're all authing to the same Neo4j database. When we had two it wasn't too painful to enter creds. Now that's pretty awful. We need to setup creds in a way that they cascade through all the notebooks. It might just be a walkthrough of how to make an environmental varibale.

Remove Form 13, Replace with 10-K

Now that we're not doing a supervised learning component in here, it probably makes sense to just switch entirely to the 10-K. Need to discuss.

Transaction in LOAD CSV

Use transactions in load

:auto LOAD CSV WITH HEADERS FROM 'https://storage.googleapis.com/neo4j-datasets/form13/form13-v2.csv' AS row
CALL { MATCH (m:Manager {managerName:row.managerName})
MATCH (c:Company {cusip:row.cusip})
MERGE (m)-[r:OWNS {reportCalendarOrQuarter:date(row.reportCalendarOrQuarter)}]->(c)
SET r.value = toFloat(row.value), r.shares = toInteger(row.shares)
} IN TRANSACTIONS OF 1000 ROWS;

Vertex AI Lab - Batch Prediction

Can't seem to use the new batch prediction feature because it doesn't know about multiregion buckets. As a workaround, could try to create it in one region. Not sure how that works. Asked Henry....

Screen Shot 2022-03-26 at 3 02 15 PM

Screen Shot 2022-03-26 at 3 02 24 PM

Ensure Data Model Matches

Right now the data model doesn't match across labs. To work around it, we're deleting the data and then reloading it. Make the data model match everywhere and remove the delete statements as they'll no longer be needed.

Clean up form10k.zip

This has a __MAC_OSX directory in it. Likely there's no repeatable script to generate it. Need to look into this...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.