Comments (7)
I guess we could add this as a
MatterFile
but there I feel like we would need to add an attribute oftype
or something? Something to signify what eachMatterFile
represents (i.e. just a report, an amendment, or the bill text)
Since the full text uri is kinda distinct from other MatterFile
's, I think we could just add full_text_uri
to Matter
(also saves us a query if we want to fetch this for a specific Matter
).
Unless there are very discrete categories that we could classify MatterFile
into, I don't think we need MatterFile.type
and name
would be sufficient.
from cdp-backend.
What would be appropriate for matter full_text_uri
, in this example: (I'm presuming the corresponding ingestion_models.Matter
must change as well)
https://seattle.legistar.com/MeetingDetail.aspx?ID=930274&GUID=903D2508-9840-4878-8334-1AEF77335BB8
https://gist.github.com/dphoria/3134769fe44686a82fdca2a55b822397
I will take a look later myself. Just wanted to start the question / conversation.
from cdp-backend.
Great question! Yes the ingestion model would need to be updated as well to add the same property / attribute.
Taking this meeting from legistar: https://seattle.legistar.com/MeetingDetail.aspx?ID=929921&GUID=3EB77948-2243-425A-9864-8CD868B96048&Options=&Search=
And selecting the first council bill (CB 120263), we get to: https://seattle.legistar.com/LegislationDetail.aspx?ID=5448143&GUID=4F8010D6-BEBB-46AF-BE22-F579AD681B68&Options=&Search=
I think what we want really just a link to that page / that above link since it has the full details. But if we wanted to get even more specific, I would say clicking "Reports" and then clicking "Legislation Text" or really any of the options gives us more of a "document view" like this: https://seattle.legistar.com/ViewReport.ashx?M=R&N=Text&GID=393&ID=4717976&GUID=660120D3-9C6F-4314-AFC7-A44217E71237&Title=Legislation+Text
from cdp-backend.
This is really a bigger deal because like.... currently we don't even store that info to CDP at all, here is the corresponding meeting page for that meeting on seattle staging: http://councildataproject.org/seattle-staging/#/events/f3351cc9822f
notice that the minutes item CB 120263 doesnt have any attachments / documents.
from cdp-backend.
Do we need a separate field for the full text, or could it just be another MatterFile
? If we want to handle the full text differently in the UI than other MatterFile
s than I'm all for adding full_text_uri
, but otherwise I think it could be another MatterFile
I think the above try-except block may be dropping some MatterFile / MinutesItemFile attachments that would be useful to keep and so we may want to try to fix it if we do see that behavior.
For this it's most likely failing due to a connection timeout or an error when making an http request. Since the only validation run on MatterFile
is resource_exists
, I think it has to be one of these two
from cdp-backend.
Do we need a separate field for the full text, or could it just be another
MatterFile
? If we want to handle the full text differently in the UI than otherMatterFile
s than I'm all for addingfull_text_uri
, but otherwise I think it could be anotherMatterFile
I guess we could add this as a MatterFile
but there I feel like we would need to add an attribute of type
or something? Something to signify what each MatterFile
represents (i.e. just a report, an amendment, or the bill text)
from cdp-backend.
Yea the benefit to query time is also a major plus.
from cdp-backend.
Related Issues (20)
- Investigate / fix m3u8 processing...
- Google Speech-to-Text SR Model raises a confusing attr error instead of a defined error when Google runs into an issue
- Parse closed caption files for Oakland better HOT 13
- Hackathon Cleanup: Audio/Video Clipping
- Accept timestamps and only process a subset of a video as an event HOT 13
- Allow ability to flag Events as "try to scrape again next time" and do so on the next CRON run
- Filter out bad caption files
- Issue with deploy-infra action on 3.2.4 due to missing ffmpeg HOT 1
- Issue with deploy-infra action on 3.2.5 due to missing webvtt HOT 2
- Allow meeting minutes to be processed like a transcript
- Re-enable m3u8 and vimeo file utils resource copy tests HOT 1
- Duplicate Persons in cdp-seattle instance HOT 7
- Docker Images
- Break up Event Gather Pipeline HOT 1
- Cannot store and transcribe multiple videos that are clipped via the video_start_time and video_end_time event params
- Reduce complexity of event gather pipeline HOT 4
- MP4 Videos Unnecessarily Encoded During Trim HOT 3
- Inefficient usage of requests.get for very large videos causes event gather to fail
- deploy-infra - problem when Firestore database created in deprecated Datastore Mode HOT 3
- 403 error on resource_copy HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cdp-backend.