To increase the Aztec database from 10,000 to >50,000 database entries while containing up-to-date metadata. This requires scraping text and/or downloading PDFs at scale (just once) and creating a web-crawler to update the metadata for the entries. The web-crawler should run on a weekly basis.
ucla-bd2k / aztec-db-expansion Goto Github PK
View Code? Open in Web Editor NEWTo increase the Aztec database from 10,000 to >50,000 database entries while containing up-to-date metadata. This requires scraping text and/or downloading PDFs at scale (just once) and creating a web-crawler to update the metadata for the entries. The web-crawler should run on a weekly basis.