q-m / food-fish-parser-ruby Goto Github PK
View Code? Open in Web Editor NEWExtract fish details from food product descriptions
License: MIT License
Extract fish details from food product descriptions
License: MIT License
Sometimes the catch area is specified using an ICES
code. Also recognize this.
Let's add an ices_codes
field to catch_areas
, and make the presence of both fao_codes
and ices_codes
optional.
https://www.ices.dk/marine-data/maps/Pages/ICES-statistical-rectangles.aspx
http://gis.ices.dk/sf/index.html?widget=StatRec choose Mapping layers > ICES Areas
Right now only fishing details are recognized, also add aquaculture.
The text
Alaska pollakfilet, uit de Stille Oceaan
Has a fish name and an area, but it is unknown whether it is wild or aquaculture. It would be nice to still recognize this. This requires a change in the data structure (which the flat parser sometimes already does), to something like:
[
{
:names => [{ :common=>"Alaska pollakfilet", :latin=>nil }],
:areas => [{ :text=>"de Stille Oceaan", :fao_codes=>[]}],
:methods => [],
:kind => nil,
},
]
When it is unknown whether the fish was caught or raised using aquaculture, the output format is unsuitable. This was encountered with the flat parser, which may return methods
and areas
(instead of catch_areas
, catch_methods
, aquaculture_areas
and aquaculture_methods
).
It would be clearer to use the following output format:
[
{
:names => [{ :common=>"Alaska pollakfilet", :latin=>nil }],
:areas => [{ :text=>"de Stille Oceaan", :fao_codes=>[]}],
:methods => [],
:kind => "caught",
},
]
This would have the following benefits:
Note that this breaks the API, requiring a major version bump.
Some fish detail texts have an option list with different possibilities, e.g.
Alaska KOOLVIS gevangen in: A = het Noordwestelijke deel Stille Oceaan (FAO 61) gevangen met trawlnetten of B = het Noordoostelijke deel Stille Oceaan (FAO 67)
gevangen met trawlnetten in fao 27-i barentszzee (a), fao27-iia noorse zee (b), fao27-iib spitsbergen en bereneiland (c)
Gevangen in de (A) Barentszzee, (B) Noorse Zee of (C) Spitsbergen en Bereneiland.
The Dutch law has a list of trade fish names, these can be integrated into the known species list.
https://wetten.overheid.nl/BWBR0019099/2016-02-19
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.