free-programming-books-parser's People
free-programming-books-parser's Issues
Improve title text extraction
According to current code
free-programming-books-parser/index.js
Lines 92 to 95 in dc53b8c
first node children[0]
is used as resource titles without check if there are more meaningfull tokens. So the rest is stripped making sometimes difficult to do a search by title of resources.
Therefore a escape in resources title links part is needed when submitting and make a rebuild Markdown here is mandatory
Context
See EbookFoundation/free-programming-books#7086
Related with #2 (same workarround)
Improve Index section detection
Resolve this TODOS across localized files
free-programming-books-parser/index.js
Lines 178 to 192 in 5eaf00b
part of EbookFoundation/free-programming-books#6988 (comment)
Index
word is not translated according to file locale. E.g.:
Índice
for-es.md
filesÍndice
for-pt_BR.md
files目录
for-zh.md
files- other variants at EbookFoundation/free-programming-books@78913af
Parser doesn't take into account bold format in notes
Having current code:
free-programming-books-parser/index.js
Lines 140 to 172 in dc53b8c
If a bold format is found, the i.value
is undefined and then the program crash. It should check if i.type == "strong"
or have i.children
.
Resources affected:
- https://github.com/EbookFoundation/free-programming-books/blob/fc4b0c5c139b952de979aa54a4de5141ea280906/books/free-programming-books-fr.md?plain=1#L73
In general, we should extends the fix in depth to other inline formats like emphasis (already exists), bold, code, image...
Parser don't take into account resources organized in sublists (fascicles/parts)
There are a kind of resource that are not covered by parser.
Examples:
-
https://github.com/EbookFoundation/free-programming-books/blob/fc4b0c5c139b952de979aa54a4de5141ea280906/books/free-programming-books-langs.md?plain=1#L1265-L1268
-
https://github.com/EbookFoundation/free-programming-books/blob/fc4b0c5c139b952de979aa54a4de5141ea280906/books/free-programming-books-zh.md?plain=1#L303
-
https://github.com/EbookFoundation/free-programming-books/blob/fc4b0c5c139b952de979aa54a4de5141ea280906/casts/free-podcasts-screencasts-en.md?plain=1#L109-L112
As we can see are listed with title without link and links are in a sublist apart or using multiformat syntax.
Discovered fixing #8 because resources after it appears in fpb.json
.
Improve file media type extraction from directory name
It should that function in charge of extract the file type doesn't work well.
free-programming-books-parser/index.js
Lines 171 to 180 in ce6be65
Always returns "fpb" instead of "books", "courses"....
See https://raw.githubusercontent.com/EbookFoundation/free-programming-books-search/main/fpb.json
Even worst if not sanatized path is provided or the parser is executed with customized inputs.
Tasks
- Sanatize input to be independent of OS.
- Extract right slug for both cases: if input parameter is file or is directory.
Improve extraction of section texts from Markdown headings
According to current code...
free-programming-books-parser/index.js
Lines 192 to 212 in 5eaf00b
it seems that the parser not supports HTML anchor aliases neither Markdown syntax. It takes for granted that childrens[0]
will be plain text.
It's necesary make a ES6 Array.reduce of the heading item.children
taking into account all cases in order to rebuild the desired text. Cases type
:
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.