GithubHelp home page GithubHelp logo

eldamo's Introduction

Eldamo

An Elvish Data Model in XML.

Most consumers of this project are probably interested in the finished product, the generated Elvish lexicon. This can be found at http://eldamo.org (mirror http://pfstrack.github.io/eldamo/) and project releases are zip files of this site.

This document provides information for those interested in how the Eldamo site is generated. To fully understand this process, you need to know:

  • Java web application development
  • XML editing and parsing, include XSL and XQuery

Requirements

  1. Git
  2. Java 1.6+
  3. Gradle 2.6+ (which will download other dependencies)
  4. A good text editor (I use TextWrangler on OS X).
  5. For generating the site, a web crawler (I use SiteSucker on OS X)

Command-line Execution

  1. Clone the project from github:
git clone https://github.com/pfstrack/eldamo.git
  1. Copy the data directory inside the web application directory:
cp -R src/data src/main/webapp

Or simply manually copy the directory.

  1. Use gradle to launch the Jetty web application server (tomcat probably also works).
gradle :jettyRun
  1. Browse the site:
http://localhost:8080/eldamo/
  1. Edit the data file (src/main/webapp/eldamo-data.xml) and refresh the pages to see changes.

Building in Eclipse

If you want to use an Java development tool, here is how to set the project up in Eclipse.

  1. Use gradle generate an Eclipse project:
gradle :eclipse
  1. Import the project into Eclipse: File > Import : General > Existing Project Into Workspace

  2. In the project properties, set the file encoding to UTF-8:

  • Project > Properties
  • Resource > Text File Encoding
  • Select Other: UTF-8
  1. Install the Jetty plug-in for Eclipse.

  2. Select the project and Run > Run As > Run Jetty

  3. Browse the site as above:

http://localhost:8080/eldamo/
  1. For best performance, you will need to increase the memory in JVM arguments for the Jetty run configurations:
  • Run > Run Configurations
  • Jetty Webapp > eldamo
  • Arguments > VM Arguments
-Xmx1024m

Building the Eldamo Site

  1. Switch to the "pub" (publish) version of the site:
http://localhost:8080/eldamo/pub/index.html
  1. Use your web crawler of choice to crawl and download the site contents.

eldamo's People

Contributors

pfstrack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eldamo's Issues

Fix sort order for search

When searching for element in the middle or end of the word, don't put initial matches first in the sort order.

Also ditch gloss prioritization in sorting, because it is too confusing.

Minor Sindarin addition

Add Tirith Aear. the Sea ward tower of Dol Amroth it is found in the Adventures of Tom Bombadil

Documentation for new schema

Is there documentation available describing how the current version of the schema ( https://eldamo.org/general/eldamo-schema.html) has changed compared to the pre-0.6.1 version?

Also, is there more information on the language-type attribute beyond the list of potential values? In particular, it would be helpful to better understand how the language codes indicated here map to neo-eldarin vs. "classical eldarin".

Question on page-id in the data model

Greetings,

This is a mere question. The XSD for the data model states:

The @page-id is the numeric identifier of the generated HTML Lexicon, and can be used to link to the lexicon.

How is generated this id in the first place and does it stays unchanged across revisions of the lexcon (that is, unless of course an entry is removed for some reason) ?

Let me clarify the question behind the question. I am looking for some sort of unique identifier to unambiguously link a word entry from other documents (web site, or even from another XML lexicon :) (note) ). From current usage (and as other sites did, e.g. Parf Edhellen), it seems the page-id is the way to go. I am nevertheless asking, to be sure it is indeed unique in some way and persistent across revisions of the Eldamo lexicon.

(note) E.g. the new (now online) HSD has a fuzzy search for matching entries on Eldamo and linking to them - but of course, homographs fail short. Thought I could use the same ids as you to mark an entry as "vetted" in some way (which would also help for cross-checks)

Q. mat-, “to eat” ERROR:

I've identified a little error with the verb conjugation:
Matuvatye | future 1st-sg | “you [shall] eat” |
Should be:
Matuvatye | future 2nd-sg-familiar | “you [shall] eat” |

Just to make this easier, I'll mark the pages linked to this error:
Q. mat- page: eldamo-0.5.6/content/words/word-342927741.html
Q. future tense page: eldamo-0.5.6/content/words/word-2817430301.html
Q. Quenya phrase: eldamo-0.5.6/content/words/word-1368676801.html

Hope this helps!

Incorrect Words

From Damien Bador via email

  • nelcë “tooth”: your list of entries suggest that nelcë is the latest Quenya version of the word for “tooth”, but as the detailed information in the nelet entry shows, the latest word is in fact nelë, replacing nelet, since the alternate version nelke was later stricken through by Tolkien (see PE 17, p. 56 n. 12).

  • raiqi “angry”: the word is mistakenly quoted from PE 22 p. 124; the sentence in fact reads “i·nér né raiqa…”, so the adjective “angry” should be listed as raiqa (and it will look like a normal adjective).

  • endaquet- “answer”: you mention it as a noun, despite the dash present in PE 17, p. 167, which seems to show it is a verb (and it is much more comparable to other verbs such as quet-, aquet-, avaquet-, enquet-, láquet-, etc. than to any other noun derived from the same root).

  • órë: though you rightly mention the most common uses according to VT 41, you do not list the secondary use “counsel” mentioned in VT 41, p. 11: “it could be used of the influence of one person upon another by visible or audible means (words or signs) — in which case ‘counsel’ was nearer to its sense”

  • vanga “staff”: this is a Qenya word entirely missed in your entries; yet one finds it in PE 11, p. 21, entry bang, and in fact you actually quote it in the list of cognates of Gn bang. It would definitely deserve its own entry, especially because its phonology is much nearer to Quenya than its cognate vandl.

  • yava- “bear fruit”: though you give the bare stem in this way, it does not seem to fit with the form yavin, translated as “bears fruit” in PE 12, p. 105: one would rather expect a stem yav-.

While I’m on it, I noticed that your entry Q. axo only mentions the meaning “bone”, while your entry Q. axë “neck, rock ridge” also includes the variant axo attested in RC and PE 17.

I believe there should be two entries for axo, or at least that the alternate (restricted?) meaning “neck, rock ridge” be mentioned in the entry axo, with a redirection toward axë (and vice-versa).

Note about the Easterlings Part.

Many of the names in the Easterling Part of the Lexicon should not be there (in my opinion of course :-P )

The names Algund, Androg, Andvir, Asgon, Ban, Blodren, Dirhavel, Forweg and Indor belonged to people of the Edain so those names whould be under that Language. (also let it be noted that some of them actually have sindarin meanings so in any case they are highly unconnected with the easterling languages)

The names of the Tale of Tal-Elmar:
Agar, Buldar, Elmar, Go-Hilleg, Gorbelegod, Ishmalog, Hazad, Mogru, Udul, Tal-Elmar also shoulf not be there. These names belong to the Pre-Numenorean people of Gondor (like Arnach Eilenach, Eilenaer ect) so they should either be moved to the unknown language, or opened in a new language dubbed Pre-Numenorean Gondor

And since I opened the issue another detail
In the Valarin language
http://eldamo.org/content/words/word-1738233795.html
Ithir meanis light not fire ;-)

Umbarto, Oarel, Amon Gwareth, Amras entries

http://eldamo.org/content/words/word-389468831.html

the name was given by Nerdanel not Indis :-)

http://eldamo.org/content/words/word-2470950471.html

Aurel comes from ✶awādelo as said
but Oarel comes from ✶atvadelo
WJ page 363: '3(c) Aurel < *aw(a)delo. Oärel < *atvadelo.'

In WJ page 200 t is seen that S. Amon Gwareth was changed to Amon Gwared... They are not variant forms... As I see it the suffic -eth was outdated and Gwared seems rather like an adjective giving it the meaning of Guarded Hill...

http://eldamo.org/content/words/word-2758841993.html

In Peoples of Middle Earth page 353 it is clearly seen that Amras was the last of the brothers to be born so it is clear that he was the seventh. So Amrod was Minyarussa and Amras was Atyarussa.

Search problems

“*th*th” matches words only ending in “th”

Searching for “eket” does not match “ecet”

About the Rohirric entries

I just noticed that Anglo-Saxon(-ish) words such as Théoden, Holbytla, etc. has been categorized as Rohirric.
Well, I know it's ok to say that in some context, just like in the body of the books, where English equals Westron.
But the readers should know that the Anglo-Saxon names are but translations of the original Rohirric. Holbytla(n) is AS translation of kûd-dûkan - the actual original Rohirric name, not an obsolete Westron name as currently labeled (as Wes. †). The Appendix F - II On Translation has it all explained.

Westron and Rohirric were contemporary languages that were related. When Tolkien translated (fictionally) the Red Book (supposedly mainly written in Westron) into English, he felt like to translate all the closely related languages such as Rohirric into something related to English. He chose Anglo-Saxon for Rohirric, and Gothic, I believe, for some of the ancient northern Mannish names.

So kûd-dûkan, Lohtûr and the like should be Rohirric, while Holbytla, Éothéod be put into a new category, say "Rohirric (in Anglo-Saxon representation)".

Revisit I-intrusion ordering

  1. Final vowel loss
  2. Voicing of voiceless stops
  3. Monosyllabic lengthening
  4. Vowel loss at morpheme boundaries
  5. Reduction of ss
  6. I-intrusion
  7. Reduction of mm/ng

Update to latest version of glaemscribe

From Benjamin Babut:

Again just a small notification for you ; I have recently introduced a new feature in glaemscribe called "virtual chars". It allows the modes to be written with characters that do not exist explicitly in charsets, such as all tehtar, because they exist in multiple variants (at least 4). The charset itself will describe how these virtual chars behave ; for example, a "a tehta" will become the very large version after big tengwar, or the very small one after telco, ara, etc.

This allows to have a unique mode file, working with multiple compatible charsets and simplify a lot the writing of modes. Thus, there is now a dedicated charset for "tengwar eldamar" which will handle all tehtar variants with precision for the font "Tengwar Eldamar" that you like to use. Opentype fonts with anchors will also be able to use the same mode files, because their charset will contain explicit chars instead of virtual ones with the same names.

To use this enhanced feature, you will have to pass explicitly the charset to the "transcribe" function, because the default charset used by the modes for transcribing with tengwar modes is still the one corresponding to "Tengwar sindarin". This is done like in the documentation (http://bentalagan.github.io/glaemscribe/).

Instead of using :

quenya.transcribe("Ai ! Laurie lantar lassi súrinen !")
 => [true, "lE Á j.E7T`V jE4#7 jE,T 8~M7T5$5 Á"]

You will be using :

// Use an alternative charset for better rendering, depending on the font 
// you want to use
var eldamar = Glaemscribe.resource_manager.loaded_charsets["tengwar_ds_eldamar"]

quenya.transcribe("Ai ! Laurie lantar lassi súrinen !", eldamar)
 => [true, "lD Á j.E7T`V jE4#6 jE,G 8~M7T5$5 Á"] 

And that's all! Pretty simple. You can note the little changes for tehtar. Don't forget to include the corresponding js charset file and you will be good :

<script src="../glaemscribe/js/charsets/tengwar_ds_eldamar.cst.js"/>

Telperim should be marked with #, and rather Telperin

Since Telperim is isolated from Telperimpar, a # marker should be present, if the notation scheme is to be followed.
And the form Telperim- is most probably due to assimilation to following p. #Telperin is much safer, Also cf. its cognate celebren in Sindarin(N.) (which unlike Quenya, allows word final -m). And in fact #-rin is already listed as a T. suffix.

(very minor) Typo at entry tew

Hi Paul!

The following sentence :

but is clearly glossed as singular “letter” wiwith primitive form <i>tekmē</i>

Should be corrected as :

but is clearly glossed as singular “letter” with primitive form <i>tekmē</i>

There's also a non breaking space before the 'wiwith' typo (U+A0) and i think there are a few others which can be found here and there into the xml source, generally (if not always) after an ending quote character (”) - I don't know if this is really intended or not.

Acessing /index.jsp fails

After trying for serveral hours to get the website to work, i finally found out that i had to download the Gradle 2.6 binaries from their servers instead of using my pre-installed Gradle instance (may be resolved by adding an install script?) because jetty is no more supported since Gradle 4.0 (if i remember correctly). But now i ran into another issue: When trying to access /index.jsp (http://localhost:8080/eldamo/) the server throws this error:


HTTP ERROR 500
Problem accessing /eldamo/. Reason:

PWC6033: Unable to compile class for JSP

PWC6197: An error occurred at line: 14 in the jsp file: /index.jsp
PWC6199: Generated servlet error:
The type java.lang.CharSequence cannot be resolved. It is indirectly referenced from required .class files

Caused by:
org.apache.jasper.JasperException: PWC6033: Unable to compile class for JSP

PWC6197: An error occurred at line: 14 in the jsp file: /index.jsp
PWC6199: Generated servlet error:
The type java.lang.CharSequence cannot be resolved. It is indirectly referenced from required .class files

at org.apache.jasper.compiler.DefaultErrorHandler.javacError(DefaultErrorHandler.java:123)
at org.apache.jasper.compiler.ErrorDispatcher.javacError(ErrorDispatcher.java:296)
at org.apache.jasper.compiler.Compiler.generateClass(Compiler.java:376)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:437)
at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:608)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:360)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
at org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

I have Java 1.8 installed and as far as I can tell, compatibility mode for 1.6 is already activated in the build.gradle file. I have not much experience with either Gradle or Jetty so please excuse me if this is a simple problem.

Thank you!

Ithe, the name of Varda

Īthē, the mystical name of Varda could be related to the Valarin word for light: ithīr

Invalid XML

At line 18824 (entry for "caras"), it looks like the <combine> tag does not have a matching closing tag. Not sure if there's some info missing, or if the <combine> tag should just not be there, but as is the XML document is flagged as invalid.

Version 0.8.0

lungu, parya

http://eldamo.org/content/words/word-1877407.html - the absence of a macron over paranye is most definitely a slip, and even if a possibility of A-verb *para or sirya-type *parya is not out of the question, the aorist form pare is now well-attested in contemporary PE22:154, cuita'r pare.

http://eldamo.org/content/words/word-4278088233.html - if a separate word, should be *lungo (PE19:060 et al.). Though could be a compound-only form on the lines of PE21:80 (Ia) description, especially if we wish to retain Ety/LUG.

HSD references

Greeetings M. Paul Strack, and congratulations for the excellent work done here.

A few minor small details regarding references to HSD follow below, as I used your own dictionary, in turn, to re-check a few points recently. Please feel free to take into account or not, at your own convenience !

Kind regards,
Didier.

--

Réferences:

Hiswelókë’s Sindarin Dictionary (HSD)
Didier Willis; 1999-2008; http://www.jrrvf.com/hisweloke/sindar/online/sindar/dict-sd-en.htm
The most extensive list of Sindarin words available on the internet. Unfortunately, it does not yet include all the Sindarin words from PE17 or later. (...)

You might possibly want to rephrase a bit, by new name, and refer to the github repository:

A Sindarin and Noldorin dictionary (HSD, formerly Hiswelókë’s Sindarin Dictionary)
The Sindarin Dictionary Project, 1999-2008, 2010-2011, 2014, 2019, https://github.com/Omikhleia/sindict
(As of 2021, it probably does not yet include all the Sindarin words from PE17.)

I'm am pretty sure the last version there should include everything up to VT50 and PE22, except some entries from PE17 where the coverage status is unclear (it's likely partial, at best). The online version hosted on JRRVF should, I expect, be updated at one point sooner or later (so technically the link could also stay eventually, I assume), but anyhow the "latest" XML source, FWIW, is now on github.

Other points:

  • Entry dem was fixed (but basically states what you said, with reference to Eldamo, so I think your text may likely stay unchanged.)
  • Entry ened "In his Sindarin dictionary (HSD/ened), Didier Willis suggested ened as the likely “final” form based on Enedwaith": I can't remember having personally authored that annotation (which is not unexpected, in a work that at one point tried to be collegial, FWIW). Could be rephrased in a more neutral way here: "In HSD/ened, it is suggested..."
  • Entry fuir: "Didier Willis suggested fair, but this is likely incorrect, as suggested to me in a private chat by Elaran on 2018-08-26" — Not very clear to me what it refers to (though fair is suggested as normalization for feir, but in a different entry, so perhaps unsound indeed. Anyhow I don't think it pertains to the fuir entry).
  • Entry arnen: "Roman Rausch suggested ... This theory was also suggested by Didier Willis in his Sindarin dictionary (HSD/arnen), but I am not sure who first proposed this theory." — Personally, I don't really care who might have said it first or not, esp. for such an unimportant point of detail. Anyhow, for the record, my first mentioning of it was seemingly in (French) Hiswelókë - Quatrième feuillet, (2000, re-ed. august 2001), "Du détail géographique : Les Emyn Arnen" (pp. 119-124). The 2001 re-edition included an addendum on the (then) newly revealed meaning of Emyn Arnen and amended the earlier notes in accordance, leading to a formulation close to what the HSD eventually included. I have no remembrance of having discussed the topic with M. Rausch. Of course, he could as well have discovered the idea independently, earlier or later. And I wouldn't be surprised if the topic had been brought to Elfling at some point in 2000-2001, though I can't remember if it indeed did.
  • Entry dambeth: "HSD incorrectly has dambeth" — Oops. This was fixed and annotated. Congrats for catching it.

Better separation of “academic mode” and “Neo-Eldarin” mode

As of v0.6.8, Eldamo does an imperfect job of separate Neo-Eldarin connect from Tolkien’s own content. Although everything is properly marked, when browsing the site in “academic mode”, you can still see Neo-Eldarin content as links, for example as derivatives of attested roots. This extra clutter complicates the use of Eldamo as a pure research tool.

Eldamo needs to hide all Neo-Eldarin content when browsing the site in “academic mode”, showing only forms created by Tolkien.

See the discussion here:
https://plus.google.com/u/0/101464290458691045803/posts/RNMbXwLfhCj

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.