GithubHelp home page GithubHelp logo

giocoal / word-embedding-italian-literature Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 2.0 21.86 MB

Using distibuctional semantics (word2vec family algorithms and the CADE framework) to learn word embeddings from the Italian literary corpuses we generated.

Home Page: https://github.com/giocoal/word-embedding-italian-literature/blob/main/Project%20Report%20EN.pdf

License: MIT License

Python 100.00%
cade word2vec distributional-semantics topos bigrams corpus-linguistics italian-language lemmatization

word-embedding-italian-literature's Introduction

Distributional Semantics (word embedding with Word2Vec and CADE): the evolution of tópoi in the Italian literary tradition

Contributors Forks Stargazers Issues MIT License LinkedIn

Requirements

  • python 3.7+
  • multiprocessing
  • Pandas
  • Numpy
  • cade
  • gensim
  • smart_open
  • spacy
    • it_core_news_lg: python -m spacy download it_core_news_lg
  • simplemma
  • nltk

Introduction

The Greek term tópoi, in the singular tòpos, translated simply as "commonplace", identifies the repertoire of thematic and formal constants that constitute the morphological framework of the Western and Italian literary tradition. Although the spectrum of narrative patterns that have characterized the literature produced in the Italian peninsula is broad and changing over time, tòpos represent a form of imitatio that has never completely faded away, thus a useful tool for handing down the literary tradition. Indeed, conventionality and recurrence allow tòpos to traverse centuries and literary phases, yet lend themselves to the different formulations and interpretations of individual authors. Just think of the different visions of the locus amenous tòpos, the ideal place, which goes from 'natural earthly paradise' in classical literature to a grotesque and solitidune place in decadentism.

Goals

The goals of our project were:

  1. to obtain corpora that were consistent with our research questions from a collection of texts obtained from two main sources
  2. to use distibutional semantics, and in particular algorithms from the word2vec family along with the CADE frameword in order to learn word embeddings from the generated and processed corpora
  3. and finally to Analyze some particularly long-lived tòpos, chosen arbitrarily, to be able to answer some research questions

The questions we asked ourselves were:

  1. How do the longest-lived literary tòpos change in different historical periods ? and thus Does the historical-cultural context influence the recurring themes ?
  2. How do the canons proper to the different literary currents of Italian literature shape the representation of these common themes ?
  3. Given some tòpos and concepts peculiar to some of the greatest authors of Italian literature what are the correspondences in the works of other great authors ?

word-embedding-italian-literature's People

Contributors

giocoal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.