GithubHelp home page GithubHelp logo

pilarhidalgo / latinamerican_thesis_analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.68 MB

This repository contains documentation about the paper: Social and scientific relevance of Undergraduate Theses in Latin America

Jupyter Notebook 100.00%

latinamerican_thesis_analysis's Introduction

LatinAmerican_thesis_Analysis

This repository contains documentation about the paper: Social and scientific relevance of Undergraduate Theses in Latin America

Process in obtaining and processing data:

We used scraping and crawling techniques to obtain data. These procedures have limitations. Legality and ethics of using these tools are often overlooked. In order not to incur legal faults [24], we followed the following rules:

  • Use the free access API2 to collect information from Google Scholar, through the Serapi3 tool.

  • Avoid pages that contain blocking codes CAPTCHA4.

  • The use of the mentioned data is for the exclusive purpose of scientific dissemination and teaching, not for commercial purposes.

    grap1

To obtain metrics, the procedure was:

  • For social relevance: compliance with the items of social relevance that must contain at least one thesis work is evaluated. To avoid manual reading, we used the technique of natural language processing (NLP) Word2Vec [26], which automatically identifies the terms and stores them in registers to determine if the thesis under study mentions terms of social relevance defined by the theoretical framework [27].
  • For scientific relevance: These data are obtained from the statistics available on the website of each DSpace repository5. Additionally, we grouped other indicators (such as author and thesis citation statistics) that were available in a scattered way in Google Scholar to build the database. The purpose was to find out if a thesis work complied with the indicators of scientific relevance and if it was justified within the process of statistical description.

Data analysis:

In this phase, we apply the instrument to measure the social and scientific relevance presented in this work. gra2

Metrics description:

We organized the description and visualization of the results for each indicator through the development of a dashboard, which is a graphical interface that shows the metrics and fundamental data to monitor their evolution. It can be accessed from an Application Programming Interface (API) that allows extracting metadata from Google searches. CAPTCHA stands for the Completely Automated Public Turing test to tell Computers and Humans Apart. They are machine-controlled challenge-response tests that are used to determine whether the user is a human or an automated program [25]. DSpace repository is the most popular software used by the universities to organize publishing academic data online [28]. One of its advantages is that it allows the interaction of the users with the dashboard. Besides, visits are registered, according to the search requirement. There are several free access platforms to connect a set of data to a dashboard.

latinamerican_thesis_analysis's People

Contributors

pilarhidalgo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.