GithubHelp home page GithubHelp logo

marsron / konosuba-data Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.92 MB

Text data of KonoSuba: God's Blessing on This Wonderful World! Light Novel Volume 1 to 17 + short stories (English fan translation).

License: MIT License

Python 100.00%
aqua darkness dataset kazuma konosuba lightnovel megumin text webscrape

konosuba-data's Introduction

KonoSuba Data

Text data of KonoSuba: God's Blessing on This Wonderful World! Light Novel Volume 1 to 17 + short stories (English fan translation).

Note:
Most of the unrelated metadata/TL note have been removed.
This might have accidentally removed some lines from the light novel, but the damage should be minimal.
Feel free to create an issue if there are some lines that have been accidentally removed.

Context

KonoSuba: God's Blessing on This Wonderful World!, often referred to simply as KonoSuba, is a Japanese light novel series written by Natsume Akatsuki. The series follows Kazuma Satou, a boy who is sent to a fantasy world with MMORPG elements following his death, where he forms a dysfunctional adventuring party with a goddess, an archwizard, and a crusader.

Source: https://en.wikipedia.org/wiki/KonoSuba

Usage

Download the files below.

File Lines Size Description
konosuba.txt 47573 4.5MB 17 volumes of KonoSuba light novel condensed into 1 file. Both dialogue and monologue are included.
konosuba-dialogue.txt 18689 2.3MB Contains only dialogues in between quotes (โ€œโ€). Monologue is excluded.

Shameless self-plug:

  • Wanna make a Markov chain random sentence generator? Check out aqua.
  • Wanna make a AI chatbot? Check out kazuma.

I wanna DIY

If you want to manually generate the data yourself, I recommend using a proxy/VPN before running the webscraper.

Clone the project.

git clone https://github.com/MarsRon/konosuba-data

Create a Python virtual environment.

python3 -m venv venv
source venv/bin/activate

Install libraries.

pip install -r requirements.txt

Run the webscraper.

python scrape.py

This will create a ./data directory which temporarily stores each chapter from Volume 1 to Volume 17 in text form.

Then, the script will merge all the posts into konosuba.txt and also generate konosuba-dialogue.txt only from speeches.

Acknowledgements

The data is scraped from cgtranslations.me and crimsonmagic.me.

License

Distributed under the MIT License. See LICENSE.md for more information.

Contact

MarsRon - [email protected] - marsron.name.my

konosuba-data's People

Contributors

marsron avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.