GithubHelp home page GithubHelp logo

txt_reform's Introduction

小说文本格式化脚本

作用

网络上下载地盗版小说往往排版一塌糊涂,经常会出现以下问题。

  • 章卷标题格式混乱
  • 段前空格飘忽不定
  • 段间距、标题空行差强人意
  • 段落顺序混乱、出现重复章节
  • 编码混乱

所以我花了几个小时写了这个脚本,主要有以下作用

  • 可自由配置的自动识别章卷标题,并删除没用的空格重新格式化
  • 对文件总体进行扫描,显示章卷结构与明显错误
  • 对于隐式章卷结构,或不规范的章卷结构进行优化
  • 删除所有段前段后空格,并按照我们的想法进行段前空格调整
  • 统一调整、添加空行
  • 对所有卷、章进行排序后删除重复的
  • 可以进行编码转换

适用范围

任何detector与filter可以识别出章卷结构的小说

可在config_file/detect_config.pyconfig_file/filter_config.py中自由配置过滤器和嗅探器

如何使用

根据config_file文件夹里面config.py的注释更改为你想要的配置。

直接运行python main.py

或者下载exe版本,并双击使用(此版本配置文件被打包编译无法更改,下版本会解耦到json)

注意

有些作者在更新小说的时候会犯迷糊,更新小说却不更新章节,导致会出现两个章节号一样的章节,内容章节名都不一样,这就很麻烦

txt_reform's People

Contributors

intmian avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.