GithubHelp home page GithubHelp logo

pofile's Introduction

pofile - gettext .po parsing for JavaScript

Parse and serialize Gettext PO files.

Build Status

Usage

Add pofile to your project:

Installation (Node.JS, browser via Browserified)

npm install --save pofile

Reference it in your code:

var PO = require('pofile');

Installation (via bower)

bower install --save pofile

Add it to your HTML file:

<script src="bower_components/pofile/dist/pofile.js"></script>

Reference it in your code:

var PO = require('pofile');

Loading and parsing

You can create a new empty PO file by using the class:

var po = new PO();

Or by loading a file (Node.JS only):

PO.load('text.po', function (err, po) {
    // Handle err if needed
    // Do things with po
});

Or by parsing a string:

var po = PO.parse(myString);

The PO class

The PO class exposes three members:

  • comments: An array of comments (found at the header of the file).
  • headers: A dictionary of the headers.
  • items: An array of PO.Item objects, each of which represents a string from the gettext catalog.

There are two methods available:

  • save: Accepts a filename and callback, writes the po file to disk.
po.save('out.po', function (err) {
    // Handle err if needed
});
  • toString: Serializes the po file to a string.

The PO.Item class

The PO.Item class exposes the following members:

  • msgid: The message id.
  • msgid_plural: The plural message id (null if absent).
  • msgstr: An array of translated strings. Items that have no plural msgid only have one element in this array.
  • references: An array of reference strings.
  • comments: An array of string translator comments.
  • extractedComments: An array of string extracted comments.
  • flags: A dictionary of the string flags. Each flag is mapped to a key with value true. For instance, a string with the fuzzy flag set will have item.flags.fuzzy == true.
  • msgctxt: Context of the message, an arbitrary string, can be used for disambiguation.

Contributing

In lieu of a formal styleguide, take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Lint and test your code using Grunt.

Credits

Originally based on node-po (written by Michael Holly). Rebranded because node-po is unmaintained and because this library is no longer limited to Node.JS: it works in the browser too.

Changes compared to node-po

  • Proper handling of async methods that won't crash your Node.JS process when something goes wrong.
  • Support for parsing string flags (e.g. fuzzy).
  • A test suite.
  • Browser support (through Browserified and bower).

Migrating from node-po

You'll need to update the module reference: require('pofile') instead of require('node-po').

At the initial release, node-po and pofile have identical APIs, with one small exception: the save and load methods now take a callback that has an err parameter: (err) for save and (err, po) for load. This is similar to Node.JS conventions.

Change code such as:

PO.load('text.po', function (po) {

To:

PO.load('text.po', function (err, po) {
    // Handle err if needed

License

(The MIT License)

Copyright (C) 2013-2017 by Ruben Vermeersch <[email protected]>
Copyright (C) 2012 by Michael Holly

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

pofile's People

Contributors

camillem avatar cdauth avatar dedesite avatar dependabot[bot] avatar elewinso avatar gabegorelick avatar iddan-flycode avatar janhommes avatar johnyb avatar kirillku avatar ma2ciek avatar megaboich avatar mikejholly avatar paulpdaniels avatar remko avatar rosston avatar rubenv avatar sanderhouttekier avatar septs avatar tafkanator avatar tearwyx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pofile's Issues

TypeScript issues

I'm using pofile with ts-node to process some PO files locally. As mentioned in #35 there are some issues lately with pofile and TypeScript. I'm not sure whether it is related to the recent changes in pofile or my TypeScript lib upgrade.

Anyway, I'm not an expert creating TypeScript definition files manually 😅, so I quickly converted the current po.js file into a proper TypeScript implementation to get the TSDs out. here's the result:

export interface IHeaders {
    'Project-Id-Version'?: string;
    'Report-Msgid-Bugs-To'?: string;
    'POT-Creation-Date'?: string;
    'PO-Revision-Date'?: string;
    'Last-Translator'?: string;
    Language?: string;
    'Language-Team'?: string;
    'Content-Type'?: string;
    'Content-Transfer-Encoding'?: string;
    'Plural-Forms'?: string;
    [name: string]: string;
}
export declare class PO {
    comments: any[];
    extractedComments: any[];
    headers: IHeaders;
    headerOrder: any[];
    items: any[];
    constructor();
    save(filename: any, callback: any): void;
    toString(): string;
    static load(filename: any, callback: any): void;
    static parse(data: any): PO;
    static parsePluralForms(pluralFormsString: any): {
        nplurals: any;
        plural: any;
    };
}
export declare class Item {
    msgid: string;
    msgctxt: any;
    references: any[];
    msgid_plural: any;
    msgstr: any[];
    comments: any[];
    extractedComments: any[];
    flags: {};
    obsolete: boolean;
    nplurals: any;
    constructor(options?: any);
    toString(): string;
}

It's slightly different though, for instance there's no PO.Items, but rather it is its own class.

If you're interested in converting po.js into a TypeScript file, here's the conversion:

import * as fs from 'fs';

function trim(string) {
  return string.replace(/^\s+|\s+$/g, '');
}

export interface IHeaders {
  'Project-Id-Version'?: string;
  'Report-Msgid-Bugs-To'?: string;
  'POT-Creation-Date'?: string;
  'PO-Revision-Date'?: string;
  'Last-Translator'?: string;
  Language?: string;
  'Language-Team'?: string;
  'Content-Type'?: string;
  'Content-Transfer-Encoding'?: string;
  'Plural-Forms'?: string;
  [name: string]: string;
}

export class PO {
  comments = [];
  extractedComments = [];
  headers: IHeaders = {};
  headerOrder = [];
  items = [];

  constructor() {}

  save(filename, callback) {
    fs.writeFile(filename, this.toString(), callback);
  }

  toString() {
    var lines = [];

    if (this.comments) {
      this.comments.forEach(function(comment) {
        lines.push(('# ' + comment).trim());
      });
    }
    if (this.extractedComments) {
      this.extractedComments.forEach(function(comment) {
        lines.push(('#. ' + comment).trim());
      });
    }

    lines.push('msgid ""');
    lines.push('msgstr ""');

    var self = this;
    var headerOrder = [];

    this.headerOrder.forEach(function(key) {
      if (key in self.headers) {
        headerOrder.push(key);
      }
    });

    var keys = Object.keys(this.headers);

    keys.forEach(function(key) {
      if (headerOrder.indexOf(key) === -1) {
        headerOrder.push(key);
      }
    });

    headerOrder.forEach(function(key) {
      lines.push('"' + key + ': ' + self.headers[key] + '\\n"');
    });

    lines.push('');

    this.items.forEach(function(item) {
      lines.push(item.toString());
      lines.push('');
    });

    return lines.join('\n');
  }

  static load(filename, callback) {
    fs.readFile(filename, 'utf-8', function(err, data) {
      if (err) {
        return callback(err);
      }
      var po = PO.parse(data);
      callback(null, po);
    });
  }

  static parse(data) {
    //support both unix and windows newline formats.
    data = data.replace(/\r\n/g, '\n');
    var po = new PO();
    var sections = data.split(/\n\n/);
    var headers: any = [];
    //everything until the first 'msgid ""' is considered header
    while (sections[0] && (headers.length === 0 || headers[headers.length - 1].indexOf('msgid ""') < 0)) {
      if (sections[0].match(/msgid "[^"]/)) {
        //found first real string, adding a dummy header item
        headers.push('msgid ""');
      } else {
        headers.push(sections.shift());
      }
    }

    headers = headers.join('\n');
    var lines = sections.join('\n').split(/\n/);

    po.headers = {
      'Project-Id-Version': '',
      'Report-Msgid-Bugs-To': '',
      'POT-Creation-Date': '',
      'PO-Revision-Date': '',
      'Last-Translator': '',
      Language: '',
      'Language-Team': '',
      'Content-Type': '',
      'Content-Transfer-Encoding': '',
      'Plural-Forms': ''
    };
    po.headerOrder = [];

    headers
      .split(/\n/)
      .reduce(function(acc, line) {
        if (acc.merge) {
          //join lines, remove last resp. first "
          line = acc.pop().slice(0, -1) + line.slice(1);
          delete acc.merge;
        }
        if (/^".*"$/.test(line) && !/^".*\\n"$/.test(line)) {
          acc.merge = true;
        }
        acc.push(line);
        return acc;
      }, [])
      .forEach(function(header) {
        if (header.match(/^#\./)) {
          po.extractedComments.push(header.replace(/^#\.\s*/, ''));
        } else if (header.match(/^#/)) {
          po.comments.push(header.replace(/^#\s*/, ''));
        } else if (header.match(/^"/)) {
          header = header
            .trim()
            .replace(/^"/, '')
            .replace(/\\n"$/, '');
          var p = header.split(/:/);
          var name = p.shift().trim();
          var value = p.join(':').trim();
          po.headers[name] = value;
          po.headerOrder.push(name);
        }
      });

    var parsedPluralForms = PO.parsePluralForms(po.headers['Plural-Forms']);
    var nplurals = parsedPluralForms.nplurals;
    var item = new Item({ nplurals: nplurals });
    var context = null;
    var plural = 0;
    var obsoleteCount = 0;
    var noCommentLineCount = 0;

    function finish() {
      if (item.msgid.length > 0) {
        if (obsoleteCount >= noCommentLineCount) {
          item.obsolete = true;
        }
        obsoleteCount = 0;
        noCommentLineCount = 0;
        po.items.push(item);
        item = new Item({ nplurals: nplurals });
      }
    }

    function extract(string) {
      string = trim(string);
      string = string.replace(/^[^"]*"|"$/g, '');
      string = string.replace(/\\([abtnvfr'"\\?]|([0-7]{3})|x([0-9a-fA-F]{2}))/g, function(match, esc, oct, hex) {
        if (oct) {
          return String.fromCharCode(parseInt(oct, 8));
        }
        if (hex) {
          return String.fromCharCode(parseInt(hex, 16));
        }
        switch (esc) {
          case 'a':
            return '\x07';
          case 'b':
            return '\b';
          case 't':
            return '\t';
          case 'n':
            return '\n';
          case 'v':
            return '\v';
          case 'f':
            return '\f';
          case 'r':
            return '\r';
          default:
            return esc;
        }
      });
      return string;
    }

    while (lines.length > 0) {
      var line = trim(lines.shift());
      var lineObsolete = false;
      var add = false;

      if (line.match(/^#\~/)) {
        // Obsolete item
        //only remove the obsolte comment mark, here
        //might be, this is a new item, so
        //only remember, this line is marked obsolete, count after line is parsed
        line = trim(line.substring(2));
        lineObsolete = true;
      }

      if (line.match(/^#:/)) {
        // Reference
        finish();
        item.references.push(trim(line.replace(/^#:/, '')));
      } else if (line.match(/^#,/)) {
        // Flags
        finish();
        var flags = trim(line.replace(/^#,/, '')).split(',');
        for (var i = 0; i < flags.length; i++) {
          item.flags[flags[i]] = true;
        }
      } else if (line.match(/^#($|\s+)/)) {
        // Translator comment
        finish();
        item.comments.push(trim(line.replace(/^#($|\s+)/, '')));
      } else if (line.match(/^#\./)) {
        // Extracted comment
        finish();
        item.extractedComments.push(trim(line.replace(/^#\./, '')));
      } else if (line.match(/^msgid_plural/)) {
        // Plural form
        item.msgid_plural = extract(line);
        context = 'msgid_plural';
        noCommentLineCount++;
      } else if (line.match(/^msgid/)) {
        // Original
        finish();
        item.msgid = extract(line);
        context = 'msgid';
        noCommentLineCount++;
      } else if (line.match(/^msgstr/)) {
        // Translation
        var m = line.match(/^msgstr\[(\d+)\]/);
        plural = m && m[1] ? parseInt(m[1]) : 0;
        item.msgstr[plural] = extract(line);
        context = 'msgstr';
        noCommentLineCount++;
      } else if (line.match(/^msgctxt/)) {
        // Context
        finish();
        item.msgctxt = extract(line);
        context = 'msgctxt';
        noCommentLineCount++;
      } else {
        // Probably multiline string or blank
        if (line.length > 0) {
          noCommentLineCount++;
          if (context === 'msgstr') {
            item.msgstr[plural] += extract(line);
          } else if (context === 'msgid') {
            item.msgid += extract(line);
          } else if (context === 'msgid_plural') {
            item.msgid_plural += extract(line);
          } else if (context === 'msgctxt') {
            item.msgctxt += extract(line);
          }
        }
      }

      if (lineObsolete) {
        // Count obsolete lines for this item
        obsoleteCount++;
      }
    }
    finish();

    return po;
  }

  static parsePluralForms(pluralFormsString) {
    var results = (pluralFormsString || '').split(';').reduce(function(acc, keyValueString) {
      var trimmedString = keyValueString.trim();
      var equalsIndex = trimmedString.indexOf('=');
      var key = trimmedString.substring(0, equalsIndex).trim();
      var value = trimmedString.substring(equalsIndex + 1).trim();
      acc[key] = value;
      return acc;
    }, {});
    return {
      nplurals: results.nplurals,
      plural: results.plural
    };
  }
}

export class Item {
  msgid = '';
  msgctxt = null;
  references = [];
  msgid_plural = null;
  msgstr = [];
  comments = []; // translator comments
  extractedComments = [];
  flags = {};
  obsolete = false;
  nplurals;

  constructor(options: any = null) {
    var nplurals = options && options.nplurals;

    var npluralsNumber = Number(nplurals);
    this.nplurals = isNaN(npluralsNumber) ? 2 : npluralsNumber;
  }
  toString() {
    var lines = [];
    var self = this;

    // reverse what extract(string) method during PO.parse does
    var _escape = function(string) {
      // don't unescape \n, since string can never contain it
      // since split('\n') is called on it
      string = string.replace(/[\x07\b\t\v\f\r"\\]/g, function(match) {
        switch (match) {
          case '\x07':
            return '\\a';
          case '\b':
            return '\\b';
          case '\t':
            return '\\t';
          case '\v':
            return '\\v';
          case '\f':
            return '\\f';
          case '\r':
            return '\\r';
          default:
            return '\\' + match;
        }
      });
      return string;
    };

    var _process = function(keyword, text, i) {
      var lines = [];
      var parts = text.split(/\n/);
      var index = typeof i !== 'undefined' ? '[' + i + ']' : '';
      if (parts.length > 1) {
        lines.push(keyword + index + ' ""');
        parts.forEach(function(part) {
          lines.push('"' + _escape(part) + '"');
        });
      } else {
        lines.push(keyword + index + ' "' + _escape(text) + '"');
      }
      return lines;
    };

    //handle \n in single-line texts (can not be handled in _escape)
    var _processLineBreak = function(keyword, text, index) {
      var processed = _process(keyword, text, index);
      for (var i = 1; i < processed.length - 1; i++) {
        processed[i] = processed[i].slice(0, -1) + '\\n"';
      }
      return processed;
    };

    // https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html
    // says order is translator-comments, extracted-comments, references, flags

    this.comments.forEach(function(c) {
      lines.push('# ' + c);
    });

    this.extractedComments.forEach(function(c) {
      lines.push('#. ' + c);
    });

    this.references.forEach(function(ref) {
      lines.push('#: ' + ref);
    });

    var flags = Object.keys(this.flags).filter(function(flag) {
      return !!this.flags[flag];
    }, this);
    if (flags.length > 0) {
      lines.push('#, ' + flags.join(','));
    }
    var mkObsolete = this.obsolete ? '#~ ' : '';

    ['msgctxt', 'msgid', 'msgid_plural', 'msgstr'].forEach(function(keyword) {
      var text = self[keyword];
      if (text != null) {
        var hasTranslation = false;
        if (Array.isArray(text)) {
          hasTranslation = text.some(function(text) {
            return text;
          });
        }

        if (Array.isArray(text) && text.length > 1) {
          text.forEach(function(t, i) {
            var processed = _processLineBreak(keyword, t, i);
            lines = lines.concat(mkObsolete + processed.join('\n' + mkObsolete));
          });
        } else if (self.msgid_plural && keyword === 'msgstr' && !hasTranslation) {
          for (var pluralIndex = 0; pluralIndex < self.nplurals; pluralIndex++) {
            lines = lines.concat(mkObsolete + _process(keyword, '', pluralIndex));
          }
        } else {
          var index = self.msgid_plural && Array.isArray(text) ? 0 : undefined;
          text = Array.isArray(text) ? text.join() : text;
          var processed = _processLineBreak(keyword, text, index);
          lines = lines.concat(mkObsolete + processed.join('\n' + mkObsolete));
        }
      }
    });

    return lines.join('\n');
  }
}

po.js errors out with latest version

/node_modules/gulp-angular-gettext/node_modules/angular-gettext-tools/node_modules/pofile/lib/po.js:69
while (headers[headers.length - 1].indexOf('msgid ""') < 0) {
^
TypeError: Cannot read property 'indexOf' of undefined

Empty line causes failure to read header

The following example will not pick up the headers because of the empty line after the first comment:

# placeholder_format_custom = \{\{.+?}}

#. extracted from code/
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2015-08-20 18:05-0700\n"
"PO-Revision-Date: 2015-08-20 18:06-0700\n"
"Language-Team: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Poedit 1.8.4\n"
"Last-Translator: \n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"
"Language: gi_US\n"

It is a simple fix to adjust it, but I am getting these files generated automatically, so I would prefer to not have to manually fix. I think these are valid files, but I am not familiar enough with .po files to say for sure.

Assign translated values

How can I use the translated values? I am developing a multi language system and would like to change the values ​​of the variables according to the language obtained by node process.env.LANG.

Example of my .po file:

msgid ""
msgstr ""
"Project-Id-Version: date-pt-bt 1.2.2\n"
"Report-Msgid-Bugs-To: [email protected] \n"
"Last-Translator: Victor Gianvechio <[email protected]>\n"
"Language: pt_BR\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"

msgid "day"
msgid_plural "days"
msgstr[0] "dia"
msgstr[1] "dias"

More than one references in a line

Comments starts with #: can contain more than one references per line. For example, if xgettext is used with standard input a reference comment can contains the following:

#: standard input:11 standard input:26
msgid "One"
msgstr ""

At the same time, PO.parse finds only one reference is such cases (standard input:11 standard input:26 for the example above).

Add support webpack for work in browser

I have an issue

ERROR in ./node_modules/pofile/lib/po.js
26 unchanged chunks
chunk {main} main.js, main.js.map (main) 2.91 MB [initial] [rendered]
Module not found: Error: Can't resolve 'fs' in '/Users/user/work/project/node_modules/pofile/lib'
chunk {vendor} vendor.js, vendor.js.map (vendor) 8.2 MB [initial] [rendered]
「wdm」: Failed to compile.

maybe you can create split builds for nodejs and browser?

Item typing not exported in TS declaration file

Currently running into this problem after last release:

import PO, { Item } from 'pofile'; // Cannot export Item

private doSomething(str: string): PO.Item { // Error, cant use PO type as namespace to get item Type

	const item = new PO.Item(); // Works fine.
	item.msgid = str;
	return item;
}

Probably going to have the same issue with the Header type

`obsolete` is marked as private

The set of items returned from parse includes obsolete items. However, it seems you can't check for obsolete, as this field is marked private in the typescript definition. Is there a reason for this?

msgctxt spanning more than one line is not captured

xgettext output:

#: standard input:49
msgctxt ""
"hello world hello world hello world hello world hello world hello world hello world"
"hello world hello world"
msgid "inviting friends"
msgstr ""

Parsed output:

{ 
       msgid: 'inviting friends',
       msgctxt: '',
       references: [Object],
       msgid_plural: null,
       msgstr: [Object],
       comments: [],
       extractedComments: [],
       flags: {},
       obsolete: false 
}

It appears as though we need to add a context = 'msgctxt'; when we're on a msgctxt line, and then add another conditional to // Probably multiline string or blank. Since this is not typical usage of context (very large string), I thought it might be better to discuss first.

pofile misparses po files with "Mac" linebreaks

I haven't been able to interpret whether the po format specifies a particular variant of line break.

It would make plenty of sense if "UNIX style" (LF only) was preferred, but this is not mentioned in any of the official docs. Also "DOS style" (CR+LF) is obviously supported by dozens of gettext-related tools. So far, so good.

Anyway, I'm posting this issue after spending far too much time discovering that pofile.js misparses files which use the old Mac style of linebreak (CR only). The interesting thing is that it parses most of the file correctly, but the header seems to get glommed into the first 'true' po entry, so that the header appears as a multiline translator comment. There are various other symptoms concerning this mixup, for example, the first true entry loses its references entirely.

Not sure if it's absolutely crucial that this be fixed, because it's usually easy enough to switch to LF or CR+LF, and Mac OSX defaults to LF these days.

But until or unless a fix appears, I think it would be worth mentioning in the pofile docs. This issue report can act as a kind of documentation for the problem for now.

Update documentation

  • Usage instructions (Node.JS)
  • Usage instructions (browserified)
  • Usage instructions (bower)
  • API
  • Contribution guidelines

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.