GithubHelp home page GithubHelp logo

robohack / yajl Goto Github PK

View Code? Open in Web Editor NEW
8.0 4.0 2.0 1.65 MB

A fast streaming JSON parsing library in C. This variant uses BSDMake to build and includes various fixes and enhancements.

Home Page: http://robohack.github.io/yajl

License: Other

C 83.94% Shell 1.32% Makefile 14.71% Emacs Lisp 0.03%
c json json-parser json-data json-api json-api-serializer json-api-normalizer bsdmake

yajl's Introduction

# Welcome to Yet Another JSON Library (YAJL)

## NOTE:  This is a variant of the original [YAJL][LLOYD] by Lloyd Hilaiel.

This variant started as a fork of Lloyd's original.  This YAJL uses
BSDMake for building, Cxref for documentation, and it includes a few
minor bug fixes and other enhancements.  See the Git history.  Further
fixes or enhancements are welcome.

This YAJL is maintained in [robohack's GitHub][GHRY] by Greg A. Woods.

Motto:  Write Portable C without complicating the build!

See, e.g.: https://nullprogram.com/blog/2017/03/30/

Also, and perhaps more importantly:

	https://queue.acm.org/detail.cfm?id=2349257

## Why does the world need another C library for parsing JSON?

Good question.  In a review of current C JSON parsing libraries I was
unable to find one that satisfies my requirements.  Those are,

0. written in Plain Standard ANSI C (C99!)
1. i.e. portable
2. robust -- as close to "crash proof" as possible
3. data representation independent
4. fast
5. generates verbose, useful error messages including context of where
   the error occurs in the input text.
6. can parse JSON data off a stream, incrementally
7. simple to use
8. tiny
9. can use a custom memory allocator

Numbers 3, 5, 6, and 7 were particularly hard to find, and were what
caused me to ultimately create YAJL.  This document is a tour of some
of the more important aspects of YAJL.

## YAJL is Free.

Permissive licensing means you can use it in open source and
commercial products alike without any fees.  My request beyond the
licensing is that if you find bugs drop me a email, or better yet,
fork and fix.

Porting YAJL should be trivial, the implementation is ANSI C.  If you
port to new systems I'd love to hear of it and integrate your patches.

## YAJL is data representation independent.

BYODR!  Many JSON libraries impose a structure based data representation
on you.  This is a benefit in some cases and a drawback in others.
YAJL uses callbacks to remain agnostic of the in-memory representation.
So if you wish to build up an in-memory representation, you may do so
using YAJL, but you must bring the code that defines and populates the
in memory structure.

This also means that YAJL can be used by other (higher level) JSON
libraries if so desired.

## YAJL supports stream parsing

This means you do not need to hold the whole JSON representation in
textual form in memory.  This makes YAJL ideal for filtering projects,
where you're converting YAJL from one form to another (i.e. XML).  The
included JSON pretty printer is an example of such a filter program.

## YAJL is fast

Minimal memory copying is performed.  YAJL, when possible, returns
pointers into the client provided text (i.e. for strings that have no
embedded escape chars, hopefully the common case).  I've put a lot of
effort into profiling and tuning performance, but I have ignored a
couple possible performance improvements to keep the interface clean,
small, and flexible.  My hope is that YAJL will perform comparably to
the fastest JSON parser out there.

YAJL should impose both minimal CPU and memory requirements on your
application.

## YAJL is tiny.

Fat free.  No whip.  Now truly so with the elimination of CMake!

enjoy,
Lloyd - July, 2007
Greg - April, 2024

[GHRY]: https://github.com/robohack/yajl/
[LLOYD]: https://github.com/lloyd/yajl/

yajl's People

Contributors

7ac avatar bluemarvin avatar bovine avatar conradirwin avatar dougm avatar emaste avatar gno avatar halostatue avatar hstern avatar jstamp avatar lloyd avatar mirek avatar mxcl avatar octo avatar patperry avatar plaguemorin avatar rflynn avatar robohack avatar sgravrock avatar shahbag avatar timgates42 avatar tjw avatar z00b avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

yajl's Issues

Null pointer 3

in yajl_tree.c, on line 299, we use ctx->stack->value->type but, at this location, ctx->stack->value may be NULL

NULL pointer 2

In yajl_alloc(), there no NULL pointer check after the YA_MALLOC:

hand = (yajl_handle) YA_MALLOC(afs, sizeof(struct yajl_handle_t));
+if (!hand) return NULL;

Same in yajl_lex_alloc():

yajl_lexer lxr = (yajl_lexer) YA_MALLOC(alloc, sizeof(struct yajl_lexer_t));
+if (!lxr) return NULL;

Invalid "state" of buffer when decoding string

In yajl_parser.c, on line 258, we pass yajl_buf_data(hand->decodeBuf) to the callback instead of the usual buffer "buf". As this points to another memory location, the callback receive 2 buffers that are located in another space.
Concrete problem: in ModSecurity, we use the callback to get the decoded value of the string and we calculate the offset of a variable value in order to mask it in the log. In the callback, when the JSON is decoded, we receive another location than the original one and we cannot calculate the offset.

We could perform this trivial change:

-yajl_string_decode(hand->decodeBuf, buf, bufLen);
-_CC_CHK(hand->callbacks->yajl_string(
                                    hand->ctx, yajl_buf_data(hand->decodeBuf),
                                    yajl_buf_len(hand->decodeBuf)));
+if (yajl_string_decode(hand->decodeBuf, buf, bufLen) < 0) return yajl_status_error;
+bufLen = yajl_buf_len(hand->decodeBuf);
+strcpy((char*)buf, (char*)yajl_buf_data(hand->decodeBuf));
+_CC_CHK(hand->callbacks->yajl_string(hand->ctx, buf, bufLen));

Same on line 397

Enhancement: conditional define for alloc macros

For enhanced performance in prod:

#ifdef NDEBUG
# define YA_MALLOC(afs, sz) malloc(sz)
# define YA_FREE(afs, ptr) free(ptr)
# define YA_REALLOC(afs, ptr, sz) realloc(ptr, sz)
#else
# define YA_MALLOC(afs, sz) (afs)->malloc((afs)->ctx, (sz))
# define YA_FREE(afs, ptr) (afs)->free((afs)->ctx, (ptr))
# define YA_REALLOC(afs, ptr, sz) (afs)->realloc((afs)->ctx, (ptr), (sz))
#endif

Compatibility syntax

Visual C++, in debug mode, redefines alloc functions (to detect memory leaks).
The syntax used in this project is incompatible with the redefinition.
Would it be possible to prefix the functions with 'yajl_' in the yajl_alloc_funcs struct?
Thus yajl_malloc(), yajl_free, yajl_realloc.
It's trivial but it makes a huge difference when you need to compile with Visual C++.
Thanks

yajl.pc is not installed

After successful build and bmake install - yajl.pc file does not appear in pkgconfig dir.

Version release-2.2

NULL pointer check

yajl_buf yajl_buf_alloc(yajl_alloc_funcs * yajl_alloc)
{
    yajl_buf b = YA_MALLOC(yajl_alloc, sizeof(struct yajl_buf_t));
    memset((void *) b, 0, sizeof(struct yajl_buf_t));

Should be

    yajl_buf b = YA_MALLOC(yajl_alloc, sizeof(struct yajl_buf_t));
    if (!b) return NULL;

Enhancement: compatibility with memory pools without realloc

Some memory pool API (like Apache APR) don't implement a realloc function, so it's not possible to use these API with yajl. And it's totally impossible to develop a genric one.

A (quite simple) possibility is to allow to not specify a realloc function and, in this case, use an internal function. We can do that because in all calls to realloc, we know the old size (which we don't know in an external function).

Concretely:

Remove existing tests "afs->yajl_realloc == NULL"

Create the internal "extended" realloc:
void* yajl_realloc2(yajl_alloc_funcs* afs, void* previous, size_t sz, size_t oldsz)
{
void* new = afs->yajl_malloc(afs->ctx, sz);
if (!new) return NULL;
if (!previous) return new;
if (oldsz) memcpy(new, previous, oldsz);
afs->yajl_free(afs->ctx, previous);
return new;
}

Extend the definition of YA_REALLOC:
#define YA_REALLOC(afs, ptr, sz, oldsz) ((afs)->yajl_realloc ? (afs)->yajl_realloc((afs)->ctx, (ptr), (sz)) : yajl_realloc2((afs), (ptr), (sz), (oldsz)))

I attached a complete diff, tested with mod_security2 (APR).
For info, this speeds up the parsing by 250% on big JSON.

yajl_realloc2.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.