GithubHelp home page GithubHelp logo

waltzofpearls / dateparser Goto Github PK

View Code? Open in Web Editor NEW
37.0 2.0 8.0 106 KB

Parse dates in commonly used string formats with Rust.

License: MIT License

Rust 96.91% Makefile 3.09%
rust dateparser datetime cli lib chrono timezones

dateparser's Introduction

Build Status MIT licensed Crates.io Doc.rs

Parse dates in commonly used string formats with Rust.

This repo contains 2 cargo workspaces:

  • dateparser: Rust crate for parsing date strings in commonly used formats.
  • belt: Command-line tool that can display a given time in a list of selected time zones. It also serves as an example showcasing how you could use dateparser in your project.
use dateparser::parse;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    let parsed = parse("6:15pm")?;
    println!("{:#?}", parsed);
    Ok(())
}

Will parse the input 6:15pm and print parsed date and time in UTC time zone as 2023-03-26T01:15:00Z. More about this crate on Docs.rs and in examples folder

Accepted date formats

// unix timestamp
"1511648546",
"1620021848429",
"1620024872717915000",
// rfc3339
"2021-05-01T01:17:02.604456Z",
"2017-11-25T22:34:50Z",
// rfc2822
"Wed, 02 Jun 2021 06:31:39 GMT",
// postgres timestamp yyyy-mm-dd hh:mm:ss z
"2019-11-29 08:08-08",
"2019-11-29 08:08:05-08",
"2021-05-02 23:31:36.0741-07",
"2021-05-02 23:31:39.12689-07",
"2019-11-29 08:15:47.624504-08",
"2017-07-19 03:21:51+00:00",
// yyyy-mm-dd hh:mm:ss
"2014-04-26 05:24:37 PM",
"2021-04-30 21:14",
"2021-04-30 21:14:10",
"2021-04-30 21:14:10.052282",
"2014-04-26 17:24:37.123",
"2014-04-26 17:24:37.3186369",
"2012-08-03 18:31:59.257000000",
// yyyy-mm-dd hh:mm:ss z
"2017-11-25 13:31:15 PST",
"2017-11-25 13:31 PST",
"2014-12-16 06:20:00 UTC",
"2014-12-16 06:20:00 GMT",
"2014-04-26 13:13:43 +0800",
"2014-04-26 13:13:44 +09:00",
"2012-08-03 18:31:59.257000000 +0000",
"2015-09-30 18:48:56.35272715 UTC",
// yyyy-mm-dd
"2021-02-21",
// yyyy-mm-dd z
"2021-02-21 PST",
"2021-02-21 UTC",
"2020-07-20+08:00",
// hh:mm:ss
"01:06:06",
"4:00pm",
"6:00 AM",
// hh:mm:ss z
"01:06:06 PST",
"4:00pm PST",
"6:00 AM PST",
"6:00pm UTC",
// Mon dd hh:mm:ss
"May 6 at 9:24 PM",
"May 27 02:45:27",
// Mon dd, yyyy, hh:mm:ss
"May 8, 2009 5:57:51 PM",
"September 17, 2012 10:09am",
"September 17, 2012, 10:10:09",
// Mon dd, yyyy hh:mm:ss z
"May 02, 2021 15:51:31 UTC",
"May 02, 2021 15:51 UTC",
"May 26, 2021, 12:49 AM PDT",
"September 17, 2012 at 10:09am PST",
// yyyy-mon-dd
"2021-Feb-21",
// Mon dd, yyyy
"May 25, 2021",
"oct 7, 1970",
"oct 7, 70",
"oct. 7, 1970",
"oct. 7, 70",
"October 7, 1970",
// dd Mon yyyy hh:mm:ss
"12 Feb 2006, 19:17",
"12 Feb 2006 19:17",
"14 May 2019 19:11:40.164",
// dd Mon yyyy
"7 oct 70",
"7 oct 1970",
"03 February 2013",
"1 July 2013",
// mm/dd/yyyy hh:mm:ss
"4/8/2014 22:05",
"04/08/2014 22:05",
"4/8/14 22:05",
"04/2/2014 03:00:51",
"8/8/1965 12:00:00 AM",
"8/8/1965 01:00:01 PM",
"8/8/1965 01:00 PM",
"8/8/1965 1:00 PM",
"8/8/1965 12:00 AM",
"4/02/2014 03:00:51",
"03/19/2012 10:11:59",
"03/19/2012 10:11:59.3186369",
// mm/dd/yyyy
"3/31/2014",
"03/31/2014",
"08/21/71",
"8/1/71",
// yyyy/mm/dd hh:mm:ss
"2014/4/8 22:05",
"2014/04/08 22:05",
"2014/04/2 03:00:51",
"2014/4/02 03:00:51",
"2012/03/19 10:11:59",
"2012/03/19 10:11:59.3186369",
// yyyy/mm/dd
"2014/3/31",
"2014/03/31",
// mm.dd.yyyy
"3.31.2014",
"03.31.2014",
"08.21.71",
// yyyy.mm.dd
"2014.03.30",
"2014.03",
// yymmdd hh:mm:ss mysql log
"171113 14:14:20",
// chinese yyyy mm dd hh mm ss
"2014年04月08日11时25分18秒",
// chinese yyyy mm dd
"2014年04月08日",

belt CLI tool

Run belt to parse a given date:

$> belt 'MAY 12, 2021 16:44 UTC'
+-------------------+---------------------------+
| Zone              | Date & Time               |
+===================+===========================+
| Local             | 2021-05-12 09:44:00 -0700 |
|                   | 1620837840                |
+-------------------+---------------------------+
| UTC               | 2021-05-12 16:44:00 +0000 |
|                   | 2021-05-12 16:44 UTC      |
+-------------------+---------------------------+
| America/Vancouver | 2021-05-12 09:44:00 -0700 |
|                   | 2021-05-12 09:44 PDT      |
+-------------------+---------------------------+
| America/New_York  | 2021-05-12 12:44:00 -0400 |
|                   | 2021-05-12 12:44 EDT      |
+-------------------+---------------------------+
| Europe/London     | 2021-05-12 17:44:00 +0100 |
|                   | 2021-05-12 17:44 BST      |
+-------------------+---------------------------+

Installation

MacOS Homebrew or Linuxbrew:

brew tap waltzofpearls/belt
brew install belt

How to make a new release

List files that need to be updated with new version number:

make show-version-files

It will output something like this:

./dateparser/Cargo.toml:3:version = "0.1.5"
./dateparser/README.md:26:dateparser = "0.1.5"
./dateparser/README.md:60:dateparser = "0.1.5"
./belt/Cargo.toml:3:version = "0.1.5"

Next, automatically bump the version with make bump-version or manually update verion numbers in those listed files. When auto incrementing version with make bump-version, it will only bump the patch version, for example, 0.1.5 will become 0.1.6. Automatic version bump will create a git branch, commit and push the changes. You will need to create a pull request from GitHub to merge those changes from the git branch that's automatically created.

NOTE: if those files with version numbers are manually edited, then you will need to run cargo update to update dateparser and belt versions in the Cargo.lock file, and then git commit and push those changes to a git branch, and create a pull request from that branch.

Once the pull request is merged and those files are updated, run the following command to tag a new version with git and push the new tag to GitHub. This will trigger a build and release workflow run in GitHub Actions:

make release

dateparser's People

Contributors

autarch avatar gyfis avatar kdwarn avatar kkysen avatar waltzofpearls avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dateparser's Issues

Seemingly arbitrary leniency towards commas

Currently some formats allow unnecessary commas others enforce commas in specific places, and others disallow commas entirely. A few examples:

  • 12 September 2013 parses correctly
  • 12 September, 2013 does not parse
  • September 12, 2013 parses correctly
  • September 12 2013 does not parse
  • September 12 2013 08:00 UTC parses correctly
  • September 12 2013 08:00 does not parse
  • September 12, 2013 08:00 parses correctly

I'm looking to put together a PR about this (I've already done some brief tests, looking to make it lenient in every case i could think of) but I just wanted to check if this is something you'd be interested in, and check that there aren't any massive pitfalls that I haven't noticed yet that are enforcing the current behaviour

Convert `anyhow::Result`s in parsing functions to custom error type.

I like this library, but one downside is that it returns an anyhow::Result instead of a custom error type. Generally, it's better for libraries to implement these custom error types as it gives more information to the user about what went wrong, as anyhow::Result can be very vague at times. Consider possibly using thiserror or snafu? That way I can actually see (and report!) what went wrong instead of dealing with an opaque type returned by anyhow.

awk mktime date format support

Thanks for good library to parse date. I use AWK daily, could you support gawk mktime date format?

mktime(datespec [, utc-flag ])
Turn datespec into a timestamp in the same form as is returned by systime(). It is similar to the function of the same name in ISO C. The argument, datespec, is a string of the form "YYYY MM DD HH MM SS [DST]". The string consists of six or seven numbers representing, respectively, the full year including century, the month from 1 to 12, the day of the month from 1 to 31, the hour of the day from 0 to 23, the minute from 0 to 59, the second from 0 to 60,58 and an optional daylight-savings flag.

$ echo | awk '{print strftime("%d-%m-%Y",mktime("2012 12 21 0 0 0"));}'
21-12-2012

https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html

parse returns datetime with time set to current local time

Reviving #16

I was wondering if you would consider adding a feature or accepting a PR to allow zeroing out of unknown components?

My use case is I wish to use this parser to parse arbitrary dates where adding the local time makes the dates incorrect. Because I'll be parsing arbitrary data where the input is not under my direct control I will be unable to implement the workaround provided in the previous issue.

`parse` returns datetime with time set to current local time...

...when calling parse with a plain date string. For example:

use dateparser::parse;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    let parsed = parse("July 14, 2021")?;
    println!("{:#?}", parsed);
    Ok(())
}

returns a value like 2021-07-14T22:51:35.983216400Z where the time portion happens to be the current local time.

Shouldn't it just return 2021-07-14T00:00:00Z?

Thanks regardless @waltzofpearls !

Natural language support

Hey @waltzofpearls,

I'm currently using chrono-english, which is a library similar to dateparser. The chrono-english maintainer is pretty inactive though and I was about to rewrite his library, when I stumbled upon your library 😁 .

I like your architectural approach and I wanted to ask you, whether you could imagine to merge a parser for natural language, such as 3 weeks ago, in 2 months and 2 days, friday at 12pm. I'm currently using this for pueue and would really like to continue using this.
There's however an issue with chrono-english and some backwards incompatible changes in chrono itself, which lead to compilation errors

My approach for this would be to use pest to create a parser for a well-known syntax.
I would create a new module for natural language parsing and a generic trait and a dedicated parser for the english language that implements said trait, so other languages may be added in the future. Those could also be gated behind features.

How do you feel about this? If you think this is worthwhile and that you want to continue maintaining this crate for the forseeable future, I would go ahead and start working on this.

Edit:
Current progress:
https://github.com/Nukesor/dateparser/tree/natural-language

Default time for incomplete datetime strings

QUERY:
I'm quite keen to use this library, as I try to migrate from Scala/JVM world to Rust where I can. One of the better (though now really obsolete) libraries in the JVM ecosystem is the Joda datetime handler. What is tripping me up is subtle differences in handling parse for "datetimes" that only contain the date part. For example in scala:

scala> import org.joda.time.DateTime
 
 DateTime.parse("2021-12-02")
import org.joda.time.DateTime

scala> 
scala> res0: org.joda.time.DateTime = 2021-12-02T00:00:00.000Z

whereas in your dateparser, the same kind of parse would yield

[src/main.rs:161] date1 = 2021-12-02T12:06:22.907945488Z

Looking at the code, I can see that you are using current time as "filler" for those fields that are missing. This seems like a perfectly valid approach, but I'm wondering if you would consider extending the API to support passing arbitrary DateTime values to be used as fillier. That would allow me to replicate the behaviour of Joda a little more easily, without having to do a lot of regex parsing and date truncation in my own code.

Default date is not configurable

Would a patch be welcomed that's similar to the below, but for setting the default date to be used? Right now there are a lot of places that default to utc::now(), which doesn't work for my use-case. I need to set some context of what day I expect dateless timestamps to be.

826b8c4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.