GithubHelp home page GithubHelp logo

sisyphsu / dateparser Goto Github PK

View Code? Open in Web Editor NEW
95.0 4.0 23.0 85 KB

dateparser is a smart and high-performance date parser library, it supports hundreds of different formats, nearly all format that we may used. And this is also a showcase for "retree" algorithm.

License: MIT License

Java 100.00%
dateparser dateformat dateformatter date

dateparser's Introduction

dateparser

Travis CI codecov

Introduce

dateparser is a smart and high-performance datetime parser library, it supports hundreds of different patterns.

For better performance and flexibility, dateparser doesn't use SimpleDateFormat or DateTimeFormatter, but uses retree to parse the specified String into several matched parts, and convert different parts to be different properties like year, month, day, hour, minute, second, zone etc.

dateparser has lots of predefined regular expressions as rules, like:

  • (?<week>%s)\W* to match Monday as week
  • ?(?<year>\d{4})$ to match 2019 as year
  • ^(?<year>\d{4})(?<month>\d{2})$ to match 201909 as year and month
  • ?(?<hour>\d{1,2}) o’clock\W* to match 12 o’clock as hour
  • More rules in DateParserBuilder.java

With so many regular expressions, if use java.util.regex.Pattern to match them one by one, the performance would be a disaster. So I choice to use retree, retree could merge lots of regular expressions as one, in my opinion, it is more like a tree, which could execute matching quickly and concurrently.

You can also customize your own parser, by add new rules.

Install

Add maven dependency:

<dependency>
  <groupId>com.github.sisyphsu</groupId>
  <artifactId>dateparser</artifactId>
  <version>1.0.11</version>
</dependency>

Basic Usage

Parse String into Date, Calendar, LocalDateTime, OffsetDateTime:

Date date = DateParserUtils.parseDate("Mon Jan 02 15:04:05 -0700 2006");
// Tue Jan 03 06:04:05 CST 2006
Calendar calendar = DateParserUtils.parseCalendar("Fri Jul 03 2015 18:04:07 GMT+0100 (GMT Daylight Time)");
// 2015-07-03T17:04:07Z
LocalDateTime dateTime = DateParserUtils.parseDateTime("2019-09-20 10:20:30.12345678 +0200");
// 2019-09-20T16:20:30.123456780
OffsetDateTime offsetDateTime = DateParserUtils.parseOffsetDateTime("2015-09-30 18:48:56.35272715 +0000 UTC");
// 2015-09-30T18:48:56.352727150Z

Please notice the TimeZone and ZoneOffset like -0700, it could affect time.

Create new DateParser

Because DateParser isn't thread safe, and the parse operation is quite fast(about 1us), so DateParserUtils maintains one parser as default, and wrap it with synchronized.

If you want to use it concurrently, you should create new parser like this:

DateParser parser = DateParser.newBuilder().build();
Date date = parser.parseDate("Mon Jan 02 15:04:05 -0700 2006");
// Tue Jan 03 06:04:05 CST 2006

The DateParser's instance is a little heavy, you should try to reuse it.

Prefer MM/dd or dd/MM

For most cases, dateparser could recognize which part is month and which part is day.

But for MM/dd/yy and dd/MM/yy, it would be confused, because most of countries use dd/MM/yy, but little of countries use MM/dd/yy, which is mainly the USA.

So dateparser will use dd/MM as priority, but you could change it by:

DateParserUtils.preferMonthFirst(true);
DateParserUtils.parseCalendar("08.03.71");
// 1971-08-03
DateParserUtils.preferMonthFirst(false);
DateParserUtils.parseCalendar("08.03.71");
// 1971-03-08

Notice: if either number is larger than 12, then preferMonthFirst wouldn't be effective.

Customize Parser

You could use DateParserBuilder to build your own parser, and customize new rules to parse different input.

Like add support for 【2019】, which isn't supported by default:

DateParser parser = DateParser.newBuilder().addRule("【(?<year>\\d{4})】").build();
Calendar calendar = parser.parseCalendar("【1991】");
assert calendar.get(Calendar.YEAR) == 1991;

The group name year is very important, you cannot use other unknown name.

But, you can register new handler to parse the new rule:

DateParser parser = DateParser.newBuilder()
    .addRule("民国(\\d{3})年", (input, matcher, dt) -> {
        int offset = matcher.start(1);
        int i0 = input.charAt(offset) - '0';
        int i1 = input.charAt(offset + 1) - '0';
        int i2 = input.charAt(offset + 2) - '0';
        dt.setYear(i0 * 100 + i1 * 10 + i2 + 1911);
    })
    .build();
Calendar calendar = parser.parseCalendar("民国101年");
assert calendar.get(Calendar.YEAR) == 2012;

The 民国101年 represents 101 years after 1911.

Performance

Compared to single SimpleDateFormat, the performance of dateparser:

Benchmark               Mode  Cnt     Score    Error  Units
SingleBenchmark.java    avgt    6   921.632 ± 12.299  ns/op
SingleBenchmark.parser  avgt    6  1553.909 ± 70.664  ns/op

Compared to single DateTimeFormatter, the performance of dateparser:

Benchmark                       Mode  Cnt     Score    Error  Units
SingleDateTimeBenchmark.java    avgt    6   654.553 ± 16.703  ns/op
SingleDateTimeBenchmark.parser  avgt    6  1680.690 ± 34.214  ns/op

So, for String with known format, the dateparser is slower.

But if the number of format is not single, lets increase to 16, the performance of dateparser:

Benchmark              Mode  Cnt      Score      Error  Units
MultiBenchmark.format  avgt    6  47385.021 ± 1083.649  ns/op
MultiBenchmark.parser  avgt    6  22852.113 ±  310.720  ns/op

dateparser is very stable, with increasing of the number of format, it has no performance lose.

You can checkout the source code of benchmark at there.

Showcase

There are some examples of datetime format which dateparser supports:

May 8, 2009 5:57:51 PM                               
oct 7, 1970                                          
oct 7, '70                                           
oct. 7, 1970                                         
oct. 7, 70                                           
Mon Jan  2 15:04:05 2006                             
Mon Jan  2 15:04:05 MST 2006                         
Mon Jan 02 15:04:05 -0700 2006                       
Monday, 02-Jan-06 15:04:05 MST                       
Mon, 02 Jan 2006 15:04:05 MST                        
Tue, 11 Jul 2017 16:28:13 +0200 (CEST)               
Mon, 02 Jan 2006 15:04:05 -0700                      
Thu, 4 Jan 2018 17:53:36 +0000                       
Mon Aug 10 15:44:11 UTC+0100 2015                    
Fri Jul 03 2015 18:04:07 GMT+0100 (GMT Daylight Time)
September 17, 2012 10:09am                         
September 17, 2012 at 10:09am PST-08               
September 17, 2012, 10:10:09                       
October 7, 1970                                    
October 7th, 1970                                  
12 Feb 2006, 19:17                                 
12 Feb 2006 19:17                                  
7 oct 70                                           
7 oct 1970                                         
03 February 2013                                   
1 July 2013                                        
2013-Feb-03                                        
3/31/2014                                          
03/31/2014                                         
08/21/71                                           
8/1/71                                             
4/8/2014 22:05                                     
04/08/2014 22:05                                   
4/8/14 22:05                                       
04/2/2014 03:00:51                                 
8/8/1965 12:00:00 AM                               
8/8/1965 01:00:01 PM                               
8/8/1965 01:00 PM                                  
8/8/1965 1:00 PM                                   
8/8/1965 12:00 AM                                  
4/02/2014 03:00:51                                 
03/19/2012 10:11:59                                
03/19/2012 10:11:59.3186369                        
2014/3/31                                          
2014/03/31                                         
2014/4/8 22:05                                     
2014/04/08 22:05                                   
2014/04/2 03:00:51                                 
2014/4/02 03:00:51                                 
2012/03/19 10:11:59                                
2012/03/19 10:11:59.3186369                        
2014年04月08日                                      
2006-01-02T15:04:05+0000                           
2009-08-12T22:15:09-07:00                          
2009-08-12T22:15:09                                
2009-08-12T22:15:09Z                               
2014-04-26 17:24:37.3186369                        
2012-08-03 18:31:59.257000000                      
2014-04-26 17:24:37.123                            
2013-04-01 22:43                                   
2013-04-01 22:43:22                                
2014-12-16 06:20:00 UTC                            
2014-12-16 06:20:00 GMT                          
2014-04-26 05:24:37 PM                           
2014-04-26 13:13:43 +0800                        
2014-04-26 13:13:43 +0800 +08                    
2014-04-26 13:13:44 +09:00                       
2012-08-03 18:31:59.257000000 +0000 UTC          
2015-09-30 18:48:56.35272715 +0000 UTC           
2015-02-18 00:12:00 +0000 GMT                    
2015-02-18 00:12:00 +0000 UTC                    
2015-02-08 03:02:00 +0300 MSK m=+0.000000001     
2015-02-08 03:02:00.001 +0300 MSK m=+0.000000001 
2017-07-19 03:21:51+00:00
2014-04-26               
2014-04                  
2014                     
2014-05-11 08:20:13,787  
3.31.2014       
03.31.2014      
08.21.71        
2014.03         
2014.03.30      
20140601        
20140722105203  
1332151919      
1384216367189   
1384216367111222
1384216367111222333 

Lots of these examples were copied from https://github.com/araddon/dateparse.

Support

Let me know if you meet any issues when using this library.

Let me know if you need any features that this library hasn't yet.

Pull Request are welcomed.

License

MIT

dateparser's People

Contributors

fdlsk2r avatar kiruthikaaarthi avatar sisyphsu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dateparser's Issues

Returning a hint whether the parsing was deterministic or not

We have a use case where we need to get all possible combinations of date in case the format is non deterministic. For example, 3/4/2023 where parser won't know which is date and which is month. In this case, could we provide one of the options:

  • Either provide both possible dates
  • Or, return a hint that the group was "dayOfMonth" in which case, the caller can use the parsed date to convert to the alternate format and later resolve the conflict based on the use case.

Support modern JDK versions by updating lombok to 1.18.10 (or newer)

See original Lombok issue here: projectlombok/lombok#2790

It looks like updating to a more recent version of lombok will do the trick!

Personally, I use OpenJDK 17 and see the error:

java.lang.RuntimeException: java.lang.IllegalAccessError: class lombok.javac.apt.LombokProcessor (in unnamed module @0x4d0cbf3f) cannot access class com.sun.tools.javac.processing.JavacProcessingEnvironment (in module jdk.compiler) because module jdk.compiler does not export com.sun.tools.javac.processing to unnamed module @0x4d0cbf3f

during compilation.

Thank you for this wonderful date parsing solution!

Get date format

Hi,
I would like to be able to get the date format from a string. For example, one could add a getFormatPattern method to get a format string that can be used further for business logic.
My case:
I am doing a csv parser, I need to be able to define the data type in a column, this library could help me if it had this function.

(How to?) Improve performance when parsing many strings in the same format

I was wondering if there is an option to improve the performance even further when parsing many strings that are all in the same format.
My use-case is parsing timestamps from a CSV file where the CSV file has million of rows and each of the timestamps is in the same format.
It would be ideal if I could just say to the parser: "remember that format you detected for the previous string. I'm pretty sure this string is in the same format, so try that first when parsing this string".

To illustrate this, my situation is similar to this benchmark

package com.github.sisyphsu.dateparser.benchmark;

import com.github.sisyphsu.dateparser.DateParser;
import org.openjdk.jmh.annotations.*;

import java.util.Random;
import java.util.concurrent.TimeUnit;

@Warmup(iterations = 2, time = 2)
@BenchmarkMode(Mode.AverageTime)
@Fork(2)
@Measurement(iterations = 3, time = 3)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class MultiSameBenchmark {

    private static String[] TEXTS;

    static {
        Random random = new Random(123456789l);
        TEXTS = new String[10000000];
        for(int i = 0; i < TEXTS.length; i++){
            TEXTS[i] = String.format("2020-0%d-1%d 00:%d%d:00 UTC",
                    random.nextInt(8) + 1,
                    random.nextInt(8) + 1,
                    random.nextInt(5),
                    random.nextInt(9));
        }
    }

    @Benchmark
    public void parser() {
        DateParser parser = DateParser.newBuilder().build();
        for (String text : TEXTS) {
            parser.parseDate(text);
        }
    }
}

Is there already such an option on the parser that I overlooked ?

Format exception

hi,I want to format the input "17JAN2023/00:00";
I use the rule "(?\d{2})\W+(?%s)\W+(?\d{4})./(?\d{2})$" but has error like that;
Exception in thread "main" java.time.format.DateTimeParseException: Text 17JAN2023/00:00 cannot parse at 0

Fatal Exception: java.lang.NoClassDefFoundError

We get a crash on Android 6 and 7 devices (LG and Samsung so far):

Fatal Exception: java.lang.NoClassDefFoundError: Failed resolution of: Ljava/time/ZoneId;
       at com.github.sisyphsu.dateparser.DateBuilder.<clinit>(DateBuilder.java:21)
       at com.github.sisyphsu.dateparser.DateParser.<init>(DateParser.java:23)
       at com.github.sisyphsu.dateparser.DateParserBuilder.build(DateParserBuilder.java:207)
       at com.github.sisyphsu.dateparser.DateParserUtils.<clinit>(DateParserUtils.java:20)
       at com.github.sisyphsu.dateparser.DateParserUtils.parseDate(DateParserUtils.java:29)

https://developer.android.com/reference/java/time/ZoneId is Android 8+ only so your library doesn't support older OS versions.

Parsing a date with some negative offsets raises an exception.

Steps to reproduce:
Try to parse a date with negative time zone offset and minutes set in 30 - “2020-12-31 01:33-09:30” and “2020-12-31 07:33-03:30”. Error is raised: Zone offset minutes and seconds must be negative because hours is negative..”
Please, note: parsing works for most of time zones (negative and positive). The problem happened only when negative time zone has non-zero minutes.

UTC-09:30 and UTC-03:30 are real offsets:
https://en.wikipedia.org/wiki/List_of_UTC_time_offsets

Wrong selection of a matching rule

Hi, I'm trying to parse dates in a format of month-year. This format without a day is very common for documents like CV. But I found that I cannot add a custom rule e.g. for the following dates:

September 2010
September/2003

DateParser parser = DateParser.newBuilder()
                    .addRule("(?<month>september)\\s{1,4}(?<year>\\d{4})")
                    .addRule("(?<month>\\w+)\\s{1,4}(?<year>\\d{4})")
                    .addRule("(?<month>\\w+)/(?<year>\\d{4})")
                    .build();
Calendar calendar = parser.parseCalendar(date.toLowerCase());

I added custom rules and checked that these must be working fine as a common Regex. But I'm getting an error Text september 2010 cannot parse at 12. The reason is, in the code:

    private void DateParser::parse(final CharArray input) {
        matcher.reset(input);
        int offset = 0;
        int oldEnd = -1;
        while (matcher.find(offset)) {
       // ....
        }
        if (offset != input.length()) {
            throw error(offset);
        }
    }

every time I see matcher.re() is equal to (?<month>september)\W+(?<day>\d{1,2})(?:th)?\W* with offset equal to 12 instead of 14 and, definitely, this doesn't cover whole template.

Is any way to force matching by a longest match instead of taking first one? Or give a bunch of matches instead of a total break?

Custom Rule Failure

I have the following code, which I would expect to be able to extract dates like 28X01X2020 (ddXmmXyyyy):

DateParser parser = DateParser.newBuilder()
    .addRule("(?<day>[0-9]{2})X(?<month>[0-9]{2})X(?<year>[0-9]{4})")
    .build();

System.out.println(parser.parseDate("28X01X2020"));

However, on running this, I get the following error:

java.time.format.DateTimeParseException: Text 28X01X2020 cannot parse at 0

I can't see how this differs significantly from the example in the README. Am I doing something wrong, or is this a bug?

Locale support

Currently local versions of dates like "23. März 1999" for 23nd of March 1999 in german aren't detected

preferMonthFirst is not reset when month is given greater than 12

Hi ,

While using DateParser lib for en_US language, where it considers month first, it works as expected if we provide any values within 12 for month. Example: 09/21/2021. But when we provide greater than 12( example, 17/03/2021), it is behaving a bit different. From my understanding from document, it should not consider a preferMonthFirst if it is greater than 12. So, expected date in this case is March 17, 2021. But it displays, May 3, 2022.

Case 1: prefer date first:
In the below example, when we provided the month greater than 12, it prefers first value as a month. This is expected and correct behaviour

dateParser.setPreferMonthFirst(false);
dateParser.parseDate("12/17/21")
Result: Date@64 "Fri Dec 17 00:00:00 IST 2021"

Case 2: prefer month first:
In the below example, when we provide month greater than 12, it ideally should reset to prefer date first.

dateParser.setPreferMonthFirst(true);
dateParser.parseDate("17/09/2021")
Result: Date@72 "Mon May 09 00:00:00 IST 2022"
Expected Value: : Mon Sept 17 00:00:00 IST 2021

Not sure why the behaviour is like this. We would also expect it to parse without considering preferMonthFirst.

Could you please help us here on how we can handle this so that we get date as first prefered only in this edge case?

Unneeded patterns/rules influence the result of the parsing

This might be an issue with retree rather than with the dateparser though.

The following test (which you cannot execute via the public API) fails:

    @Test
    public void parserWithLimitedPatterns(){
        List<String> rules = Arrays.asList(
          "(?<year>\\d{4})\\W{1}(?<month>\\d{1,2})\\W{1}(?<day>\\d{1,2})[^\\d]?",
          "\\W*(?:at )?(?<hour>\\d{1,2}):(?<minute>\\d{1,2})(?::(?<second>\\d{1,2}))?(?:[.,](?<ns>\\d{1,9}))?(?<zero>z)?",
          " ?(?<zoneOffset>[-+]\\d{1,2}:?(?:\\d{2})?)"
        );

        DateParser dateParser = new DateParser(rules, new HashSet<>(rules), Collections.emptyMap(), true, false);
        String input = "2022-08-09 19:04:31.600000+00:00";
        Date date = dateParser.parseDate(input);
        assertEquals(parser.parseDate(input), date);
    }

Note how those 3 rules should be sufficient to parse the date.

  • There is a rule for the year-month-day part
  • There is a rule for the hours:minutes:seconds.ns part
  • There is a rule for the zone offset part

However, during parsing the zoneoffset rule is never used. Instead, it uses the rule for the hours twice.

The weird thing is that when I add a rule that should not be used (`" ?(?\d{4})$"), the test suddenly succeeds:

    @Test
    public void parserWithLimitedPatterns(){
        List<String> rules = Arrays.asList(
          "(?<year>\\d{4})\\W{1}(?<month>\\d{1,2})\\W{1}(?<day>\\d{1,2})[^\\d]?",
          " ?(?<year>\\\\d{4})$",
          "\\W*(?:at )?(?<hour>\\d{1,2}):(?<minute>\\d{1,2})(?::(?<second>\\d{1,2}))?(?:[.,](?<ns>\\d{1,9}))?(?<zero>z)?",
          " ?(?<zoneOffset>[-+]\\d{1,2}:?(?:\\d{2})?)"
        );

        DateParser dateParser = new DateParser(rules, new HashSet<>(rules), Collections.emptyMap(), true, false);
        String input = "2022-08-09 19:04:31.600000+00:00";
        Date date = dateParser.parseDate(input);
        assertEquals(parser.parseDate(input), date);
    }

The position where I add that additional rule is important. For example adding it at the end of the list instead of at index 1 makes the test fail again.

I bumped into this issue for PR #28 , where I try to reduce the number of rules that are used for parsing to improve the performance.

Missing date format

Hi,

The following format throws an exception:

2020-30-03T18:28:47.382Z

I tried to add a customized rule:
DateParser parser = DateParser.newBuilder().addRule("(?\d{4})\W{1}(?\d{1,2})\W{1}(?\d{1,2})[^\\d]?(?\d{1,2}):(?\d{1,2})(?:(?\d{1,2}))?(?:.,)?(?z)?").build();

But it didn't work. What am I doing wrong?

I need to support both YYYY-mm-dd and YYYY-dd-mm.

Thank you.

Strange timezone offsets

Hi,

public static void main(String[] args) {
final DateParser dp = DateParser.newBuilder().build();
final String date = "2020-06-08T13:45:05-00:00";
System.out.println(dp.parseDate(date).toString());
System.out.println(dp.parseDateTime(date).toString());
System.out.println(dp.parseOffsetDateTime(date).toString());
}

The code above gives me the results:

Mon Jun 08 14:45:05 CEST 2020
2020-06-08T14:45:05
2020-06-08T13:45:05Z

My local time was 13:45:05 I'm using GMT+2.
I expected to get something like 15:45:05 (13:45:05 + 2 hours)
Am I doing something wrong?

Version 1.0.2-1.0.4

Version 1.0.0-1.0.1
Gives me the following:

Mon Jun 08 15:45:05 CEST 2020
2020-06-08T15:45:05
2020-06-08T13:45:05Z

Weird bug - Custom date parser for parsing dates with zero prefixes

For a specific usecase, I needed to parse a date of format yyyy-mm-dd where each component might be prefixed by a zero

I am trying to write a custom parser which is able to parse this

for eg

02022-012-009 should be parsed as 2022-12-09

This is my code


import com.github.sisyphsu.dateparser.DateParser;

public class DateUtilsApplication  {

    public static void main(String[] args) {
        DateParser dateParser = DateParser.newBuilder()
                .addRule("0?(?<year>\\d{4})\\W{1}0?(?<month>\\d{1,2})\\W{1}0?(?<day>\\d{1,2})")
                .build();;

        //example 1  (no zeros)
        System.out.println(dateParser.parseDateTime("2022-12-09").toLocalDate());
        //prints "2022-12-09"

        //example 2 (year, month and date have zero prefix)
        System.out.println(dateParser.parseDateTime("02022-012-009").toLocalDate());
        //prints "2022-12-09"

        //example 3 (month and date have zero prefix)
        System.out.println(dateParser.parseDateTime("2022-012-009").toLocalDate());
        //prints "2022-12-09"

        //example 4 (date has zero prefix)
        System.out.println(dateParser.parseDateTime("2022-12-009").toLocalDate());
        //expected  "2022-12-09", but errors out


    }

}

All examples use the same dateparser but the fourth errors out.
very weird because I have given 0? for all 3 components.

What is the problem here?

Dateparser parsed the date incorrectly

This is string : "Sat, 29 Feb 2020 01:21:19+5:30"

This is the output by dateparser : Sat, 29 Feb 2020 00:00:19 UTC

Expected output : Fri Feb 28 19:51:19 UTC 2020

The code is used :

                Date date = DateParserUtils.parseDate("Sat, 29 Feb 2020 01:21:19+5:30");
		SimpleDateFormat dateFormat = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z");
		dateFormat.setTimeZone(TimeZone.getTimeZone("UTC"));
		System.out.println("Output: "+ dateFormat.format(date));

Date Parsing Rpoblem

Hi,
In Slovenia date format is "d. M. yyyy" (Example: "13. 4. 2022") and the problem is that this dateParser can't parse it:

Exception in thread "main" java.time.format.DateTimeParseException: Text 12. 4. 2022 cannot parse at 0
at com.github.sisyphsu.dateparser.DateParser.error(DateParser.java:401)
at com.github.sisyphsu.dateparser.DateParser.error(DateParser.java:397)
at com.github.sisyphsu.dateparser.DateParser.parse(DateParser.java:131)
at com.github.sisyphsu.dateparser.DateParser.parseDate(DateParser.java:67)
at com.github.sisyphsu.dateparser.DateParserUtils.parseDate(DateParserUtils.java:29)
at si.zzi.eforms.wp.utils.DateJSFConverter.main(DateJSFConverter.java:51)

Can you help? regards

Thank you so much for writing this library!

I was about to write my own "lenient" datetime parser when I stumbled across this project. It works so well, and saved me many hours. Thank you!

(Feel free to close this 🙂 )

ISO formatted string is parsed wrong

    @Test
    public void testISOString() throws ParseException {
        String input = "2016-10-29T09:20:19.000Z";
        Date simpleDateFormatDate = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSX").parse(input);
        Date instantDate = Date.from(Instant.parse(input));
        Date parsedDate = parser.parseDate(input);

        assert Objects.equals(simpleDateFormatDate, instantDate);
        //The following assert fails
        assert Objects.equals(simpleDateFormatDate, parsedDate);
    }

fails.

The input should parse to (as the JDK code does):

Sat Oct 29 11:20:19 CEST 2016

But you get

Sat Oct 29 10:20:19 CEST 2016

The ISO timestamp is expressed in GMT. CEST is Central Europe Summer Time and is GMT+2. So 09:20 in GMT becomes 11:20 CEST. To me it looks like the JDK code is correct, and this parser is 1 hour off.

This was tested with OpenJDK11, with

  • Locale.getDefault(Locale.Category.FORMAT): en_US
  • TimeZone.getDefault(): id="Europe/Brussels"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.