GithubHelp home page GithubHelp logo

Comments (6)

tsgit avatar tsgit commented on May 23, 2024

issue is in invenio

https://github.com/inspirehep/invenio/blob/prod/modules/miscutil/lib/sequtils_cnum.py#L77-L82

but arguably a 'start_date' should be a fully qualified date.

from inspire.

annetteholtkamp avatar annetteholtkamp commented on May 23, 2024

from inspire.

kaplun avatar kaplun commented on May 23, 2024

@tsgit is it something you can quickly fix? Otherwise we can close it.

from inspire.

michamos avatar michamos commented on May 23, 2024

the workaround is to have a starting date of the form 2012-05-00 on legacy, so it's not terribly needed. I am closing this.

from inspire.

tsgit avatar tsgit commented on May 23, 2024

just looking at the data in Conferences, there is often sufficient info to construct a full $$x in $$d

complete list of short dates in $$x (not matching r'^\d{4}-\d{2}-\d{2}$'):

963698  1974    d:['1974']
963699  1974    d:['1974']
963780  1974-09 d:['Sep 1974']
964810  1981-07 d:['Jul 1981 - Jan 1982']
965286  1968    d:['1968']
965288  1976-08 d:['Aug 1976']
965313  1983-05 d:['May-Sep 1983']
965528  1981    d:['1981']
965529  1983    d:['1983']
966335  1987-04 d:['7-9 Apr 1987']      y:1987-04-09
966465  1980-07 d:['Jul 1980']
966466  1982-07 d:['Jul 1982']
966467  1984-07 d:['Jul 1984']
967146  1987-12 d:['December 1987']
970176  1995    d:['?-? Month 1995']
970766  1997-09 d:['Sep Oct 1997']
970772  1997-10 d:['0-0 Oct 1997']      y:1997-10-00
978206  1965-09 d:['Sep 1965']
979446  2008    d:['8-11 Jun 2008']     y:2008-06-11
979917  1966    d:['1966']
979923  1968-11 d:['Nov 1968']
979927  1969-12 d:['Dec 1969']
980079  1967    d:['9-11 Oct 1967']     y:1967-10-11
980080  1967    d:['21-23 Aug 1967']    y:1967-08-23
980082  1967    d:['11-13 Jul 1967']    y:1967-07-13
980084  1966    d:['27 Jun - 9 Jul 1966']       y:1966-07-09
980086  1963    d:['10 Jun - 19 Jul 1963']      y:1963-07-19
980103  1966    d:['5-10 Sep 1966']     y:1966-09-10
980104  1968    d:['15-18 Jul 1968']    y:1968-07-18
980105  1967    d:['7-19 Aug 1967']     y:1967-08-19
980106  1968    d:['19-25 May 1968']    y:1968-05-25
980116  1967    d:['12-16 Dec 1967']    y:1967-12-16
980118  1968    d:['28-30 Oct 1968']    y:1968-10-30
980123  1969    d:['8-21 Jun 1969']     y:1969-06-21
980124  1965    d:['1965']
980151  1969    d:['24 Feb - 8 Mar 1969']       y:1969-03-08
980162  1969-02 d:['Feb 1969']
980163  1967    d:['7-15 Feb 1967']     y:1967-02-15
980188  1968    d:['10-14 Sep 1968']    y:1968-09-14
980189  1968    d:['11-16 Nov 1968']    y:1968-11-16
980190  1969    d:['31 Aug - 13 Sep 1969']      y:1969-09-13
980195  1968    d:['6-11 Sep 1968']     y:1968-09-11
980196  1969    d:['20 Mar 1969']
980201  1968    d:['12-24 Aug 1968']    y:1968-08-24
980204  1968    d:['17-20 Apr 1968']    y:1968-04-20
980205  1970    d:['12 - 25 Jun 1970']  y:1970-06-25
980223  1969    d:['15 Jul 1969']
980231  1970    d:['30 Mar - 1 Apr 1970']       y:1970-04-01
980232  1969    d:['23-25 Apr 1969']    y:1969-04-25
980233  1969    d:['25 Jun - 1 Jul 1969']       y:1969-07-01
980245  1968    d:['29 Jul - 3 Aug 1968']       y:1968-08-03
980246  1968    d:['17 Jun - 23 Aug 1968']      y:1968-08-23
980272  1969-04 d:['Apr 1969']
1198487 2013    d:['17-20 Dec 2012']    y:2012-12-20
1477158 1999-07
1621863 1983-07 y:1983-07

so the majority of these records could be fixed
T.

from inspire.

tsgit avatar tsgit commented on May 23, 2024

Kirsten improved most of the above, the remaining recs with incomplete dates:

In [5]: datere = re.compile(r'^\d{4}-\d{2}-\d{2}$')

In [6]: for r in get_collection_reclist('Conferences'):
    v = get_fieldvalues(r, '111__x')
    if not datere.match(v[0]):
        d = get_fieldvalues(r, '111__d')
        y = get_fieldvalues(r, '111__y')
        if d and y:
            print "%s\t%s\td:%s\ty:%s" % (r, v[0], d, y[0])
        elif d and not y:
            print "%s\t%s\td:%s" % (r, v[0], d)
        elif y and not d:
            print "%s\t%s\ty:%s" % (r, v[0], y[0])
        else:
            print "%s\t%s" % (r, v[0])
   ...:             
963698  1974    d:['1974']
963699  1974    d:['1974']
963780  1974-09 d:['Sep 1974']
964810  1981-07 d:['Jul 1981 - Jan 1982']
965286  1968    d:['1968']
965288  1976-08 d:['Aug 1976']
965313  1983-05 d:['May-Sep 1983']
965528  1981    d:['1981']
965529  1983    d:['1983']
966465  1980-07 d:['Jul 1980']
966466  1982-07 d:['Jul 1982']
966467  1984-07 d:['Jul 1984']
967146  1987-12 d:['December 1987']
970176  1995    d:['?-? Month 1995']
970766  1997-09 d:['Sep Oct 1997']
970772  1997-10 d:['0-0 Oct 1997']      y:1997-10-00
978206  1965-09 d:['Sep 1965']
979917  1966    d:['1966']
979923  1968-11 d:['Nov 1968']
979927  1969-12 d:['Dec 1969']
980124  1965    d:['1965']
980162  1969-02 d:['Feb 1969']
980272  1969-04 d:['Apr 1969']
1477158 1999-07
1621863 1983-07 y:1983-07

from inspire.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.