GithubHelp home page GithubHelp logo

courtsbr / esaj Goto Github PK

View Code? Open in Web Editor NEW
43.0 13.0 20.0 36.17 MB

Scrapers for many e-SAJ systems

Home Page: http://courtsbr.github.io/esaj/

License: GNU General Public License v2.0

R 100.00%
brazil law scraper captcha

esaj's Introduction

esaj

Made In Brazil Travis-CI Build Status AppVeyor Build Status

Overview

The esaj R package is a simple interface that allows you to download multiple kinds of files from Brazil's e-SAJ (Electronic Justice Automation System) portals. With this package you can save and parse lawsuits, queries, and decisions with very simple, tidyverse compliant functions.

To install esaj, run the code below:

# install.packages("devtools")
devtools::install_github("courtsbr/esaj")

Usage

Lawsuits

Before esaj if you wanted to gather information about lawsuits being processed by Brazil's state-level Judiciary, you would have to go to each state's e-SAJ portal, manually input each lawsuit's ID, break a capthca, and only then download an HTML with the information you wanted; now you can simply run download_cpopg() or download_cposg(), and spend your valuable time analysing the data.

# Download first degree lawsuits from multiple states
ids <- c(
  "0123479-07.2012.8.26.0100",
  "0552486-62.2015.8.05.0001",
  "0303349-44.2014.8.24.0020")
esaj::download_cpopg(ids, "~/Desktop/")
#> [1] "/Users/user/Desktop/01234790720128260100.html"
#> [2] "/Users/user/Desktop/05524866220158050001.html"
#> [3] "/Users/user/Desktop/03033494420148240020.html"

# Download second degree lawsuits from São Paulo
ids <- c(
  "1001869-51.2017.8.26.0562",
  "1001214-07.2016.8.26.0565")
esaj::download_cposg(ids, "~/Desktop/")
#> [1] "/Users/user/Desktop/10018695120178260562.html"
#> [2] "/Users/user/Desktop/10012140720168260565.html"

For more information on how to use these functions and which TJs are implemented, please see Downloading Lawsuits.

Queries

Besides downloading lawsuits (see the Downloading Lawsuits article), esaj also allows the user to download the results of a query on lawsuits. This kind of query is very useful for finding out what lawsuits contain certain words, were filed in a given period, were filed in a given court, etc.

# Download results of a simple first degree query
esaj::download_cjpg("recurso", "~/Desktop/")
#> [1] "/Users/user/Desktop/search.html"
#> [2] "/Users/user/Desktop/page1.html"

# Download results of a slightly more complex second degree query
esaj::download_cjsg("recurso", "~/Desktop/", classes = c("1231", "1232"))
#> [1] "/Users/user/Desktop/search.html"
#> [2] "/Users/user/Desktop/page1.html"

For more information on how to use these functions and all their auxiliary methods (like peek_cj*g() and cj*g_table()), please see Downloading Queries.

Decisions

Of all functions in the esaj package, download_decision() is probably the simplest: it downloads the PDF belonging to a decision and that's it.

# Download one decision
esaj::download_decision("10000034", "~/Desktop/")
#> [1] "/Users/user/Desktop/10000034.pdf"

# Download more than one decision
esaj::download_decision(c("10800758", "10000034"), "~/Desktop/")
#> [1] "/Users/user/Desktop/10800758.pdf"
#> [2] "/Users/user/Desktop/10000034.pdf"

For more information on how to use this function, please see Downloading Decisions.

esaj's People

Contributors

azeloc avatar clente avatar jtrecenti avatar mifarhat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

esaj's Issues

Falha ao instalar no R 3.4.4 no Windows

Rodei devtools::install_github("courtsbr/esaj") e recebi a seguinte mensagem de erro:

  • installing source package 'esaj' ...
    Warning in .write_description(db, file.path(outDir, "DESCRIPTION")) :
    Unknown encoding with non-ASCII data: converting to ASCII
    Error in iconv(x[ind], "latin1", "ASCII", sub = "byte") :
    objeto 'ind' não encontrado
    ERROR: installing package DESCRIPTION failed for package 'esaj'
  • removing 'C:/Users/mgaldino/Documents/R/win-library/3.4/esaj'
    In R CMD INSTALL

Meu "session Info":
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

download_2deg_lawsuit should be aware of list pages

parsers won't work as intended on second degree docket numbers. sometimes one number defines two lawsuits.

wd <- tempdir()
downloaded_file <- esaj::download_2deg_lawsuit(id = '00000144420138260352', path = wd)

parser <- esaj::make_parser() %>% 
  esaj::parse_data()

run_parser(downloaded_file, parser, path = wd)

https://esaj.tjsp.jus.br/cposg/search.do?conversationId=&paginaConsulta=1&localPesquisa.cdLocal=-1&cbPesquisa=NUMPROC&tipoNuProcesso=SAJ&numeroDigitoAnoUnificado=&foroNumeroUnificado=&dePesquisaNuUnificado=&dePesquisa=00000144420138260352&uuidCaptcha=&pbEnviar=Pesquisar

download_2deg_lawsuit should download both lawsuits on such cases.

Captcha

E ai galera. O programa não está conseguindo quebrar o captcha no site do TJSP, vocês fizeram alguma melhoria já, ou essa é a última versão?

download_cpopg para processos de SC não funciona

Olá!

Primeiramente, muito obrigado pelo desenvolvimento desta biblioteca!

Estou realizando alguns testes e percebi que a função "download_cpopg" para os processos de SC baixa um html com uma mensagem de erro (vide https://ibb.co/t46gT8j)

Acredito que o (re)captcha do google não foi devidamente "quebrado".

Precisando de qualquer ajuda é só avisar!

Problemas com esaj::parse_cjsg()

Arquivo zip com exemplo reprodutível (Issues.zip)

Fiz o download com esaj::download_cjsg da classe "279" (inquérito policial) e listei os arquivos baixados para parsear com esaj::parse_cjsg, porém apenas alguns arquivos html foram parseados, isto é, dos 52 arquivos baixados, apenas 23 foram parseados com sucesso (o restante resultou na coluna "result" igual "error").

esaj::download_cjsg(query = "",classes = class_inq$id5, 
                    path = "InqPolicial_AL-SP/data_raw_SP/cjsg_SP/", 
                    max_page = Inf, tj = "TJSP")

# Os arquivos estão anexados no Isssue
files <- dir("/cjsg_SP/", full.names = TRUE, pattern = "page")

d_cjsg <- esaj::parse_cjsg(files)

Tentei rodar as funções contidas na função esaj::parse_cjsg separadamente, entre elas, a esaj:::parse_cjsg_one e a esaj:::parse_cjsg_file. A primeira sempre retornava o seguinte erro:

# file[2] é a "page40.html". O mesmo problema ocorre com "page1.html", apesar de ao aplicar esaj::parse_cjsg, o dado sai estruturado perfeitamente.
esaj:::parse_cjsg_one(xml2::read_html(files[2], encoding = "UTF-8") %>%
                                  rvest::html_nodes(".fundocinza1"))
# Erro: Duplicate identifiers for rows (1, 8, 16, 24, 32, 40, 47, 54, 61, 68, 75, 82, 89, 96, 103, 110, 117, 125, 133, 141), (3, 10, 18, 26, 34, 42, 49, 56, 63, 70, 77, 84, 91, 98, 105, 112, 119, 127, 135, 143), (5, 12, 20, 28, 36, 44, 51, 58, 65, 72, 79, 86, 93, 100, 107, 114, 121, 129, 137, 145), (6, 13, 21, 29, 37, 45, 52, 59, 66, 73, 80, 87, 94, 101, 108, 115, 122, 130, 138, 146), (15, 23, 31, 39, 124, 132, 140), (4, 11, 19, 27, 35, 43, 50, 57, 64, 71, 78, 85, 92, 99, 106, 113, 120, 128, 136, 144), (7, 14, 22, 30, 38, 46, 53, 60, 67, 74, 81, 88, 95, 102, 109, 116, 123, 131, 139, 147), (2, 9, 17, 25, 33, 41, 48, 55, 62, 69, 76, 83, 90, 97, 104, 111, 118, 126, 134, 142)

A segunda função retorna um tibble apenas com uma coluna (result) com as observações "error"

# file[2] é a "page40.html". O mesmo problema NÃO com "page1.html"
esaj:::parse_cjsg_file(file = files[2])
# # A tibble: 20 x 1
#    result
#    <chr> 
#  1 error 
#  2 error 
#  3 error 
#  4 error 
#  5 error 
#  6 error 
#  7 error
# ...

Obs: É a minha primeira Issue no Git

Major API change proposal

CJPG tables and files

  • cjpg_table()
  • browse_table()
  • cjpg() download_cjpg()
  • cjpg_npags()
  • cjpg_parms()
  • cjpg_session()
  • peek_cjpg()

CJSG tables and files

  • cjsg_table()
  • browse_cjsg() browse_table()
  • download_cjsg()
  • peek_cjsg()

Decisions

  • download_decision()

CPOPG and CPOSG

  • download_lawsuit()
  • download_2deg_lawsuit() download_lawsuit(degree = 2)

Parsers

  • make_parser() make_parser(type = "cposg")
  • parse_data()
  • parse_movs()
  • parse_parts()
  • run_parser()
  • parse_cjpg() make_parser(type = "cjpg")
  • parse_cjsg() make_parser(type = "cjsg")
  • parse_cpopg() make_parser(type = "cpopg")

peek_cjsg() returns error when there are 0 results

Example

n_pags <- esaj::peek_cjsg(query = "", subjects = "14")  
#> There are 169 pages to download
#> This should take around 1.4 minutes
n_pags                                                  
#> [1] 167
                                                        
n_pags <- esaj::peek_cjsg(query = "", subjects = "5944")
#> Error in ceiling(.): non-numeric argument to mathematical function

I think it should throw a message and return 0L

Also the result is always numeric. it should be an integer.

Baixa taxa de sucesso ao usar o download_decision()

Fiz com apenas 200 decisões, mas no código original acabo utilizando cerca de 3000.

A questão é que a taxa de acerto dos downloads diminuiu muito desde a última vez que utilizei. Antes a taxa era de mais de 50% eu imagino, agora creio que ficou abaixo de 1%.

Código que eu usei para testar está aqui, o resultado foram 4 decisões baixadas de 200, qualquer sugestão para alterar o código é bem vinda:

teste <- c("11617159" ,"11616826", "11614197", "11613204", "11612465", "11612382",
"11610790", "11596184", "11585837", "11579077", "11579466", "11577358", "11572937",
"11573007", "11574174", "11568555", "11565531", "11564407", "11556810", "11549104",
"11545315", "11537176", "11537950", "11537951", "11526058", "11516110", "11511079", 
"11513449", "11508610", "11508568", "11503389", "11498955", "11489817", "11485411",
"11479724", "11480155", "11460948", "11462275", "11448583","11445232", "11441593",
"11440951", "11438440", "11436243", "11423825", "11421460", "11416047", "11409629",
"11403745", "11403744", "11406635", "11403663", "11403377", "11403630", "11398831",
"11393143", "11385625", "11381633", "11379419", "11378051", "11369112", "11365960",
"11367136", "11365345", "11365327", "11359719", "11360049", "11357297", "11353364",
"11353520", "11354949", "11343282", "11340297", "11340155", "11334865", "11329995", 
"11327349", "11323885", "11323884", "11317873", "11317179", "11315337", "11315241", 
"11315030", "11311922", "11310228", "11309236", "11308380", "11297992", "11299057", 
"11297076", "11296682", "11295834", "11295887", "11293266", "11291271", "11290014", 
"11288893", "11286811", "11281715", "11275175", "11273259", "11266539", "11263203", 
"11261741", "11260513", "11257231", "11251051", "11243957", "11242693", "11226301", 
"11227568", "11222617", "11218730", "11214487", "11209927", "11208034", "11203952", 
"11196930", "11200642", "11189730", "11194175", "11191469", "11191379", "11186632", 
"11183063", "11179295", "11182386", "11180725", "11180854", "11179799", "11180913", 
"11172762", "11165020", "11162177", "11158851", "11157543", "11163216", "11155797", 
"11151707", "11150154", "11133850", "11131344", "11129935", "11125810", "11120858", 
"11113894", "11110115", "11107224", "11101203", "11098796", "11087477", "11087782", 
"11091022", "11094604", "11094684", "11085162", "11076721", "11076080", "11078846", 
"11080229", "11073569", "11065171", "11065617", "11062180", "11058370", "11060600", 
"11058198", "11058378", "11053116", "11050031", "11045719", "11038303", "11040633", 
"11041094", "11040996", "11038209", "11033692", "11033691", "11034738", "11033245", 
"11028949", "11031748", "11030768", "11028519", "11031738", "11019675", "11022262", 
"11025347", "11014711", "11012362", "11014089", "11016115", "11013972", "11006254", 
"11006467", "11007225", "10996923", "10998816", "10999542")
library(esaj)
esaj::download_decision(teste, "./decisoes/")
#>   [1] ""                                     
#>   [2] "/tmp/Rtmp4es8XU/decisoes/11616826.pdf"
#>   [3] "/tmp/Rtmp4es8XU/decisoes/11614197.pdf"
#>   [4] "/tmp/Rtmp4es8XU/decisoes/11613204.pdf"
#>   [5] "/tmp/Rtmp4es8XU/decisoes/11612465.pdf"
#>   [6] ""                                     
#>   [7] ""                                     
#>   [8] ""                                     
#>   [9] ""                                     
#>  [10] ""                                     
#>  [11] ""                                     
#>  [12] ""                                     
#>  [13] ""                                     
#>  [14] ""                                     
#>  [15] ""                                     
#>  [16] ""                                     
#>  [17] ""                                     
#>  [18] ""                                     
#>  [19] ""                                     
#>  [20] ""                                     
#>  [21] ""                                     
#>  [22] ""                                     
#>  [23] ""                                     
#>  [24] ""                                     
#>  [25] ""                                     
#>  [26] ""                                     
#>  [27] ""                                     
#>  [28] ""                                     
#>  [29] ""                                     
#>  [30] ""                                     
#>  [31] ""                                     
#>  [32] ""                                     
#>  [33] ""                                     
#>  [34] ""                                     
#>  [35] ""                                     
#>  [36] ""                                     
#>  [37] ""                                     
#>  [38] ""                                     
#>  [39] ""                                     
#>  [40] ""                                     
#>  [41] ""                                     
#>  [42] ""                                     
#>  [43] ""                                     
#>  [44] ""                                     
#>  [45] ""                                     
#>  [46] ""                                     
#>  [47] ""                                     
#>  [48] ""                                     
#>  [49] ""                                     
#>  [50] ""                                     
#>  [51] ""                                     
#>  [52] ""                                     
#>  [53] ""                                     
#>  [54] ""                                     
#>  [55] ""                                     
#>  [56] ""                                     
#>  [57] ""                                     
#>  [58] ""                                     
#>  [59] ""                                     
#>  [60] ""                                     
#>  [61] ""                                     
#>  [62] ""                                     
#>  [63] ""                                     
#>  [64] ""                                     
#>  [65] ""                                     
#>  [66] ""                                     
#>  [67] ""                                     
#>  [68] ""                                     
#>  [69] ""                                     
#>  [70] ""                                     
#>  [71] ""                                     
#>  [72] ""                                     
#>  [73] ""                                     
#>  [74] ""                                     
#>  [75] ""                                     
#>  [76] ""                                     
#>  [77] ""                                     
#>  [78] ""                                     
#>  [79] ""                                     
#>  [80] ""                                     
#>  [81] ""                                     
#>  [82] ""                                     
#>  [83] ""                                     
#>  [84] ""                                     
#>  [85] ""                                     
#>  [86] ""                                     
#>  [87] ""                                     
#>  [88] ""                                     
#>  [89] ""                                     
#>  [90] ""                                     
#>  [91] ""                                     
#>  [92] ""                                     
#>  [93] ""                                     
#>  [94] ""                                     
#>  [95] ""                                     
#>  [96] ""                                     
#>  [97] ""                                     
#>  [98] ""                                     
#>  [99] ""                                     
#> [100] ""                                     
#> [101] ""                                     
#> [102] ""                                     
#> [103] ""                                     
#> [104] ""                                     
#> [105] ""                                     
#> [106] ""                                     
#> [107] ""                                     
#> [108] ""                                     
#> [109] ""                                     
#> [110] ""                                     
#> [111] ""                                     
#> [112] ""                                     
#> [113] ""                                     
#> [114] ""                                     
#> [115] ""                                     
#> [116] ""                                     
#> [117] ""                                     
#> [118] ""                                     
#> [119] ""                                     
#> [120] ""                                     
#> [121] ""                                     
#> [122] ""                                     
#> [123] ""                                     
#> [124] ""                                     
#> [125] ""                                     
#> [126] ""                                     
#> [127] ""                                     
#> [128] ""                                     
#> [129] ""                                     
#> [130] ""                                     
#> [131] ""                                     
#> [132] ""                                     
#> [133] ""                                     
#> [134] ""                                     
#> [135] ""                                     
#> [136] ""                                     
#> [137] ""                                     
#> [138] ""                                     
#> [139] ""                                     
#> [140] ""                                     
#> [141] ""                                     
#> [142] ""                                     
#> [143] ""                                     
#> [144] ""                                     
#> [145] ""                                     
#> [146] ""                                     
#> [147] ""                                     
#> [148] ""                                     
#> [149] ""                                     
#> [150] ""                                     
#> [151] ""                                     
#> [152] ""                                     
#> [153] ""                                     
#> [154] ""                                     
#> [155] ""                                     
#> [156] ""                                     
#> [157] ""                                     
#> [158] ""                                     
#> [159] ""                                     
#> [160] ""                                     
#> [161] ""                                     
#> [162] ""                                     
#> [163] ""                                     
#> [164] ""                                     
#> [165] ""                                     
#> [166] ""                                     
#> [167] ""                                     
#> [168] ""                                     
#> [169] ""                                     
#> [170] ""                                     
#> [171] ""                                     
#> [172] ""                                     
#> [173] ""                                     
#> [174] ""                                     
#> [175] ""                                     
#> [176] ""                                     
#> [177] ""                                     
#> [178] ""                                     
#> [179] ""                                     
#> [180] ""                                     
#> [181] ""                                     
#> [182] ""                                     
#> [183] ""                                     
#> [184] ""                                     
#> [185] ""                                     
#> [186] ""                                     
#> [187] ""                                     
#> [188] ""                                     
#> [189] ""                                     
#> [190] ""                                     
#> [191] ""                                     
#> [192] ""                                     
#> [193] ""                                     
#> [194] ""                                     
#> [195] ""                                     
#> [196] ""                                     
#> [197] ""                                     
#> [198] ""                                     
#> [199] ""                                     
#> [200] ""

Created on 2018-07-17 by the reprex package (v0.2.0).

Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  C                           
#>  tz       America/Sao_Paulo           
#>  date     2018-07-17
#> Packages -----------------------------------------------------------------
#>  package     * version    date       source                        
#>  assertthat    0.2.0      2017-04-11 cran (@0.2.0)                 
#>  backports     1.1.2      2017-12-13 CRAN (R 3.4.4)                
#>  base        * 3.4.4      2018-04-21 local                         
#>  base64enc     0.1-3      2015-07-28 cran (@0.1-3)                 
#>  bindr         0.1.1      2018-03-13 CRAN (R 3.4.4)                
#>  bindrcpp    * 0.2.2      2018-03-29 CRAN (R 3.4.4)                
#>  compiler      3.4.4      2018-04-21 local                         
#>  crayon        1.3.4      2017-09-16 cran (@1.3.4)                 
#>  curl          3.2        2018-03-28 CRAN (R 3.4.4)                
#>  datasets    * 3.4.4      2018-04-21 local                         
#>  devtools      1.13.6     2018-06-27 CRAN (R 3.4.4)                
#>  digest        0.6.15     2018-01-28 CRAN (R 3.4.2)                
#>  dplyr         0.7.6      2018-06-29 CRAN (R 3.4.4)                
#>  esaj        * 0.1.2.9000 2018-07-16 Github (courtsbr/esaj@2fc11fe)
#>  evaluate      0.10.1     2017-06-24 cran (@0.10.1)                
#>  glue          1.2.0      2017-10-29 cran (@1.2.0)                 
#>  graphics    * 3.4.4      2018-04-21 local                         
#>  grDevices   * 3.4.4      2018-04-21 local                         
#>  hms           0.4.2      2018-03-10 CRAN (R 3.4.4)                
#>  htmltools     0.3.6      2017-04-28 CRAN (R 3.4.4)                
#>  httr          1.3.1      2017-08-20 CRAN (R 3.4.2)                
#>  jsonlite      1.5        2017-06-01 CRAN (R 3.4.2)                
#>  knitr         1.20       2018-02-20 CRAN (R 3.4.4)                
#>  lubridate     1.7.4      2018-04-11 CRAN (R 3.4.4)                
#>  magick        1.9        2018-05-11 CRAN (R 3.4.4)                
#>  magrittr      1.5        2014-11-22 cran (@1.5)                   
#>  memoise       1.1.0      2017-04-21 CRAN (R 3.4.2)                
#>  methods     * 3.4.4      2018-04-21 local                         
#>  pillar        1.3.0      2018-07-14 CRAN (R 3.4.4)                
#>  pkgconfig     2.0.1      2017-03-21 cran (@2.0.1)                 
#>  png           0.1-7      2013-12-03 cran (@0.1-7)                 
#>  prettyunits   1.0.2      2015-07-13 cran (@1.0.2)                 
#>  progress      1.2.0      2018-06-14 CRAN (R 3.4.4)                
#>  purrr         0.2.5      2018-05-29 CRAN (R 3.4.4)                
#>  R6            2.2.2      2017-06-17 CRAN (R 3.4.2)                
#>  rappdirs      0.3.1      2016-03-28 cran (@0.3.1)                 
#>  Rcpp          0.12.17    2018-05-18 CRAN (R 3.4.4)                
#>  rlang         0.2.1      2018-05-30 CRAN (R 3.4.4)                
#>  rmarkdown     1.10       2018-06-11 CRAN (R 3.4.4)                
#>  rprojroot     1.3-2      2018-01-03 CRAN (R 3.4.4)                
#>  stats       * 3.4.4      2018-04-21 local                         
#>  stringi       1.2.3      2018-06-12 CRAN (R 3.4.4)                
#>  stringr       1.3.1      2018-05-10 CRAN (R 3.4.4)                
#>  tesseract     2.2        2018-07-10 CRAN (R 3.4.4)                
#>  tibble        1.4.2      2018-01-22 cran (@1.4.2)                 
#>  tidyr         0.8.1      2018-05-18 CRAN (R 3.4.4)                
#>  tidyselect    0.2.4      2018-02-26 CRAN (R 3.4.4)                
#>  tools         3.4.4      2018-04-21 local                         
#>  utils       * 3.4.4      2018-04-21 local                         
#>  withr         2.1.2      2018-03-15 CRAN (R 3.4.4)                
#>  yaml          2.1.19     2018-05-01 CRAN (R 3.4.4)

download_decision() nao funciona

Olá.

Antes de mais nada, acho que esse erro é relacionado ao #25.

Nao estou conseguindo fazer o download_decision() funcionar. Ela nao baixa nada.

Testei os seguinte exemplos, da pagina inicial, e nada:

esaj::download_decision("10000034", "~/Desktop/")
esaj::download_decision(c("10800758", "10000034"), "~/Desktop/")

Tentei também um outro id qualquer, que estava pegando como teste, e nao funcionou também.

download_decision("11616826",  "~/Desktop/")

E, pra finalizar, fiz o teste que o @lagolucas, apenas trocando o "esaj:::download_decision_" por "esaj:::download_decision_", para ele aceitar o parametro ntry=1, e ele nao baixou nenhum dos 200 IDs indicados.

Alguma sugestão? Obrigado!

Resultados estranhos no R CMD check

No merge f2ebd0a fiz o possível para que devtools::check() não retornasse nenhum warning ou note. Consegui me livrar de quase todos os problemas, exceto dois:

# R CMD check results
# 0 errors | 1 warning  | 1 note 
# checking R files for non-ASCII characters ... WARNING
# Found the following files with non-ASCII characters:
# cpopg-tjba.R
# cpopg-tjsc.R
# cposg-tjsc.R
# dje.R
# parse-cpopg.R
# parse-cposg.R
# Portable packages must use only ASCII characters in their R code,
# except perhaps in comments.
# Use \uxxxx escapes for other characters.
# 
# checking R code for possible problems ... NOTE
# set_values2: no visible global function definition for ‘%||%’
# Undefined global functions or variables:
#   %||%

Caso alguém tenha alguma sugestão, por favor comente aqui em baixo.

Pergunta de iniciante na linguagem R

Estou com problemas na hora de instalar as dependências para rodar o projeto.

  • installing to library ‘/Library/Frameworks/R.framework/Versions/3.3/Resources/library’
    ERROR: dependencies ‘captchaTJSC’, ‘captchasaj’, ‘multidplyr’ are not available for package ‘esaj’
  • removing ‘/Library/Frameworks/R.framework/Versions/3.3/Resources/library/esaj’

install.packages("captchasaj")
Warning in install.packages :
package ‘captchasaj’ is not available (for R version 3.3.2)

alguém pode me ajudar?

Get decision from id

Hello,

is there a way to get the decision of a lawsuit using it's id as input?

I would like to download a set of .pdfs and for that I should use esaj::download_decision, right?
But I only have their ids and not their decisions.

Thanks!

"Error: object 'orgao_julgador' not found" na função "parse_cjsg()"

Olá, tudo bem ?

O pacote possuí uma função chamada "parse_cjsg()" que consegue transformar o HTML das buscas processuais em um dataframe. Porém toda vez que chamo essa função ela apresenta o seguinte erro: "Error: object 'orgao_julgador' not found".

Eu subi os arquivos HTML oriundos da busca processual que eu realizei.

error 'orgao julgador'.zip

E no que tange o código utilizado, segue ele abaixo.

O pacote de vocês é ótimo e facilita a vida demais, qualquer sugestão de vocês já ajudaria muito.
Grato.

TESTE_2013_jan <- esaj::download_cjsg(query='coletivo OU empresarial OU adesão',
                             subjects=c('6233','10000629','10000983','10001132','10001318'),
                             registration_start="2013-01-01",
                             registration_end="2013-01-31",
                             tj="tjsp",max_page=11)

files_2013_jan <- as.vector(fs::dir_ls(regexp="page"))

df_files_2013_jan <- esaj::parse_cjsg(files_2013_jan)
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [============>---------------------------------------------------------]  18%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [==================>---------------------------------------------------]  27%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [========================>---------------------------------------------]  36%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [===============================>--------------------------------------]  45%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [=====================================>--------------------------------]  55%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [============================================>-------------------------]  64%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [==================================================>-------------------]  73%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [========================================================>-------------]  82%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- [===============================================================>------]  91%Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found
- Error: object 'orgao_julgador' not found

download_cjsg() with max_page=Inf not working

> esaj::download_cjsg("trecenti", max_page = Inf)
Error in cjsg_npags(dirname(file)) : could not find function "cjsg_npags"

in fact, esaj:::cjsg_npags does not exist. I think it was deleted in previous commits.

translating error

(had to zip the HTML file to upload)

page40.html.zip

library(esaj)
parse_cjsg("page40.html")
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
Error: object 'data_julgamento' not found
# A tibble: 0 x 1
# ... with 1 variable: file <chr>

parse_movs problem

parse_movs() for CPOSG is not working for files like the one attached

20180731_00000013518858260642.html.zip

library(magrittr)
# devtools::install_github('courtsbr/esaj')
arq <- "~/Downloads/20180731_00000013518858260642.html"
parser <- esaj::make_parser("cposg") %>% esaj::parse_movs()
res <- arq %>% esaj::run_parser(parser)
#> Error: html_name(x) == "table" is not TRUE

This correction works for me

library(magrittr)
# devtools::install_github("courtsbr/esaj")
parse_movs2 <- function(parser) {
  stopifnot("parser" %in% class(parser))
  get_movs <- function(html) {
    xp0 <- "//*[@id='tabelaTodasMovimentacoes']"
    tab <- xml2::xml_find_all(html, paste0(xp0, "//parent::table"))
    # if (length(tab) == 0) tab <- xml2::xml_find_all(html, xp0)
    tab %>% 
      rvest::html_table(fill = TRUE) %>% 
      purrr::pluck(1) %>% 
      janitor::clean_names() %>% 
      dplyr::as_tibble() %>%
      dplyr::select(movement = data, X3 = movimento) %>% 
      dplyr::filter(movement != "") %>% 
      tidyr::separate(X3, c("title", "txt"), sep = "\n\t",
                      extra = "merge", fill = "right") %>% 
      dplyr::mutate_all(stringr::str_squish) %>% 
      dplyr::mutate(movement = lubridate::dmy(movement, quiet = TRUE))
  }
  purrr::list_merge(parser, name = "movs", getter = get_movs)
}
arq <- "~/Downloads/20180731_00000013518858260642.html"
parser <- esaj::make_parser("cposg") %>% 
  parse_movs2()
res <- arq %>% esaj::run_parser(parser)
res
#> # A tibble: 1 x 4
#>   id                   file                            hidden movs        
#>   <chr>                <chr>                           <lgl>  <list>      
#> 1 00000013518858260642 ~/Downloads/20180731_000000135… FALSE  <tibble [41…

I think is should be tested with more files

Warning of Pseudo Error

  • Fix the warning of pseudo error when running the function esaj::run_parser(...).
    Warning message: "1 failed to parse.no non-missing arguments to max; returning -Inf ".
    The code works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.