GithubHelp home page GithubHelp logo

batchgetsymbols's Introduction

About me

I'm an associate professor of Finance at EA/UFRGS.

You can find more details about my work at my personal site.

Fell free to reach me at [email protected].

batchgetsymbols's People

Contributors

evelynmitchell avatar msperlin avatar samprohaska avatar sandroraabe avatar zecaclasher avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

batchgetsymbols's Issues

Error: Input must be a vector, not NULL

If BatchGetSymbols is not able to get data for ANY of the symbols provided, it gives an error "Error: Input must be a vector, not NULL", which occurs on this statement in the function:

df.tickers$ret.adjusted.prices <- calc.ret(df.tickers$price.adjusted, df.tickers$ticker, type.return)

At that point, df.tickers is an empty tibble.

Aggregation to Data by Period

BatchGetSymbols is very useful code. Thank you for writing it.
I have only looked at monthly data produced by BatchGetSymbols. It appears that the closing price and adjusted closing price are computed from the data for the first day of the period. Is that intended?

I believe the current code snippet is

df.tickers <- df.tickers %>%
  group_by(time.groups, ticker) %>%
  summarise(ref.date = min(ref.date),
            volume = sum(volume, na.rm = TRUE),
            price.open = first(price.open),
            price.high = max(price.close),
            price.low = min(price.close),
            price.close = first(price.close),
            price.adjusted = first(price.adjusted)) %>%
  #select(-time.groups) %>%
  arrange(ticker, ref.date)

Should this code be (changing the formulas for price.close and price.adjusted)

df.tickers <- df.tickers %>%
  group_by(time.groups, ticker) %>%
  summarise(ref.date = min(ref.date),
            volume = sum(volume, na.rm = TRUE),
            price.open = first(price.open),
            price.high = max(price.close),
            price.low = min(price.close),
            price.close = last(price.close),
            price.adjusted = last(price.adjusted)) %>%
  #select(-time.groups) %>%
  arrange(ticker, ref.date)

I would appreciate your thoughts. Thank you in advance for your help.

Error with google-style check

Hello there! Great package however I keep getting the following error (which will only be triggered by certain stocks)

Error in if (any(tickers.src == "google")) { : missing value where TRUE/FALSE needed

I tried to clear the options for SymbolLookup but I can't get rid of the error. It is my understanding that google no longer provide access to stock data so I'm not sure that someone will still use google-style symbols

Any help would be appreciated :)

Duplicated dates with different prices

Prices at 2020-04-17 are being duplicated with inconsistent values.

How to reproduce:

future::plan(future::multisession, workers = floor(parallel::detectCores()/2))

TICKERS = BatchGetSymbols(
  tickers=c('^GSPC','^DJI', '^IXIC'),
  first.date='2010-01-01',
  last.date=Sys.Date(),
  freq.data = 'daily',
  type.return='log',
  do.complete.data = FALSE,
  do.fill.missing.prices = FALSE,
  do.cache=TRUE,
  do.parallel=TRUE,
  cache.folder='data')

key = TICKERS$df.tickers %>% select(c(ticker, ref.date))
TICKERS$df.tickers[duplicated(key) | duplicated(key, fromLast=TRUE),]

Output:

Running BatchGetSymbols for:
   tickers =^GSPC, ^DJI, ^IXIC
   Downloading data for benchmark ticker
^GSPC | yahoo (1|1) | Found cache file
Running parallel BatchGetSymbols with 6 cores (12 available)

 Progress: ──────────────────────────────────────────────────────────────── 100%


^GSPC | yahoo (1|3) | Found cache file - Got 100% of valid prices | Youre doing good!
^DJI | yahoo (2|3) | Found cache file - Got 100% of valid prices | Got it!
^IXIC | yahoo (3|3) | Found cache file - Got 100% of valid prices | Good stuff!> key = TICKERS$df.tickers %>% select(c(ticker, ref.date))


> TICKERS$df.tickers[duplicated(key) | duplicated(key, fromLast=TRUE),]
     price.open price.high price.low price.close     volume price.adjusted
2590    2842.43   2879.220  2830.880    2874.560 5792140000       2874.560
2591    2842.43   2879.220  2830.880    2874.560 3554592893       2874.560
5181   23817.15  24264.211 23817.150   24242.490  525950000      24242.490
5182   23817.15  24264.211 23817.150   24242.490  530277705      24242.490
7772    8667.48   8670.300  8531.690    8650.140 4335020000       8650.140
7773    8667.48   8670.304  8531.688    8650.141 3915690406       8650.141
       ref.date ticker ret.adjusted.prices ret.closing.prices
2590 2020-04-17  ^GSPC        2.644093e-02       2.644093e-02
2591 2020-04-17  ^GSPC        0.000000e+00       0.000000e+00
5181 2020-04-17   ^DJI        2.950436e-02       2.950436e-02
5182 2020-04-17   ^DJI        0.000000e+00       0.000000e+00
7772 2020-04-17  ^IXIC        1.370943e-02       1.370943e-02
7773 2020-04-17  ^IXIC        1.129461e-07       1.129461e-07
> 

parallel does not work

I ran the lines in R version 4.0.4 MacOS BigSur 11.4

tickers = c('AAPL', 'ASH')

future::plan(future::multisession, workers = floor(parallel::detectCores()/2))

a = BatchGetSymbols(tickers,
first.date = Sys.Date() - 30,
last.date = Sys.Date(),
do.parallel = T
)

future::plan(future::multisession, workers = floor(parallel::detectCores()/2))
Running BatchGetSymbols for:
tickers =AAPL, ASH
Downloading data for benchmark ticker
^GSPC | yahoo (1|1) | Not Cached | Saving cache
Running parallel BatchGetSymbols with 4 cores (8 available)

Error in makeClusterPSOCK(workers, ...) :
Cluster setup failed. 4 of 4 workers failed to connect.

I found people use this:
cl <- parallel::makeCluster(2, setup_strategy = "sequential")
from https://stackoverflow.com/questions/61700586/r-makecluster-command-used-to-work-but-now-fails-in-rstudio
But it returns the original error:
Error in BatchGetSymbols(tickers, first.date = Sys.Date() - 3, last.date = Sys.Date(), :
When using do.parallel = TRUE, you need to call future::plan() to configure your parallel settings.
A suggestion, write the following lines:

future::plan(future::multisession, workers = floor(parallel::detectCores()/2))

The last line should be placed just before calling BatchGetSymbols. Notice it will use half of your available cores so that your OS has some room to breathe.

BatchGetSymbols

Hello,

My personal R website went down in the last few days, and I narrowed it down to the BatchGetSymbols package.

When trying to run the package, an error message. Below is the code and the error message. Any suggestions?

Code:

library(BatchGetSymbols)
df.SP500 <- GetSP500Stocks()
print(df.SP500)

Error:

'names' attribute [8] must be the same length as the vector [2]

Difference from raw data from yfinance?

From email:

"Basically, I'm trying to download price data for the constituents of the S&P 500 similar to what you do in your examples. However in certain cases the retrieved data does not seem to be accurate. Specifically, Apple (AAPL) recently split their stock and the adjusted price data does not reflect this. Rather the adjusted price falls from 438 to 109 on August 5th 2020 just like the close. I've checked and I'm using Yahoo Finance as a source and on their website the data has been updated to reflect the split so I'm not sure what I could be missing."

get.clean.data is not working with a vector of tickers

Hi! Thanks for the package!

get.clean.data is not working with a vector of tickers. Here's what I've been trying:

first.date <- "2008-12-09"
last.date <- Sys.Date()
freq.data <- 'daily'
tickers <- c('^AXJO', '^BVSP', '^GSPTSE', '^N100', '^FTSE', '^N225', '^MXX', '^NZ50', 'IMOEX.ME', '^SSMI', '^GSPC')
dfclean <- get.clean.data(tickers, src = "yahoo", first.date, last.date)

The result is:

dput(dfclean)
structure(list(), .Names = character(0), row.names = integer(0), class = "data.frame")

My session info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BatchGetSymbols_2.5.7 rvest_0.3.5           xml2_1.3.2            quantmod_0.4.17       TTR_0.23-6            plyr_1.8.6           
 [7] lubridate_1.7.9       forcats_0.5.0         stringr_1.4.0         purrr_0.3.4           readr_1.3.1           tidyr_1.1.0          
[13] tibble_3.0.1          ggplot2_3.3.2         tidyverse_1.3.0       dplyr_1.0.0           openxlsx_4.1.5        Quandl_2.10.0        
[19] xts_0.12-0            zoo_1.8-8            

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.0 haven_2.3.1      lattice_0.20-41  colorspace_1.4-1 vctrs_0.3.1      generics_0.0.2   yaml_2.2.1      
 [8] utf8_1.1.4       blob_1.2.1       rlang_0.4.6      pillar_1.4.6     glue_1.4.1       withr_2.2.0      DBI_1.1.0       
[15] dbplyr_1.4.4     modelr_0.1.8     readxl_1.3.1     lifecycle_0.2.0  munsell_0.5.0    gtable_0.3.0     cellranger_1.1.0
[22] zip_2.0.4        curl_4.3         fansi_0.4.1      broom_0.7.0      Rcpp_1.0.5       backports_1.1.7  scales_1.1.1    
[29] jsonlite_1.7.0   fs_1.4.2         hms_0.5.3        stringi_1.4.6    cowplot_1.0.0    grid_4.0.2       cli_2.0.2       
[36] tools_4.0.2      magrittr_1.5     crayon_1.3.4     pkgconfig_2.0.3  ellipsis_0.3.1   reprex_0.3.0     assertthat_0.2.1
[43] httr_1.4.1       rstudioapi_0.11  R6_2.4.1         compiler_4.0.2  

Missing dates for BTC?

`library(BatchGetSymbols)

titoli <- c('BTC-EUR')
data.inizio <- "2018-01-01"
data.fine <- "2019-01-01"

lista <- BatchGetSymbols(tickers = titoli,
first.date = data.inizio,
last.date = data.fine,
do.cache = FALSE)

`

Error in charToDate(x)

For random stocks the function is not able to convert the date-string properly and will throw this error
Error in charToDate(x) : character string is not in a standard unambiguous format.

The error is hard to troubleshoot as the error does not occur always withe the same stocks. I tried the option do.cache=F and it works if the chaced data is somewhat affected, but it doesn't solve the problem all the time...

I still haven't figured out what's going on here...

combine with getquote

Suggestion: in addition to historical data, also get the latest data from the current day.
Here is my code for that:

l.out$df.tickers <- l.out$df.tickers %>% bind_rows(tickers %>%
  getQuote() %>%
  rownames_to_column("ticker") %>%
  transmute(ticker = ticker,
            price.open = Open, price.high = High, price.low = Low,
            price.close = NA, volume = Volume, price.adjusted = Last,
            ref.date = as.Date(`Trade Time`),
            ret.adjusted.prices = NA, ret.closing.prices = NA))

BatchGetSymbols: Error in download

The function with snippet below was working very reliably, but now not working anymore at all.

I tried with several symbols to no success. Also tried your new package "fyR" - same result, that is none.

Any idea?

Snippet:
start_lng <- '2019-01-01'
list_yahoo_out <- BatchGetSymbols(c('BZ=F','NG=F', 'EURUSD=X', 'TTF=F'), first.date=start_lng, last.date=Sys.Date())

Incomplete data for ticker RDS-A

The function BatchGetSymbols::BatchGetSymbols() doesn't retrieve all the available data from Yahoo Finance for the ticker RDS-A (Royal Dutch Shell plc).

The following code using quantmod::getSymbols() seems to retrieve all the available data as we can see in https://finance.yahoo.com/quote/RDS-A?p=RDS-A&.tsrc=fin-srch

quantmod::getSymbols(
    Symbols = 'RDS-A',
    from = '1990-01-01',
    to = '2019-07-18',
    auto.assign = FALSE
)

But the following code using BatchGetSymbols::BatchGetSymbols() retrieves only data from 2018-01-02 to 2019-07-18

BatchGetSymbols::BatchGetSymbols(
    tickers = 'RDS-A',
    first.date = '1990-01-01',
    last.date = '2019-07-18',
    thresh.bad.data = 0
)

time lag in the data

Hi guys, thanks for the package. One quesion:
it seems there is a time lag in the data.
For example if I run it as of 2020.01.23 the latest date in the reference date col is 2020.01.17.
Could you please comment?

Error in charToDate(x)

Hi,

In the latest github release, I have been experiencing problems with retrieving some stock prices.

Following #3, the issue described seems to be related to caching. However, the problem persists even after caching was turned off and no cache folder was created.

E.g.
On my system this works:
getSymbols("BKZ.SI")

But this does not work:
BatchGetSymbols("BKZ.SI")

Returned error:

BKZ.SI | yahoo (1|1) | Not CachedError in charToDate(x) :
character string is not in a standard unambiguous format

In the past, I can use BatchGetSymbols with no issues 100% of the time. Now, due to this, I often have to manually remove stock tickers that are causing problems one by one, which defeats the purpose of the original idea of using BatchGetSymbols.

Any ideas how to resolve this?

Thanks :)

Unrecognized backslash escape

Hello Mr. Perlin,
I want to congratulate you on this useful package.
I have a problem and I would like you to help me to solve it. Please.
error_BatchGetSymbols

Company and tickers headers are interchanged in GetSP500Stocks()

Problem

The company and tickers headers are interchanged in the output from GetSP500Stocks(). So if we use the results as input for BatchGetSymbols() by using the header "tickers", we actually get the company name. Consequently, the download from Yahoo will fail.

I've got a pull request ready to come out soon, there only needs a change between the two headers in line 42 of the script GetSP500Stocks.R.

Most likely, the wikipedia page was changed recently causing this bug.

Expected behavior

The function should retrieve the tickers information in the proper order as not to confuse users even though the required information is still inside the dataframe but in another column.

my.tickets <- GetSP500Stocks()$tickers

BatchGetSymbols version

2.5.3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.