GithubHelp home page GithubHelp logo

tbep-tech / tbeptools Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 3.0 1.46 GB

R package for Tampa Bay Estuary Program functions

Home Page: https://tbep-tech.github.io/tbeptools/

License: Other

R 38.85% TeX 1.58% HTML 59.57%
package data-analysis water-quality tbep tampa-bay

tbeptools's Introduction

tbeptools

R-CMD-check pkgdown DOI Codecov test coverage DOI

R package for Tampa Bay Estuary Program functions. Please see the vignettes for a full description.

Installation

The package can be installed from r-universe. The source code is available on the tbep-tech GitHub group web page: https://github.com/tbep-tech/tbeptools. Note that tbeptools only needs to be installed once, but it needs to be loaded every new R session (i.e., library(tbeptools)).

# enable repos
options(repos = c(
    tbeptech = 'https://tbep-tech.r-universe.dev',
    CRAN = 'https://cloud.r-project.org'))

# install tbeptools
install.packages('tbeptools')

# load tbeptools
library(tbeptools)

After the package is loaded, you can view the help files for each function by typing a question mark followed by the function name, e.g., ?read_importwq, on the console. The help files provide a brief description of what each function does and the required arguments that are needed to run the function.

Package vignettes

The vignettes are organized by topic and are an excellent place to start for understanding how to use the package. Currently, there are six vignettes available for tbeptools:

  • Water Quality Data: Overview of functions for working with water quality data and the water quality report card
  • Tampa Bay Nekton Index: Overview of functions to import, analyze, and plot results for the Tampa Bay Nekton Index
  • Tampa Bay Benthic Index: Overview of functions to import data for Tampa Bay Benthic Index, under development
  • Tidal Creeks Assessment: Overview of functions to import, analyze, and plot results for the assessment of tidal creeks in southwest Florida
  • Seagrass Transect Data: Overview of functions to import, analyze, and plot results for the seagrass transect data collected in Tampa Bay
  • Habitat Master Plan: Overview of functions to analyze and create a report card for the Tampa Bay Habitat Master Plan 2020 update
  • Fecal Indicator Bacteria: Overview of functions to import, analyze, and plot results for Fecal Indicator Bacteria (FIB)

Usage

The core functions in tbeptools are in three categories based on mode of use. Each function is named using a prefix for the mode of use, followed by what the function does. The prefixes are:

  • read: Import current data from the main site.

  • anlz: Analyze or summarize the imported data.

  • show: Create a plot of the analyzed data.

The functions can be easily found in RStudio after loading the package and typing the prefix at the command line. An autofill dialog box will pop up showing all functions that apply for the prefix. This eliminates the need for searching for individual functions if all you know is the category of function you need (e.g., read, anlz, or show).

Each function also includes a semi-descriptive suffix that generally describes what category it applies to (e.g, water quality, seagrass) and what it does (e.g., imports, formats). These follow a loose convention that attempts to strike a balance between description and brevity. The optimal balance is often hard to achieve. To aid in understanding, we provide a brief description of suffixes that are used more than once.

Suffix descriptions:

  • attain: Analyze functions that summarize data relative to attainment categories specific to bay segments
  • ave, med: Analyze functions that summarize data into averages or medians
  • benthic: Applies to benthic monitoring data used for the Tampa Bay Benthic Index
  • fim: Applies to data from the Fisheries Independent Monitoring program used for the Tampa Bay Nekton Index
  • form: An intermediate function for formatting imported data for downstream analysis
  • hmp: Functions that work with Habitat Master Plan data
  • import: A function used to import data from a source external to the package
  • indic: A function that analyzes or plots individual tidal creek indicator values, as opposed to integrated creek scores
  • iwr: Functions or data that apply to the Impaired Waters Rule (IWR) data maintained by the Florida Department of Environmental Protection used as source data for the tidal creek functions
  • matrix: A plotting function that creates a report card style matrix
  • met: A function that analyses or plots individual metrics for integrated indices, e.g., TBBI, TBNI
  • phyto: Applies to phytoplankton data from the Hillsborough County Environmental Protection Commission
  • plotly: A plotting function that returns an interactive plotly object
  • scr: A function that analyses or plots summary scores for integrated indices, e.g., TBBI, TBNI
  • seg, site: Functions that analyze or plot results relative to bay segments or individual monitoring sites
  • tbbi: Applies to the Tampa Bay Benthic Index (TBBI)
  • tbni: Applies to the Tampa Bay Nekton Index (TBNI)
  • tdlcrk: Applies to tidal creeks
  • transect: Applies to seagrass transect data
  • wq: Applies to water quality

The function reference page can also be viewed for a complete list of functions organized by category, a description of what they do, and links to the help files.

The following example demonstrates use of a subset of the functions for water quality data to read a file from the Hillsborough County Environmental Protection Commission long-term monitoring dataset (available from https://www.tampabay.wateratlas.usf.edu/), analyze monthly and annual averages by major bay segments of Tampa Bay, and plot an annual time series for one of the bay segments.

# load the package
library(tbeptools)

# read current data
wqdat <- read_importwq(xlsx = "wqdata.xlsx", download_latest = TRUE)
wqdat
## # A tibble: 26,611 x 22
##   bay_segment epchc_station SampleTime             yr    mo
##   <chr>               <dbl> <dttm>              <dbl> <dbl>
## 1 HB                      6 2021-06-08 10:59:00  2021     6
## 2 HB                      7 2021-06-08 11:13:00  2021     6
## 3 HB                      8 2021-06-08 14:15:00  2021     6
## 4 MTB                     9 2021-06-08 13:14:00  2021     6
## 5 MTB                    11 2021-06-08 11:30:00  2021     6
## # ... with 26,606 more rows, and 17 more variables:
## #   Latitude <dbl>, Longitude <dbl>, Total_Depth_m <dbl>,
## #   Sample_Depth_m <dbl>, tn <dbl>, tn_q <chr>, sd_m <dbl>,
## #   sd_raw_m <dbl>, sd_q <chr>, chla <dbl>, chla_q <chr>,
## #   Sal_Top_ppth <dbl>, Sal_Mid_ppth <dbl>,
## #   Sal_Bottom_ppth <dbl>, Temp_Water_Top_degC <dbl>,
## #   Temp_Water_Mid_degC <dbl>, ...
# analyze monthly and annual means by bay segment
avedat <- anlz_avedat(wqdat)
avedat
## $ann
## # A tibble: 584 x 4
##      yr bay_segment var         val
##   <dbl> <chr>       <chr>     <dbl>
## 1  1974 HB          mean_chla 22.4 
## 2  1974 LTB         mean_chla  4.24
## 3  1974 MTB         mean_chla  9.66
## 4  1974 OTB         mean_chla 10.2 
## 5  1975 HB          mean_chla 27.9 
## # ... with 579 more rows
## 
## $mos
## # A tibble: 4,484 x 5
##   bay_segment    yr    mo var         val
##   <chr>       <dbl> <dbl> <chr>     <dbl>
## 1 HB           1974     1 mean_chla 36.2 
## 2 LTB          1974     1 mean_chla  1.75
## 3 MTB          1974     1 mean_chla 11.5 
## 4 OTB          1974     1 mean_chla  4.4 
## 5 HB           1974     2 mean_chla 42.4 
## # ... with 4,479 more rows
# show annual time series of chlorophyll for Hillsborough bay segment
show_thrplot(wqdat, bay_segment = "HB", yrrng = c(1975, 2020))

Functions in tbeptools also support the creation of content for interactive, online dashboards that can facilitate more informed decisions without requiring an intimate understanding of the R programming language or the methods for analysis. These dashboards include assessments for water quality, seagrasses, nekton communities, and tidal creeks.

Issues and suggestions

Please report any issues and suggestions on the issues link for the repository. A guide to posting issues can be found here.

Contributing

Please view our contributing guidelines for any changes or pull requests.

tbeptools's People

Contributors

bbest avatar dependabot[bot] avatar esherwoo77 avatar fawda123 avatar meagschrandtphd avatar mikewessel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tbeptools's Issues

Remove mapview dependency

This mapview package was only included to provide an option to interactively select the base map. This can be done with leaflet as in show_fibmap(), using the addLayersControl() function once relevant provider tiles are added.

leaflet::addLayersControl(baseGroups = names(esri),

Update any EPC links with new ones

EPC recently migrated to sharepoint for hosting water quality and phytoplankton results (not sure about benthic). Any links that point to the old FTP will need to be updated using a publicly accessible URL. Will also need to verify the current/old data checks in read_chkdate() still work. This affects other repos on tbep-tech that use tbeptools (e.g., wq-dash).

Data analysis using current year

Is the data analysis pulling in data from the current year? This could be problematic since not all data is in. May want to exclude current year until fully incorporated (hopefully early Nov this year).

[JOSS Review] Consider more verbose function names

This comment is really about coding style, and while I hope the authors consider it seriously I don't consider this to be a blocking issue. I recognize that changing function names is a major breaking change, especially for a package that is as fully-developed as this one, and likely has a long history of use already within the institutions for which it was designed.

I like the consistency in the function verb prefixes (e.g. read_, show_, anlz_) however as an unfamiliar user I find it hard to parse the spectific intentions of the functions from the suffixes.

In general, I would prefer to see fully descriptive function names -- e.g. analyze_ instead of anlz_, water_quality instead of wq, etc. As an example anlz_avedat() is completely opague to me as a function name -- the documentation says that it "Estimates annual means", so why not call it analyze_annual_means(). The argument against this is that it's a lot of typing, but in this day and age of good IDEs and code completion I prefer real words instead of confusing abbreviations.

Another thing to consider is that the mix of snake_case for the prefixes and no case convention for the suffixes is also hard to parse. I think that much of the functionality of the package would be more easily deduced if the functions were named as above, e.g. analyze_annual_means() or analyze_site_attainments().

(Linking openjournals/joss-reviews#3485 )

Include code examples?

Enjoyed your presentation today for the TBEP TAC. It would help me a lot to see your presented code examples included or linked to from the readme. It wasn't immediately clear to me to to look under "articles".

missing example

Same area as previous issue:

transectave <- anlz_transectave(transectocc)
transectave
#> # A tibble: 132 x 4

I think the penultimate sentence in the paragraph is missing whatever example you're providing:

Results for an individual bay segment can be returned with the appropriate argument, e.g., . Results can also be filtered

Update function names with concept conventions

Should follow the convention defined in the pkgdown site yaml

reference:
- title: "Read"
  contents:
  - starts_with("read_")
- title: "Analyze"
  contents:
  - starts_with("anlz_")
- title: "Visualize"
  contents:
  - starts_with("map_")
  - starts_with("show_")

Seagrass - plotting 1 or 2 species

For this graphing section: show_transectavespp(transectocc)

Is there an option where the total frequency of occurrence can be omitted so the graph is scaled to a better level, because species like Halodule and Ruppia will often just get lost in the bottom of the graph.

transect function improvements

  • Remove (or round up/down) weird BB estimates from show_transcect(), e.g., 2.5
  • Visualize multiple species in show_transect(), could be simple p/a of any seagrass species or something more complex, e.g., multiple lines per species in a given year
  • Link transect locations to depth, e.g., in trnpts or as separate file for every placement over time. This will need to be done with topo-bathy data to compare depths on the same scale. This will also lead into methods for better eval of seagrass depth edge and potential migration with SLR

@Gmangrove anything else?

believe it or not...

Language after this:
transectocc <- anlz_transectocc(transect)
transectocc
#> # A tibble: 11,430 x 6

suggest using % sign in this phrase: red < 25%, orange 25-50%, yellow 50-75%, and green > 75%

[JOSS Review] Widen the scope of literature review in paper

Hi again!

In your paper submission you've noted one other package that reports estuarine data. I think there's two more broad types of packages that would be useful for readers of your paper to know about: (1) general water quality data access (like dataRetrieval, which I think is in your Imports) and (2) calculation of policy thresholds in an open and repeatable way (like https://github.com/bcgov/wqindex ). This gives you an opportunity to comment on how you solved the issues associated with each of these, such as the download/cache system you need for a data access package (or as package data? or I think you have some of both here?). For policy thresholds (here, the difference between 'Stay the course', 'Caution', and 'On alert'), I imagine it gave you an opportunity to document how these are calculated and allow clients/stakeholders to reproduce your work.

(Linking: openjournals/joss-reviews#3485 )

Re-wording suggestion

2nd paragraph, 1st sentence, please consider:

There are two datasets included in tbeptools that show the actively monitored transect locations in Tampa Bay.

[JOSS Review] Clarify statement of need in README and in paper

Hi!

This package is an excellent example of packaging up code that can be well tested to back a Shiny app or other type of automated reporting (e.g., Rmd report). You've backed it up with integration testing with impressive coverage. I didn't see any reference to the Shiny app in the README or the paper and I think that's a shame...the Statement of Need in my opinion is that this is a good example of a well-designed reporting system to communicate environmental data to clients and stakeholders. A short paragraph with a link might work well for the README...in the paper I would love to see an all-out flowchart/screenshot of the Shiny app. I know the Shiny app isn't your submission but the design of the whole thing is where the value is for your readers (very few of whom actually need data from the Tampa Bay Estuary but might be interested in doing this kind of thing with their own monitoring program).

(Linking openjournals/joss-reviews#3485 )

[JOSS Review] Testing on MacOS?

It looks like you're skipping MacOS on your GitHub Actions...I'm guessing because of the latest issues with sf building. If that is the case, you can conditionally make GitHub actions install the binary version of sf until the build issue has been solved (see https://github.com/paleolimbot/wk/blob/master/.github/workflows/R-CMD-check.yaml#L69-L70 ). I know this is just temporary but I do think it's important to make sure your package will work on MacOS + Windows + Linux.

(Linking openjournals/joss-reviews#3485 )

The show_transectsum() graph - legend

This is a data header issue, for the legend, ALL the names are genera so should be followed by sp. or spp.. Several ways to skin this cat (full genus species, g. species, genus, common name). I'd probably prefer g. species, except Caulerpa, which does get spp., but does this mean going into the database? Other option would be just to remove spp. from the two that have it, but prob similar issue.

H. engelmannii (Halophila, star grass)
Caulerpa spp. (attached algae)
R. maritima (widgeon grass)
T. testudinum (turtle grass)
S. filiforme (manatee grass)
H. wrightii (Halodule, shoal grass)

Updates to creeks dashboard

details page should reference updated tidelcreekreport.png graphic
discuss connection to additional resources/analyses re: manuscript
Evaluate efficacy of connection to IWR database

3F creeks with missing data automatically get assigned target/green

An example of geometric means for JEI MC02 and wbid 1816:

   id wbid  JEI class Creek_Length_m year CHLAC COLOR COND DO DOSAT NO23 ORGN SALIN TKN TN TP TSS TURB
1 375 1816 MC02    3F       23783.52   NA    NA    NA   NA NA    NA   NA   NA    NA  NA NA NA  NA   NA

And the final score:

     id wbid  JEI   class target caution investigate   act score 
  <int> <chr> <chr> <chr>  <int>   <int>       <int> <int> <chr> 
1   375 1816  MC02  3F         1      NA          NA    NA Target

This is done here

class %in% c('3F', '1') & (TN < investigate | is.na(TN)) ~ 1,

Output from the tbeptools matches that from MW, so need to verify that this was on purpose.

figure out better way to update fimdata

Currently it requires an updated fimstations data object because stations have different reference numbers each year. If fimstations is not current, a right join here:

dplyr::right_join(fimstations, by = c("Reference", "Zone", "Grid")) %>%

will exclude the most recent data. One option is to not have fimstations as a data object in the package and just create an sf object on the fly.

Wording suggestion

Paragraph right after the transect map, shoot count description is in there twice, suggest truncating the last sentence to:

Abundance is reported as a numeric value from 0 -5 for Braun-Blanquet coverage estimates and blade length is in cm.

Missing data types

For the graph following this:
show_transect(transect, site = 'S3T10', species = 'Halodule', varplo = 'Abundance')

It looks like BB values of 0.1 and 0.5 are being displayed but not listed in the legend.

And another wording suggestion

2nd paragraph after 'Calculating seagrass frequency occurrence' 1st sentence, suggesting:

The anlz_transectocc() function summarizes frequency occurrence for all transects and dates by collapsing species results across quadrats within each transect.

Site Map Feature Request

@fawda123 : It would be nice to be able to export a site map of the EPC chl-a or la data for a particular period of time other than the annual timestep. For instance export a map image of chl-a data for Nov. 2020 or export a map image for averages over the May - Oct. 2020 period.

Incomplete sentence?

For the last sentence in the paragraph after this

transectave <- anlz_transectave(transectocc)
transectave
#> # A tibble: 132 x 4

Is the end of the sentence missing? Or does it lead into the next piece of code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.