wendtke / psyphr Goto Github PK
View Code? Open in Web Editor NEWlegacy repo for R package suite for psychophysiological data; see github.com/psyphr-dev
legacy repo for R package suite for psychophysiological data; see github.com/psyphr-dev
Or are separate functions needed? Right now, they are separate (transform_eda_stats.R and transform_hrv_stats.R) but contain the same content.
Introducing myself again! My name is Siqi Zhang, I've been using R since 2012. , and am a freelance R developer. Rather than using the language for analysis, my edge is sharper on the language itself. It is my pleasure to meet to be onboard this open source project.
I'm excited that you've already made very substantial progress. I think when we're ready to take it further, we should exchange opinion on each other's thoughts and situations. Hit me up at [email protected].
In the mean time, I'm going to branch it off and start poking around. Looking forward to hearing from you!
Implement quick visualization:
Sample code from @wendtke at https://drive.google.com/drive/u/0/folders/1fvFlD5CT1ZP2Bgm4eaF9ERtNgg7WxBxN
See also #14
We want psyphr
to work on a normal laptop, which nowadays has somewhere between 4-12G's of usable memory, and R normally should not use more than half of the total memory. Currently read_study()
reads everything all at once. A really big study can create a problem.
If the problem exists, there are at least two ways to mitigate the problem:
What is a likely the total size of a study? I'm looking for a figure at about the 80th percentile, and I surely hope it will be small enough.
... composed of many psyphr_workbook
objects, with subject/activity identification inferred from file structure of workbooks (See #21).
Able to:
saveRDS()
(?)S3
object, class name: psyphr_study
generics:
Use a control file in YAML or DCF.
Update #11 and #45 with final decision
End-user license (MIT vs. GPL)
"Worst" case scenario: company takes psyphr
, puts GUI on it, and sells it (could be with or without attribution). Let's talk through the scenario with each license and consider if we are comfortable with which/either outcome.
Can we change license after making repo public or submitting/publishing?
GBA 20190714 suggested not to do so:
Yes, I think you should be able to change licenses down the line, with all coauthors’ agreement. I’d try not to too often, though—if people ever use it within other things they make, I think a change from MIT to GPL might affect what they can do (if they’re creating under a license that isn’t open source). The general consideration, when you are maintaining a package, is to try to limit the changes you make that could break a lot of things “downstream” for people who might be building off your package. This, of course, is only a big issue when the package has a lot of users, which isn’t the case for plenty of packages (although I think yours could get a lot of downstream development, where people are using your package as a dependency in their own package). But it’s not the end of the world if you change your license later, I think.
Resources from GBA
R Packages
Understanding Open Source and Free Software Licensing
Emailed CSU HDFS labs for process input
Similar to #3, will transform_editing_sheet
work for both HRV and EDA? Currently, one file exists.
check:
Hi @wendtke , as we've talked on the phone, it seems a valuable proposition to take some of the data QA work into the package.
Could you please make a short list of common expectations, starting with those we've talked about?
For example:
@iqis I met with @geanders today. She recommended making the following changes:
importFrom magrittr %>%
in roxygen
notes to specify pipesuppressWarnings()
within function; instead, use purrr::quietly
or purrr::safely
roxygen
notes (deferred, pending better sample data)a character string that gives path to...
rather than path
only)print(psyphr_workbook)
; see #43Read Issue #19 first for more ideas.
Some ideas for questions. There are a lot, so we will probably have to cut some.
Workbook formats:
BPV
EMG
Startle EMG : @wendtke Are you familiar with this type? In the sample data there is no information on "Right Eye". Is this expected?
IMP
BSA: Unstable format, need a closer look
We would need some examples of commonly applicable visualizations, within a subject or across a study.
User assertthat
for condition checking in functions, replacing if()
or stopifnot()
Discord is a very popular free software for group audio chat, available on desktop and mobile. Let's give it a try!
The following link is to our channel on Discord:
https://discord.gg/swvHChq
Leftover from #32, because BSA files have unstable format, and need to be treated specially
read_MW() ->
validate data format ->
dispatch corresponding parsing function
Automatically detect and parse workbook format, using:
For every data format:
I have a video chat with a MW representative on Tuesday, June 4. I had to reschedule from a few weeks back.
I will ask about
@iqis Do you have any other questions for MindWare?
The "Interval Stats" sheet is an optional sheet appearing in BPV, HRV and EDA.
An Epoch File contains the metadata of a subject's activity period; manually tagged? How to integrate with measurement data?
In addition to reading
consider (@geanders suggestions):
psyphr_read_wb()
)mutate
to allow users to bring in non-MindWare datalubridate
or tidyr
for examples of maintaining consistency across functions and packages (e.g., verbose = TRUE
option within function)brief description of data munging process (from start to end with data examples)
@iqis do you still want/need this? You mentioned it in the phone call.
What is the best location (in repo or out) for extra materials like the templates and example data from MindWare and the BIOPAC editing steps that one lab shared? Do we need a cloud folder, @iqis ?
Continuous integration is a service that automatically checks error in your code each time a new commit is pushed to GItHub. A badge can be displayed on whether the build passes the test. At current stage, it is most likely that our code with fail the CI's stringent standards. But don't be discouraged.
Before we can implement free CI service, our repo needs to be open.
Set up:
Create example data (or ask @MalloryJfeldman for samples) for end-user practice. We can check readxl
repo for example data placement within package. For example,
@geanders referred to a function to give path name for user to pull data; I assume it is file.path()
?
compare output data across all MindWare applications; create example data from demo software
Per Discord conversation @iqis @wendtke 20190625:
A workbooks generally has three ID dimensions:
This information shall be inferred from folder/name structure. See: #21.
This information is key to downstream analysis.
Issue Suspension:
See: #19
See #31 for original formulation of survey.
@MalloryJfeldman and @wendtke shared survey via email (departments; colleagues) and Twitter.
I would like to change the repo/package name to psyphyr
. Do you think the best approach to this would be to (eventually) recreate a new repo, transfer the content and collaborators, and delete psyr
?
Any suggestions would be helpful. This is not a vital change at the moment, but I figured changing the name earlier (before making the repo public or trying to publish the package) would be better.
create a README page with:
See here
Navigating authorship and contributions (from discussion with GBA)
"This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 006784-00002 [to KEW]. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation."
deferred pending more stable API and better sample data
Originally posted by @iqis in #55 (comment)
add a function to rename/restructure a collection of files to a specified naming convention that is used in psyphr
.
Is submitting to rOpenSci a good idea? The platform has hosted many scientific packages through a rigorous peer-review process, example here. The dev guide is very thorough and helpful.
psyphr
can fall into the data munging category.
MindWare Technologies, Ohio sells 6 analysis applications that provide output data we are interested in wrangling. These include Basic Signal Analysis (BSA), Blood Pressure Variability (BPV) Analysis, Electrodermal Activity (EDA) Analysis, Electromyography (EMG) Analysis, Heart Rate Variability (HRV) Analysis, and Impedance Cardiography (IMP) Analysis. So far, I am only familiar with EDA and HRV. Eventually, I would like psyphr
to wrangle data from all MindWare analysis applications and then move to add options for data from BIOPAC Research Solutions.
Here is some more information on EDA and HRV.
EDA Analysis 3.2 Manual
HRV Analysis 3.2 Manual
Aside: BioLab is the data acquisition software, which provides the raw data files for the analysis applications. The analysis applications then export the edited output data for compilation, analysis, and visualization.
Interesting tidbit: Years ago, MindWare had its own proprietary study compilation tool for use across analysis applications. They do not offer it to clients anymore, but maybe there is content in the manual that might inform our approach in managing the file naming problem or other things. It looks they required users to enter subject ID, etc.
Right now as I'm trying to figure out the best approach, I need to know some common characteristics in downstream analyses. Some detailed use cases will help. For example, what are some frequently used statistical models? Are modeling usually done for each and every subject, or across some kind of summation of a group?
Originally posted by @iqis in #58 (comment)
Currently all data are read in verbatim as "character".
Make a parse_MW_() function family to address all kinds, then call from read_MW_() family
Use dplyr::mutate_*() family.
Keep categorical variables as "character" or press into "factor"? @wendtke This also begs another question, what are the possible levels of a factor? e.g. SCR Type in SCR Stats from EDA databook.
See #15 for rOpenSci info
See #45 for author discussion
Proposed dissemination timeline
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.