GithubHelp home page GithubHelp logo

kangchenghou / admix-kit Goto Github PK

View Code? Open in Web Editor NEW
22.0 22.0 5.0 3.13 MB

Toolkit for analyzing genetics data from admixed populations

Home Page: https://kangchenghou.github.io/admix-kit

Python 99.72% Shell 0.28%

admix-kit's People

Contributors

kangchenghou avatar ziqixu091 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

admix-kit's Issues

p-value = 0, inconsistency between marginal and marginal_fast

import admix
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1234)

np.random.seed(1234)
binary_sim = admix.simulate.binary_pheno(dset, beta=beta, hsq=0.15, method="logit")

dset.indiv["pheno"] = binary_sim["pheno"][:, sim_i]
df_assoc_ATT = admix.assoc.marginal_fast(dset, pheno_col="pheno", family="logistic", method="ATT")
df_assoc_TRACTOR = admix.assoc.marginal_fast(dset, pheno_col="pheno", family="logistic", method="TRACTOR")

fig, axes = plt.subplots(figsize=(4, 2), ncols=2, dpi=150)
admix.plot.manhattan(df_assoc_ATT.P.values, s=10, ax=axes[0])
axes[0].set_title("ATT")
admix.plot.manhattan(df_assoc_TRACTOR.P.values, s=10, ax=axes[1])
axes[1].set_title("TRACTOR")
fig.tight_layout()

ValueError: Lengths must match to compare

Hello,

I try to use admix-kit with my dataset. I first download the plink2 fril from 1000G on plink2 website https://www.cog-genomics.org/plink/2.0/resources

Then I decompress and process them the same way you described in the toy.sh script

 here for chr 1
plink2 --pfile chr1_phase3  --rm-dup exclude-all --max-alleles 2 --maf 0.01 --snps-only --seed 0 --make-pgen --out chr1_phase3.admix

Then I executed admin on it :

admix lanc --pfile mydataset.chr1.QCed --ref-pfile chr1_phase3.admix --ref-pop-col "SuperPop" --ref-pops "EUR,AFR,AMR,EAS,SAS" --out test.lanc

But I got an error :

Traceback (most recent call last):
  File "../bin/admix", line 392, in <module>
    fire.Fire()
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "../bin/admix", line 22, in lanc
    assert np.all(sample_dset.snp.index == ref_dset.snp.index), (
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/pandas/core/ops/common.py", line 65, in new_method
    return method(self, other)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/pandas/core/arraylike.py", line 29, in __eq__
    return self._cmp_method(other, operator.eq)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 5615, in _cmp_method
    raise ValueError("Lengths must match to compare")
ValueError: Lengths must match to compare

Any idea how to solve this ?

Toy dataset works well btw.

Thanks

dataset loading with region specified

dset_region = admix.data.load_lab_dataset(
    name="ukb_eur_afr_imputed", region="1:1000-2000"
)

This is because no SNPs in the data set.

dset_region = admix.data.load_lab_dataset(
name="ukb_eur_afr_imputed", region="1:100-2000000"
)

This is because the start of the region is selected.

admix cli pheno single files

all CLI functionality should take only a single phenotype file, with 1st column as IID, 2nd column as phenotype values, 3rd-later columns as covariates. NaN should be coded with NAs. This should be documented in prepare dataset.

Error in Admix assoc marginal (binary)

Hello,

I am sending the information about my trying to run the ATT, TRACTOR and SNP1 using our data (case and control for Parkinson Disease).

Our covar files has 13 columns:
indiv AGE SEX PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10

Our pheno file has 2 columns:
indiv DISEASE

Our pfile is from imputed data (VCF to PFILE) with a modified PSAM:

$ head ./PFILES/LARGE_chr22_Biallelic_COVAR.psam -n 1
#IID AGE SEX PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10

Thank you very much,
Thiago Peixoto Leal
Log.txt

# Simulate 3-way admixture issues

Hi,
I am trying the command:
admix haptools-simu-admix
--pfile data/example_data/1kg-ref
--admix-prop '{"CEU": 0.4, "YRI": 0.1, "PEL": 0.5}'
--pop-col Population
--mapdir data/1kg-ref-${BUILD}/metadata/genetic_map/
--n-gen 10
--n-indiv 10000
--out data/example_data/CEU-YRI-PEL

But, I get an error:
ERROR: Cannot find key: haptools-simu-admix
Usage: admix <group|command>
available groups: fire
available commands: assoc | append_snp_info | calc_pgs |
calc_partial_pgs | grm | log_params | get_1kg_ref |
select_admix_indiv | simulate_admix_pheno |
simulate_pheno | lanc | lanc_convert | lanc_count |
prune | pca | liftover | pfile_align_snp |
pfile_merge_indiv | pfile_freq_within |
subset_hapmap3 | subset_pop_indiv | hapgen2 |
admix_simu | download_dependency | plot_joint_pca |
admix_grm | admix_grm_merge | genet_cor |
admix_grm_rho | estimate_genetic_cor |
summarize_genet_cor | meta_analyze_genet_cor | cli

For detailed information on this command, run:
admix --help

Can't Download The Reference

Describe the bug
Hi, I downloaded the package and tried to copy the simulation example code. After the setup step (in the prepare source data section), it ran into an error.
admix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}.
I would like to know if I need to download anything or provide more information rather than copy and paste. Thank you very much!

To Reproduce

BUILD=hg19
N_INDIV=100000
CHROM=22
N_GEN=8

REF_DATA_DIR=data/1kg-ref-${BUILD}
RAW_DATA_DIR=data/raw
GENO_DATA_DIR=data/geno
mkdir -p ${RAW_DATA_DIR}
mkdir -p ${GENO_DATA_DIR}
admix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}

Error Message
CalledProcessError: Command 'b'# genome build to use\nBUILD=hg19\n# number of admixed individuals to simulate\nN_INDIV=100000\n# chromosome \nCHROM=22\n# number of generations\nN_GEN=8\n\n# setup\nREF_DATA_DIR=data/1kg-ref-${BUILD}\nRAW_DATA_DIR=data/raw\nGENO_DATA_DIR=data/geno\nmkdir -p ${RAW_DATA_DIR}\nmkdir -p ${GENO_DATA_DIR}\n\n# Create the necessary directories if they do not already exist\nmkdir -p ${RAW_DATA_DIR}\nmkdir -p ${GENO_DATA_DIR}\n\nadmix get-1kg-ref --dir ${REF_DATA_DIR} --build ${BUILD}\n\n'' returned non-zero exit status 1.

Numerical overflow

import xarray as xr
dset_region = xr.open_zarr("/u/scratch/y/yiding/admixture-finemapping/data/ukb_eur_afr_imputed/1/1:1000000-2000000")
X = dset_region.geno.values
X[:,3,:].T @ X[:,3,:]
array([[ -7,  21],
       [ 21, -12]], dtype=int8)

The data type int8 leads to numerical overflow when computing XTX matrix

consistency between lanc and bp2anc.pl

we should generate more complicated 3-way admixture and check consistency between lanc and bp2lanc.pl

import admix
import numpy as np
import pandas as pd

lanc1 = admix.data.Lanc("data/admix.lanc").dask().compute()
lanc2 = admix.io.read_digit_mat("data/admix.hanc").T

admix.plot.lanc(lanc=lanc1)

assert np.all(lanc2[:, 0::2] == lanc1[:, :, 0]) and np.all(
    lanc2[:, 1::2] == lanc1[:, :, 1]
)

Index of df_indiv does not match the columns of lanc

Describe the bug
Hi, I am running admix lanc-convert with pfile and msp, the msp spans the entire chr1 and the pfile is a shard of chr1, not sure if that matters, but I get the following error after running for ~1 hour :

df_snp["POS"][-1] + 1, df_rfmix_info.loc[len(df_rfmix_info) - 1, "epos"]
Traceback (most recent call last):
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/bin/admix", line 8, in
sys.exit(cli())
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/admix/cli/init.py", line 39, in cli
fire.Fire()
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/admix/cli/lanc.py", line 135, in lanc_convert
lanc = admix.io.read_rfmix(
File "/group/tools/Anaconda/Anaconda3-2019.10/envs/admix-kit/lib/python3.9/site-packages/admix/io/_read.py", line 320, in read_rfmix
assert np.all(df_indiv.index == lanc.columns)
AssertionError

Admix kit function:

admix lanc-convert
--pfile $pfilestem
--rfmix $mspchr1
--out $out/test-lanc

Thanks,
Mike

Error in assoc-quant

I follow the different steps in order to perform association analysis

I have a .lanc file for mydataset.chr1 generated from admix lanc

admix assoc-quant --pfile data/mydataset.chr1 --pheno samples_pheno.txt --pheno-col PHENO --method ATT --out test.assoc

samples_pheno.txt has the same samples order and has two columns. col1 (indiv) contains sample id and col2 (PHENO) contains phenotype (2 : cases ; 1 : controls )

An error occured :

admix.Dataset: `n_anc` is not provided, infered n_anc from the first 1,000 SNPs is 2. If this is not correct, provide `n_anc` when constructing admix.Dataset
Traceback (most recent call last):
  File "admix", line 392, in <module>
    fire.Fire()
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "../bin/admix", line 382, in assoc_quant
    dict_rls[m] = admix.assoc.marginal(
  File "/home/nicolas/Documents/admix-kit/admix/assoc/__init__.py", line 260, in marginal
    model = sm.OLS(pheno, design).fit(disp=0)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 872, in __init__
    super(OLS, self).__init__(endog, exog, missing=missing,
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 703, in __init__
    super(WLS, self).__init__(endog, exog, missing=missing,
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 190, in __init__
    super(RegressionModel, self).__init__(endog, exog, **kwargs)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py", line 237, in __init__
    super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py", line 77, in __init__
    self.data = self._handle_data(endog, exog, missing, hasconst,
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py", line 101, in _handle_data
    data = handle_data(endog, exog, missing, hasconst, **kwargs)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/data.py", line 672, in handle_data
    return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/data.py", line 87, in __init__
    self._handle_constant(hasconst)
  File "/home/nicolas/anaconda3/lib/python3.8/site-packages/statsmodels/base/data.py", line 133, in _handle_constant
    raise MissingDataError('exog contains inf or nans')
statsmodels.tools.sm_exceptions.MissingDataError: exog contains inf or nans

Also when will the .lanc file will be used in the association as we do not provide it to the function ?

Thank you

beagle vs. eagle2 genetic map

currently we use beagle genetic map for rfmix, eagle2 genetic map for hapgen. Ideal to leave only 1.
TODO: check consistency between these 2 types of genetic map.
Have a x-y plot where we put genetic position vs. physical position and compare across 2 data source.

Fail to install admix-kit

Describe the bug
I try to install admix-kit following the instructions. I update the cmake first but it turns out to be using the old version, I guess?

To Reproduce
cd admix-kit && pip install -e .

Screenshots
ERROR: Command errored out with exit status 1:
command: /n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"'; file='"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-f_9vj7u1
cwd: /tmp/pip-install-_1ohrzxs/tinygwas/
Complete output (39 lines):
running bdist_wheel
running build
running build_ext
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
CMake 3.12 or higher is required. You are running version 2.8.12.2

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 74, in
setup(
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 290, in run
self.run_command('build')
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 36, in run
self.build_extension(ext)
File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 65, in build_extension
subprocess.check_call(
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-install-_1ohrzxs/tinygwas', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-install-_1ohrzxs/tinygwas/build/lib.linux-x86_64-3.8', '-DPYTHON_EXECUTABLE=/n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python', '-DCMAKE_BUILD_TYPE=Release']' returned non-zero exit status 1.

ERROR: Failed building wheel for tinygwas
Running setup.py clean for tinygwas
Building wheel for pylampld (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_1ohrzxs/pylampld/setup.py'"'"'; file='"'"'/tmp/pip-install-_1ohrzxs/pylampld/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-v_wd4m4h
cwd: /tmp/pip-install-_1ohrzxs/pylampld/
Complete output (39 lines):
running bdist_wheel
running build
running build_ext
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
CMake 3.12 or higher is required. You are running version 2.8.12.2

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-_1ohrzxs/pylampld/setup.py", line 69, in
setup(
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 290, in run
self.run_command('build')
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/tmp/pip-install-_1ohrzxs/pylampld/setup.py", line 34, in run
self.build_extension(ext)
File "/tmp/pip-install-_1ohrzxs/pylampld/setup.py", line 62, in build_extension
subprocess.check_call(['cmake', ext.sourcedir] + cmake_args,
File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-install-_1ohrzxs/pylampld', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-install-_1ohrzxs/pylampld/build/lib.linux-x86_64-3.8', '-DPYTHON_EXECUTABLE=/n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python', '-DCMAKE_BUILD_TYPE=Release']' returned non-zero exit status 1.

ERROR: Failed building wheel for pylampld
Running setup.py clean for pylampld
Failed to build tinygwas pylampld
Installing collected packages: tinygwas, pylampld, admix-kit
Running setup.py install for tinygwas ... error
ERROR: Command errored out with exit status 1:
command: /n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"'; file='"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-s0fxgx4d/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /n/home13/xinanwang/.local/include/python3.8/tinygwas
cwd: /tmp/pip-install-_1ohrzxs/tinygwas/
Complete output (41 lines):
running install
running build
running build_ext
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
CMake 3.12 or higher is required. You are running version 2.8.12.2

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 74, in <module>
    setup(
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/site-packages/setuptools/command/install.py", line 61, in run
    return orig.install.run(self)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/command/install.py", line 545, in run
    self.run_command('build')
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 36, in run
    self.build_extension(ext)
  File "/tmp/pip-install-_1ohrzxs/tinygwas/setup.py", line 65, in build_extension
    subprocess.check_call(
  File "/n/sw/eb/apps/centos7/Anaconda3/2020.11/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-install-_1ohrzxs/tinygwas', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-install-_1ohrzxs/tinygwas/build/lib.linux-x86_64-3.8', '-DPYTHON_EXECUTABLE=/n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python', '-DCMAKE_BUILD_TYPE=Release']' returned non-zero exit status 1.
----------------------------------------

ERROR: Command errored out with exit status 1: /n/sw/eb/apps/centos7/Anaconda3/2020.11/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"'; file='"'"'/tmp/pip-install-_1ohrzxs/tinygwas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-s0fxgx4d/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /n/home13/xinanwang/.local/include/python3.8/tinygwas Check the logs for full command output.

Additional context
Add any other context about the problem here.

versioning / packaging

  1. replace setup.py with pyproject.toml. setup.py can still be useful for editable installment (but as this packge get mature, we may not that anymore).
  2. add versioning to each update.
  3. update to pip

Timeline:

  1. Have a releasable version and upload to pip (this can be done after going through the whole package) (pyproject.toml update happens here.)

cyvcf2 conflict in haptools-simu-admix

Describe the bug
Hi, I am just going through the simulation tutorial (https://kangchenghou.github.io/admix-kit/simulate-admix-genotype.html) and running into an error that seems to have to do with the admix-kit conda env python packages conflicting and cyvcf2. I thought it might be related to the first answer here (https://stackoverflow.com/questions/35006614/what-does-symbol-not-found-expected-in-flat-namespace-actually-mean).

To Reproduce
admix haptools-simu-admix
--pfile ${OUT_DIR}/1kg-ref
--admix-prop '{"CEU": 0.4, "YRI": 0.1, "PEL": 0.5}'
--pop-col Population
--mapdir data/1kg-ref-${BUILD}/metadata/genetic_map/
--n-gen 15
--n-indiv 1000
--out ${OUT_DIR}/CEU-YRI-PEL

2024-05-09 19:27:29 [info ] Received parameters:
haptools-simu-admix
--pfile=data/example_data/1kg-ref
--admix_prop={'CEU': 0.4, 'YRI': 0.1, 'PEL': 0.5}
--pop_col=Population
--mapdir=data/1kg-ref-hg38/metadata/genetic_map/
--n_gen=15
--n_indiv=1000
--out=data/example_data/CEU-YRI-PEL
1000 ADMIX CEU YRI PEL
15 0 0.4 0.1 0.5
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/dapgen/_read.py:435: FutureWarning: The 'delim_whitespace' keyword in pd.read_csv is deprecated and will be removed in a future version. Use sep='\s+' instead
df_psam = pd.read_csv(path, delim_whitespace=True, skiprows=skiprows)
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/dapgen/_read.py:412: FutureWarning: The 'delim_whitespace' keyword in pd.read_csv is deprecated and will be removed in a future version. Use sep='\s+' instead
pd.read_csv(path, delim_whitespace=True, skiprows=skiprows)
2024-05-09 19:27:29 [info ] haptools simgenotype --model data/example_data/CEU-YRI-PEL.tmpdata/admix.dat --mapdir data/1kg-ref-hg38/metadata/genetic_map/ --ref_vcf data/example_data/1kg-ref.pgen --sample_info data/example_data/CEU-YRI-PEL.tmpdata/sample_info.txt --out data/example_data/CEU-YRI-PEL.tmpdata/admix.pgen --region 22:16406147-49474282 --seed 1234
Traceback (most recent call last):
File "
/miniconda3/envs/admix-kit/bin/haptools", line 8, in
sys.exit(main())
^^^^^^
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/click/core.py", line 783, in invoke
return _callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/haptools/main.py", line 234, in simgenotype
from .sim_genotype import (
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/haptools/sim_genotype.py", line 8, in
from cyvcf2 import VCF
File "~/miniconda3/envs/admix-kit/lib/python3.12/site-packages/cyvcf2/init.py", line 1, in
from .cyvcf2 import (VCF, Variant, Writer, r
as r_unphased, par_relatedness,
ImportError: dlopen(/miniconda3/envs/admix-kit/lib/python3.12/site-packages/cyvcf2/cyvcf2.cpython-312-darwin.so, 0x0002): symbol not found in flat namespace '_bcf_float_missing'
Traceback (most recent call last):
File "
/miniconda3/envs/admix-kit/bin/admix", line 8, in
sys.exit(cli())
^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/admix/cli/init.py", line 39, in cli
fire.Fire()
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/miniconda3/envs/admix-kit/lib/python3.12/site-packages/admix/cli/_ext.py", line 319, in haptools_simu_admix
admix.tools.haptools_simu_admix(
File "
/miniconda3/envs/admix-kit/lib/python3.12/site-packages/admix/tools/_standalone.py", line 441, in haptools_simu_admix
subprocess.check_output(cmd, shell=True)
File "/miniconda3/envs/admix-kit/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "
/miniconda3/envs/admix-kit/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'haptools simgenotype --model data/example_data/CEU-YRI-PEL.tmpdata/admix.dat --mapdir data/1kg-ref-hg38/metadata/genetic_map/ --ref_vcf data/example_data/1kg-ref.pgen --sample_info data/example_data/CEU-YRI-PEL.tmpdata/sample_info.txt --out data/example_data/CEU-YRI-PEL.tmpdata/admix.pgen --region 22:16406147-49474282 --seed 1234' returned non-zero exit status 1.

Thanks,
Mike

Error in admix lanc_convert

Hello,

I try to convert rfmix output to lanc format using admix lanc_convert

admix lanc_convert input out.lanc --rfmix input.msp.tsv

I got this error :

2022-02-07 10:11.59 [info     ] Received parameters: 
lanc-convert
  --pfile=input
  --out=output.lanc
  --rfmix=input.msp.tsv
  --raw=None
Traceback (most recent call last):
  File "/home/user/.local/bin/admix", line 33, in <module>
    sys.exit(load_entry_point('admix-kit', 'console_scripts', 'admix')())
  File "/home/users/tools/admix-kit/new/admix-kit/admix/cli/__init__.py", line 37, in cli
    fire.Fire()
  File "/home/user/.local/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/user/.local/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/user/.local/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/users/tools/admix-kit/new/admix-kit/admix/cli/_lanc.py", line 54, in lanc_convert
    lanc = admix.io.read_rfmix(
  File "/home/users/tools/admix-kit/new/admix-kit/admix/io/_read.py", line 340, in read_rfmix
    chunk_start, chunk_stop = np.where(chunk_mask)[0][[0, -1]]
IndexError: index 0 is out of bounds for axis 0 with size 0

Any idea how to solve this ? thank you

use plink2 as the storage engine

Instead of using zarr as the engine, consider using plink2

It should support random access, see https://github.com/chrchang/plink-ng/tree/master/2.0/Python

the big advantage of it is it can enjoy so many usage of plink2

To check whether plink2 support storing phasing information.

Other than this, we can have another file to store local ancestries.

The first step may be to implement similar functions with pandas-plink

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.